Building an IT Taxonomy with Cooccurrence Analysis, Hierarchical Clustering, and Multidimensional Scaling Chia-jung Tsui, Ping Wang, Kenneth R. Fleischmann, Asad B. Sayeed, Amy Weinberg, and Douglas Oard The abundance of IT is a challenges for both IT management & information management. Cartoon by Sidney Harris
We Have Lots of IT, But Portable Personality Mashup OSS SOA Semantic Web Web2.0 Tera-architectures Ajax Ultramobile Devices Identity Management SCM SaaS BPO Chatbots Thin Provisioning DRM RFID Business Intelligence Cloud Computing CRM Application Quality Dashboards VoIP Distributed Encryption 3 Little and Dated Understanding 1993 1998 4
Extant Approach to IT Taxonomy Compile list of ITs by empirical surveys. Experts rate ITs according to their assessments of functions or features of the technologies. Limitations Narrow representation: arbitrary and limited choices of features and functions, few ITs Static: snapshots few and far in between Not scalable: more ITs lower reliability 5 Scalable Computational Approach Downloaded full-text articles published in 1998-2007 from six magazines: ComputerWorld & InformationWeek BusinessWeek & The Economist Newsweek & US News and World Report Extracted ~220,000 paragraphs containing 50 IT concepts. 6
IT Concepts Included in Analysis AI Artificial intelligence Multimedia Multimedia ASP Application service provider MP3 MP3 player BI Business intelligence MySpace MySpace Blog Blog NeuralNet Neural net Bluetooth Bluetooth OLAP Online analytical processing BizProReen Business process reengineering OSS Open source software CloudCom Cloud computing Outsource Outsourcing CRM Customer relationship management PDA Personal digital assistant DigiCam Digital camera RFID Radio frequency identification DLearn Distance learning SmartCard Smart card DSL Digital subscriber line SCM Supply chain management DecisionSS Decision support system SFA Salesforce automation DW Data warehouse SocNet Social networking ebiz Electronic business SOA Service oriented architecture ecom Electronic commerce Telecommute Telecommuting EDI Electronic data interchange TabletPC Tablet PC ERP Enterprise resource planning UtiComp Utility computing ExpertSys Expert system Virtualization Virtualization GPS Global positioning system VPN Virtual private network Grpware Groupware Web2 Web 2.0 IM Instant messaging WebServ Web services iphone iphone WiFi Wi-Fi ipod ipod Wiki Wiki KM Knowledge management Wikipedia Wikipedia Linux Linux YouTube YouTube 7 Scalable Computational Approach Downloaded full-text articles published in 1998-2007 from six magazines: ComputerWorld & InformationWeek BusinessWeek & The Economist Newsweek & US News and World Report Extracted ~220,000 paragraphs containing 50 IT concepts. Counted co-occurrence of IT concepts in paragraphs. 8
Co-Occurrence of IT Concepts Over the past few years, we have seen the ERP vendors-led by SAP-move into different business areas, says Byron Miller, an analyst with the Giga Information Group. The competitive advantage of just having ERP has diminished. The next big thing beyond ERP is supply-chain management. Links between groupware and ERP applications speed users' access from within a groupware application to key business data, such as purchase orders, inventory, customer histories, and other supply-chain information. 9 Hierarchical Clustering Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8
Face Validity of Our Approach 11 Benefits of This Approach Representative More IT concepts to study Monitor and understand popularity More data sources Represent reality by pooling data Compare to exam segments of communities Dynamic Multiple periods Reveal what exactly is diffusing Visualize species and speciation of innovations Scalable 12
Popularity of E-commerce & E-business 3000 2500 Number of Paragraphs 2000 1500 1000 500 0 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 ebiz ecom Source: InformationWeek 13 Popularity of Web Services & SOA 800 700 600 Number of Paragraphs 500 400 300 200 100 0 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 SOA WebServ Source: InformationWeek 14
Implications for IT Management When expert knowledge is not readily available, this approach offers maps of IT domains or sub-domains. A new technology s cluster membership may suggest its broader type. Taxonomy is useful for vendors in product/service labeling and for adopters in IT portfolio management. 15 Takeaways Computational discourse analysis based on co-occurrence and hierarchical clustering can help us explore complex relationships among IT concepts in a representative, dynamic, and scalable way. Social-technical approach: we used social artifacts (language/discourse) to chart technological terrains. Effective information management and effective IT management go hand-in-hand. 16
Thank You from the PopIT Team Thanks to National Science Foundation for grants IIS- 0729459 and SBE- 0915645 http://terpconnect.umd.edu/~pwang/popit/ * pwang@umd.edu