CSS 692 Social Network Analysis Maksim (Max) Tsvetovat Center for Social Complexity George Mason University mtsvetov@gmu.edu January 19, 2010 This is version 6.0 1 Introduction There has been a dramatic rise in the use of social network analysis over the last decade. The availability of standard texts and robust software has undoubtedly contributed to this increase. Social network analysis focuses on the relationships between actors and acknowledges that an individual s behaviour is influenced by those around them. Actors and their actions are viewed as interdependent rather than independent units. This view means that the unit of analysis is not the individual, but an entity consisting of the individuals and the linkages connecting them. The purpose of this class is to introduce you to both social-science and mathematical concepts underlying the field of social network analysis. We shall look at the description and visualisation of network data and consider issues of validity and representation. We will then focus on uncovering structural properties of individual actors and the detection and description of groups. Finally we will consider how to test network hypothesis. This is a research-oriented course; its purpose is to give you basic tools for navigating the Social Network Analysis literature, and introducing you to the methods of doing social network analysis on real data. 2 Course Mechanics 2.1 Course schedule The course meets weekly, on Tuesdays at 4:30 pm, in Innovation Hall 135. 1
2.2 Facebook I m going to attempt to use Facebook as our course website this semester. I ve never done this before, so if it fails we ll fall back to old-fashioned email. If you don t have a Facebook account already, please create one (you don t have to friend me, I won t be offended). Once you are logged onto Facebook, search for CSS 692 and the first search result will be our class page with this syllabus. Click Become a Fan and you will be allowed to post and comment on the page. One of the side-benefits of using Facebook as our class site is that we ll be able to capture our online interactions and analyze our class social network which if the experiment is successful will be one of your homeworks. 2.3 Office Hours The official office hours for this course are between 3 pm and 4:15 pm on Tuesdays and Thursdays, and by appointment at mutually convenient times. My office is in the room 381 of Research 1 building. The easiest way to reach me with a question or concern is by email or Facebook. If the matter requires a face-to-face meeting, we can also schedule appointments at mutually convenient times. 3 Readings Stanley Wasserman and Katherine Faust, Social Network Analysis: Methods and Applications, # Paperback: 857 pages (hardcover also available) # Publisher: Cambridge University Press (November 25, 1994) # ISBN: 0521387078 # List Price: $32.95 This book is the Bible and the Cookbook of SNA, and will answer every one of your questions as long as it begins with the words how do I.... If your question begins with why, you may have a slightly harder time. For that, I will provide plenty of supplemental readings on the website. A number of research papers will also be included in readings. In a way, these are more important then the textbook, as they illustrate the history of the field as well as the state-of-the-art. Readings for each class will be posted on the website as PDF files. Page 2
4 Schedule of Topics Week 1 - Jan 19 What is social network analysis social network analysis and link analysis survey of tools and applications basic graph theory nodes, edges, graphs graph density walks, paths and geodesics. Week 2 - Jan 26 Centrality in social networks degree centrality closeness centrality betweenness centrality power centrality. Network Workbench, UCINET and NetDraw Lab first practical analysis session. Homework 1 handed out. Week 3 - Feb 2 Cohesive subgroups cliques clusters clans Sept. 11 Hijacker network potentially taught by a guest lecturer Week 4 - Feb 9 Brokerage and structural holes Cohesion and closure friendship vs. competition tradeoffs in efficiency vs. inclusiveness implication in organization theory, politics. Homework 1 due. Week 5 - Feb 16 Most likely, a guest lecture. I ll be at the ISA Conference in New Orleans. Week 6 - Feb 23 Presentations of project proposals by students. Week 7 - Mar 2 Block modeling finding distinct roles in social networks analyzing social groups as systems of roles. SPRING BREAK Mar 9 Week 8 - Mar 16 Distance and clustering in social networks analyzing similarities and differences multi-dimensional scaling Week 9 - Mar 23 Strength of ties dealing with non-binary networks strength of weak ties strength of strong ties Homework 2 handed out. Week 10 - Mar 30 First approach to 2-mode networks knowledge networks feature matrices networks of similarities Week 11 - Apr 6 Formation of social networks through information exchange effect of networks on information exchange Project Checkpoint Due. Page 3
Week 12 - Apr 13 Rich social networks PCANS MetaMatrix Semantic Social Networks Link Analysis Networks Homework 2 Due. Week 13 - Apr 20 Dynamic social networks dealing with dimension of time evolution of networks over time forces in networks Week 14 - Apr 27 Visualization of networks pretty pictures what looks good and how to make it network movies Paper draft due. May 4 - FINAL PRESENTATIONS 5 Assignments and Projects The goal of this course if to familiarize you with research techniques and interpretations that comprise the field of Social Network Analysis. Thus, the course work is designed to expose you not only to the primary concepts, but also to the real-world techniques and their limitations and pitfalls. 5.1 Homeworks There will be 2 homeworks where you will get a chance to test and apply the techniques you learn in class, using available data. Think of the homeworks as a walled-off playground where you can test your analysis tools and skills. This is also the right place and time to resolve any questions you may have with material that we are working with. Each homework is worth 20% of your grade. 5.2 Course Project - Small Groups The goal of the course project is to expose you to the way Social Network Analysis is done in the real world. In the course project, you will need to complete a social network study, complete with data acquisition, analysis, visualization and interpretation. Given the project-oriented nature of this course, you will learn more and achieve more interesting results if you work in small groups. I recommend that you work in groups of 2-3 people. I also recommend that all members of the group participate in the project at data collection and analysis stages. After the Checkpoint, you should designate one person to act as an editor this will result in higher quality of writing in the end product. The goal of the Checkpoint and the Draft deadlines is to prevent procrastination on the course projects. In the ideal world, you should make steady progress towards the goals you stated in your project proposal from beginning of the course. This will result in better overall quality of your final paper, and in a stress-free final presentation. Page 4
Please start thinking about your project topic, and recruit members of your project team as soon as you can. I have a few project topics that I could give you if you are stuck, but please make a reasonable effort of coming up with one of your own. The project is broken down into several stages: Project Proposal (due September 25, 5% of the grade) Please submit a 1 or 2-page abstract of what are you planning to do. Each project group will give a 10-minute in-class presentation of the proposal, followed by 5 minutes of Q and A. Please prepare a short PowerPoint presentation (2-5 slides) and email it to me ahead of the presentation. If you are having difficulties zeroing in on the project topic, please talk to me earlier rather then later; once the project proposal is presented, please consider it is set in stone. Checkpoint (due October 6, 5% of the grade) By the time of the checkpoint, you should have completed data collection and. If you are having major problems with any of the steps, this is the time to talk about it. The checkpoint will be graded on a 5-point scale (0 = nothing done yet, 5 = strong progress towards stated goal ) Paper Draft (due November 27, 5% of the grade) An assessment of your progress; The research itself should be practically complete; we should be able to have a fruitful discussion about your results and their interpretation. I will act as an editor and give you written feedback, both on the quality of your research and the quality of your writing. From that point on, you will have between two and three weeks to finish writing and produce a polished piece. The draft will be graded on a 5-point scale (0 = not started writing or not submitted the draft, 5 = only minor editing required for final submission ) Project Paper (due December 18, 25% of the grade) The final product of the course project should be a scholarly paper describing the motivation for the project, data collection and analysis methods, results and discussion thereof. The goal is to create a paper that may be presented in a social network analysis conference, and potentially lead to a longer-term research project with multiple publications. The paper will be graded on its scientific merit, as well as the quality of writing. While I am not expecting a written masterpiece, the paper should at least be readable. I can put you in touch with writing and editorial help and resources, if you require this kind of help. Page 5
5.3 Mini-Conference, 10% of the grade Instead of a formal final exam, we will conclude this course with a mini-conference open to the public. The presentation format of the mini-conference will mimic that of a real research meeting, and serve as a training ground for further presentations in the field. I repeat: the final presentations will be open to the public. Please choose your project topic in such a way that a public presentation will not get you (and me!) fired or sued. The Mini-Conference will occur on Tuesday, December 11, between 4:30 pm and 8:00 pm Every course project will be given a 20-minute presentation slot, with a 5-10 minute question-and-answer period at the end. We will order pizzas during final presentations, and finish the course with a happy hour in one of the local pubs. 6 Grade Breakdown Course Project - 50% of the total grade, broken down as follows: Project Proposal - 5% Checkpoint - 5% Paper Draft - 5% Project Paper - 25% Presentation - 10% Homeworks 40%, or 20% each. Course Participation - 10% 6.1 Late Assignments Homeworks will be accepted up a day late with no penalty. After this grace period, the penalty is 10% per day. Lateness in course projects will most likely be caused by overly ambitious project proposals so be careful not to bite off more then you can chew. If you end up with an overly ambitious project, write up a portion of it for the course and turn it in on time, and then continue to work on the project as an independent study course. 7 Software The purpose of this course is NOT to teach you how to use software packages for network analysis, but to work through the concepts and methodologies of the field. There are a number of good packages available for social network analysis, and all of them have a place under the sun. I will make each of the packages below available as a download, and also Page 6
post downloadable documentation for them. For the final project, you will have to make a choice of software tools and you are responsible for learning how to use them 7.1 Social Network Analysis and Visualization Every piece of software below has its strengths and weaknesses, and none are perfect. You are free to experiment with all of them and decide what works best for you. I will make an assumption in this class that you will be able to learn the software on your own. RTFM, please. All of the tools have strange user interfaces (if they have one at all), some are well-documented and some are not. I ll run a UCINET tutorial before homework 1 is due, but that s about all the help you ll get from me unless you re really stuck. Network Workbench: http://nwb.slis.indiana.edu/ A brand new package, very impressive start. We ll try to use it in class and see how far we can get. Free and Open Source. Python NetworkX: http://networkx.lanl.gov/ A very nice package for these that can program. If you can deal with SAS or Matlab or the like - may be a very good choice for you. Free and Open Source. R SNA Package: http://erzuli.ss.uci.edu/r.stuff/ Some swear by it, many swear AT it. Best mathematics implementation, best statistical methods, best of breed as far as rigor. If you already know R, you ll be right at home. If not, you ll be in a lot of pain. Free and Open Source. UCINET, NetDraw, Mage: http://www.analytictech.com/downloaduc6.htm UCINET exposes to the user a large amount of mechanics of doing social network analysis. The package is very comprehensive and covers most tasks you will face in analysis. The major drawback is an outdated user interface which makes it difficult to do multiple analysis sets. Free for 30 days ( trial copy ), $40 for student license. ORA: http://www.casos.cs.cmu.edu/projects/ora A very capable package with a nice visualization engine and clean user interface. I recommend ORA for all non-technical users as it is much easier to learn (even if not 100% complete). Free as Beer, not as Speech. Pajek: http://vlado.fmf.uni-lj.si/pub/networks/pajek/ A Slovenian package. Stellar capabilities, but strange user interface. You can use it instead of UCINET if you prefer. Free as Beer, not as Speech. NetMiner: www.netminer.com A commercial package; nice analysis and visualization with a clean interface, but very expensive. There s a 30-day trial version that you could check out. Page 7
GUESS Graph Exploration package: http://www.hpl.hp.com/research/idl/projects/graphs/guess.html An open-source program for interactive graph exploration. Fully programmable in Python; nice visualization. Does not include a full set of network algorithms yet, but can be very useful for the programmers in the group. Free and Open Source, but poorly maintained (if at all). 7.2 Other software Acrobat Reader Mathematics software - Matlab, Octave, SciLab, Mathematica. Optional. Stats software - Stata, SPSS, R. Optional. 7.3 Writing Use any word processor you are comfortable with; for electronic submissions, I recommend sending me PDF files. Anybody willing to learn L A TEX and use it for paper writing will be rewarded with free pizza (beer, lunch, take your pick) ;-) 8 Collaboration and Plagiarism This course plagiarism policy adheres to the standard academic practices. If continue to work in a university setting or publish in scholarly publication, you can expect to face very similar standards. Homeworks are designed to help you enhance your analysis skills, and teach you to use the software packages. Most of the software we use does not have clean interfaces, and you will have a pretty difficult time learning it. Therefore, it is OK if you work in groups during the analysis stage of your project. However, all writing should be individual and reflect personal interpretations and conclusions drawn from the data. Longer assignments (the Project Proposal, Checkpoint and the Project Paper) can and should be collaboratively authored. Make sure that the title page of the paper lists all of the coauthors. If you receive help during the project in any significant form (including, but not limited to programming, data processing, visualization, editing and proofreading) from any person outside of your project team, please thank this person in the Acknowledgements section. A good guide to proper citation and acknowledgement of source material can be found at http://www.dartmouth.edu/ sources/contents.html If your group is experiencing internal dysfunction - for example, if one person is doing all of the work while the others do nothing - this will inevitably affect the quality of the end product, and everybody s grades. If your group is not communicating well and not sharing the workload, please talk to me as soon as you can. Page 8
8.1 Plagiarism Given the fact that collaboration is allowed and encouraged, we will probably never encounter this provision in the course. However, I am obligated to remind you that the GMU functions on the Honor Code system, which means there is a Zero Tolerance policy for plagiarized assignments. In this course, an assignment will be considered plagiarized if it consists of a verbatim copy or simple paraphrase of another student s assignment - or significant use of copied text, data or figures without proper acknowledgements or citations. An assignment will also be considered plagiarized if you copy research results from a published paper unless they are presented in a context of critical evaluation and properly cited in the bibliography. This is what the University requires me to say in regards to plagiarism: Any plagiarized assignment will receive an automatic grade of F. This may lead to failure for the course, resulting in dismissal from the university. This dismissal will be noted on the students transcript. For foreign students who are on a university-sponsored visa (e.g. F-1, J-1 or J-2), dismissal also results in the revocation of their visa. Sounds scary, doesn t it? So, PLEASE CITE YOUR COLLABORATORS! Page 9