Constructing Kirq, software for set-theoretic social research: A software development travelogue



Similar documents
What is Qualitative Comparative Analysis (QCA)?

QCA: A Package for Qualitative Comparative Analysis by Alrik Thiem and Adrian Duşa

Lasse Cronqvist. Tosmana. TOol for SMAll-N Analysis. version 1.2. User Manual

Crisp and Fuzzy Set Qualitative Comparative Analysis (QCA)

Rapid GUI Application Development with Python

Dr. Marco Hugentobler, Sourcepole QGIS from a geodata viewer to a GIS platform

Introduction to Python

Analytic Modeling in Python

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

Architecture, Design and Implementation of a Mathematical Derivation Editor. Johannes Eriksson

Risk Mitigation, Monitoring and Management Plan

Postprocessing with Python

Working Paper

Introduction. Why (GIS) Programming? Streamline routine/repetitive procedures Implement new algorithms Customize user applications

Lesson 1 Introduction to Rapid Application Development using Visual Basic

QUALITATIVE COMPARATIVE ANALYSIS. Charles C. Ragin Department of Sociology University of Arizona Tucson, AZ

The Case for a FOO-technology Web-based Application Development Service

What is New Whitepaper. White Paper

Pivot Charting in SharePoint with Nevron Chart for SharePoint

Pypad. A magical and revolutionary collaborative editing platform. Apr 2010 R. Crosby, M. Huang, J.J. Poon, S. Zhang

Bachelor s Programme in Analytical Finance, 180 credits

The 2014 Bottleneck Report on Enterprise Mobile

Computer Science for San Francisco Youth

GUI application set up using QT designer. Sana Siddique. Team 5

Using Ruby on Rails for Web Development. Introduction Guide to Ruby on Rails: An extensive roundup of 100 Ultimate Resources

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

How can I manage all automation software tasks in one engineering environment?

The Operating System Lock Down Solution for Linux

Semantic Stored Procedures Programming Environment and performance analysis

Scaling Graphite Installations

JMulTi/JStatCom - A Data Analysis Toolkit for End-users and Developers

Programming in Python. Basic information. Teaching. Administration Organisation Contents of the Course. Jarkko Toivonen. Overview of Python

David Boddie. PyCon UK 2007, Birmingham

Gallio: Crafting a Toolchain. Jeff Brown jeff.brown@gmail.com

Graphical Environment Tool for Development versus Non Graphical Development Tool

Case Study. IT contract and permanent recruitment into Cornwall.

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

White Paper: Data Capture and Document Management Systems - 10 Tips and Information Nuggets That Will Save You Time, Money, and Hair

Obstacles and opportunities for model-based testing. in an industrial software environment

METADATA-DRIVEN QLIKVIEW APPLICATIONS AND POWERFUL DATA INTEGRATION WITH QLIKVIEW EXPRESSOR

So, let's dive right into what you need to do to start.

Start your own Recruitment. Receive training from recruitment expert 30 years experience. Earn a substantial income that grows year-on-year...

High-Level Interface Between R and Excel

MayaVi: A free tool for CFD data visualization

Instructional Design Framework CSE: Unit 1 Lesson 1

10 Things Every Linux Programmer Should Know Linux Misconceptions in 30 Minutes

Introduction to Python

FREQUENTLY ASKED QUESTIONS

How to Design and Create Your Own Custom Ext Rep

Using fsqca. A Brief Guide and Workshop for Fuzzy-Set Qualitative Comparative Analysis

CrossPlatform ASP.NET with Mono. Daniel López Ridruejo

White Paper. Java versus Ruby Frameworks in Practice STATE OF THE ART SOFTWARE DEVELOPMENT 1

Open Source & XBRL: the Arelle Project

CODE ASSESSMENT METHODOLOGY PROJECT (CAMP) Comparative Evaluation:

RIA DEVELOPMENT OPTIONS - AIR VS. SILVERLIGHT

Community perceptions of drugs and drug services: perspectives from 3 ethnic minority communities in Edinburgh. Tasneem Irshad

Git Branching for Continuous Delivery

Department of Human Resource Management

How To Become A Successful Advertising, Marketing, Or Public Relations Professional

System Structures. Services Interface Structure

Lawrence Tarbox, Ph.D. Washington University in St. Louis School of Medicine Mallinckrodt Institute of Radiology, Electronic Radiology Lab

Course plan. MSc on Bioinformatics for Health Sciences Academic Year Qualification Master's Degree

user experience interaction design web design web development web:

The Future of the PFMABE Suite: The open source business model in the Hydrographic World

SCHOOL OF ELECTRONICS AND COMPUTER SCIENCE

MOBILE MARKETING MUSTS. IRCE June 3, 2015

LTO tape technology continues to evolve with LTO 5

SAP BusinessObjects GRC Access Control 10.0 New Feature Highlights and Initial Lessons Learned

How To Develop Software

SAS MSc Business Intelligence: A Flexible Approach. C Straker and T Tricker School of Computing and Management Sciences Sheffield Hallam University

Preparing for Drupal 8

Upgrading a Visual Basic Application to.net:

Keywords: Cloud computing, Characteristics of Cloud computing, Models of Cloud computing, Distance learning, Higher education.

Audit TM. The Security Auditing Component of. Out-of-the-Box

We Transform IDEA into Business with Right Blend of Talent, Technology & Techniques

Mobile devices (laptops, smartphones and tablets) help to improve the. effectiveness of the employee labor. The owners of small business are

Establishing Environmental Funds

UML-based Test Generation and Execution

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

functional Safety UL Functional Safety Mark

Chapter 12 Programming Concepts and Languages

Data-Deduplication Reducing Data to the Minimum

Photonic & Fiber Alignment Engines

AUTOMATION CENTER MAXIMUM ENGAGEMENT. MINIMUM RESOURCES.

True Stories of Customer Service ROI: The real-world benefits of Zendesk

Data Mining Practical Machine Learning Tools and Techniques

Python and Cython. a dynamic language. installing Cython factorization again. working with numpy

Object-Oriented Test Automation

Choosing a Content Management System (CMS)

Electronic Records Management Strategy


Continuous Integration for Snabb Switch

Review & AI Lessons learned while using Artificial Intelligence April 2013

OpenChorus: Building a Tool-Chest for Big Data Science

FlameRobin Administration tool for Firebird

What is a programming language?

Tech-Clarity Insight: Managing Design Data with SharePoint. Improving Product Design and Development using Low Overhead Collaboration Infrastructure

SLANGTNG - SOFTWARE FOR STOCHASTIC STRUCTURAL ANALYSIS MADE EASY

Web Hosting: Pipeline Program Technical Self Study Guide

COS 333: Advanced Programming Techniques

Transcription:

Constructing Kirq, software for set-theoretic social research: A software development travelogue Claude Rubinson University of Houston Downtown rubinsonc@uhd.edu cjr@grundrisse.org http://grundrisse.org PyHou Houston, Texas March 19, 2013

Overview Introduction to QCA History of QCA software First (mis-)steps in developing Kirq: the fsqca package for R Use cases and design goals for acq and Kirq Python's role in meeting these design goals Lessons learned: Using Python for academic software projects

What is Qualitative Comparative Analysis? A method of conducting social research by analyzing subset relationships, using Boolean algebra

What is Qualitative Comparative Analysis? A method of conducting social research by analyzing subset relationships, using Boolean algebra Example: Religious fundamentalists tend to be politically conservative. Set of Political Conservatives Set of Religious Fundamentalists

What is Qualitative Comparative Analysis? A method of conducting social research by analyzing subset relationships, using Boolean algebra Example: Wealthy individuals tend to come from privileged families. Set of People with Rich Parents Set of Rich People

What is Qualitative Comparative Analysis? Particularly concerned with two types of causal relationships: necessary conditions and sufficient conditions

What is Qualitative Comparative Analysis? Necessary condition: cause must be present for outcome to occur Example: Must be exposed to HIV to contract AIDS Set of people with AIDS Set of people exposed to HIV

What is Qualitative Comparative Analysis? Sufficient condition: if cause occurs, outcome will occur Example: Abortion or miscarriage will terminate pregnancy Set of women with abortion Set of women with miscarriage Set of women with terminated pregnancy

What is Qualitative Comparative Analysis? Sufficient condition: if cause occurs, outcome will occur Recently deported women who do not plan to cross again (Outcome) High SES women who haven't lived in the U.S. and aren't traveling with family (X 1 ) High SES women who haven't lived in the U.S., have only attempted cross a few times and felt that their last crossing experience was very dangerous (X 2 )

What is Qualitative Comparative Analysis? Sufficient condition: if cause occurs, outcome will occur Recently deported women who do not plan to cross again (Outcome) High SES women who haven't lived in the U.S. and aren't traveling with family (X 1 ) Women belonging to sets X 1 and X 2 High SES women who haven't lived in the U.S., have only attempted cross a few times and felt that their last crossing experience was very dangerous (X 2 )

What is Qualitative Comparative Analysis? Challenges conventional statistical analysis, which is based upon a linear-additive model Complements other set-theoretic research methods (e.g., SNA and QNA) Does not depend upon degrees of freedom, so is useful for small-, medium-, and large-n studies Encourages research process that is retroductive and case-oriented

What is Qualitative Comparative Analysis? Example: Brown and Boswell (1995) Truth Table with Contradiction (from Table 4 of Brown and Boswell 1995) Recent Black Migrants Weak Union Black Strikebreaking Observations T T T East Chicago, Pittsburgh, Youngstown T F Con Buffalo, Chicago, Gary, Johnstown, [Cleveland] F T F Bethlehem, Joliet, McKeesport, Milwaukee, New Castle, Reading F F F Decatur, Wheeling

What is Qualitative Comparative Analysis? Example: Brown and Boswell (1995) Revised Truth Table without Contradiction (from Table 5 of Brown and Boswell 1995) Recent Black Migration Weak Union Local Govt Repression Black Strikebreaking Observations T T T T East Chicago, Pittsburgh, Youngstown T T F T F T T Buffalo, Chicago, Gary, Johnstown T F F F Cleveland F T T F Bethlehem, Joliet, McKeesport, New Castle, Reading F T F F Milwaukee F F T F Decatur F F F F Wheeling

What is Qualitative Comparative Analysis? Example: Brown and Boswell (1995) Revised Truth Table without Contradiction (from Table 5 of Brown and Boswell 1995) Recent Black Migration Weak Union Local Govt Repression Black Strikebreaking Observations T T T T East Chicago, Pittsburgh, Youngstown T T F T F T T Buffalo, Chicago, Gary, Johnstown T F F F Cleveland F T T F Bethlehem, Joliet, McKeesport, New Castle, Reading F T F F Milwaukee F F T F Decatur F F F F Wheeling

What is Qualitative Comparative Analysis? Example: Brown and Boswell (1995) Revised Truth Table without Contradiction (from Table 5 of Brown and Boswell 1995) Recent Black Migration Weak Union Local Govt Repression Black Strikebreaking Observations T T T T East Chicago, Pittsburgh, Youngstown T F T T Buffalo, Chicago, Gary, Johnstown RBM * WU * LGR + RBM * ~WU * LGR = Black Strikebreaking

What is Qualitative Comparative Analysis? Example: Brown and Boswell (1995) Revised Truth Table without Contradiction (from Table 5 of Brown and Boswell 1995) Recent Black Migration Weak Union Local Govt Repression Black Strikebreaking Observations T T T T East Chicago, Pittsburgh, Youngstown T F T T Buffalo, Chicago, Gary, Johnstown RBM * WU * LGR + RBM * ~WU * LGR = Black Strikebreaking RBM * LGR = Black Strikebreaking

Technical and Usability Challenges QCA algorithms are: NP-hard (no exact algebraic solution) O(2 N ) complexity, where N indicates the number of variables (not observations) in the data set Because data sets tend to be small and matrix algebra isn't used, no use for NumPy How to maintain and encourage retroductive, case-oriented research process? How to make software that's efficient, useful, and usable?

Lineage of QCA Software QCA (Drass and Ragin 1992) TOSMANA (Cronqvist 2011) fs/qca (Ragin, Drass, and Davey 2009) QCA (R) (Dusa and Thiem 2012) QCA3 (R) (Huang 2012) fuzzy (Stata) (Longest and Vaisey 2008) acq & Kirq (Rubinson and Reichert 2012)

Plan to throw one away; you will, anyhow fsqca module for R Cross-platform, but requires R Not user-friendly Too slow R programming considered harmful But: allowed me to realize that the user interface should be task-oriented

Plan to throw one away; you will, anyhow fsqca module for R Cross-platform, but requires R Not user-friendly Too slow R programming considered harmful But: allowed me to realize that the user interface should be task-oriented

Plan to throw one away; you will, anyhow fsqca module for R Cross-platform, but requires R Not user-friendly Too slow R programming considered harmful But: allowed me to realize that the user interface should be task-oriented

Plan to throw one away; you will, anyhow fsqca module for R Cross-platform, but requires R Not user-friendly Too slow R programming considered harmful But: allowed me to realize that the user interface should be task-oriented

Plan to throw one away; you will, anyhow fsqca module for R Cross-platform, but requires R Not user-friendly Too slow R programming considered harmful But: allowed me to realize that the user interface should be task-oriented

Use Cases for acq and Kirq acq: QCA at the Unix commandline a scratch my own itch project Kirq: QCA for everybody else a user-friendly, crossplatform GUI program

Design Goals for acq and Kirq Software that is efficient, useful, and usable: Conform to the Unix philosophy Good out of the box performance, plus ability to optimize performance Support and encourage good QCA research practices Kirq should be crossplatform and user-friendly Also important: Avoid sucking up all of my time

Why Python? The surrounding ecosystem Ability to hire others Confidence that the supporting environments is stable and will continue to be maintained Python is lingua franca in academia Rich environment for GUI toolkits, installers, etc. Chose Qt for GUI toolkit and PyInstaller for installer

Design Goal: Avoiding a time sink Relatively easy to recruit and hire good programmers Easy to mix procedural and OOP programming Official online documentation remains top notch The core Python language remains relatively compact but not the standard library, and certainly not the surrounding environment (PyPI, etc)

Design Goal: Conform to the Unix Philosophy Build a prototypes as soon as possible Small is beautiful/do one thing well acq's GUI scripts: gtt and concov have resisted adding a data editor to Kirq decided to offload Kirq's visualizations Make every program a filter Because Kirq can read data from the commandline, it's easy for other programs to call out to it

Design Goal: Good out of the box performance and ability to optimize acq had fewer lines of code than fsqca for R module, and was faster compare to QCA module for R Good tools for profiling Some standard, well documented practices for improving performance, although Python optimization often requires expertise Promise of projects such as PyPy

Design Goal: Support and encourage good QCA research practices Less concern for performance means more attention to user-interface issues Writing acq as Unix shell scripts helped me streamline the QCA analysis; both acq and Kirq make it easy to modify and rerun analyses Have designed Kirq to facilitate interrogation and comparisons of solutions Lots of GUI niceties, such as tooltips and pop-out windows Importance of eating your own dogfood

Design Goal: Kirq should be crossplatform and user-friendly Standardized on Python 2.7, for PyQt Only minor compatibility issues with PyQt bindings and OSX, and none with Windows Kirq always feels native Could never build Qt for OSX; used MacPorts instead (slow, but works well) PyInstaller works really well, but needed to use development branch for OSX (dev branch is now stable) Session history is Kirq's killer feature

Lessons Learned Python's Advantages (for Academic Projects) Core language is relatively compact, with excellent documentation Relatively easy to find developers Strong, well-developed environment of GUI toolkits, installers, etc. Good performance out of the box, with ability to optimize if necessary

Lessons Learned Python's Disadvantages (for Academic Projects) Package distribution is a mess, as is associated documentation Churn in the standard library is too rapid to keep up with for a part-time developer Dead tree documentation is lousy Online signal-to-noise ratio is low Python community online is too insular; overly concerned with idiomatic Python