What is Management Reporting from a Data Warehouse and What Does It Have to Do with Institutional Research? Emily Thomas Stony Brook University AIRPO Winter Workshop January 2006
Data to Information Historically Standard paper reports Systems of record Special reports written by programmers A few direct users IR extracts
Data to Information Current Technology Data Many On-line warehouse self-service transaction users processing Institutional systems reporting including tool IR
Hypothesis New information technologies are creating greater demand for information and the need for data warehouses. Building a data warehouse requires transforming raw data into reporting measures and categories. Meeting the demand for information requires creating useful reports and report templates. Institutional researchers are experts at displaying information and constructing reporting variables. Therefore participating in the development of institutional reporting programs is a new role for institutional research.
Questions What is a data warehouse? What kinds of reporting does higher education do with warehouse data? What roles are institutional researchers playing in the development of institutional reporting programs?
What is a data warehouse? A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data in support of management s decisi`ons. (W.H. Inmon, Building the Data Warehouse)
What is a data warehouse? A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data to describe an organization s activities and support of management s decisions.
What is a data warehouse? A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of derived data to describe an organization s activities and support of management s decisions.
What is a data warehouse? A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of derived data that are managed and institutionally recognized as a shared data resource used to describe an organization s activities and support of management s decisions.
What is a data warehouse? A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of data in support of management s decisions. (W.H. Inmon, Building the Data Warehouse) A data warehouse is a subject oriented, integrated, non-volatile, and time variant collection of derived data that are managed and institutionally recognized as a shared data resource used to describe an organization s activities and support management s decisions.
There is no question that the power user is the most important person in the corporation in regard to establishment of the data warehouse and the unleashing of the power of informational processing. (Inmon 1994,116)
Reporting Categories Operations monitoring: Which students are ready to be cleared for graduation? Operations analysis: Which students were affected by an error in processing graduation clearance? Management reporting: How many students graduated in each major in each of the last five years? Management analysis: Did the graduation GPA of recent graduates vary with whether they entered as freshmen or transfer students? Analytics: Did a new freshman program improve the graduation rate?
Star Schema Data Model Instructor Dimension Student Dimension Instructor ID Student ID Name Fact Table Name Department Student ID Class Title Course ID Major Full/Part-Time Instructor ID Location ID Course Dimension Location Dimension Term Course ID Location ID Credits Course Title Room Grade Course Department Building Gen Ed Indicator Example: http://web.mit.edu/warehouse/
Reporting Matrix Operations monitoring Operations analysis Management reporting Management Analysis Analytics Detail or aggregates Ad hoc or recurring Who does it? Output format Distribution Tools User skills Data timing Data sources Data access
Reporting Matrix: Contents and Repetition Contents Detail to support action on individual students Aggregate data to describe performance or trends Repetition Recurring Ad hoc Recurring Ad hoc
Reporting Matrix: Reporting Contents and Repetition Reporting category Contents Operations Operations monitoring analysis Detail to support action on individual students Management Management reporting Analysis Aggregate data to describe performance or trends Repetition Recurring Ad hoc Recurring Ad hoc
Reporting Matrix: Who Does the Reporting? Reporting category Objective Operations monitoring Operations analysis Detail to support action on individual students Management Management reporting Analysis Aggregate data to describe performance or trends Analytics Support for conclusions Question type Recurring Ad hoc Recurring Ad hoc Typical reporters Functional area staff Functional area technical experts Core management such as dept. chairs Management analysts Institutional researchers
Reporting Matrix: Typical Output, Distribution, Skills, Tools Reporting Operations Operations Management Management Analytics category monitoring analysis reporting Analysis Typical output lists/counts lists/counts tables/graphs tables/graphs tables/graphs statistics Distribution system output system output Tools User skill pre-programmed queries standard reports OLAP Low use of preprogrammed reports/queries/ cubes SQL reporting software analytic software High use raw data via SQL or similar extraction paper report/ web standard reports OLAP dashboards Low easy information access paper report reporting software spreadsheets Moderate manipulation of raw data with a reporting tool text document reporting software analytic softw are Very high statistical analysis and data mining
Reporting Matrix: Data Timing, Data Sources and User Access Reporting category Data timing Data sources Data access Operations monitoring real time/ daily extract transaction system/ warehouse all or restricted veiw Operations analysis real time/ daily extract transaction system/ warehouse all Management reporting snapshots/ longitudinal data cubes from warehouse or data marts all or restricted view Management Analysis snapshots/ longitudinal data warehouse/ data marts all or restricted view Analytics snapshots/ longitudinal data warehouse/ longitudinal data marts all
Trends Information culture and data availability generate increased demand for information. Web-based report delivery and user-friendly tools facilitate self-service reporting. Increased reporting generates interest in institution-wide reporting solutions. New transaction systems add data complexity that motivates warehousing.
New Roles for Institutional Research New responsibilities for designing and implementing disseminated management reporting systems New responsibilities for shared data designs including data warehouses New means of ensuring the accuracy of management information: within the data source Less staff time devoted to meeting simple data requests New relationships with IT
IR and IT Historical Responsibilties Shared Responsibility? Operational reporting IT Operational data Management reporting Wide data disseminatation --data extracted and transformed for reporting Analytics IR Extracted/constructed data --wide dissemination Special purpose IR data
Institutional Research Contributions Assessing reporting needs Advocating for new forms of information delivery Defining an institutional reporting strategy/program Defining warehouse variables and table structure Selecting an institutional reporting tool Designing standard reports or templates Managing a management information delivery system
www.stonybrook.edu/hedw
Two Types of Best Practice? (1) Fully-developed data warehouse Core of an institutional reporting program Source for all or most reporting Well-developed data model Fully defined and documented data management procedures Substantial institutional commitment and staff http://web.mit.edu/warehouse/
Two Types of Best Practice? (2) Pragmatic low-budget approach Build something. Identify the data needed to meet key reporting needs Create tables to meet those needs Clean, expand, integrate, and document the tables and extend their use
Courtesy of Henry Stewart
Sources The Data Warehousing Institute. http://tdwi.org Davenport, TH (1997). Information Ecology: Why Technology is Not Enough for Success in the Information Age. New York and Oxford: Oxford University Press. Greenfield, L (1995). The Data Warehousing Information Center. www.dwinfocenter.org. Inmon, WH (1996). Building the Data Warehouse. New York: John Wiley & Sons, Inc. Inmon WH and RD Hackathorn (1994). Using the Data Warehouse. New York: John Wiley & Sons, Inc.
Sources Kimball, R, M Ross and W Thornthwaite (1998). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses. New York: John Wiley & Sons, Inc. Kimball, R and M Ross (2002). The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (second edition). New York: John Wiley & Sons, Inc. Sanders, L (editor), How Technology is Changing Institutional Research. New Directions in Institutional Research, 103. Fall 1999. Serban, AM and J Luan. Knowledge Management: Building a Competitive Advantage in Higher Education. New Directions in Institutional Research, 113, Spring 2002. Wierschem, D, R McBroom and J McMillen. Methodology for Developing an Institutional Data Warehouse. AIR Professional File 88, 2003.
Hypothesis New information technologies are creating greater demand for information and the need for data warehouses. Building a data warehouse requires transforming raw data into reporting measures and categories. Meeting the demand for information requires creating useful reports and report templates. Institutional researchers are experts at displaying information and constructing reporting variables. Therefore participating in the development of institutional reporting programs is becoming a new IR role.