DKAN. Data Warehousing, Visualization, and Mapping



Similar documents
Creating and Managing Online Surveys LEVEL 2

Google Sites: Creating, editing, and sharing a site

Visualization with Excel Tools and Microsoft Azure

Quick and Easy Web Maps with Google Fusion Tables. SCO Technical Paper

Advanced Training Reliance Communications, Inc.

There are various ways to find data using the Hennepin County GIS Open Data site:

Ease of Use No programming, no system administration. Make maps fast with this productivity tool.

How To Create A Campaign On Facebook.Com

understand how image maps can enhance a design and make a site more interactive know how to create an image map easily with Dreamweaver

Creating Codes with Spreadsheet Upload

2) Log in using the Address and Password provided in your confirmation

MetroBoston DataCommon Training

Making a Website with Hoolahoop

How To Write A Cq5 Authoring Manual On An Ubuntu Cq (Windows) (Windows 5) (Mac) (Apple) (Amd) (Powerbook) (Html) (Web) (Font

CMS Training Manual. A brief overview of your website s content management system (CMS) with screenshots. CMS Manual

BID2WIN Workshop. Advanced Report Writing

MicroStrategy Desktop

NDSU Technology Learning & Media Center. Introduction to Google Sites

Introducing our new Editor: Creator

CrownPeak Platform Dashboard Playbook. Version 1.0

Getting Started With Mortgage MarketSmart

Consider the possible problems with storing the following data in a spreadsheet:

McAfee Endpoint Encryption Reporting Tool

WebFOCUS BI Portal: S.I.M.P.L.E. as can be

CMS Training. Prepared for the Nature Conservancy. March 2012

Chapter 1 Kingsoft Office for Android: A Close Look. Compatible with Microsoft Office: With Kingsoft Office for Android, users are allowed to create,

Using the Bluemix Analytics for Hadoop Service to Analyse Data

State of Indiana Content Management System. Training Manual Version 2.0. Developed by

So you want to create an a Friend action

ArcGIS online Introduction Module 1: How to create a basic map on ArcGIS online Creating a public account with ArcGIS online...

Google Docs Basics Website:

Frog VLE Update. Latest Features and Enhancements. September 2014

We re going to show you how to make a Share site. It takes just a few minutes to set one up. Here s how it s done.

M-Files Gantt View. User Guide. App Version: Author: Joel Heinrich

The Power Loader GUI

ORACLE BUSINESS INTELLIGENCE WORKSHOP

With a wide variety of drag and drop widgets, adding and updating information on your website will be a snap!

How to Import Data into Microsoft Access

Intellect Platform - Tables and Templates Basic Document Management System - A101

ITP 101 Project 3 - Dreamweaver

Microsoft Expression Web

NJCU WEBSITE TRAINING MANUAL

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

Sage Accountants Business Cloud EasyEditor Quick Start Guide

Virtual Exhibit 5.0 requires that you have PastPerfect version 5.0 or higher with the MultiMedia and Virtual Exhibit Upgrades.

What is OneDrive for Business at University of Greenwich? Accessing OneDrive from Office 365

Creating Interactive PDF Forms

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9.

Novell ZENworks Asset Management 7.5

USER GUIDE. Unit 2: Synergy. Chapter 2: Using Schoolwires Synergy

2/24/2010 ClassApps.com

Content Manager User Guide Information Technology Web Services

INTRODUCTION PARKING REPORTING: OVERVIEW PARKING REPORTING: STATISTICS MY DOMAINS: PARKING OPTIMIZATIONS...

Office365 at Triton College

WP Popup Magic User Guide

Utilizing Microsoft Access Forms and Reports

MICROSOFT OFFICE ACCESS NEW FEATURES

Mail Chimp Basics. Glossary

Cal Answers Analysis Training Part III. Advanced OBIEE - Dashboard Reports

CentralMass DataCommon

User Guide. Analytics Desktop Document Number:

PloneSurvey User Guide (draft 3)

Portal Version 1 - User Manual

Joomla! 2.5.x Training Manual

ORACLE BUSINESS INTELLIGENCE WORKSHOP

Content Management System User Guide

SelectSurvey.NET Basic Training Class 1

Using FileMaker Pro with Microsoft Office

DirectTrack CrossPublication Users Guide

SECTION 5: Finalizing Your Workbook

EBOX Digital Content Management System (CMS) User Guide For Site Owners & Administrators

03_Events Trainingv3 1

Web CMS Forms. Contents. IT Training

UH CMS Basics. Cascade CMS Basics Class. UH CMS Basics Updated: June,2011! Page 1

State of Nevada. Ektron Content Management System (CMS) Basic Training Guide

Teacher References archived classes and resources

Once logged in you will have two options to access your e mails

REUTERS/TIM WIMBORNE SCHOLARONE MANUSCRIPTS COGNOS REPORTS

Utilities ComCash

Intro to Excel spreadsheets

Dreamweaver and Fireworks MX Integration Brian Hogan

To change title of module, click on settings

Microsoft SharePoint 2010 End User Quick Reference Card

Storytelling with Maps: Workflows and Best Practices

Teacher Training Session 1. Adding a Sub-Site (New Page) Editing a page and page security. Adding content cells. Uploading files and creating folders

Wellesley College Alumnae Association. Volunteer Instructions for Template

How To Change Your Site On Drupal Cloud On A Pcode On A Microsoft Powerstone On A Macbook Or Ipad (For Free) On A Freebie (For A Free Download) On An Ipad Or Ipa (For

Build Your Mailing List

Microsoft Access 2010 handout

Appspace 5.X Reference Guide (Digital Signage) Updated on February 9, 2015

What Do You Think? for Instructors

Using Adobe Dreamweaver CS4 (10.0)

Chapter 15: Forms. User Guide. 1 P a g e

SAS BI Dashboard 4.3. User's Guide. SAS Documentation

ithenticate User Manual

DATA VISUALIZATION WITH TABLEAU PUBLIC. (Data for this tutorial at

Creating an with Constant Contact. A step-by-step guide

Google Drive: Access and organize your files

Creating an with Constant Contact. A step-by-step guide

JOOMLA 2.5 MANUAL WEBSITEDESIGN.CO.ZA

Transcription:

DKAN Data Warehousing, Visualization, and Mapping

Acknowledgements We d like to acknowledge the NuCivic team, led by Andrew Hoppin, which has done amazing work creating open source tools to make data available to the world; it s been a pleasure improving DKAN together over the past two years. Gemima Barlow and the NDI Nigeria team initially supported the development of color shaded maps, teaching us the meaning of the world choropleth in the process, and NDI s Gender, Women and Democracy team for significant user identified and funded important usability improvements. This content is available under a Creative Commons Attribution ShareAlike 4.0 International Public License. You are free to: Share copy and redistribute the material in any medium or format; Adapt remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. The license terms include: Attribution You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use; ShareAlike If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original; No additional restrictions You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. Table of Contents Acknowledgements Table of Contents Introduction Purpose of DKAN Features Adding Data to DKAN Adding a New Dataset and Resource(s) Step 1: Create the Dataset Step 2: Add one or more Resources to the Dataset Step 3: Adding Metadata to a Dataset Visualizations Charts Choropleth Maps Publishing Visualizations Questions Data Stories

Introduction Many governments, institutions, and organizations are now moving towards open data, collecting and publishing large quantities of information in an effort to increase transparency and use data to inform policy. However, open data is not enough to improve lives, as the raw data has to be presented in meaningful and accessible ways to both citizens and policymakers. Data needs to be organized, processed, and presented in human-readable formats so that citizens, analysts, and policymakers can effectively use the information. However, many organizations lack the resources and technical capability to use commercial data visualization services or develop platforms of their own. That often means that the organizations in the best position to collect data and work closely with the communities the data comes from lack the ability to present and share this information in effective ways. Purpose of DKAN Spreadsheets of raw numbers are difficult for most of us to easily understand. With DKAN, organizations can take large amounts of data and instantly organize, display, analyze, and visualize this information. This data-driven storytelling can help policymakers quickly understand the data to make better decisions, and each form of visualization can be instantly created as needed. Choropleth maps instantly show regional trends and variations, and a large dataset can instantly be organized into multiple charts and graphs comparing changes over, time, region, funding, or any number of variables. While other programs can easily be used to create individual graphs or sort lists of data, DKAN provides a comprehensive data warehousing, browsing, and visualization solution for large sets of data tagged with multiple variables, with highly customizable options based on the same set of data. DKAN is especially useful for rapidly prototyping multiple visualizations, aggregating data, and displaying changes over time or by geographic region. It has been particularly successful in releasing data from elections, censuses, health monitoring, and economic analysis. Features Ready to use out of the box, DKAN boasts powerful data warehousing, publishing, and visualization capabilities. With this tool, users can quickly publish and display open data, creating powerful data narratives with charts, graphs, and maps. The content management system (CMS) can be integrated with blogs and DKAN is compatible with major open data

standards, including the White House s Project Open Data and data.gov. Since DKAN is open source, users can download the source code from our Github or Drupal for free and use the tool used by governments pursuing open data and used by NDI in multiple elections for publishing and visualizing data. Adding Data to DKAN DKAN s data publishing model is based on the concept of datasets and resources. A dataset is a collection of one or more resources; a resource is the actual data being published, such as a CSV table or a GeoJSON data file. Adding a New Dataset and Resource(s) In our example, we ll be adding a dataset with Wisconsin polling places to a DKAN site. The data may look familiar; it's one of the sample datasets provided with DKAN upon installation. Step 1: Create the Dataset By default, only authenticated ( logged in ) users can add new Datasets and Resources to a DKAN website. Once logged in, we can use the "Add Dataset" link in the main navigation bar. Depending on your user permissions, you may have access to the administration menu; in that case, you may also navigate to Content >> Add Content >> Dataset link to access the Create Dataset form.

The Dataset is simply the container or folder for the actual data resource files and contains basic higher level information that applies across all the data, such as title, description, category tags, and license. Once we ve entered information about the data, we can click the Next: Add data button to begin adding data. Step 2: Add one or more Resources to the Dataset

After creating a dataset, we re prompted to add one or more data resources to it. There are three types of Resources that can be added to a Dataset, depending on the type and location of the Resource: Upload a file this option allows publishers to upload data files to the DKAN site. As in the link to a file option, the data within the file will be imported into your DKAN site s Datastore for preview and analysis by your users. See The DKAN Datastore for more information. Link to a file this option allows publishers to create a link to a data file published on another Internet website. Although the file itself will remain on the other site, the data within the file can be imported into your DKAN site s Datastore for preview and analysis by your users. See The DKAN Datastore for more information. Link to an API some data resources aren t standalone files but queryable online databases; the interface to these databases is known as an API. Adding links to these types of online database interfaces to your DKAN data catalog can be very useful for developers interested in working with your data. Typically, you ll need to upload a file (almost always a.csv), so please feel free to ignore the linking options if you don t need them. To continue with our Wisconsin Polling Places example, we ll add one resource file to the Dataset we created in Step 1. Our resource file is a CSV that is, comma separated values format; this is a popular file format for exchanging tabular data. Let s explore the example resource shown here and the various fields within: Resource / Choose File upload a file from your local hard drive. Resource / Recline Views DKAN s Data Preview feature allows visitors to preview published data in three views: Map data with latitude and longitude coordinates can be previewed in a map interface Graph tabular (spreadsheet) data can be graphed by users, letting them create their own meaningful visualizations (Please note this is a method for the data intake, not for rendering the graphs themselves) Grid by default, tabular data is presented in a basic spreadsheet view, with filter, sort, and search capabilities

Title this is the title of the individual data file, not the parent dataset container. Description a rich text editor field is provided so publishers can offer detailed and useful descriptions Format entering the file format here will allow users the ability to search for data by specific format Dataset this is the parent dataset container; this field should already be populated if you re adding a Resource subsequent to adding a Dataset At the bottom of the Add Resource page, we can choose: Save Save progress on this resource and immediately return to it for further editing Save and add another Save this resource and add another resource to the same dataset Next: Additional Info Save this resource and enter optional metadata In our example, we re only adding a single resource, so we ll click Next: Additional Info to move onto Step 3. If we had more than one resource to add to this dataset, we would choose the Save and add another option. Step 3: Adding Metadata to a Dataset Organizations may be interested in providing valuable information about their dataset to both human visitors to the website and machines discovering the dataset through one of DKAN's public APIs. All the below fields are optional, but provide important context on data type, kind and function. Adding additional metadata to the dataset serves to further clarify how the data can be used by others.

Let's take a closer look at some of the metadata fields available on this form: ** Author** The data set's author, in plain text. Spatial / Geographical Coverage Area Lets us define what region the data applies to. In this case, the US State of Wisconsin. You can use the map widget to draw an outline around the state borders, or, click the "Add data manually" button if you already have a GeoJSON string you can paste in. Spatial / Geographical Coverage Location The region the data applies to, written in plain text. This can be used instead of or in addition to the Coverage Area field.

Frequency How often is this dataset updated? We might expect our list of polling places to be updated every year, so we could select "annually." However, often we don't expect the data to be updated (even in this case, perhaps we plan to post the next version of the data as a separate dataset), in which case we can leave this blank. Temporal Coverage Like Geographic Coverage, this field lets us give some context to the data, but now for the relevant time period. Here we could enter the year or years for which our polling places data is accurate. Granularity This is a somewhat open ended metadata field that lets you describe the granularity or accuracy of your data. For instance: "Year". Data Dictionary Another open ended field, this is a space for almost any kind of explanation for understanding the terminology/units/column names/etc. in our dataset. In most cases, this will be a simple URL to a Data Dictionary resource elsewhere on the web. Additional Info Lets us arbitrarily define other metadata fields. See Additional Info field for more information. Resources This field is a reference to the resources you have already added. You should generally leave this field alone and use the workflows outlined here and in Updating Datasets in DKAN to add, edit and remove resources from your Dataset. After you click "Save", the metadata we enter will appear on the page for this Dataset:

Visualizations Charts For numeric data that s best rendered comparatively, you ll want to make charts with your resources. You can make bar charts, pie charts, scatterplots, or line graphs. Navigate to the dataset you want to base your chart on, then Click the Explore Data button

Right click (or on Macs, control click) the download button to copy the URL of the resource file. Saving this link will allow you to directly revisit your resource in the future. Now use the administration menu at the top to navigate to Structure» Entity types» Visualization» Chart» Add Chart

Enter values for the title, description, categories and tags fields. At the bottom of the form, paste the resource link you just copied into the Source field. Now, click the Next button. If the URL was loaded properly you will have two fields to fill under the title 'Define Variables'. The first one, 'Series' stands for the Y axis, and the second field, X Field, stands for the X axis. On these fields you have to choose the columns that you are going to display. Only the Series field can contain multiple values. If the column names are not displayed properly, check again that your source URL was correct. Keep the radio buttons checked in 'auto'. After making sure that everything is correct, click the Next button.

Now you can select the type of chart you want to create. Click on the image of the chart type you would like to use. The charts on this screen are generic images and not based on the data you loaded. To see the actual chart, click the Next button. If everything went ok, you should see your chart displayed. The data might be slightly misplaced so on the right column, you can edit the X Format for the labels (number, date, etc), Label Rotation, Color of the lines / columns / etc, X and Y labels for the axis themselves and margins to move not only the labels but the chart as well. If you would like to see what this data looks like in another type of chart or graph, click Back on the bottom on the page and repeat these steps with another chart or graph selection. After editing and customizing the chart to your liking, click the Finish button.

Now you have created your chart. On the chart s page, there will be an Embed button. Click on it to reveal the HTML Embed code which you can add to any website to embed a live, dynamic chart which will update if you change the chart on your DKAN site. You can also set the height and width of the embedded chart by typing it into the Height and Width boxes above the Embed code. Choropleth Maps A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per capita income. The choropleth map provides an easy way to visualize how a measurement varies across a geographic area or it shows the level of variability within a region. Choropleth Maps can effectively be used to report area values at virtually any scale, from global to local and the data can be thought about in many different ways at many different levels of analysis, from general overall patterns to the detection of details. They are especially helpful for finding intriguing hot spots. 1. Look for Content > Add Content > Resource in the admin menu and click on it.

2. Upload a csv file for the resource. 3. Fill the required fields and save the resource

4. Look for Structure > Entity Types > Geo File > geojson > Add geojson in the admin menu and click on it. GeoJSON is a widely used data format for displaying vectors in web maps. It is based on JavaScript object notation, a simple and minimalist format for expressing data structures using syntax from JavaScript. In GeoJSON, a vector feature and its attributes are represented as a JavaScript object, allowing for easy parsing of the geometry and fields.

5. Set Title 6. Upload a geojson file 7. Fill name attribute with the column name in the data (csv resource) that will match the name property for the features in the geojson file.

8. Click Save.

9. Look for Structure > Entity Types > Visualization > Choropleth Visualization > Add Choropleth Visualization in the admin menu and click on it.

10. Fill Title 11. Select the geojson file we created for the geojson field. 12. Select the resource file we created for the resource field.

13. Select the colors you like to use for the choropleth map. 14. Fill data column with the column or columns in the csv of your data that you want to display in the map. Separate multiple columns with a comma. The columns that you choose will appear as radio buttons on the side of your visualization, which you can then toggle between to see the effect of different data. If you leave this field blank, you'll get a list of radio buttons for all of the columns in your data sheet. The select of certain columns in your data can be helpful when, for instance, trying to show change of data over a certain time period you could for example choose the April, May, June columns, but leave out July, August, September.

15. Fill the data breakpoints with comma separated numbers. If you leave this field blank, breakpoints will be calculated for you based on the data. You will use breakpoints to determine what data values will be captured by different colors on the visualization. For instance, if you use 25, 50, 75, 100 as your data breakpoints, your visualization will display 4 different shades one for those values between 0 25, a slightly darker shade for values 25 50, an even darker shade for values 50 75, and the darkest shade for values 75 100. Remember to choose your breakpoints wisely based upon the data that you want to display!

16. Click Save & Enjoy! Publishing Visualizations After you finish creating the visualization, click on the blue Embed button to get an embed code for sharing the file on other platforms. You can alter the height and width of the file to be embedded by entering the desired values in the corresponding text boxes. Once you ve copied the code, you can now implant your visualization anywhere with a field for embedding an HTML element. Even on other sites, the graph will automatically update to any change made to the source data or settings on DKAN.

Questions DKAN not only renders data visualizations, it can serve as a standalone data storytelling platform as well. The first function available for telling data stories is creating a question, which allow users to combine visualizations with companion text and images. Fill in the fields as desired, attach files, and categorize the question as fits the content. Fields marked with a red asterisk ( * ) are required to create the question. Make sure the entity URL matches the one auto generated for the question. Previously rendered visualizations can be added to the question by pasting the embed code into the corresponding field. Click Save at the bottom and your question is ready for viewing. Data Stories Telling stories based on data is a primary goal of DKAN. Visualizations can be used to create a clear understanding of a complex situation. Furthermore, elements of storytelling can be used to illustrate what the findings actually mean. The best method for leveraging the narrative in your data with DKAN is creating a data story. Data stories consist of multiple elements and pieces of content, allowing you to build unique and engaging bulletins showcasing your data.

Title it and add any images, body text, or tags, then select the layout that best fits how you want to represent your data and content. Click Save and you ll be greeted with a screen prompting you to add and define your content. The functional icons do the following: plus icons allow you to add content gear icons permit you to modify formatting options paintbrush icons allow you to change the style of content s pane arrow icons enable you to change the position of the content trash can icons allow you to delete the content

You can add all kinds of content, new or existing, and organize it as you see fit. When you ve finished building and organizing content, click the save button at the bottom and your data story is ready!