Installing OGDI DataLab version 5 on Azure Using Azure Portal ver.2 August 2012 (updated February 2012) Configuring a Windows Azure Account This set of articles will walk you through the setup of the Windows Azure account you will use to install the OGDI platform. It will explain how to setup a Cloud Service, which will host the OGDI web and worker roles. It will also explain how to create the two Storages needed for the configuration tables and the data storage itself. To be able to install OGDI on Azure you need to setup one Cloud Service and two Storages. If you do not have your Azure account, you need to sign up for it using a Windows Live ID. For our walkthrough we will use a trial account of Windows Azure, which provides the same functionality as a paid one, except it is time-limited. All screens below are shown as examples and you should use your own information to create the required services and accounts. Please note that all screenshots are based on Visual Studio 2010 and Windows Azure Management Portal released in 2012: https://manage.windowsazure.com/ Creating the Cloud Service The Cloud service will run the data provider accessible to the public, which will expose the catalogues (datasets) to any consumer client app or a direct http request. To create a Cloud Service: 1. Start the Windows Azure Portal Management Portal. 2. Select Cloud Services on the left sidebar. 3. Click on New at the left corner of the bottom toolbar.
4. In the new menu select Cloud Service, Quick Create, and then type a URL of your new site. It will be the one that your users will see unless you have a domain name to assign it to, so be cautious when choosing a name. 5. You need to also choose a region or affinity group so all your storages and cloud services are in the same geographic location. This will decrease data traffic and speed up your services. If you do not have an affinity group already, you will not see any in the drop-down selection. 6. Click Create Cloud Service and wait for the process to finish. After a few seconds you will see the newly created Cloud Service in the list. This is all you need to prepare the Cloud Service on Windows Azure. Creating two storages For the OGDI v.5, you need two storages: A configuration storage, where the OGDI project keeps the data endpoints; A data storage, where the actual data is stored, which will be available for the public. To create the two storages: 1. Start the Windows Azure Portal Management Portal, if you haven t done so yet. 2. Select Storage. 3. Click on New at the left corner of the bottom toolbar. 4. Select Storage and then Quick Create 2 P a g e
5. Enter a prefix of the URL that will be used to connect to the account. We suggest using the word config in it, so you can easily recognise this account later. 6. Do not forget to select the same region or affinity group in which you put the Cloud Service. This will make both servers to be in the same geographic location. 7. Click Create Storage Account to create the account Perform the same steps for creating Storage for the data, which will hold the publically exposed catalogues. The difference should be that in Step 5 you will put the word data in the prefix. We also suggest you use the same structure of the prefix. In our example we used myopencityconfig and myopencitydata. Please note that these are public URLs and if the names are already taken, the Management Portal will notify you and ask to choose different ones. At the end, the Storage panel will display the two new accounts (config and data) you have just created. These are all the steps you need to perform to create the needed Cloud Service and the Storages. 3 P a g e
Installing Dependencies In the second part of the walkthrough, we will cover the preparation of the OGDI DataLab solution for compiling and publishing on the Windows Azure instance. When you download the files, you will be missing the 3 rd party dependencies for some of the projects in the package. Before you start the steps below, you need to have completed all steps on setting up the Windows Azure account, storages, and service, described above. Getting the OGDI DataLab dependencies 1. Once you have downloaded OGDI DataLab solution version 5, you can decompress it in a folder and open it in Visual Studio 2010. Make sure you run VS 2010 as administrator (right-click the VS 2010 icon while holding the [Shift] key and select Run as administrator from the context menu). 2. Once you open the solution, you may notice that some references are missing. To get the latest versions of these 3 rd party components, you can use the NuGet Package Manager. 3. Locate the DataBrowser.WebRole project, right-click on it, and select Manage NuGet Packages 4. In the NuGet window, search for OGDI DataLab and you have to see the four dependency packages that are named according to the project names they relate to. There are few other packages there so be careful to select the ones starting with OGDI DataLab and end on Dependencies. 5. Select the OGDI DataLab DataBrowser.WebRole Dependencies item and click the [Install] button. 6. Once the installation is complete, follow the same logic and install the other files: - for DataService.WebRole project, install OGDI DataLab DataService.WebRole Dependencies - for DataLoader project, install OGDI DataLab DataLoader Dependencies 7. To check that all dependencies were installed correctly, rebuild the solution and make sure there are no errors. If you see errors, install the necessary dependency packages for the projects with errors. 4 P a g e
Up to now, you have completed all steps that will allow you to compile and publish the project on the Windows Azure instance and publish your data. 5 P a g e
Configuring the Roles In this third part of the walkthrough, we will configure the web role and the worker role projects so you can publish them on your Windows Azure Cloud service. This configuration is necessary to connect the catalogue data with the metadata of the endpoint. They will be the ones that provide the data to any client that visits your site. Before you start the steps below, you need to have completed all steps on setting up the Windows Azure account, storages, and service as well as installing all 3rd party dependencies of the projects described previously in this guide. Configuring the web role 1. Load the solution (if you haven t done so yet), locate the Solution explorer, and find the DataBrowser.Cloud project. 2. Expand the branches under the project and look for the folder Roles, where you will find the two roles: DataBrowser.WebRole and DataBrowser.WorkerRole 3. Double-click on the DataBrowser.Webrole to open the configuration table. Select the Settings tab on the left. 4. In the newly opened window, locate the Service Configuration drop-down box and select Cloud. This way, all changes you make will affect only the configurations for the Azure cloud installation. 6 P a g e
5. Click on the DataConnectionString and choose ConnectionString in the column Type. Doing that will display a browse button at the end of the same row. 6. Click the browse [ ] button to open a Connection String form. 7. Once it opens, select the Enter storage account credentials. These settings will allow you to connect to the storages you created above in this article. 8. In the Account name field type the name of the storage account you chose to use for the configuration database. You may remember that we asked you to name them in such a way so you can recognize them later data and config. Here, you should use the one with the config word in the name. 9. In the Account key field you must enter the access key for config storage. You can get it from the Windows Azure Management panel. Locate the Storages panel, select the configuration storage. 10. In your Azure Portal, on the bottom toolbar you will see Manage Key. Clicking it will display Primary access key and Secondary access key. They both have access rights but we suggest using the second one. Keep the first for administration purposes and the second for the OGDI DataLab installation itself. This way at a later point you will be able to change it and cancel access to everyone that is not an administrator. 7 P a g e
11. Go back to Visual Studio and paste it in the Storage Account Connection String form. 12. Click OK to save the settings and close the Storage Account Connection String form. 13. Repeat the same process for the DiagnosticsConnectionString setting of the web role project (starting from step 5), using the same storage name and key. 14. There is one more setting that needs changing serviceuri. It is on the same Settings tab in the Visual Studio. Go back to Windows Azure Platform portal and select the Cloud Services screen. Click the service you created in the first part of this article and locate the column URL to the right. 15. Copy the DNS Prefix. It is just the first part of the ServiceUri. All standard Azure DNS names finish with.cloudapp.net so the service URI you will use starts with the DNS prefix you copied and finishes with cloudapp.net:8080/v1/ In our example, the serviceuri is http://myopencity.cloudapp.net:8080/v1/ 8 P a g e
After you have completed the change to the three settings, the table with settings should look similar to this: Configuring the worker role Now, you need to configure the same three settings for the DataBrowser.WorkerRole, so follow the exact same steps from above (starting from step 3) but double-clicking on the DataBrowser.WorkerRole in the Solution Explorer. The values for all settings are the same. Once you are done, the screen will look similar to this: You have just completed the configuration of the DataBrowser project and you will be able to publish it on Azure in part 4 of this walkthrough. 9 P a g e
Publishing the DataBrowser Project In this part, we will show you how to upload the DataBrowser.Cloud project, which includes the web role and the worker role, to the Windows Azure instance you have created in part 1. These roles will become your webpages where users will have the access to the data in your catalogue. To be able to complete the steps in this part, you must have completed all previous parts of the walkthrough and have access to the Windows Azure Platform management panel. 1. After you have set the connection strings as described in part 3 of this walkthrough, go to the Solution Explorer, right-click on the DataBrowser.Cloud project, and click Publish 2. You will see the publishing wizard open. If you are publishing the project for the first time, you will see no subscription in the drop-down box. So, click on the Sign in to download credentials and the wizard will open a web browser window for you and ask you to sign in with your LiveId, which you used to open a Windows Azure account. If you were already signed, you will not see the login screen. 3. After a few seconds, the browser will try to download a configuration file with publishsettings extension. Save it somewhere on your hard drive. 10 P a g e
4. Go back to the wizard, click [Import ] and browse to the downloaded settings file. When you open it, you will see the drop-down box change to the name of your subscription. Click [Next>] 5. The next screen will show all settings you received through the settings file you imported. Verify that you have selected Release built configuration and Cloud service configuration. Click [Next>] 6. The Summary page will show you the way the project will be published on your hosting account. Click [Publish] to start the process of automatic upload of all necessary files. 11 P a g e
This process will take several minutes. During that, you will be able to see the progress in the Windows Azure Activity Log screen in Visual Studio. You will also be able to see the uploading on the management panel as well. At the end of the publishing, the hosting service with the DataBrowser roles will be started. Please have in mind that you will not have any data in the database and you do not have any catalogues defined in it. This is what we will do next. For now, check that your Cloud service is running on Azure and read our next part. 12 P a g e
Defining Endpoints In this part we will explain how to define OGDI catalogues necessary for the connections between the data storage and the data service. The configuration of the end points uses the ConfigTool project from the OGDI DataLab solution. Its use is quite simple and common sense can guide you as well. But, if you want to see how we did it, just follow along. For this part of the walkthrough we need to thank Ivan Dragolov for his help on web-security and access rights while analysing the solution and configuring our sample catalogues. To be able to complete this task, you have to complete all previous parts of this walkthrough as well as have access to the Windows Azure Platform panel or have the necessary storage account name and key. The ConfigTool project is executed on your local computer and that is why you may be surprised by its simple user interface. Again, you will need to run it only one time and it will not be visible to your clients and users, so trust us when we say: it will do the job. Before you can start it though, you have to configure the storage account information. 1. Locate the Web.Config file in the ConfigTool project and open it. 2. On the DataConnectionString line, find the [StorageName] after the AccountName attribute and change it to the real storage name you will use. Again, you should use the storage that you created to keep the configurations in the one we asked you to have the word config in. You can get it from the Windows Azure Platform management panel 13 P a g e
3. On the same line locate the [StorageKey] after the AccountKey attribute and replace it with the secondary access key from the same config storage account. 4. Select Debug as a solution configuration in Visual Studio, right-click on the ConfigTool project, point on Debug, and click on Start new instance to start the project. It will start a new instance of your web browser and show you a simple form. Of course, you need to have IIS installed on your machine but who is a developer and doesn t have it. 5. Fill out the fields as follows: o Alias add a short name that describes what information is included in the catalogue. Examples for that are NewYorkStreetParking, TorontoDayCareFacilities, MyExperimentalCatalogue. o Description here you can put a short user-friendly description about the catalogue but this time with more descriptive words. For example, New York Parking Facilities. o TableStorageAccount her you need to input the second storage account name the one we asked you to include the word data in its name. It is the only time you will be using the second storage account name and access key. Again, you can take this from the Windows Azure Platform management panel. o TableStorageAccountKey use the secondary access key of the same data storage account. 14 P a g e
6. Once you make sure all of the fields have valid information, click the [Add] button and the information for the new catalogue will be added to the config storage. You will see a confirmation of the information under the form once the webpage reloads. You have completed all necessary steps to create the definition of the catalogue and in the next article, you will see how to upload data in it. 15 P a g e
Uploading Data in the Catalogue In this part, we will be uploading data in the catalogue we have created and configured in the previous parts of this walkthrough. The OGDI DataLab v.5 solution has two projects that can help us with that console application and Windows application. We will focus on the more visual and easy to use one DataLoaderGuiApp. To complete this step, you will need a CSV or a KML file that contains data for your catalogue. If you do not have one and you are experimenting with OGDI, you can always search the web for the terms open data and even include a city name. Most of the large municipalities already have their data published in plain files. An example for that is the Region of Peel in Ontario, Canada. Its data is here as of 26 June 2012. In our example we will use a CSV file with parking facilities in New York. We also suggest a smaller file so you can visually inspect it and make sure it is valid as well as smaller files will need less time to upload. You can always add data later. Using the same procedure we will describe below. 1. Start the project DataLoaderGuiApp and you will see the OGDI Data Loader Windows application started 2. First we need to configure the connection settings. So, click on the Settings tab and click the [Connection] button. 3. In the Endpoint Setting form add your Account name of the data storage account and its secondary key in the Account key field. If you do not know what they were, login to your Windows Azure Platform management panel and look for the storage account with the word data at the end. You can also check part 1 of our walkthrough for details. 4. Click [Save]. 16 P a g e
5. Go back to the File tab and choose Open. 6. Choose a CSV or KML file to upload but make sure it is really a comma separated value file and all rows have valid information in each column. 7. Click [Open] to load the file in the application. It will not go in the storage yet. Now, we need to configure the catalogue information in the OGDI Metadata Designer. 8. The Dataset Metadata tab is mainly information you want to use to describe the catalogue. This is something the users will see and know what data you offer, how often you update it, and so on. Try to add descriptive information so you can attract more people to look at your catalogue. 9. The next tab Dataset Properties is more critical for the proper functioning of your catalog. On it you define the unique fields the catalogue will have so to distinguish each record. There are two fields in Windows Azure storage to do that Dataset Primary Key and Dataset RowKey. You can use New.Guid, which means the storage will generate random and unique values for these two fields or you can choose fields from your CSV file that you know are unique and will be unique in the future. This is totally your call but we recommend you use the New.Guid feature and forget about saving few bytes of space per record. 17 P a g e
10. The Data Source Timezone again defines a reference to the location this data is published. 11. Go to the Dataset Columns tab, where you can define the type of the data for each column as well as the two columns of geographic location data, if your file contains such. 12. The OGDI Metadata Designer is smart and looks for columns that may contain Latitude and Longitude. If it finds them, you will see them selected in the two drop-down boxes on the right. If not, you can select them or leave the fields blank. 13. Choose the Bing To Map checkbox if you have such geo-data to be identified as such in your catalogue. This will allow the DataBrowser we have installed in Part 4 to present each item of your catalogue on a map. 14. In the Map Push Pin Text Formatting String text box you are able to add some of the data from the other columns of your catalogue on the pop-up box on the map. If you do not want to do that, skip to step 16. 15. To do so, select identifiers for the columns in Push Pin Mapping column. These identifiers are fixed and there is a short explanation under the textbox. 18 P a g e
16. You can define extra information about the fields in the catalogue on the Dataset Columns Metadata but the application is quite smart to do it for you. We still suggest you add at least a better description of the fields in the Description column. 17. Click [Save] when you are done and the application will save all the configuration data in a file next to the CSV or the KML file. 18. Now, the OGDI Data Loader shows a line with some information about the data catalogue you just configured. You are ready to upload it. The important part is to select the Upload Method from the 19 P a g e
dropdown box and decide if you need to preserve the original data as well using the similarly named checkbox. In our case, we do not have any data in the catalogue, so we choose Create. 19. Click [Start] and monitor the progress of the upload. Based on the size of your file and your internet connection, you may need to wait several minutes. In the log screen (the lower part of the form) you will see textual information about the upload and errors that may occur. Even you experience errors, you may have some of your data uploaded. The loader will ignore the bad records from your file and let you know about it on the log screen. That is all! Check to see your data by opening the webpages generated by the DataBrowser instance you installed on Windows Azure. The URL looks like this: http://mycloudservicename.cloudapp.net and you can also locate it in the Properties panel of your particular Cloud Service in your Windows Azure account. You created it in Part 1 of this walkthrough. Good luck making the world open! And, please spread the news we exist! The team February 2013 20 P a g e