NaviCell Data Visualization Python API

NaviCell Data Visualization Python API Tutorial - Version 1.0 The NaviCell Data Visualization Python API is a Python module that let computational biologists write programs to interact with the molecular maps displayed on the NaviCell and ACSN websites (https://navicell.curie.fr, https://acsn.curie.fr). Using this API, you can upload different data types and visualize them on any of the molecular maps available on our websites. Currently accepted data types are gene lists, expression data, copynumber data and mutation data. Programs can also get useful information from the server, such as the list of modules or HUGO gene lists. APIs for other programming languages such as R or Java have been started and should be available soon (check the website!). Download and Installation The Python API can be downloaded on the NaviCell website, in the section NaviCell Web Service. The package is a zip archive containing all the necessary files, including the examples described in this tutorial. The archive must be extracted into a directory, for both Windows and Mac OS / Linux users. You should also have Python version 3.x already installed on your computer (any version of Python 3 should be fine, but the library will not work with Python version 2.x). A recent web browser should also be installed. The API was tested and is working best with Firefox and Google Chrome. To see if Python is installed on your system and what version you have, open a terminal window (Mac OS/Linux) or a PowerShell session (Windows 7 and above), and launch the Python interpreter with the appropriate command: u900-bdd-1-78x-6993:navicell_python_api eric$ python3.3 Python 3.3.3 (default, Nov 23 2013, 15:25:10) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/apple/clang- 421.11.66))] on darwin Type "help", "copyright", "credits" or "license" for more information. On the example above, we can see that the version installed is 3.3.3. Using the API After extraction of the archive, open a terminal session and go to the directory. 1

For Mac OS / Linux users, it is recommended to set 3 environmental variables to launch the session. Those variables are: NV_BROWSER_COMMAND: the command to open the browser or the path to the browser NV_PROXY_URL: the URL of the NaviCell server NV_MAP_URL: the URL of the default map to start a session For instance, here we set those variables in a Terminal session under Mac OS: export NV_BROWSER_COMMAND=open -a Firefox export NV_PROXY_URL=https://acsn.curie.fr/nv2.2/cgi-bin/nv_proxy.php export NV_MAP_URL=https://acsn.curie.fr/nv2.2/navicell/maps/cellcycle_light /master/index.php For Windows, the variables can be set when launching the Python API. Actually we have included for convenience a Windows command file that includes the options pre- set (cmd.bat). The Python interactive session can be launched with this command (you should be in the directory where the archive has been extracted): $ python3.3 -i nvpy Welcome in NaviCell python client Use python variable 'nv' as the NaviCell object Type 'nv.examples()' to get examples >>> Of course the name of the Python executable can be variable, depending on the operating system and Python version installed, so you should adjust this to your system. For windows users, you can start the session with default parameters the following command in a powershell window:.\cmd.bat After you have launched the interactive Python session, you can now start launching NaviCell commands. The NaviCell Python scripts automatically creates a NaviCell object called nv that will be used to interact with the server and the web browser. 2

Let s open an interactive session in a web browser with this command: >>> nv.launchbrowser() >>> This command will use the default browser command and the default map URL specified earlier to launch a browser session. Note that a session ID will be created and will be visible in the URL field of the browser: Note that if you quit the Python session, you can re- start another Python session and re- use the same ID later with this command: $ python3.3 -i nvpy Welcome in NaviCell python client Use python variable 'nv' as the NaviCell object Type 'nv.examples()' to get examples >>> nv.attachlastsession() Now you can use commands to manipulate the maps, for example setting the zoom level with the command: >>> nv.setzoom('', 4) >>> nv.setzoom('', 1) These commands will zoom into the level 4 of the map and come back to the first level, respectively. Now, let s load some expression data and visualize it on the map. In the data directory of the distribution, there are four sample data files: $ ls data/ ovca_copynumber.txt ovca_expression.txt ovca_mutations.txt ovca_sampleinfo.txt These files were extracted from TCGA ovarian cancer data, and represent copynumber calls, mrna expression, gene mutation and sample annotations 3

respectively. We will use them to demonstrate how to use data from the API in this tutorial. Here we create a data object (in fact an expression matrix) and then send this data to our active session: filename = "data/ovca_expression.txt" dat = nv.makedatafromfile(filename) nv.importdatatables(dat, "ovca-exp", "mrna expression data") Note that the data matrix is automatically filtered before sending the data to the server, meaning that the list of all the genes (HUGO IDs) present on the map is first retrieved from the server, and this list is used as a filter on the expression data matrix. At this stage, if you click on the My Data button of the map interface, you should see the ovca- exp data set listed under the Datatables tab. Now, we will send annotation data that will be used to create groups of samples. nv.sampleannotationimport("https://acsn.curie.fr/data/ovca_sampleinf o.txt") nv.sampleannotationselectannotation("", "IntrinsicExprClassJCI") nv.sampleannotationapply("") Those commands import the annotation table ovca_sampleinfo.txt in our session, then we select a group named IntrinsicExprClassJCI, corresponding to a column in the annotation table, and finally we apply this selection to our dataset to effectively create the groups. The groups are visible by clicking the button My Data and then going to the tab Groups. The partition of samples according to the groups is visible: 4

Now we will create a heatmap visualization according to one of the groups that we have selected: >>> nv.heatmapeditorselectdatatable('', 0, 'ovca-exp') >>> nv.heatmapeditorselectsample('', 0, 'IntrinsicExprClassJCI: Proliferative') >>> nv.heatmapeditorapply('') We first select the source datatable and then select the group IntrinsicExprClassJCI: Proliferative and finally apply the modifications. The heatmaps are now visible on the Cell Cycle map: 5