Streamlining analytics and visualization infrastructure at the University of Calgary Submitted to the Vice- President (Research) by Advisory Committee on Analytics & Visualization (ACAV) Big data = big opportunity When he was a kid in 1969, James Gosling saw his first computer in a lab at the University of Calgary. He thought it was so cool that he started coming back to the computer science lab breaking in by figuring out the simple combination lock on the door and using one of the smallest computers, the size of a refrigerator, to teach himself how to write code. Gosling graduated from the University of Calgary with his BSc in 1977 and went on to create Java, the universal programming language that helped the Internet develop. In the four decades since, the Internet and technological infrastructure has quite simply revolutionized our society, our lives and our work. And the revolution continues. Information technology is advancing at a staggeringly fast rate. Along with the hardware and software, the exponential proliferation of data is continuing it s estimated that 90 per cent of the data in the world has been created in the last few years. And it all has to be stored, analyzed and understood so that better decisions can be made for the future. The University of Calgary is striving to become one of the top research universities in the country. Our scholars are ever active in collecting and creating more data. Big data is creating a big opportunity for the university to organize and streamline how we manage our essential cyberinfrastructure, how we make sense of vast amounts of research data, and how we increase the impact of research results on society. But we have not only reached, but surpassed, our capacity in high powered computing. Our inability to keep pace with technology is slowing us down. And if we don t speed it up, we will no longer be competitive. For example, advancements in cryo electron microscopy which allows researchers to see individual proteins in atomistic detail is revolutionizing structural biology and accelerating the amount of structural information available. But without sufficient HPC, our very strong concentration of experts in simulations of proteins won t be able to compete. Our researchers are losing ground to groups in other universities and we are at risk of being unable to attract funding, excellent graduate students, postdocs and collaborators. Without increasing our HPC capacity to be competitive, some of our scholar s research programs are not viable. Consider: The Reservoir Simulation Group at the Schulich School of Engineering needs to run data with 2 billion grid cells to accurately predict the performance of a petroleum reservoir. But that s impossible given the current compute power on campus. The group has an IBM cluster with limited memory size and speed and so it also uses the Parallel cluster to run large- scale reservoir
simulations. Parallel s memory speed isn t fast enough and it s often bottlenecked. Because many labs use Parallel, the queue to run large simulations can be weeks or more. Libin Institute researcher Wayne Chen has discovered a protein, ryanodine receptor, that s responsible for the initiation of calcium waves and calcium- triggered arrhythmias. This will lead to a better understanding of the molecular basis of anti- arrhytmic treatment. But we do not have sufficient HPC to analyze the protein and capitalize on our expertise. The research team was able to make temporary collaborations with institutions that do have the computing capacity. Computational Biophysics researcher Gurpreet Singh needed to run a scaling test for a molecular simulation that required using 50 Nvidia graphics processing units (GPUs) for 20 minutes. The WestGrid/Compute Canada Parallel cluster is the only machine in Western Canada capable of running the test and it is booked solid. Luckily, the cluster was booked for an annual operating system upgrade and Singh s job was allowed to run. But without an outage of some kind, our researchers had no chance to access the machine in the near future. Mark Lowerison of the Clinical Research Unit (CRU) was also able to take advantage of WestGrid down times. He needed to run 300,000 simulations utilizing the R statistical package within a three week time period. Each run needed five to 90 minutes. Normally, this would be impossible. In this case, the annual upgrade and subsequent down time created spare cycles on the three WestGrid clusters. Plugging in to high performance computing (HPC) Having significant compute and storage capacity for all researchers at the university high performance computing is every bit as essential as having sufficient electricity to power our operations. We have a state of the art cogeneration plant to help provide our main campus with power. It s time we had a strategy to ensure we have access to leading edge compute technology. Our current cyberinfrastructure is at capacity. As individual researchers and projects apply for funding and procure technology, we re not always acquiring the technology that would best serve the entire university community nor are we creating a sustainable support infrastructure. The three WestGrid clusters on campus are already over- committed to researchers from the Compute Canada catchment area. We have little room to maneuver except when an outage is scheduled (as the examples above illustrate). The WestGrid machines are due for retirement starting in 2016, but replacement clusters which still would need to be funded will only replace the current cycles, they will not add much needed supplemental capacity. And, the new Compute Canada clusters will not be placed on- campus but at other institutions reducing our ability to access downtime cycles. The global trend in HPC is toward sharing services and we see Campus Alberta capacity as a pre- requisite for many projects. With very few exceptions, having access and control is more important than the physical location of any equipment. It s imperative that we are competitive. We have to ensure that our research community has the capacity it requires. We have to be more strategic in how and what cyber infrastructure we acquire as well as how me make it accessible to researchers on campus and beyond. The Advisory Committee on Analytics & Visualization (ACAV) was struck in 2013 by the Vice President (Research) to examine the university s growing requirements for technology, our current methods of procuring technology through research grant proposals and other means and identify and explore opportunities for improvement.
Strategic recommendations for the path forward Take inventory of all major analytics and visualization equipment across campus and define clear paths to access and support it. Surveying faculties and departments to provide a clearer picture of cyberinfrastructure demands, requirements and bottlenecks. The survey could also help make an asset map that incorporates research interests. As well as the hardware and infrastructure, we need sufficient support and the correct expertise to manage it. These expenses should be considered an annual operational expenditure not a capital one. The level of cyberinfrastructure service provided needs to be evaluated regularly to ensure sufficient capacity. Cleary define points of contact to manage and access cyberinfrastructure on and off campus, including senior level research personnel to coordinate initiatives across faculties, an UCIT contact person for system architecture and data scientists. Access and control to HPC resources is more important than physical location. Deployment of new cyber- infrastructure in non- UofC data centres could be used if it is cost effective. Ensure major cyber- infrastructure grants are coordinated centrally. The university needs to provide a high- level of service and cover partial costs to encourage researchers to contribute to a shared infrastructure. Researchers successful in major infrastructure grant proposals will have priority access to the equipment while surplus capacity if made available to other researchers on Campus. Develop an institutional strategy to catalogue and curate a research data library and make it available for future research projects. The research data archive can also facilitate sharing data which would enable collaborations with people inside and outside the university. Different researchers and research groups would have easy access to data in a controlled and secure manner. Create a Digital Data Commons (DDC), a physical space, as a nexus for collaboration o Sharing space will facilitate face- to- face collaboration between analytics and visualization research groups and application researchers. Core analytics/vis researchers can be hosted in this space and be joined by other researchers with needs that could be addressed by big data analytics. o o The digital data commons should have oversight on shared digital research infrastructure (I.e. HPC, data analytics hardware, curated archives of research data) Support personal should be part of the DDC + research leadership needs to be provided (I.e. AKA an academic director) Develop benchmarks in 2015/16 that would demonstrate progress on access to HPC/data analytics capabilities. Strategic recommendations for infrastructure investment Invest in general research data storage and archiving. This would benefit many groups across campus and would also help meet requirements of tri- council and other funding bodies to keep research data for five years or longer. Invest in an analytics cloud system, such as one based on Hadoop. To overcome our currently limited capacity, we need to engage with partner organizations to increase abilities. Support personal and data scientists need to be made available to all researchers.
Invest in super computers that combine GPUs or other accelerators with more standard CPUs. Canada lags significantly in this area with no large GPU- based HPC facilities. Storing results in separate facilities and moving them to local equipment for analysis is no longer adequate where data analysis requires the power of the HPC facility combined with substantial storage for intermediate and final results (e.g. genomics, large- scale biomolecular simulation, whole- cell simulations, other computational biology, materials research, large- scale geospatial modeling). Invest in next generation sequencing. Its related techniques already provide severe challenges for storing data in genomics and bioinformatics. Certain datasets have specific security and privacy concerns, particularly in the growing field of patient- related data. Sequencing data could soon be linked with patient data to enable personalized medicine, creating specific requirements for secure storage. Invest in large- scale visualization equipment for making sense of vast amounts of data and inviting opportunities for collaborative work. Upgrade two existing major visualization facilities (CCIT and TFDL), including more powerful graphics hardware driving the CCIT projectors Invest in software such as TechViz XL to allow seamless integration of key software applications (such as Petrel, Matlab and Paraview) with our virtual reality environments. Further demand may exist for: High- throughput streaming analytics capacity will be of use for oil/gas/health. An important application area has been smart cities and analytics problems related to managing the cities of the future. Future of ACAV: Strategic recommendations Identify co- chairs that can act as first point of contact for researchers and senior administration. ACAV becomes a strategic oversight committee that will among other things, review strategic recommendations every few years. Sponsor domain- specific networking events that connect researchers across campus with complimentary research interests. Examples could be Informatics for the life sciences or Energy analytics.
ACAV committee members 2014-2015 Name Frank Maurer (co-chair) Sam Wiebe (co-chair) Carey Williamson Christopher Hugenholtz Deborah Marshall Jason de Koning Karen Bourrier Kim Koh Laleh Behjat Loren Falkenberg Michael Ranelli Michael Ullyot Parsa Samavati Paul Galpern Peter Tieleman Robin Winsor Sergei Noskov Sheelagh Carpendale Stafford Dean Steve Liang Thomas Hickerson Faculty/Department Medicine/CRU Geography Community Health Sciences Genomics/ACRI Arts Education ENEL Haskayne UCIT Arts Undergrads EVDS Bio Cybera Biological Sciences AHS, DIMR Geomatics Library