Visualisation of Large Datasets with Houdini

Similar documents
High Performance Computing. Course Notes HPC Fundamentals

Parallel Large-Scale Visualization

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Data Centric Interactive Visualization of Very Large Data

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Parallel Programming Survey

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

Lecture 2 Parallel Programming Platforms

Big Graph Processing: Some Background

2. COMPUTER SYSTEM. 2.1 Introduction

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

Performance Optimization and Debug Tools for mobile games with PlayCanvas

High Performance Computing in CST STUDIO SUITE

YALES2 porting on the Xeon- Phi Early results

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

Overview of HPC Resources at Vanderbilt

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.


CHAPTER FIVE RESULT ANALYSIS

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Distributed Architecture of Oracle Database In-memory

OpenMP Programming on ScaleMP

Best practices for efficient HPC performance with large models

Embedded Parallel Computing

Large-Scale Data Processing

The Evolution of Computer Graphics. SVP, Content & Technology, NVIDIA

Chapter 2 Parallel Architecture, Software And Performance

Integrated Grid Solutions. and Greenplum

Oracle Database In-Memory The Next Big Thing

Computer Graphics Hardware An Overview

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

FPGA-based Multithreading for In-Memory Hash Joins

Interactive Level-Set Deformation On the GPU

Big Data Visualization on the MIC

GPU Point List Generation through Histogram Pyramids

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Moving Virtual Storage to the Cloud

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Large Vector-Field Visualization, Theory and Practice: Large Data and Parallel Visualization Hank Childs + D. Pugmire, D. Camp, C. Garth, G.

Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions

Kriterien für ein PetaFlop System

Enterprise Architectures for Large Tiled Basemap Projects. Tommy Fauvell

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

PRIMERGY server-based High Performance Computing solutions

Lecture 1. Course Introduction

The Mainframe Virtualization Advantage: How to Save Over Million Dollars Using an IBM System z as a Linux Cloud Server

Symmetric Multiprocessing

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Optimizing Unity Games for Mobile Platforms. Angelo Theodorou Software Engineer Unite 2013, 28 th -30 th August

Enabling Technologies for Distributed Computing

Oracle Database Scalability in VMware ESX VMware ESX 3.5

ICRI-CI Retreat Architecture track

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Large-Data Software Defined Visualization on CPUs

The team that wrote this redbook Comments welcome Introduction p. 1 Three phases p. 1 Netfinity Performance Lab p. 2 IBM Center for Microsoft

World s fastest database and big data analytics platform

HPC performance applications on Virtual Clusters

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

White Paper. Recording Server Virtualization

Big Data Performance Growth on the Rise

Overview: X5 Generation Database Machines

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Benchmarks and Comparisons of Performance for Data Intensive Research

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Main Memory Data Warehouses

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

2009 Oracle Corporation 1

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

CSE 6040 Computing for Data Analytics: Methods and Tools

Binary search tree with SIMD bandwidth optimization using SSE

The Future Of Animation Is Games

P013 INTRODUCING A NEW GENERATION OF RESERVOIR SIMULATION SOFTWARE

DB2 Database Layout and Configuration for SAP NetWeaver based Systems

How To Build A Cloud Computer

Enabling Technologies for Distributed and Cloud Computing

Michael Kagan.

Introduction to Virtual Machines

Recent Advances and Future Trends in Graphics Hardware. Michael Doggett Architect November 23, 2005

Lecture 23: Multiprocessors

21 st Century Storage What s New and What s Changing

1. INTRODUCTION Graphics 2

Netezza and Business Analytics Synergy

Achieving Performance Isolation with Lightweight Co-Kernels

September 25, Maya Gokhale Georgia Institute of Technology

Database Hardware Selection Guidelines

Introduction to Cloud Computing

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

GPUs for Scientific Computing

Running Windows on a Mac. Why?

Why the Network Matters

GPU Usage. Requirements

SERVER CLUSTERING TECHNOLOGY & CONCEPT

Transcription:

Visualisation of Large Datasets with Houdini Ben Simons Data Arena Lead Developer University of Technology, Sydney ben.simons@uts.edu.au bsimons@acm.org

New UTS Broadway Building

UTS Data Arena ~ April 2014

Today's Outline - Big Data 1. Some strategies used in Film Visual FX 2. Visualisation Techniques in Houdini 3. VFX Data Formats & Disk Systems

Happy Feet 2 2 Petabytes (2,000,000 GB) 3D Stereo HD images Render: 18,000 cpu cores Parallel access to data HDF5 data on Bluearc & Isolon NAS Disk Systems Linux software: Maya, Houdini, Naiad, Nuke, 3Delight Entirely made at Carriageworks in Sydney at Dr D Studios

Resident Evil 3 Extinction The Desert Undead: 18-layer images (Rman AOV's) Each single image frame was split into 96 tiles Rendered on 96 machines, then each frame tile-joined

Houdini www.sidefx.com

Houdini across 2 screens

Houdini Object Nodes

Houdini Procedural Network

Houdini Parameters

Houdini Chops Channel is a column of data Plain textfiles ok separate columns with tabs Interactive Channel graph (zoom in) Visual programming Filtering, Sampling, shading, instancing, and rendering Hands-on tomorrow will be Chops & Vops

Spitzer Glimpse Dataset http://data.spitzer.caltech.edu/popular/glimpse/20070416_enhanced_v2/source_lists/south/

Spitzer Space Telescope GLIMPSE Dataset South: ~300 files, 78 different Channels, 145K rows gzipped.tbl data loaded into Houdini Houdini Chops used to filter & calc 'colours' Show difference of infra-red magnitude bands Point colours and scales calculated by VOPs SIMD Shaders Houdini Movie Rendered (Mantra PBR) 36M points, filtered <12M

Shading & VOP's A shader is a mini-program which makes data It can be better to generate data than load it. Shaders allow additional level of management Geom shaders on HF2 generated 1 billion snow particles per image frame (impossible to load). Houdini VOP's are SIMD

Houdini VOP Network

Instancing Saves Memory & I/O by re-using geometry Copies generated at render time Each Instance can be varied based on point attributes Referencing one instance object provides a massive data reduction

Adaptive Meshes, LOD, Caching & Filtering Data reduction techniques Level of Detail (distance from camera) Adaptive Meshes Cache common files locally Filter texture (images) - Mipmapping

Other tricks Baked Lighting & Shadows Pre-calculate lighting & shadows bake new textures & reapply onto geom Sydney Harbour Multi-Beam Sonar Survey, 30cm data. Interactive 3D Flythrough

Know ur Limits: Memory & I/O I/O will Bottleneck - Partition the problem & then scale it up Split job across many independent machines (eg. render) Segment data access for each machine (eg. HDF5) Alternate memory hardware Vector (array) processor - SIMD as Cray, now intel SSE/MMX and Nvidia GPU IBM Cell Processor has Vector Processor Content-Addressable Memory associative arrays are used by Network Routers

Types of System Memory Virtual Memory Swapping is good, thrashing is bad SMP vs MPI SMP Symmetric Multiprocessing: Multiple CPU's with common/shared memory. Multi-threaded apps. eg. Intel Xeon, Core 2 Duo are SMP. Cache coherency, snooping bus (on distributed SM) ccnuma MPI (Message Passing) PVM Clusters, Beowulf, etc (Memory not shared)

Data Formats HDF5 Heirachical Data Format www.hdfgroup.org Browsable container of data (HDFView) Has groups & datasets like dirs & files Data stored in B-Trees Can also store Binary Data HDF5 for Python www.h5py.org Operate on HDF5 data via python dictionaries & NumPy arrays - www.numpy.org

Disk Systems Network Attached Storage (NAS) Bluearc (now Hitachi) implemented via FPGA Isilon (now EMC) clustered filesystem, 100GB/s Lustre Filesystem Multiple SSD nodes & maintains global file coherency Experimental Parallel distributed filesystem can have multiple copies of a file, one master. Venti (Bell Labs Plan-9 & Inferno) WORM Archive. Shares Blocks by secure SHA-1 Hash.

Data Formats 2 Open VDB www.openvdb.org Hierachical structure for volumetric data ( clouds ) Good for sparse volumetric time-varying data Fast access (constant-time) to voxels Large set of operators (Level Set tools, filters, transforms & morphological operators)

Data Formats 3 Disney Ptex eliminates uv texture assignment http://ptex.us/ no (u,v)'s required! no seams visible works on sub-d/poly faces Stores face adjacency data & filters Efficiently stores 106 mipmapped texture files Multi-channels, compressed separately Used in Disney's Bolt

D3 Data-Driven Documents D3 An amazing Data visualisation web framework (javascript) http://d3js.org See: https://github.com/mbostock/d3/wiki/gallery Offers Parallel Coordinates Demo? Nutrient Contents - An interactive visualization of the USDA Nutrient Database. http://exposedata.com/parallel/

Parallel Co-ordinates protein, calcium, sodium, fibre, vitamin c, potassium, carbohydrate, sugar, fat, water, calories, saturated,...