Big Data Research @ Integrated Media Systems Center h#p://imsc.usc.edu/ Cyrus Shahabi, Ph.D. Professor of Computer Science & Electrical Engineering Director, Integrated Media Systems Center (IMSC) Viterbi School of Engineering University of Southern California Los Angeles, CA 900890781 shahabi@usc.edu 1
OUTLINE IMSC Background & Overview Vision: Geo- Immersion Big Data Research @ IMSC Closing Remarks 2
IMSC Background One of the 46 exisnng NaNonal Science FoundaNon (NSF) Engineering Research Centers (29 graduated) - Focuses on MulNmedia ERC emphasis: MulNdisciplinary, Industry Presence Founded by Max Nikias in 1996, graduated in 2007, operanng in self- sustained mode. New management, team & vision in July 2010 Current Management Team 3
IMSC Vision: Geo- Immersion Real World Remote Sensing: Satellite & Mul<- Spectral Imagery Aerial Sensing: LiDAR, aerial imagery, UAV video Ground Sensing: Traffic loop detectors, CCTV, pollu<on sta<ons People Sensing: smart phones, GPS, naviga<on Blending the real and virtual worlds: Beyond augmented- We fuse geo- data from the same 9me AND space into ac9onable reality, virtual- reality, knowledge. Beyond integranng data 011101100101001001001001001000100100111100001001 and informanon Towards fusion of human behaviors in both worlds Human Body Sensing Virtual World 4
Why Big Data? Big Analy<cs Small Files! Big Data Small Analy<cs Machine Learning, Data Mining, etc. Parallel DBs, Cloud, HPC, Distributed Sys, etc. 5
Who Big Data? ApplicaNon: Health, TransportaNon, Energy, etc. Machine Learning, Data Mining, etc. Parallel DBs, Cloud, HPC, Distributed Sys, etc. 6
OUTLINE IMSC Background & Overview Vision: Geo- Immersion Big Data Research @ IMSC Closing Remarks 7
Big Data Research @ IMSC Intelligent TransportaNon Big data acquisinon, storage and access Intelligent Surveillance Big data analyncs Intelligent Campus Big data collecnon 8
Big Data Research @ IMSC Intelligent TransportaNon Big data acquisinon, storage and access Intelligent Surveillance Big data analyncs Intelligent Campus Big data collecnon Prof. James Moore (VSoE) Prof. Genevieve Giuliano (Price School of Policy) Prof. Marlon Boarnet (Price School of Policy) Prof. Hamid Nazerzadeh (Marshall School of Bus.) USC Stevens 9
PROBLEM Intelligent TransportaNon Traffic congesnon is a $87.2 billion annual drain on the U.S. economy 1 : 4.2 billion lost hours (one work week for every traveler) 1 2.8 billion gallons of wasted fuel (three weeks worth of gas for every traveler) 1 1 Texas TransportaNon InsNtute Urban Mobility Report, 2007 data LocaNon data could save consumers worldwide more than CongesNon costs $713 $600 per commuter billion annually per by 2020. year, in extra fuel and wasted Nme: The biggest Los Angeles single 64 hrs, consumer $1,334 benefit will be from Nme and fuel savings San from Francisco locanon- based 50 hrs, $1,019 services tapping into real- Nme traffic and Chicago weather 51 hrs, $1,568 data that help drivers avoid congesnon Washington 74 hrs, $1,495 and suggest alternanve routes. 10
Traffic An Data Exclusive Lifecycle: Contract Data Aggregator w LA- Metro Data Type Sample XML File Size (in KB) Variety (gps, video, loop Cycle Duration (in seconds) Minute (in KB) Hourly (in KB) Daily (in KB) Annual (in KB) 3 Years (in KB) sensor, events) bus_mta_inv2.xml 23 86400 0.96 0.96 23.00 8,395.00 25,185.00 bus_mta_rt2.xml 1065 120 532.50 31,950.00 766,800.00 279,882,000.00 839,646,000.00 cctv_inv.xml 57 86400 0.04 2.38 57.00 20,805.00 62,415.00 cms_inv.xml 52 86400 0.04 2.17 52.00 18,980.00 56,940.00 cms_rt.xml 48 75 38.40 2,304.00 55,296.00 20,183,040.00 60,549,120.00 event_d7.xml 11 75 8.80 528.00 12,672.00 4,625,280.00 13,875,840.00 rail_mta_inv.xml 1 86400 0.00 0.04 1.00 365.00 1,095.00 rail_rt.xml 8 60 8.00 480.00 11,520.00 4,204,800.00 12,614,400.00 rms_inv.xml 865 86400 0.60 36.04 865.00 315,725.00 947,175.00 rms_rt.xml 1236 75 988.80 59,328.00 1,423,872.00 519,713,280.00 1,559,139,840.00 signal_inv.xml 2095 86400 1.45 87.29 2,095.00 764,675.00 2,294,025.00 signal_rt.xml 2636 45 3,514.67 210,880.00 5,061,120.00 1,847,308,800.00 5,541,926,400.00 tt_d7_inv.xml 746 86400 0.52 31.08 746.00 272,290.00 816,870.00 tt_d7_rt.xml 152 60 152.00 9,120.00 218,880.00 79,891,200.00 239,673,600.00 vds_art_d7_inv.xml 115 86400 0.08 4.79 115.00 41,975.00 125,925.00 Velocity vds_art_d7_rt.xml 45 60 45.00 2,700.00 64,800.00 23,652,000.00 70,956,000.00 vds_art_ladot_inv.xml 2538 86400 1.76 105.75 2,538.00 926,370.00 2,779,110.00 vds_art_ladot_rt.xml 969 60 969.00 58,140.00 1,395,360.00 509,306,400.00 1,527,919,200.00 vds_fr_d7_inv.xml 957 86400 0.66 39.88 957.00 349,305.00 1,047,915.00 vds_fr_d7_rt.xml 361 30 722.00 43,320.00 1,039,680.00 379,483,200.00 1,138,449,600.00 Total KB from XML data 13980 864660 6,985.28 419,060.38 10,057,449.00 3,670,968,885.00 11,012,906,655.00 Volume 11
TransDec: Big data acquisinon, storage & access Input Traffic Data Data Processing Storage Retrieval, Analysis &VisualizaNon Sensor 4 46 MB/min 26 15 MB/min TB/Year Sensor 3 Sensor 2 Highway (4313) Arterial (4780) Real- <me Queries & Bus & Rail (2000) Data Cleansing Ramp meter Events & CMS (800/day) Spa<otemporal Indexing Sensor 1 Event LocaNon E.g., Accident impact analysis & predic<on 12
Technology Transfer: ClearPath 3 Pending Patents! 13
Google OpNon #1 8:00 AM Thursday Source: W Washington Blvd & Beethoven St Des<na<on: USC 14
Google OpNon #2 8:00 AM Thursday Source: W Washington Blvd & Beethoven St Des<na<on: USC 15
Google OpNon #3 8:00 AM Thursday Source: W Washington Blvd & Beethoven St Des<na<on: USC 16
ClearPath 8:00 AM Thursday Source: W Washington Blvd & Beethoven St Des<na<on: USC 17
Sample 2012 Research Result Sensor Data AnalyNcs B. Pan, U. Demiryurek, and C. Shahabi, UNlizing Real- World TransportaNon Data for Accurate Traffic PredicNon, IEEE Interna9onal Conference on Data Mining (ICDM), Brussels, Belgium, December 2012 Machine Learning, Data Mining, etc. 18
VOA: March 6, 2013 hop://www.voanews.com/content/traffic- technology- clearpath/1616682.html 19
Big Data Research @ IMSC Intelligent TransportaNon Big data acquisinon, storage and access Intelligent Surveillance Big data analyncs Intelligent Campus Big data collecnon Prof. Gerard Medioni (VSoE) Prof. Ram NevaNa (VSoE) Prof. Antonio Ortega (VSoE) Prof. Yan Liu (VSoE) Prof Jon Taplin (Annenberg School of Comm) Prof. Carolee Winstein (Keck School of Medicine) Prof. Murali Annavaram (VSoE) USC DPS, CREATE, CTSI, CiSoq, AMI 20
Intelligent Surveillance Janus: SpaNotemporal fusion of unstructured intelligence 21
Video AnalyNcs at Scale Face detecnon & recogninon Event detecnon (e.g., abandoned bag) UNlizing Intel Viewmont co- processor for real- Nme analysis 22
Text AnalyNcs (Tech Transfer) With Prof. Craig Knoblock (ISI) 23
iwatch Health Monitoring and EvaluaNon of Parkinson's Disease PaNent Performance using Kinect Data CollecNon: Provides data sense and communicanon capabilines that enables real- Nme data acquisinon in various selected modalines Denoise: Remove noise from the raw data ClassificaNon: Define and classify the performance of the panent User Interface: Enables efficient management of the panent performance 24
(2) Kinect sensors + 1 digital camera skeleton data has 15 features Principle Component Analysis 25
Sample 2012 Research Result Human AcNviNes AnalyNcs H. Shirani- Mehr, F. Banaei- Kashani, and C. Shahabi, Efficient Reachability Query EvaluaNon in Large SpaNotemporal Contact Datasets, 38th Interna9onal Conference on Very Large Databases (VLDB), Istanbul, Turkey, August 2012. Parallel DBs, Cloud, HPC, Distributed Sys, etc. 26
Big Data Research @ IMSC Intelligent TransportaNon Big data acquisinon, storage and access Intelligent Surveillance Big data analyncs Intelligent Campus Big data collecnon Prof. Burcin Becerik- Gerber (VSoE) Prof. Bhaskar Krishnamachari (VSoE) Prof. Dennis McLeod (VSoE) Prof. François Bar (Annenberg School of Comm) Prof. Nonny De La Pena (School of CinemaNc Arts) 27
Crowdsourcing Geo- Crowdsourcing Outsourcing a set of tasks to a set of worker (e.g., Amazon Mechanical Turk) SpaNal Crowdsourcing Crowdsourcing a set of spa9al tasks to a set of workers SpaNal task: task related to a locanon Workers can perform the spanal tasks by physically traveling to the locanons of tasks E.g., take a picture of Tommy- Trojan before the USC- UCLA game 28
IMSC Worker GeoCrowd Requester In Collabora9on w Prof. Zimmermann, NUS SpaNal Crowdsourcing Server (SC- server) 29
PBS NEWSHOUR InauguraNon Coverage with IMSC GeoCrowd 15 journalism students cover the streets of Washington DC during the event using Android phones 30
AutomaNc Panoramic Picture GeneraNon 31
Sample 2012 Research Result SpaNal Crowdsourcing L. Kazemi and C. Shahabi, GeoCrowd: Enabling Query Answering with SpaNal Crowdsourcing, ACM SIGSPATIAL GIS, Redondo Beach, CA, November 2012. ApplicaNon: Health, TransportaNon, Energy, etc. 32
Big Data Research @ IMSC Intelligent TransportaNon Big data acquisinon, storage and access Intelligent Surveillance Big data analyncs Intelligent Campus Big data collecnon MediaQ 33
Unmet Need Smartphones can capture high resolunon imagery & video limited storage on the phone (e.g., 64 GB on MicroSD card), unreliable (e.g., lost phone, broken phone) Need for a reliable searchable backend storage solunon E.g., drop- box, Google docs, icloud are very difficult to search as they are file- based systems Users need to be able to share their pictures and videos E.g., media outlets buying image/video coverage of an event from willing parncipants that happen to be in the area, etc. 34
MediaQ Keyword search Map- based search Search results With accuracy ranng Time Slider 35 People Search
Real- World ApplicaNons: Real Systems: TransportaNon TransDec Security, Health, Energy Janus Social, Urban Real Research: IEEE ICDM 12 (SIGKDD) VLDB 12 (SIGMOD) BigData is REAL! Not a toy problem! GeoCrowd CommercializaNon: ClearPath Geosemble ACMGIS 12 (SIGSPATIAL) MediaQ 36
OUTLINE IMSC Background & Overview Vision: Geo- Immersion Big Data Research @ IMSC Closing Remarks 37
Research Influence 38
EducaNon 39
InternaNonal CollaboraNon Namseoul University, Korea (Christy Shin) Media Management Research Lab, NUS, Singapore (Roger Zimmermann) Civil and Environmental Engineering, Technion, Israel (Barak Fishbain) Tsinghua University, China (Lin Zhang, Yunhao Liu) Touch Center, NCKU, Taiwan (Kevin Yang) InsNtute of InformaNon Systems and ApplicaNons, NTHU, Taiwan (Von- Wun Soo, Yi- Shin Chen) Hong Kong University of Science & Technology (Lei Chen, Dimitris Papadias) 40
In the News 41
IMSC Advisory Board 27 Members from Academia, Government & Industry Government: NGA (1) Academia (7) UCSB, UMN, UI, UCB, UCI, Tsinghua, NSU, Tokyo U., RAND Industry (19) BAE (1) Chevron (1) Facebook (1) Google (2) HP (1) IBM (1) Intel (2) Microsoq (2) NEC (1) Northrop Grumman (2) Oracle (2) Samsung (1) 42
Industry Sponsorship Status Intel Chevron Microsoq Google HP Labs NGC IBM Qualcomm Vizio Samsung Oracle BAE Facebook ESRI Navteq NEC 6000000 4000000 2000000 0 Amount Raised Year- 1 Year- 2 Year- 3 43
IMSC Value Add to its Partners Our vision, expernse, background & experience in Fundamental and applied research MulNdisciplinary research Integrated system development Our test- beds Government/Federal customers Industry Partners Global Reach EducaNonal Presence 44
QuesNons? 45