Speaker: Prof. Mubarak Shah, University of Central Florida. Title: Representing Human Actions as Motion Patterns



Similar documents
UNIVERSITY OF CENTRAL FLORIDA AT TRECVID Yun Zhai, Zeeshan Rasheed, Mubarak Shah

Analysis of Data Mining Concepts in Higher Education with Needs to Najran University

Big Data: Image & Video Analytics

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Introduction to Data Mining

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

MACHINE LEARNING BASICS WITH R

Interactive person re-identification in TV series

Applications of Deep Learning to the GEOINT mission. June 2015

Behavior Analysis in Crowded Environments. XiaogangWang Department of Electronic Engineering The Chinese University of Hong Kong June 25, 2011

Sanjeev Kumar. contribute

Object Recognition. Selim Aksoy. Bilkent University

Colorado School of Mines Computer Vision Professor William Hoff

Expected spring 2018, City University of New York, The Graduate Center Environmental Psychology

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

False alarm in outdoor environments

Matthias Grundmann. Ph.D. Student. 504 Granville CT NE Atlanta, GA

Graduate Co-op Students Information Manual. Department of Computer Science. Faculty of Science. University of Regina

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

The Scientific Data Mining Process

Information Management course

Manjula Ambur NASA Langley Research Center April 2014

Introduction. Selim Aksoy. Bilkent University

Introduction. A. Bellaachia Page: 1

Kick-off Meeting. Thursday, November 5 th, Piotr Szczurek, Ph.D.

TIETS34 Seminar: Data Mining on Biometric identification

Human behavior analysis from videos using optical flow

How To Become A Data Scientist

Inner Classification of Clusters for Online News

Novel Probabilistic Methods for Visual Surveillance Applications

Data Mining Applications in Higher Education

A Survey on Intrusion Detection System with Data Mining Techniques

SURVEY REPORT DATA SCIENCE SOCIETY 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Exploring Big Data in Social Networks

Professor, D.Sc. (Tech.) Eugene Kovshov MSTU «STANKIN», Moscow, Russia

Teaching in School of Electronic, Information and Electrical Engineering

Curriculum Vitae. 1 Person Dr. Horst O. Bunke, Prof. Em. Date of birth July 30, 1949 Place of birth Langenzenn, Germany Citizenship Swiss and German

Self Organizing Maps for Visualization of Categories

Learning Detectors from Large Datasets for Object Retrieval in Video Surveillance

CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015

COMP9321 Web Application Engineering

Data Cleansing for Remote Battery System Monitoring

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak

Classifying Manipulation Primitives from Visual Data

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Data, Measurements, Features

The Big Data Paradigm Shift. Insight Through Automation

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Edge tracking for motion segmentation and depth ordering

Education. Research Experience (Funded Projects)

Data Quality in Information Integration and Business Intelligence

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor

How To Create A Text Classification System For Spam Filtering

How To Get A Computer Science Degree

Oracle Real Time Decisions

Conquering the Astronomical Data Flood through Machine

Exploration and Visualization of Post-Market Data

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Big Data and Analytics: Challenges and Opportunities

OBJECT RECOGNITION IN THE ANIMATION SYSTEM

Florida International University - University of Miami TRECVID 2014

2010 Master of Science Computer Science Department, University of Massachusetts Amherst

Research Publications and Submissions

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Nicholas J. Kelling, Ph.D.

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

Pallas Ludens. We inject human intelligence precisely where automation fails. Jonas Andrulis, Daniel Kondermann

Introduction to Data Mining. Lijun Zhang

Document Image Retrieval using Signatures as Queries

Habilitation. Bonn University. Information Retrieval. Dec PhD students. General Goals. Music Synchronization: Audio-Audio

Ecole Polytechnique Fédérale de Lausanne EPFL School of Computer and Communication Sciences IC

Visibility optimization for data visualization: A Survey of Issues and Techniques

An Introduction to Health Informatics for a Global Information Based Society

Big Data Text Mining and Visualization. Anton Heijs

Machine Learning: Overview

COMPLEXITY RISING: FROM HUMAN BEINGS TO HUMAN CIVILIZATION, A COMPLEXITY PROFILE. Y. Bar-Yam New England Complex Systems Institute, Cambridge, MA, USA

Geospatial Data Integration

Dr. Anuradha et al. / International Journal on Computer Science and Engineering (IJCSE)

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

Master of Science in Artificial Intelligence

Dong "Michelle" Li. Phone: +1 (413)

Adam Anthony Baldwin-Wallace College Voice: (440) Department of Mathematics and Computer Science 275 Eastland Rd

Web Document Clustering

The Fight for the Last Mile

Multimedia Technology Bachelor of Science

AMIS 7640 Data Mining for Business Intelligence

Image Data, RDA and Practical Policies

BACnet for Video Surveillance

Transcription:

Speaker: Prof. Mubarak Shah, University of Central Florida Title: Representing Human Actions as Motion Patterns Abstract: Automatic analysis of videos is one of most challenging problems in Computer vision. In this talk I will introduce the problem of action, event, and activity representation and recognition from video sequences. I will begin by giving a brief overview of a few interesting methods to solve this problem, including trajectories, volumes, and local interest points based representations. The main part of the talk will focus on a newly developed framework for the discovery and statistical representation of motion patterns in videos, which can act as primitive, atomic actions. These action primitives are employed as a generalizable representation of articulated human actions, gestures, and facial expressions. The motion primitives are learned by hierarchical clustering of observed optical flow in four dimensional, spatial and motion flow space, and a sequence of these primitives can be represented as a simple string, a histogram, or a Hidden Markov model. I will then describe methods to extend the framework of motion patterns estimation to the problem of multi-agent activity recognition. First, I will talk about transformation invariant matching of motion patterns in order to recognize simple events in surveillance scenarios. I will end the talk by presenting a framework in which a motion pattern represents the behavior of a single agent, while multi-agent activity takes the form of a graph, which can be compared to other activity graphs, by attributed inexact graph matching. This method is applied to the problem of American football plays recognition. Bio: Dr. Mubarak Shah, Agere Chair Professor of Computer Science, is the founding director of the Computer Visions Lab at University of Central Florida (UCF). He is a co-author of three books (Motion-Based Recognition (1997), Video Registration (2003), and Automated Multi-Camera Surveillance: Algorithms and Practice (2008)), all by Springer. He has published extensively on topics related to visual surveillance, tracking, human activity and action recognition, object detection and categorization, shape from shading, geo registration, visual crowd analysis, etc. Dr. Shah is a fellow of IEEE, IAPR, AAAS and SPIE. In 2006, he was awarded the Pegasus Professor award, the highest award at UCF, given to a faculty member who has made a significant impact on the university. He is ACM Distinguished Speaker. He was an IEEE Distinguished Visitor speaker for 1997-2000, and received IEEE Outstanding Engineering Educator Award in 1997. He received the Harris Corporation's Engineering Achievement Award in 1999, the TOKTEN awards from UNDP in 1995, 1997, and 2000; SANA award in 2007, an honorable mention for the ICCV 2005 Where Am I? Challenge Problem, and was nominated for the best paper award in ACM Multimedia Conference in 2005 and 2010. At UCF he received Scholarship of Teaching and Learning (SoTL) award in 20111; College of Engineering and Computer Science Advisory Board award for faculty excellence in 2011; Teaching Incentive Program awards in 1995 and 2003, Research Incentive Award in 2003 and 2009, Millionaires' Club awards in 2005, 2006, 2009, 2010 and 2011; University Distinguished Researcher award in 2007 and 2012. He is an editor of international book series on Video Computing; editor in chief of Machine Vision and Applications journal, and an associate editor of ACM Computing Surveys journal. He was an associate editor of the IEEE Transactions on PAMI, and a guest editor of the special issue of International Journal of Computer Vision on Video Computing. He was the program co-chair of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008.

Speaker: Prof. Irfan Essa, Georgia Tech, prof.irfanessa.com Title: Extracting Content and Context from Video Abstract: In this talk, I will describe various efforts aimed at extracting context and content from video. I will highlight some of our recent work in extracting spatio-temporal features and the related saliency information from the video, which can be used to detect and localize regions of interest in video. Then I will describe approaches that use structured and unstructured representations to recognize the complex and extended-time actions. I will also discuss the need for unsupervised activity discovery, and detection of anomalous activities from videos. I will show a variety of examples, which will include online videos, mobile videos, surveillance and home monitoring video, and sports videos. Finally, I will pose a series of questions and make observations about how we need to extend our current paradigms of video understanding to go beyond local spatio-temporal features, and standard time-series and bag of words models. Bio: Irfan Essa is a Professor in the School of Interactive Computing (ic) of the College of Computing (CoC), Georgia Institute of Technology(GA Tech), in Atlanta, Georgia, USA. At GA Tech, he is primarily affiliated with two interdepartmental centers; the Robotics & Machine Intelligence (RIM@GT) Center and the GVU Center. He founded the Computational Perception Laboratory (CPL) at GA Tech in 1996, which he now co-directs with 4 other faculty members. He is interested in the analysis, interpretation, authoring, and synthesis (of video), with the goals of building aware environments & supporting healthy living, recognizing & modeling human behaviors, empowering humans to effectively interact with each other, with media & with technologies, and developing dynamic & generative representations of time-varying streams. He has published over 150 scholarly articles in leading journals and conference venues on these topics. For further information, see his website at http://prof.irfanessa.com

Speaker: Dr. Apostol (Paul) Natsev, Google Title: Machine Perception for Content Discovery at YouTube Abstract: YouTube's mission is for YOU to discover and shape the world through video. At the heart of this mission is content discovery, or the problem of finding interesting content relevant to a given topic or user. This problem is particularly challenging given the variety and volume of YouTube videos: one hour of video is uploaded to YouTube every second (that's more than ten years worth of content every day). In this talk, I will give an overview of some work in the machine perception department at Google Research aiming to improve content discovery at YouTube. Specifically, I will present several case studies of applying machine perception and machine learning at YouTube scale to tackle problems such as automatically identifying and labeling celebrities and tourist landmarks in video, tagging videos with large unconstrained vocabularies, discovering musical or comedy talent on YouTube, and using gamification to crowdsource video discovery on YouTube. Bio: Apostol (Paul) Natsev received the M.S. and Ph.D. degrees in computer science from Duke University, Durham, NC, in 1997 and 2001, respectively. He is currently a Software Engineer and Manager in the Video Content Analysis Group at Google Research, Mountain View, CA. Previously, he was a Research Staff Member at IBM Research, Hawthorne, NY, from 2001 to 2011, and Manager of the Multimedia Research Group from 2007 to 2011. Dr. Natsev's research agenda is to advance the science and practice of systems that enable users to manage and search vast repositories of unstructured multimedia content. His research interests span the areas of image and video analysis and retrieval, computer vision, and large-scale machine learning.

Speaker: Dr. Anthony Hoogs, Kitware Title: Action and Activity Recognition: Scaling Across Domains Abstract: Over the past 10 years, the vision community has achieved significant breakthroughs in action, event and activity recognition. We have solved the fundamental problems posed by the Weizmann and KTH datasets, and are making substantial improvements each year against less constrained datasets such as UCF YouTube Sports and Hollywood Human Actions (HOHA). Because these videos were not filmed by vision researchers, but were compiled from the web and movie archives, they exhibit real-world conditions and complexity. More recently, the TRECVID Multimedia Event Detection competition was conducted on a very large collection of web videos showing complex events such as a wedding, changing a tire, and doing a woodworking project. With hundreds of exemplars of each event, plus thousands of videos of random events, this collection represents the largest publicly-available web video dataset today. Surprisingly, initial event detection accuracy on this dataset exceeded expectations, with Pd > 25% at a false alarm rate < 5%. Apparently, the problem was easier than expected. In the related domain of video surveillance, the most extensive datasets are demonstrating the opposite effect. Released last year, the VIRAT Video Dataset [Oh et al, CVPR 2011] has 11 scenes, 8.5 hours total, 11 annotated event types, and annotated bounding boxes on all movers. Initial performance on this dataset, using the same algorithms that do so well on HOHA and UCF, is much worse: at Pd = 25%, precision < 1%. Similarly poor results have been observed on the TRECVID Surveillance Event Detection dataset, which has similar content but more limited scene variety. Why is the seemingly less complex domain of surveillance more difficult than highly complex web videos? In this talk I will describe the methods we ve used to achieve the stated levels of performance on these datasets, and present reasons why surveillance video appears to be the more difficult case. Bio: Dr. Hoogs founded and directs the computer vision group at Kitware, Inc. which currently has more 30 members, half with PhD s. Over the past 20 years, he has supervised and performed research in various areas of computer vision including: event, activity and behavior recognition; motion pattern learning and anomaly detection; tracking; content-based retrieval; and segmentation. At Kitware he has led large, collaborative projects in video analysis involving universities, companies and government institutions. He has published more than 60 papers in computer vision, pattern recognition, artificial intelligence and remote sensing, and regularly serves as a program committee member and/or area chair for major vision conferences. Dr. Hoogs received his Ph.D. in Computer and Information Science from the University of Pennsylvania; M.S. from the University of Illinois at Urbana-Champaign; and B.A. magna cum laude from Amherst College.

Speaker: Dr. John Smith, IBM Title: TBD Bio: Dr. John Smith is a senior manager of the Intelligent Information Management Department at IBM T. J. Watson Research Center. He leads a research department addressing technical challenges in database systems and information management. His team includes the Database Research Group and Intelligent Information Analysis Group. In addition to his managerial responsibilities, Dr. Smith currently serves as Chair, Data Management research area at Watson and IBM Research Campus Relationship Manager for Columbia University. From 2001-2004, Dr. Smith served as Chair of ISO/IEC JTC1/SC29 WG11 Moving Picture Experts Group (MPEG) Multimedia Description Schemes group with responsibilities in the development of MPEG-7 and MPEG-21 standards. Dr. Smith also served as co-project Editor for following parts of MPEG-7 standard: "MPEG-7 Multimedia Description Schemes," "MPEG-7 Conformance," "MPEG-7 Extraction and Use" and "MPEG-7 Schema Definition." Dr. Smith also serves on the Advisory Committee for NIST TREC Video Retrieval Evaluation. Dr. Smith received his M. Phil and Ph.D. degrees in Electrical Engineering from Columbia University in 1994 and 1997, respectively (visit his page at Columbia University).