isecure: Integrating Learning Resources for Information Security Research and Education The isecure team 1
isecure NSF-funded collaborative project (2012-2015) Faculty NJIT Vincent Oria Jim Geller Reza Curtmola CUNY Soon Ae Chun MSU Edina Renfro-Michel Students NJIT Sheetal Rajgure (PhD) Arwa Wali (PhD) Deepthi Mandepudi (MS) Krithika Raghavan (MS) MSU Sailume Walo (PhD) 2
Millennials Multitask Have Short Attention Spans Tend to be Visual Learners Bore Easily Want Instant Gratification Want Control Over Their Learning Have an Expectation to Achieve Lack Self-Reflection Skills 3
Millennials Successful online courses need to utilize a variety of teaching tools Enhance learning for students using different learning styles Keep student attention Individualize learning (pace & information) 4
Online courses It can be difficult for students to coordinate all of the course material in a meaningful way How to reduce attrition rates for online courses? How can we make online teaching more attractive and effective? 5
isecure Online lecture materials are very popular Coursera MIT OpenCourseWare Multimedia data often relies on annotation. However, manual annotation is tedious How to effectively and efficiently search such a huge amount of heterogeneous data? 6
isecure Integration of multi-media teaching materials (slides, videos and textbooks) for Security courses based on a security ontology Segmenting and annotating learning media based on their learning content Building a Security Ontology to be used for annotating and querying the learning objects Adapting lecture contents to student learning styles Evaluating the usability and effectiveness of the linked multimedia learning system 7
Presentation Outline isecure modules Ultimate Course Search tool Security Ontology Evaluation & Learning styles Conclusion 8
isecure and Multimedia Learning Materials Multimedia is more and more used in teaching Slides, textbook, videos No link across the media The classical approaches based on visual features are not applicable Learning concepts Learning Concepts Complex semantics 9
Security Ontology isecure Modules SLOB: Security Ontology Browsing and Searching SKAT: Security Knowledge Acquisition Tool Team: James Geller, Soon Ae Chun, Arwa Wali, with Reza Curtmola and Roberto Rubino as the Information Security experts Ultimate Course Search (UCS) Slide, video and textbook indexing and search Team: Vincent Oria, Sheetal Rajgure, Krithika Raghavan, Deepthi Mandepudi, Hardikrasiklal Dasadia (MS), Saishashank Devannagari (MS) Evaluation UCS currently being used in CS 357 @ NJIT Team: Edina Renfro-Michel, Sailume Walo 10
Ultimate Course Search Indexing, Searching (Scoring and Ranking) 11
isecure and Multimedia Learning Materials: Goal The purpose of this tool is to bring different learning media (textbook, PowerPoint slides, and videos) to a single platform Enhance learning of students Enable easy search of lecture materials 12
TF-IDF Vector Space Model from Information Retrieval Documents and queries are represented as weighted vectors Distinct index term (keyword) is treated as a dimension in the vector space, and weights are TF-IDF values TermFrequency(t,x) is proportional with the number of occurrences of the term t in x InverseDocumentFrequency(t) is inversely proportional with the number of documents containing term t 13
PowerPoint Presentation Structuring the slides Every slide s content is extracted and structured in the form of XML elements consisting of slide number, slide title and slide text <Presentation title= Ptitle > <Slide slidenumber= 1 > </Slide> </Presentation> Analyze the content <SlideTitle> title </SlideTitle> <SlideText> text </SlideText> Text is broken into a series of atomic elements/tokens The keywords are identified after the text is analyzed to remove whitespaces, stop words etc. The keywords are stemmed using English Stemmer, updated version of Porter Stemmer s algorithm 14
PowerPoint Presentation Indexing Build an inverted index for stemmed keywords For every keyword, keep a list of triples: (slide number, frequency in slide title, frequency in slide body) For every presentation, keep a list of slides associated with that presentation 15
PowerPoint Presentation Searching A slide s relevance is based on: Term frequency / Inverse document frequency Cosine similarity: Sim(q, d) = V(q)!V(d) V(q)! V(d) Keyword location in the slide (title, body) Only consider slides that contain query keywords A presentation s relevance is based on: An aggregation (sum) of the relevancy scores of the slides in the presentation Presentations are ranked and displayed based on their relevancy (higher to lower) 16
17
Lecture Video To identify which portion of a lecture refers to a specific slide, we need to segment the lecture video We determine slide transition in the video: Based on features of recording software (for new videos) Based on image processing techniques (for videos priorly recorded) Once transition is found, we can easily determine the duration of the slide segment in the video 18
Lecture Video Indexing The video lectures are converted to an XML hierarchy: <Video filename = videofilename.mp4 presentationfilename = relatedpresentationfile.ppt > <Slide slidenumber = 1> </Slide> </Video> <StartTime>in ms</starttime> <EndTime> in ms </EndTime> For every video segment corresponding to a presentation slide, record the slide number and the presentation associated with that video segment 19
Lecture Video Searching For incorporating search on videos, we rely on extracted text and scoring for PowerPoint slides When a user searches for a keyword, the video files corresponding to the slides are linked to the result with the start and end time 20
21
Textbook Indexing and Searching We use the textbook s Index of Terms and the Security Ontology to create our index An inverted index that keeps a list of textbook page numbers for every Index Term The book s index terms and page numbers of each term s occurrences are organized as the following XML: <Textbook title = textbooktitle > <Keyword> <Word> index term</word> <PageNumber>1,2,3 </PageNumber> </Keyword> </Textbook> To search, we match query keywords with Index Terms 22
23
isecure: Security Ontology Soon Ae Chun, CUNY-CSI James Geller, NJIT Reza Curtmola, NJIT Arwa Wali, NJIT PhD Student Amy Luo, CUNY-CSI Honors College 24
Outline Project Goals Develop Security Ontology Approach & results Bootstrapping approach SOB Tool Expert Tool SKAT Tool Ongoing and Next Steps 25
Project Goals: Cyber Security Ontology Learning starts with grasping the Concepts and Rela,onships in a domain. Model Cyber Security Domain Knowledge o Develop Cyber Security Ontology Apply Cyber Security Ontology for Cyber Security EducaJon o SemanJc DescripJon o SemanJc Linking o SemanJc Search and Browsing o Security Learning by Concepts o Support sharing, querying and reuse of domain knowledge 26
Develop Cyber Security Ontology Challenges: Knowledge Engineering has been labor intensive, resorjng to manual processes Our approach Semi- automajcally build ontology from textbook index Challenge of idenjfying Security Domain Concepts Challenge of building hierarchical structure of ontology & other relajonships Categories or classes (is- a or taxonomic relajons) RelaJonships between concepts (property) Challenge of verifying ontology Manual categorizajon ValidaJon of ontology 27
Our Approach: Seed Ontology- based bootstrapping Use of Textbook index terms as domain concepts Use an exisjng Ontology as a basic scaffold Herzog s ontology Fenz s ontology Develop a human expert tool VerificaJon and manual classificajon and edijng 28
Seed Ontology: Top- Level Concepts and RelaJonships Asset Counterm easure/ Control Threat Vulnerability Fenz et al. 2009 29
Seed Ontology: Upper level Security Ontology Threat Countermeas ures Asset Asset, Counter Measures, Vulnerability, Threat Goal, Model, Product, defense strategy Herzog et al. 2007 - - h^p://www.ida.liu.se/~iislab/projects/secont/main.jpg 30
Herzog et al. Fenz et al. Criteria to choose a seed ontology: AnnotaJon power Useful concepts to annotate the learning objects Covering power Number of terms covered in the security textbook 31
Ontology Enrichment Components Textbook Index Pre- processing Seed Ontology Ontology Enrichment Engine Exact String matching Substring Matching Wikipedia Category TOC for rela,onships Concept Defini,ons Enriched Security Ontology WIKIPedia NIST DefiniJons 32
Results Total 724 index terms Seed Ontology 375 88 Index terms 636 375 Enriched Ontology 263 Index terms 361 33
SKAT: Security Knowledge AcquisiJon Tool Allow domain experts to classify lehover index terms from a security textbook into the security ontology. Design Goals RecommendaJon of concept classes for unclassified index terms, to minimize cognijve burden Help the security expert define most related class for an index term 34
Recommending Ontology Concepts Co- occurrence Analysis For the lehover index terms, we use human experts to classify them to the ontology Recommend the potenjally related ontology classes (concepts) for each index term Coocurrence analysis From Textbook, idenjfy all sentences where Term t in index co- occur with concept classes (c1, c2, ) in the ontology Recommend these co- occurring concepts for the term t 271 Terms co- occur with ontology concepts and 44 do not co- occur with any concepts An expert knowledge acquisijon TOOL 35
Ontology Bootstrapping Components Textbook Index Pre- processing Ontology Enrichment Engine Exact String matching Substring Matching Wikipedia Category TOC for rela,onships Concept Defini,ons Enriched Security Ontology Bootstrap Ontology SKAT Tool WIKIPedia NIST DefiniJons 36
Current acjvijes SKAT tool- based expert classificajon EvaluaJon of efficacy of the tool EvaluaJon of the automated bootstrapping based ontology SOB: Cyber Security Ontology Browser Enrich Security Ontology Pre- requisite relajons between Learning concepts, difficulty levels of concepts, etc. 37
SKAT Tool 38
SOB Tool 39
Next Phase AcJviJes Develop Linked Data of Learning Resources Not only the classroom materials but also open learning objects, learning data Cyber- security ontology based annotajon/indexing Modeling of Learners Learning preferences Learning history SLOBS: Security Learning by Ontology Browsing and Searching Learning concept definijons and related resources Ranking and presentajon of search results Consider Learners model for Search and PresentaJon EvaluaJon of the tool for learning cyber security 40
Evaluation Integrate Learning Information, teach this information to students Acquire Feedback from Students to Improve UCS Determine Student Usage of UCS Determine Learning Outcomes of UCS 41
Learning Preferences Index of Learning Preferences (Felder & Soloman, 1993) Four Types of Learners Active Reflective Sensing Intuitive Visual Verbal Sequential - Global 42
ILS Learning Styles Inventory Active and Reflective Active learners like to interact with new material Reflective learners like to think about applying material Sensing and Intuitive Sensing learners like material that applies to the real world Intuitive learners enjoy learning new material and dislike repetition 43
ILS Learning Styles Inventory Visual and Verbal Visual learners remember material presented through pictures or demonstrations Verbal learners tend to remember written and spoken explanations Sequential and Global Sequential learners like to link information in logical steps Global learners like looking at the big picture and how it related to larger themes before understanding all of the details 44
ILS Results on a Continuum ACT REF 11a 9a 7a 5a 3a 1a 1b 3b 5b 7b 9b 11b SEN INT 11a 9a 7a 5a 3a 1a 1b 3b 5b 7b 9b 11b VIS VRB 11a 9a 7a 5a 3a 1a 1b 3b 5b 7b 9b 11b SEQ GLO 11a 9a 7a 5a 3a 1a 1b 3b 5b 7b 9b 11b 45
The Research Currently Collecting Data in a Security Course Control and Experimental Demographics, Pre-Post Test, ILS User Feedback and Suggestions Back End User Data 46
Midterm Survey Feedback How are the students using UCS? Search for Information/specific words & terms Review video podcast lectures As a reference To study for the Midterm To help complete class projects 47
Midterm Survey Feedback What did the students like about UCS? User friendly Search engine Fast and accurate Search exact words Tabs and specific information Search Videos Searches lead to a lot of information 48
Implications for Higher Education Reduce attrition Increase clarity of course organization Increase student interaction with materials Individualize learning Create connections within and between courses 49
Q&A CONTACT INFO Ultimate Course Search Tool Vincent Oria: vincent.oria@njit.edu Security Ontology Soon Ae Chun: soon.chun@csi.cuny.edu James Geller: james.geller@njit.edu Evaluation & Learning styles Edina Renfro-Michel: renfromichee@mail.montclair.edu 50