Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network

Size: px
Start display at page:

Download "Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network"

Transcription

1 I-CiTies CINI Annual Workshop on ICT for Smart Cities and Communities Palermo (Italy) - October 29-30, 2015 Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network (Università degli Studi di Bari Aldo Moro, Italy - SWAP Research Group)

2 L Aquila April 6, magnitude earthquake 20 billions damages 70,000 people displaced 309 people died 2

3 L Aquila 2015: six years later 7 billions fundings still needed 22,000 people still displaced Diaspora 3

4 L Aquila 19 new towns around l Aquila 15,200 people today live there 4

5 L Aquila What about the consequences? Loss of trust, sense of belonging, relationships 5

6 L Aquila Loss of social capital 6

7 L Aquila Social Urban Network 7

8 L Aquila Social Urban Network Our contribution! 8

9 L Aquila Social Urban Network Research Question: Is it possible to extract and process social media to monitor in real time people feelings, opinions and sentiments about the current state of the social capital of L Aquila? 9

10 CrowdPulse A framework for real-time Semantic Analysis of Social Streams 10

11 CrowdPulse features Social Data Extraction Sentiment Analysis Semantic Tagging Processing & Visualization 11

12 CrowdPulse workflow 12

13 CrowdPulse Step 1: Social Data Extraction 13

14 CrowdPulse Step 1: Social Data Extraction Source Extraction Heuristics 14

15 CrowdPulse Step 1: Social Data Extraction Source Extraction Heuristics 15

16 CrowdPulse Step 1: Social Data Extraction Source Extraction Content User #earthquake #traffic Heuristics Geo Content+Geo Page Group 16

17 CrowdPulse Step 1: Social Data Extraction Source Extraction Content User #earthquake #traffic Heuristics We only extract public content Geo Content+Geo Page Group 17

18 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS Heuristics: - Twitter users (local newspapers, mention to politicians) - Twitter content+geo (50km around l Aquila and/or specific hashtags as #laquila #earthquake, etc) 18

19 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS Heuristics: - Facebook groups (identified after a thorough analysis) - Facebook pages (identified after a thorough analysis) 19

20 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS Tweets about the fear of new earthquakes. Facebook posts about citizens proposals. Tweets about people worried of the situation. Tweets about new buildings in the city. Extracted content (example) 20

21 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS Sentiment Analysis and Semantic Tagging of the content 21

22 Semantic Tagging Motivations (eagle)? aquila (italian) (italian city)? Keyword-based representation introduces a lot of noise in the analysis 22

23 Semantic Tagging Motivations Fate qualcosa per favore, l Aquila sta morendo! (Please, do something: l Aquila is going to die!) (Please, do something: the eagle is going to die!)? 23

24 CrowdPulse Step 2: Semantic Tagging identification and disambiguation of the entities mentioned in the text. Non-trivial NLP tasks (stopwords removal, n-grams identification, named entities recognition and disambiguation) are automatically performed 24

25 CrowdPulse Step 3: Sentiment Analysis 25

26 Sentiment Analysis Motivations Is this content conveying any opinion? 26

27 Sentiment Analysis Motivations Is this content conveying any opinion? This is a crucial issue if people-based findings have to be generated 27

28 Sentiment Analysis Definition It is the field of study that analyzes people s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes (*) (Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis." Foundations and trends in information retrieval, 2008) We concentrated on the polarity detection task 28

29 CrowdPulse Step 3: Sentiment Analysis Overall sentiment: :-( 29

30 CrowdPulse Step 3: Sentiment Analysis Overall sentiment: :-( The process can be iterated over a larger set of content, to get findings about the feeling of the population regards a certain topic 30

31 CrowdPulse Step 3: Sentiment Analysis Overall sentiment: :-( 31

32 CrowdPulse Step 4: Processing & Visualization 32

33 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS How to map each content with the social indicator it refers to? 33

34 Use Case L Aquila Social Urban Network CROWDPULSE SETTINGS Given a fixed set of social capital indicators, we built a classification model to associate each content (along with its sentiment) to the social indicator it refers to. 34

35 Use Case L Aquila Social Urban Network Tweet about new buildings in the city. Tweet about new buildings in the city. Social Capital Mapper 35

36 Use Case L Aquila Social Urban Network Tweet about new buildings in the city. Tweet about new buildings in the city. Input: Social indicators + classification model 36

37 Use Case L Aquila Social Urban Network Tweet about new buildings in the city. Domain-specific processing: Classification task 37

38 Use Case L Aquila Social Urban Network Tweet about new buildings in the city. Output: (multi-class) classification + sentiment 38

39 Use Case L Aquila Social Urban Network Tweet about new buildings in the city. The score of a social indicator is the average sentiment of all the content referring to it. 39

40 Use Case L Aquila Social Urban Network CROWDPULSE OUTPUT Overall score of the social indicators between March and August

41 Use Case L Aquila Social Urban Network CROWDPULSE OUTPUT MONITORS THE STATE OF THE SOCIAL INDICATORS COMMUNITY PROMOTER Real-world application of the output DEFINES SOME INITIATIVES TO EMPOWER THE SOCIAL CAPITAL 41

42 Lessons Learned 42

43 Lessons Learned DEFINITION OF A FRAMEWORK FOR REAL-TIME SEMANTIC CONTENT ANALYSIS Pipeline of state of the art techniques Semantic Processing, Sentiment Analysis, Machine Learning, Data Visualization Use Case: L Aquila Social Urban Network Thanks to the huge availability of textual data very complex phenomena can be analyzed in a totally new way 43

44 questions? Cataldo Musto,