Visualization of Corpus Data
|
|
|
- Thomas McCoy
- 10 years ago
- Views:
Transcription
1 MASARYK UNIVERSITY FACULTY OF INFORMATICS Visualization of Corpus Data BACHELOR S THESIS Dominika Talianová Brno, 2014
2 ii
3 Declaration I, hereby declare that this paper is my original authorial work, which I worked out by my own. All the sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Dominika Talianová Advisor: Mgr. et Mgr. Vít Baisa iii
4 Acknowledgement In the first place, I would like to thank my advisor Mgr. et Mgr. Vít Baisa for his help, guidance and assistance while working on this thesis. Then I would like to thank my parents, for without their help I wouldn t be where I am, and especially my mother who has supported me the most. And I must not forget my friend Jan Volmut for his help with the administrative part of this thesis and while working on this text. iv
5 Abstract This thesis focuses on corpus data represented in graphical form. More closely, it consists of a recherché on visualization tools and a website created to hold visualizations based on two features of Sketch Engine, namely Word Sketch and Sketch-diff. These visualizations represent collocations and their salience in connection to different lemmas. The data essential for these visualizations are processed with the use of JavaScript and its D3 library in a JSON format and are provided by Natural Language Processing Centre at Masaryk University, Faculty of Informatics in Brno. v
6 Keywords: corpus, Word Sketch, Sketch-diff, d3.js, collocation, lemma, salience, JSON, visualization, JavaScript vi
7 Contents 1 Introduction Corpus data Natural Language Processing Sketch Engine The Word Sketch The Word Sketch Difference JSON Data Visualizations Visualization tools recherché R Prefuse Flare Protovis Processing Raphaël Wordle practical part D3.js up close New features to JavaScript: Accessing the data Sketch-diff visualization Form Obtaining the data Sorting the data Setting up parameters of the SVG... 22
8 4.3.5 Assigning place in the grid Visualizing the data Sketch-diff in its full size Word sketch visualization Form Obtaining data Sorting the data Visualizing the data Putting it all together Sketch-diff page Word Sketch page Conclusion
9 1 Introduction Visualization of data of whichever type makes the data easier to take in. It is no difference with corpus data. There is a large amount of techniques to extract corpus data from, such as dictionaries, thesauruses, collocations, frequency statistics, etc. For easier study of natural language, sorted data can be easier to look at in graphical form. One of the purposes of this thesis was to research the methods used to visualizing corpus data. The second part consisted of creating particular visualizations, which are possible to access through a website created for this purpose only. The data essential for these visualizations were obtained from corpora that are available at the Natural Language Processing Centre [see section 2.1]. To be exact, these visualization are based on two of the features of Sketch Engine [read more on Sketch Engine in section 2.2], namely Word Sketch and Sketch-diff. The web site allows user to select from two corpora. The first is a corpus of English language named Araneum Angelicum Minus, and the second one, called Bruna Bohemica, is a corpus of Czech language. Then, user can chose the word they want to be visualized and the number of words to be visualized. The visualizations are interactional and also provide the user a direct link to Sketch Engine s version of these features. The thesis is practically divided into theoretical and practical part. The theoretical part serves as an introduction for the next part, contains information on corpus data and the research on existing methods in data visualization. In the practical part is a short description on a JavaScript library used to create the visualizations called D3, and then there s a description of how the particular visualizations were created and how the website, for both of them was implemented. 3
10 2 Corpus data The word corpus, in plural - corpora, comes from Latin and means body. In the connection to linguistics it represents large and structured collections of texts taken from spoken or written language and composed in a certain language. Corpora can be monolingual (containing text in one language) or multilingual (containing text from multiple different languages). Basically any collection of words can be called a corpus, but in modern linguistics, it is not that simple. Today, the term is often used to refer to systematic text collections that have been computerized. Where systematic means that the structure and contents of the corpus follows certain extralinguistic principles [1]. The corpora usually consists of rather large amount of data (millions or sometime even billions of words) therefore they need to be electronically stored and processed. Each corpus has its own specifications, but there are some properties that are common for all corpora. For easier use and more useful linguistic research, the corpus is subjected to a process, called annotation. This means that some particular metadata is attached to every word of the corpus. These metadata are also attached to every grammatical relation or cluster of given words. For example, each word is given a particular part of speech tag (e.g. noun, verb, adjective, etc.). The words that appears in these corpora are called lemmas ( lemma in singular), or headwords. Lemma is a canonical form, dictionary form or citation form that represents the inflected forms of a set of words. (e.g. in corpus lemma do represents the inflected forms of does, doing, did and done ). Lemmas are mostly significant for highly inflected languages, such as Spanish or Czech. As a result of processing corpus data can be considered another element useful in analysing the data called concordance. Concordance shows how the given lemma changes according to context of the text it is related to. 4
11 2.1 Natural Language Processing Natural Language Processing is a field of computer science that focuses on linguistics, interactions between humans and computers, and its main focus is natural language understanding. One of the challenges characteristic for natural language processing is teaching computers to learn and understand human speech. It is a branch of artificial intelligence that is similar to the area of human-computer interaction. Collecting and utilizing lexical data, creating and further improving the corpora and text analysis are the ways to obtain these goals. In Natural Language Processing Centre (later on only NLP Centre) at Faculty of Informatics, Masaryk University in Brno 1 the research is directed at theoretical and applied results in corpus linguistic, lexical databases, representations of knowledge, meaning representation of expressions of natural language and utilizing the methods for machine learning leading to disambiguate the corpus data. The research here, involves the development of several projects, such as for example Sketch Engine. 2.2 Sketch Engine Sketch Engine is a web-based program with a number of language-analysis functions. It is a Corpus Query System, taking as input corpus with appropriate level of linguistic mark-up of any language. Sketch Engine is a product of a small research company called Lexical Computing, founded by Adam Kilgarriff in 2003 and is, since then, being developed by Mgr. Pavel Rychlý Ph.D., at NLP Centre at Masaryk University in Brno [2]. The core language-analysis functions are the Concordancer and the Word Sketch. The concordance is the basic tool for anyone working with a corpus showing the raw data, the analysis is based on [3]. It displays corresponding 1 NLP Centre at Masaryk University, Faculty of Informatics in Bnro < 5
12 concordances for chosen lemma, with the possibility of adjusting the output by user s needs. (I.e. user can decide, for example, what query type he wants to search for. Whether it s a simple word, phrase or lemma, etc.) The output in this case is an index of sentences, or parts of sentences, showing the contextual occurrence of the chosen lemma. The word sketches are corpus-derived summaries of a words grammatical and collocational behaviour [see more in section 2.2.1]. Sketch Engine provides corpora in different languages, from Arabic, through English, Greek, Russian to Vietnamese, Welsh and many others. With the exceptions of few, these corpora are mostly accessible to registered users only, but regarding basic research, every user (with, or without, an account) has the same possibilities to see and try out several features that Sketch Engine provides for each corpus (some of these corpora are quite large as it can be seen on figure 2-1 as they may contain several millions of words). For full understanding and easier use of Sketch Engine s many functions, there s a user manual, where is explained every aspect of each individual feature and its property 1. Among these features belong, for example, Word sketch and Word sketch difference. Figure 2-1. Open corpora preview 1 Getting Started with the Sketch Engine < 6
13 2.2.1 The Word Sketch Word sketch is, along the Concordancer, one of the two basic features that Sketch Engine provides. It is a one-page, automatic, corpus-derived summary of a word s grammatical and collocational behaviour [4]. Word sketch uses statistic and lemmatization based on the use of grammar patterns. Meaning, it considers the aspect of each grammatical relation the word participates in. The full helpfulness of the Word sketch is seen on a large amounts of data, since Words sketch for a rare item can be analysed by hand easily enough. Word sketch takes as an input one lemma and based on the corpus it belongs to, requires the adjustment of a part-of-speech (PoS) factor, which tells the programs what lexical category the lemma comes under. Word sketches working with English corpora require the attribute lempos, which is a combination of the lemma and the PoS tag. (E.g. in English, the word work can be taken as a verb, and also as a noun, the lempos attribute creates a string that looks like work-v (meaning verb) or work-n (meaning noun) depending on what PoS tag was selected. By this suffix it can be easily decided what lexical class the lemma belongs to. While working with Czech corpora this parameter is not required.) Word sketch also offers additional settings, which can define the minimum frequency (the lemma is used in collocation with a particular word), minimum score (which shows the salience score for the collocate), set the maximum number of items displayed in each grammatical relation or for example select which grammatical relation the user wants to see. User can also choose between monolingual or bilingual word sketches. The output, based on the selected parameters, consists of a table or set of tables. Each table represents one grammatical relation and displays the collocations (i.e. the word from the collocation with the given lemma). The given lemma can be called a node word as it is connected with every line of the tables into collocations. Every single word has its score and frequency. By clicking on the frequency, the user is redirected on a page where they can see the concordance matching the node word and its chosen collocation. 7
14 What follows (in figure 2-2) is a preview of Word sketch. This is an example of the lemma mother in an open corpus of English language (Araneum Angelicum Minus). More about Word Sketch can be found in an article by Vladimír Benko called Compatible Sketch Grammars for Comparable Corpora [5]. Figure 2-2. Word sketch preview As it can be seen, the given lemma is in the left upper corner, followed by its lexical category, then there is displayed the name of selected corpus and general frequency it occurs in the corpus. The maximum number of items, was limited to ten and each table represents particular grammatical relation, where every word in connection with the node word creates a collocation. The collocates are sorted by their salience (score). The frequency value, supplies a direct link to concordance. The example of concordance for the collocation mother grandfather is shown in the small fragment in figure 2-3. Figure 2-3. Concordance preview 8
15 2.2.2 The Word Sketch Difference The Word sketch difference (Sketch-diff) displays the variations between words. Whereas it can illustrate the contrasts between total opposites it is mostly useful for its ability to help determine the difference between nearlysynonyms or synonyms. The input of Sketch-diff are two lemmas. For some corpora the part-ofspeech parameter is required, and therefore both lemmas have to be of the same lexical category. There are some advanced options to select from as well, such as minimum frequency, maximum number of items displayed in one grammatical relation or a user can choose to display the collocations for both lemmas in one box or in separate boxes, where the output is divided into three boxes (one for Common Patterns, one for first lemma only pattern and one for second lemma only pattern ). The boxes (tables) are further divided into grammatical relations. Both options show collocations of the lemmas with other words. Here, both given lemmas are called node words, as they represent one part of the collocation shown in the tables. In every table and on every line (i.e. for every word of the table) there is information about frequency it is used in collocation with both of the lemmas, as well as score. By clicking on the frequency one can see the concordance for that certain collocation. It can occur, that a word doesn t have a collocation with one of the node words, and the score and frequency are null in that occasion. For better imagination and orientation in the tables, the collocations are aligned and coloured, so that the words with higher score in the collocation with first lemma are green and due to the bottom of the table, and on the top are words coloured red, with higher score in the connection with the second lemma. Words that are used almost evenly or evenly with both node words are in the middle of the table and are coloured white. 9
16 Figure 2-4. Sketch-diff preview This example, shown in figure 2-4, was made from an open corpus of English language, (Araneum Angelicum Minus). The chosen headwords are lemmas work and job and the lexical category is non-verb. Collocates with higher salience in connection with work are at the bottom of the tables and are coloured green. And due to the top of the tables are red coloured collocates with higher salience with job. This sketch-diff has a limited number of items displayed in each grammatical relation (i.e. in each table). From this example, one can observe, that even though they are synonyms, some collocations are exclusively used in connection to only one of them. Words like progress are used only in collocation with work in progress in work and can hardly be ever seen in progress in job. As can be seen in figure 2-4, some collocates are evenly or almost evenly used with both headwords. These are the ones in the middle of the tables, and are in white background. As the difference between the words decreases, the colours are more evenly distributed. 10
17 2.2.3 JSON All corpus data from Sketch Engine are obtainable in JSON format. JSON stands for JavaScript Object Notation 1 and it is a lightweight interchange format. JSON is easy to work with for both humans and computers and it can be used as an input or output format. 1 Web JSON Documentation: < 11
18 3 Data Visualizations To visualize information in general means creating images, diagrams, or animations, mainly to prove a point or to highlight some interesting or important part of message. In connection to corpus data, the data used to visualize consist of corpora. The visualization then displays the amount of use of each word or relationships and associations between words. 3.1 Visualization tools recherché There's unlimited amount of data to be visualized and different ways to do it, with the usage of different programming languages, namely object-oriented, supported by special libraries R One of the many ways to visualizing data is by using R. R is an environment and language that computes statistical data and produces graphical output. It was created by Ross Ihaka and Robert Gentleman in [6] It is a GNU project, similar to S language and environment, and is licensed as an Open Source. By the Google Trend Chart it is, to this day, considered one of the most closely visualization tools associated with data visualization [7] Prefuse Other option is to use the Prefuse visualization toolkit, which is a set of software tools for interactive data visualizations. Prefuse uses Java 2D graphics library and requires Java plug-in in web browsers. A project based on Prefuse, called Enron was created by Jeffrey Michael Heer to present a data visualization system integrating automated applied natural language processing techniques [8]. 12
19 3.1.3 Flare Then there's an ActionScript library called Flare, for visualizations that run in the Adobe Flash Player. Flare is used for example for BBC News' to map "Top 100 sites on the Internet" 1. A combination of Prefuse and Flare was used at the IBM Visual Communication Lab for the Many-Eyes visualization service [9], which can be used for visualizing corpus data, as we can see in Jeff Clark's Word Tree visualization [10]. Many Eyes service however require a Java plugin so later was created a Many Eyes.js library which provides interactive data visualizations using the native web technologies of JavaScript and HTML5 [11] Protovis Based on experience from developing Prefuse and Flare, Heer, Michael Bostock and Vadim Ogievetsky created Protovis 2, which was released as an opensource under the BSD license (i.e. the code can be edited, published or used for commercial and non-commercial purposes) and uses JavaScript and SVG for web-native visualizations. The development of Protovis was stopped in 2011 to focus on developing a new visualization library, D3.js. [read more about d3.js in section 4.2] Processing Another approach is by using Processing, which is a programming language, development environment and online community [12]. It's based on Java, allows to make interactive visualizations and is free to use. It is used, for example, in a project called 3D Dewey Data Visualizations by Syed Reza Ali, which explores the topics of 3D Space, particle systems, OpenGL and java, alpha blending, bill boarding, user interactivity, self-organizing algorithms (Kohonen), and electromagnetic attractions & repulsion [13]. Processing.js is the sister project of Processing and is designed for web, allowing the code to run 1 flare Data Visualization for the Web < 2 Protovis at GitHub < 13
20 by any HTML5 compatible browser [14]. An example of using Processing.js is a Wiki visualization by Matt Ryall 1. In this visualization is also used a JavaScript library called Raphaël Raphaël Another way of approach is by using Raphaël, small JavaScript library, which purpose is to simplify the work with vector graphics on the web. Every graphical object created in Raphaël is a DOM as well, so it can be attached event handlers or modified later [15]. It is developed by Dmitry Baranovskiy and is licensed under MIT Wordle In concern with corpus data, it may be interesting to mention a project named Wordle 2. It is was created by Jonathan Feinberg and is based on Java 2D API, therefore requires Java plug-in installed in a browser. Wordle allows free and easy word cloud visualization of any text found online and submitted through URL or manually typed right in to the form of the application. 1 Wiki Visualization < 2 Wordle < 14
21 4 practical part One of the purposes of this thesis was to create a practical tool that visualizes particular data (i.e. displays given data in graphical form). The data, in this case, are taken from the corpus of English language called Araneum Angelicum Minus and Czech language corpus called Bruna Bohemica. The visualizations are built on the functions of Sketch Engine, namely Word Sketch and Sketch-diff, and provide a different look at the data obtained through these features. The source code that creates the visualization is written in JavaScript and uses the D3.js library, while the data is accessed by php proxy. 4.1 D3.js up close D3 stands for Data-Driven Documents 1 and the name is an allusion to the W3, World Wide Web, which serves as an implication of technologies underlying the D3 itself [16]. D3 is a JavaScript library with improved support for animation and interaction. It is built on many of the concepts of Protovis including the usage of wildly implemented Scalable vector Graphics (SVG) and JavaScript as well as HTML5 and Cascading Style Sheets (CSS3) standards. Therefore, D3 doesn't require any special plug-ins just a modern web browser. D3 can be considered a rather young library as its development was noted only in It was released as an Open Source and is licensed under BSD on GitHub [16] New features to JavaScript: One of many improved and simplified aspects is, that D3 allows to bind data directly to DOM (Document Object Model) elements in the web page. This is done through functions called selections. What previously had to be done by 1 D3 < 15
22 document.getelementbyid(elementid) can be now done by simply selecting the element by calling d3.select( #elementid ) or to call more elements, one can use d3.selectall( #elementid ), which doesn t require the use of loops and such methods. These selections can be used to create a new element, that doesn t exist in the DOM as of yet, by applying a specific selection.enter() or to remove element by calling.remove(). Selectors are supported by modern web browsers without bigger problems, and solution for older browsers is provided in the form of a JavaScript CSS selector engine called Sizzle 1. D3 also introduces the possibility of using dynamic properties, meaning styles, attributes and other properties can be specified as functions. D3 provides large number of graphical layouts. Each one of them has its own defined properties and attributes. These layouts don t provide an already created visualization, but transform the data according to the needs of the specific attribute of given visual task. Some of the most common layouts are: Pie, Force, Treemap, Tree, Partition, etc. It was created and is being developed with the goal to process large amount of data, so it is extremely fast and supports dynamic behaviours for interactive visualizations and animations. And therefore the input can be in a JSON format or even a geojson, which is a specific use of JSON that provides encoded geodata for web applications. For this purpose, D3 has its own variation of a function to load JSON encoded data. It is d3.json(). This function requires an input in the form of an external script or using XMLHttpRequest, or XHR. This allows the data to load asynchronously and the rest of the page is displayed while the data is loading. Second part, that is for d3.json() obligatory is a callback function. The code that depends on the loaded data, should be contained within the callback function. 1 Sizzle < 16
23 Usage of the D3 library can be seen, for example, in the D3 gallery on GitHub 1. A great example is work by Daniel G. Taylor, who made a visualization of The Holy Bible contradictions 2. It's a whole complex of website with interactive visualization with active links. This project was inspired by the Reason Project's poster of biblical contradictions 3, which was in turn inspired by Chris Harrison's Bible Visualization 4. Concerning corpus data, Harrison made a visualization called Web Trigrams: visualizing Google's Tri-Gram Data. Trigrams are parts of sentences consisting of three words, taken from the Google corpus. The visualization shows interesting comparison between the first words and the following words for each one of them [17]. A more often data visualization example is a Word Cloud. For example, by Jason Davies Accessing the data The data is loaded by d3.json(), which as an input takes an URL. To avoid problems with its asynchrony [see section 4.1.1], data is accessed with the use of a short, simple php script, utilizing php s proxy properties. The use of a php script also prevents possible cross-domain limitations. In the file proxy2.php is defined simple $_GET method to retrieve the URL form the main script. As the URL is handed over in its encoded form, in the php script it needs to be decoded, using function urldecode(). This helps profoundly in manipulation with words from Czech language. In this script, it is also specified, that the output is in JSON format. 1 GitHub D3 Gallery < 2 Visualization of The Holy Bible contradictions < 3 Project Reason < 4 Chris Harrison < 5 Jason Davies Word Cloud < 17
24 4.3 Sketch-diff visualization This visualization shows the difference between words and their collocations. It is based on the Sketch-diff feature of Sketch Engine [see section 2.2.2], and therefore it preserves most of its properties. To partially match the appearance it is coloured in white with shades of red and green. The onlooker can select between Czech corpus and English corpus. Based on selected corpus, they then write in first and second lemma. If the English corpus was selected, the tool requires the determination of an appropriate lexical category, which has to match both given lemmas. Then there s the possibility to limit the number of collocates to be visualized, based on limiting the number displayed for every grammatical category. This limitation is not required and, while not specified differently, the default number is ten. The layout of the visualization also, to some extension, mirrors Sketch Engine s version of Sketch-diff. The lemmas are at the top and bottom, posing as opposite poles, while their collocations are displayed between them, and arranged due to their salience in collocation with the first, or the second lemma. The salience is also shown in the font size of each word (i.e. the bigger is the salience, the bigger is the size of font). The collocations are divided into seven individual categories, due to the class attribute, obtained from loaded data. This matches their salience and frequency of use in connection to first or second lemma, too. In addition, it is also possible to access the concordance of every displayed word by clicking on it. By clicking on one of the headwords, one can go to Sketch Engine s version of Sketch-diff. The basic idea of this visualization is an abstract grind, where each field is reserved for one word. This guarantees that the words of the same salience, won t be written over each other, and therefore, the final result will be readable and every visualized word can be accessible, by clicking on it. 18
25 4.3.1 Form The attributes that are obligatory for the visualization and affect it, are entered by the user into a form at the top of the web page of the particular visualization. The form is then processed in an external JavaScript file, called myformsd.js, in function parseform(form). In this file is the processed the whole code for sorting the data and the visualization. Here (in function parseform(form)), each attribute is firstly tested for containing any unacceptable value. If such value is found, the user is warned by a pop up box alert(), which stands the possible problem. This also happens if user enters values, by which it is impossible to find the corresponding sketch-diff, for example if one selects corpus of Czech language and as lemma enters a word in English. By sorting the individual attributes obtained in the form, parse- Form(form) assigns them values that are recognizable for Sketch Engine in a query. These attributes are: a) corpus representing language corpus, it can possess values: en_araneum_minus, or cz_bruna depending on the chosen corpus. b) lemma1 first lemma, for Sketch Engine recognizable as lemma c) lemma2 second lemma, in the query detected as lemma2 d) lempos for Sketch Engine, known as lpos, obtaining the value of part-of-speech attribute. The value being -x for non-verb words and -v for verbs. It is possible to select blank option, which is sufficient only if the Czech corpus was chosen for the query. To prevent human errors its default value is an empty string, and only if the English corpus is chosen, the user is alerted by a pop up box, informing them about their mistake. e) maxcommon representing maximum number of collocates in one grammatical relation, in the query represented as maxcommon. This is needed in the query, but it is not obligatory to fill in the value by user. If they chose to leave the field empty, the default value is
26 Once, the attributes are assigned their value, they re used to compose the query referring to the specific sketch difference in Sketch Engine, with the help of the proxy2.php script file. The data is obtained in JSON format, by adding &format=json at the end of the query and are handed over to method urlencode() to help with parsing additional characters (e.g. some of the Czech alphabet letters). The output of this external script is a string, consisting of proxy2.php?url and the generated link from above Obtaining the data The data is obtained through d3.json(), which as an input takes the returned values of the parseform(form)function and makes a query to Sketch Engine, (demanding sketch difference characterized by the values included in the query) and the values are handed over to the callback function as data Sorting the data The acquired data is in JSON format. More accurately, it is an object, containing objects that contain objects or arrays of objects, or objects of arrays of objects. Example of this can be seen in figure
27 Figure 4-1. Sketch-diff data in JSON format preview To work with data with this kind of hierarchy can prove rather difficult and for the purpose of the visualization the important part is located in the Rows array. Because of that, the needed data is stored in a variable com representing the common key from data. As of now, it is an array of objects of arrays and it is still not simple enough to access individual keys and their values. 21
28 Therefore, new variable called newobject is introduced, and represents a simple array of objects. Each object in this array contains: a) grammatical relation of each word b) the word from the collocation c) frequency for each word d) attribute seek for each word (important for creating link to concordance [see end of section 4.3.6]) e) colour, containing string color, for now f) rank of each word, representing the salience of the word in relation to the lemmas g) category, representing the class attribute, used in further division of the data To guarantee that the position of a word, in the final visualization, will be moved on the y axis due to the salience, the data from newobject is divided into seven separated arrays due to the category attribute. This arrays are named from height1array to height7array. In this part, the categories are also assigned their colour. Arrays height1array to height6array represent a range from the top to the bottom, where height1array is at the top and is coloured green, and height6array is at the bottom and is coloured in red. Array height7array is the middle part and is coloured white. The arrays are then connected into single array of objects, by function concat(). The new array is called joineddata and has the same attributes as newobject but the order of the objects is sorted so that each can be assigned their place in the grid easily Setting up parameters of the SVG Parameters here, represent the aspects that affect the final visualization. The height and the width is in this case set to a given number and it doesn t change, but the count of words that need to be visualized affects the general look and readability of the visualization. 22
29 Function setparameters(newobject) is where this is taken care of. It takes as an input newobject because it needs to know the count of objects in the array. The count is store in variable temp for easier manipulation through the function. There are given 4 possibilities that can occur and they are distinguished by the count of words to be visualized. In each option is set different number of columns (variable colcount). This helps to determine the count of rows (rowcount), the height of each row (rowheight) and the width of each column (colwidth). The rows are counted as count of words divided by the count of columns. In case the count of the words is an odd number, function Math.floor() is applied to this, and to ensure that no word will be left out there is 1 line added to the rows. To set the variables colcount, rowcount, rowheight and col- Width is the purpose of this function Assigning place in the grid The place in a grid of every individual word is set by variables ypos indicating its position on y axis (starting at value of rowheight + 40), xpos referring to its position on x axis, and index which helps later with keeping the index of the word, that already has set its position. This positions are modified and allocated to the objects in the function preparedata(joineddata). This function takes joneddata as an input, assigns them to temporary variable newdata, while going through two loops, assigns each object its position. The first is while() with the condition at the beginning and it is set so, that the position on y axis will be increasing, until it reaches the height of the SVG minus 40 pixels (the reason for this will be explained later). Inside the while() the function computes words position on the x axis as width of a column multiplied by the current number of the column. All the x positions are moved a little bit to the right, by adding padding to the final calculated x position. This is repeated while index doesn t reaches the length of newdata. Inside the inner loop are these positions added to a new variable called usabledata. To this array are later pushed two, manually created objects with the same keys. These objects represent lemma1 and lemma2. Their position is outside the grid and is computed separately. The x axis position is set to be 23
30 the exact middle of the whole width of the SVG. It is done as width/2 (lemma1.length/2). The same principle applies to lemma2. Using just width/2 would make the word start at the centre of the x axis, but, since the word doesn t consist of only one point, in the final effect it would be still closer to the right edge. That is taken care of by shifting it to the left by half of the length of the given word. Concerning the y axis, it is set to be at 40px for the lemma1 and 5px above the bottom edge of the SVG (the 5 pixels at the bottom serve as a padding). Both words need empty space, and that s the reason for the ypos to be set to rowheight + 40 and for the condition to control, if the height didn t cross height-40 boundary Visualizing the data First of all is used function to remove the SVG, even if there is none to remove yet. This makes sure, that when the script is called the second time (when someone wants to visualize some different or even the same data again) the new SVG will be created in the place of the old one, and not under it. The creation of new SVG is a series of basic chain of commands from d3.js library and it is appended to an already prepared DOM element visualization. Which is, right under it, appended shape of a rectangle that serves as a background for the visualization. The actual visualization is then created as text with input data usable- Data. The text displayed is taken from the object s key word and there is an exception for lemma1 and lemma2, making it so that they acquire the values entered in the form at the beginning. What follows is a set of attributes. The x, y and fill (colour) attribute are extracted from the objects from usable- Data. The font-size is set with the help of a function called setrangeedges() and d3.scale.linear(). The size of font depends on the size of the box the word is placed in. As the d3.scale.linear() works with two parts, first it takes an input domain of values and then needs an output range. In this case, the domain is limited with the minimum and maximum score. These values are calculated with d3.min and d3.max. They take as an input an array of 24
31 numeric values, and therefore it is easier to create a new auxiliary array of score values, taken from the newobject. As these data was had a string values it needed to be parsed and converted into numbers, before it could be used with the functions maximum and minimum. Then the range is calculated in the setrangeedges() function. The font size alters by the score for all words but lemma1 and lemma2 which have the size set to 40px. The size of other words is handled as sizescale(d.rank) depending on the higher score. The interactivity of the visualization is assured by mouseover function, which helps the user to orient in the text by enlarging the word, the cursor is currently hovering over. By clicking on the word is, in the reserved place above the visualization, displayed link leading to a concordance of the chosen word. This is done with the help of attribute seek. Each word has two seek attributes and the function chooses the one with higher score. With this helps the function return- MaxScore(d). By clicking at one of the headwords, one can create link leading to Sketch Engine s Sketch-diff. This takes is taken care of by the function createlink(d). This is shown in figure 4-2 on next page, where mouse cursor is hovering over the word creative and it was clicked at, so in a space over the visualization can be seen link, leading to concordance. In the picture below, in figure 4-3, is shown what this particular concordance looks like. 25
32 Figure 4-2. Sketch-diff visuaization preview Figure 4-3. Concordance for the words work and creative preview 26
33 4.3.7 Sketch-diff in its full size The bigger the amount of data to be visualized, the harder it becomes to read and decipher the words in the main visualization. For this reason, there is a zoom version available on the site. Clicking at a button show that will be created after executing function visualize() will open new tab, where the visualization is displayed in its full size. It has the same input aspects as the main visualization. These attributes are handed over from the main page with method get and then parsed in external script myformsd2.js. As it is taken care of in the main visualization, there is no need to check for human errors like unfilled obligatory attribute (e.g. partof-speech tag). The major difference between this visualization and the main one, is the function that computes the width and height aspects of the final SVG. While in the main one, set values are width and height, here are set attributes: col- Width and rowheight. Again, there are four possibilities to arrange the data and in each the width is computed as colwidth multiplied by colcount and the height is calculated as rowheight multiplied by rowcount. 27
34 Figure 4-4. Separated Sketch-diff for maximum of 70 words per grammatical relation preview What can be seen in figure 4-4 is a Sketch-diff visualization with maximum number of words in one grammatical relation set to 70. It is clear that such visualization isn t readable at all. Because of this, there is the possibility to open it in new tab. This can be seen in figure 4-5. (This picture had to be split into two as it is rather large.) 28
35 Figure 4-5. Enlarged Sketch-diff visualization opened in new tab preview 29
36 4.4 Word sketch visualization This visualization displays the salience of words in collocations with given headword (lemma). It is built on the Sketch Engine s Word Sketch, but in comparison to the Sketch-diff visualization [section 4.2] its appearance is completely different. This tool as an input requires to specify corpus, headword and the number of the words to be visualized. If the corpus of English language was chosen, it is obligatory to select part of speech corresponding to the given lemma. In the final result, an onlooker can see a colourful ring, surrounded by visualized words. The layout used in this visualization is called pie and the type of this pie is a donut (because of the empty space at the centre). In the centre of the ring is displayed the given headword. These words are chosen by their salience, making the first one at the top of the ring, the one with highest score and continuing clock-wise to the end of the circle. The various colours represent grammatical relation for each word Form The attributes necessary for the visualization, are also (as they are in Sketchdiff) entered into a form at the top of the page. The whole visualization is executed by clicking on the button visualize which, in return calls function submitparamters(form). First of all, the function submit(form) is called. Here, as well, is firstly taken care of the possibilitiy that user leaves some obligatory form field empty, or the value filled in isn t correct [see section 4.2.1]. The major difference is that while Sketch-diff needs two headwords, Word sketch requires only one, and therefore, only one lemma is entered into the form. Then user enters the number of words they want to be visualized, which is limited to a range from 10 to 200. In this function is, as well, composed the link for Sketch Engine s query. The link is stored in a variable inputlink and is then encoded and connected with a string proxy2.php?url= in another variable called inputdata. To set this variable (inputdata) is the purpose of this function. 30
37 4.4.2 Obtaining data The data is obtained through d3.json() function with the assistance of the external proxy2.php script. The php script decodes the link and creates a query for corresponding Word Sketch in Sketch Engine. This technique of creating the particular query prevents the d3.json() to be executed asynchronously and limits possible cross-domain problems, while the urlencode() and urldecode() functions make sure, there isn t a problem with processing letters from Czech alphabet Sorting the data The acquired data is in JSON format and its hierarchy is quite complicated for easy use in other functions. The original data represent an object of arrays of objects that contain another arrays. This is shown in figure
38 Figure 4-6. Word sketch data in JSON format preview. To make the access easier, the data is stored in variable gramrel. Later is created a new variable called words with simpler hierarchy, as it consists of an array of objects. An object is pushed into this array only if it has a word key. Every object then contains the same set of keys and they are: a) gramrel signifying grammatical relation of the word b) color colour belonging to the specific word based on its grammatical relation (How are the colours chosen will be explained later.) c) word word participating in the collocation d) score salience of the word e) seek necessary to create the query for concordance [see more at the end of section 4.4.4] The colours are manually chosen and entered in an array named colors in hexadecimal code. On this array is then applied function gramrelcolor() 32
39 which uses d3 s d3.scale.ordinal and as an input domain takes an array of grammatical relations (gramrels) and as an output range takes the array of colours. The final visualization needs to visualize only the first n values with the highest score ( n being the maximum number of words to be visualized in the form at the beginning), so these objects are then sorted, based on their score. Beginning with the highest score to the lowest. As the final part of sorting the data, is created yet another new variable to store the data that will be used in the visualization. It is called newwords and takes only the first n words from the sorted array words Visualizing the data As first, to ensure the new visualization would always replace the old one, is used a function remove(). The visualization is based on the example posted on GitHub 1. After that, new SVG is appended to the visualization element, already existing in the DOM. It has set width and height and its centre is moved to the middle of the SVG. Then a rectangle is appended to the SVG, serving as a background for the visualization. It has the same properties as the SVG. After that, the text of the headword (lemma) is printed in the middle. The size of font for the headword is calculated in a function lemmafontsizescale(). It takes as an input the smallest possible size of the inner radius and its possible biggest size and as an output the range from 15 to 40. This range represents the font size of the lemma based on the size of inner radius. Next follows the function to set the inner and outer radius of the final ring. For calculation of the size of the radius is used the function radiusscale(). This function uses d3 s linear scaling and as an input is given the minimum and maximum possible number of the words that are to be visualized. Then this scale is applied to outerradius from which is later computed inner- Radius. The inner radius is computed as 4/5 of outer radius. Once the radius 1 Simple D3 Pie Chart with Magnitudes in Arcs and Legends Outside and Along Arcs < 33
40 is set, the pie layout is applied its aspects. This particular layout has its wedges divided into even pieces. Their count depends on the number of words given to the visualization. This is assured by an anonymous function that returns value of one for every component of the data. The sort attribute is set to null which, therefore, defaults to sorting the order in the pie clock-wise with the biggest value at the top. In another part are prepared the wedges (the arcs of the pie). By using data(pie) the layout is given the piefied data, where each arc contains its stratangle, endangle and value. Each arc is appended a g element of the DOM and every slice has a class wedge. In d3, anything that doesn t have ordinary shape (such as rectangle or circle) can be drawn using path. This is now used to draw the actual arcs. After that, each arc is assigned its own colour, depending on the grammatical relation. The main part of the visualization consists of the text appended to the SVG. The words are written around the ring, in an angle corresponding the angle of the arcs. This is assured by several attributes. At first there is attribute textanchor telling the visualization on which side of the ring the words is situated. This function is taken from a question on stackoverlow 1. For this part to work, it is necessary to calculate the length of the arc. This is based on the Pythagorean Theorem. This makes sure, that each word starts at the end of an arc and keeps its angle and is still readable. The words are then rotated by function angle(d), which. This function is used to convert from radians to degrees as well. What follows next is the fill attribute given the colour from the data. Attribute font-size changes its value due to the salience of the lemmas. It is computed and stored in variable fontsize, which is later used in font- SizeScale to scale the size, making it largest at the top of the ring and continuously dropping off to half of the original size. The scale function is based on the count of words (objects) and in visualization is represented as the index i of the word (fontsizescale(i)). The basic idea is that it calculates the circumference of the ring based on its outerradius that is then divided by 1 Rotating text labels around pie < 34
41 the count of the words. To make sure the word wouldn t be bigger than the place reserved for it, on this result is applied function Math.floor, helping in case of non-integral division. The 4px are added just to make the words more readable. As means of interactivity there is couple features used. The words enlarge while hovering over them with mouse s cursor. When a word in the visualization is clicked on, the link to its concordance is displayed above the visualization. If the headword is clicked on, the displayed link leads to Sketch Engine s Word sketch. Example can be seen in figure 4-7, where the word widowed is clicked on and link to its concordance is displayed above the visualization. Figure 4-7. Word Sketch visualization with clicked word preview What follows are three pictures of Word Sketch visualization with different number of words to be visualized. (figure 4-8, figure 4-9 and figure 4-10) 35
42 Figure 4-8. Word Sketch visualization with 10 words preview 36
43 Figure 4-9. Word Sketch visualization with 60 words preview Figure Word Sketch visualization with 150 words preview 37
44 4.5 Putting it all together Both visualizations are connected placed on a website 1. This website allows the onlooker to read a little about these visualization, while it provides a possibility to try out the particular visualizations. The layout of the web isn t complicated and consist of a head, body and footer. Under the head is a simple horizontal navigation bar with links to: a) home - home page containing some basic information about corpus data b) about - displaying the information about these particular visualizations c) contact providing the contact address of the author and links to Sketch Engine s and NLP Centre s websites d) Sketch-diff providing the Sketch-diff visualization e) Word sketch providing Word sketch visualization. The actual web consists of 6 html files, because to the main Sketch-diff belongs an additional Sketch-diff which can be opened separately in different tab. The web also uses external CSS file and a reset file generated at meyerweb Sketch-diff page The first thing seen on the Sketch-diff page is a little information explaining how to manipulate with the visualization and how to fill in the form at the beginning. This includes some basic rules that need to be complied, otherwise it won t be possible to create the visualization. Next follows the actual form, where user selects corpus, enters first and second headword, selects the part of 1 website vizualizace_korpusovych_dat < 2 CSS Tools, Reset CSS < 38
45 speech for chosen headwords and writes in the maximum number of items in one grammatical relation. For drawing the visualization based on these parameters serves the button visualize. On the page, under the form, is also a space reserved for link to the concordance or Sketch Engine s Sketch-diff, which is written out after clicking on the visualized word. Here will, after executing the visualization, also appear the button allowing to show this visualization in a new tab in its original size. This can be observed in figure 4-11 and figure Figure Sketch-diff web page form preview 39
46 Figure Sketch-diff web page visualization preview 40
47 4.5.2 Word Sketch page First thing that can be seen at this page is also a bit of information, explaining how to handle the visualization and what the particular parts of the visualization mean. Here are, as well, written few simple rules for the visualization. Then, there is the form where user can select the corpus, enter the headword, select its lexical category, and choose the number of words they want to be visualized. This all is executed by clicking on the button visualize. Under the form is a reserved space for a link to the concordance or Sketch Engine s Word Sketch. This link will appear once the visualization is executed and the user clicks on a word in it. This can be seen in figure 4-13 and figure Figure Word Sketch web page form preview 41
48 Figure Word Sketch web page visualization preview 42
49 5 Conclusion The final result of this work is a website. This site allows users to use two of Sketch Engine s features, namely Sketch-diff and Word Sketch. The manipulation with these features is more intuitive than on Sketch Engine and the result is easier to look at. Even though it doesn t display so much information at first sight, this data is included in the visualization itself. For Sketch-diff it means that the size of each word mirrors its score, and the position of the word itself can tell with which headword the lemma is associated more often. Considering the Word Sketch visualization, the most outstanding aspect is that words that are visualized are not from one grammatical relation, but are selected as the words with highest score from all the relations on Sketch Engine s Word Sketch for this particular query. Their score is also visible in their size. If a user requires to see it in more ordinary way, they can easily get to the original feature on Sketch Engine through the headword. These visualizations use 2 different language corpora, but it is not difficult to add more corpora in the future or change the conditions on which the words are visualized. The same way can be changed the information displayed after clicking on the word, or while hovering a mouse over it. 43
50 Bibliography [1] Nesselhauf, N.: Corpus Linguistics: A Practical Introduction, 2005 [retrieved 5/15/2014], from < hauf/files/corpus%20linguistics%20practical%20introduction.pdf> [2] The Sketch Engine, SketchEngine, 2003 [retrieved 5/15/2014], from: < [3] Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P.,Suchomel, V.: The Sketch Engine: Ten Years On, 2014 page: 7, in print [4] Lexical Computing, Ltd.: Sketch Engine features, Sketch Engine, 2003 [re- trieved 5/15/2014] from < [5] Benko, V.: Compatible Sketch Grammars for Comparable Corpora, 2013 Bratislava [6] Baker, T.: Intro to R with Data Visuaalization, Safari Books Online, 2013 [re- trieved 5/152014] from < data-visualization-intro-to-r/> [7] Trends, Google, 2014 [retrieved 5/15/2014] from < Visualization%2C%20D3.js%2C%20Processing.js&cmpt=q> [8] Heer, J.: exploring enron, Standford HCI Group, 2004 [retrieved 1/28/2014] rom < [9] IBM.: Many Eyes, 2007 [retrieved 5/15/2014] from < 958.ibm.com/software/analytics/manyeyes/>
51 [10] Clark, J.: Word Trees from Many Eyes, Neoformix, 2007 [retrieved 5/ ] from < Tree.html> [11] Rasmussen, J., Assogba, J.: Visualization and Behavious Group, IBM Re- search, [retrieved 5/15/2014] from < searcher/view_group.php?id=3419> [12] Processing 2, Processing [retrieved 5/15/2014] from < [13]. 3D Dewey Data visualization [Processing], Creative Applications Network, 2009 [retrieved 5/15/2014]. Available from < tions.net/pro cessing/3d-dewey-data-visualization-processing/> [14] Processing.js, Processingjs [retrieved 5/15/2014] from < [15] Baranovskiy, D.: Raphaël JavaScript Library, Raphaeljsm, [retrieved 5/15/ 2014] from < [16] Murray, S.: Interactive Data Visualization for the Web, 2013 [retrieved 5/15/2014] from < [17] Harrison, C.: Web Trigrams, Chris Harrison, 2014, [retrieved 5/15/2014] from < 2
52 Electronic appendixes Part of this thesis is also this list of attachments: about.html - web page with info about this thesis contact.html - web page with contact index.html home page wordsketch.html web page with Word Sketch visualization sketchdiff.html - web page with Sketch-diff visualization sketchdiff2.html web page with zoomable Sketch-diff visualization myformsd.js JavaScript file to process the Sketch-diff visualization myformsd2.js JavaScript file to process the zoomable Sketch-diff visualization myformws.js JavaScript file to process Word Sketch visualization proxy2.php php script file d3.v3.js - D3 JavaScript library (version 3) css/newstylesheet.css style sheet for the website css/reset.css reset style sheet for the website 3
Introduction to D3.js Interactive Data Visualization in the Web Browser
Datalab Seminar Introduction to D3.js Interactive Data Visualization in the Web Browser Dr. Philipp Ackermann Sample Code: http://github.engineering.zhaw.ch/visualcomputinglab/cgdemos 2016 InIT/ZHAW Visual
Lab 2: Visualization with d3.js
Lab 2: Visualization with d3.js SDS235: Visual Analytics 30 September 2015 Introduction & Setup In this lab, we will cover the basics of creating visualizations for the web using the d3.js library, which
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys
Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph Client: Brian Krzys June 17, 2014 Introduction Newmont Mining is a resource extraction company with a research and development
Chapter 14: Links. Types of Links. 1 Chapter 14: Links
1 Unlike a word processor, the pages that you create for a website do not really have any order. You can create as many pages as you like, in any order that you like. The way your website is arranged and
Visualizing Data: Scalable Interactivity
Visualizing Data: Scalable Interactivity The best data visualizations illustrate hidden information and structure contained in a data set. As access to large data sets has grown, so has the need for interactive
Mobile Web Design with HTML5, CSS3, JavaScript and JQuery Mobile Training BSP-2256 Length: 5 days Price: $ 2,895.00
Course Page - Page 1 of 12 Mobile Web Design with HTML5, CSS3, JavaScript and JQuery Mobile Training BSP-2256 Length: 5 days Price: $ 2,895.00 Course Description Responsive Mobile Web Development is more
We automatically generate the HTML for this as seen below. Provide the above components for the teaser.txt file.
Creative Specs Gmail Sponsored Promotions Overview The GSP creative asset will be a ZIP folder, containing four components: 1. Teaser text file 2. Teaser logo image 3. HTML file with the fully expanded
Basic tutorial for Dreamweaver CS5
Basic tutorial for Dreamweaver CS5 Creating a New Website: When you first open up Dreamweaver, a welcome screen introduces the user to some basic options to start creating websites. If you re going to
Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation
Course Information Course Number: IWT 1229 Course Name: Web Development and Design Foundation Credit-By-Assessment (CBA) Competency List Written Assessment Competency List Introduction to the Internet
Creating Interactive PDF Forms
Creating Interactive PDF Forms Using Adobe Acrobat X Pro Information Technology Services Outreach and Distance Learning Technologies Copyright 2012 KSU Department of Information Technology Services This
Interactive Voting System. www.ivsystem.nl. IVS-Basic IVS-Professional 4.4
Interactive Voting System www.ivsystem.nl IVS-Basic IVS-Professional 4.4 Manual IVS-Basic 4.4 IVS-Professional 4.4 1213 Interactive Voting System The Interactive Voting System (IVS ) is an interactive
Table of Contents. I. Banner Design Studio Overview... 4. II. Banner Creation Methods... 6. III. User Interface... 8
User s Manual Table of Contents I. Banner Design Studio Overview... 4 II. Banner Creation Methods... 6 a) Create Banners from scratch in 3 easy steps... 6 b) Create Banners from template in 3 Easy Steps...
WEB DEVELOPMENT IA & IB (893 & 894)
DESCRIPTION Web Development is a course designed to guide students in a project-based environment in the development of up-to-date concepts and skills that are used in the development of today s websites.
First Bytes Programming Lab 2
First Bytes Programming Lab 2 This lab is available online at www.cs.utexas.edu/users/scottm/firstbytes. Introduction: In this lab you will investigate the properties of colors and how they are displayed
Responsive Web Design Creative License
Responsive Web Design Creative License Level: Introduction - Advanced Duration: 16 Days Time: 9:30 AM - 4:30 PM Cost: 2197 Overview Web design today is no longer just about cross-browser compatibility.
Raising the Bar (Chart)
Raising the Bar (Chart) THE NEXT GENERATION OF VISUALIZATION TOOLS Jeffrey Heer @jeffrey_heer Univ. of Washington + Trifacta ? Visualizing Big Data! Stratified Sampling Binned Aggregation immens: Real-Time
Create a Poster Using Publisher
Contents 1. Introduction 1. Starting Publisher 2. Create a Poster Template 5. Aligning your images and text 7. Apply a background 12. Add text to your poster 14. Add pictures to your poster 17. Add graphs
PaperlessPrinter. Version 3.0. User s Manual
Version 3.0 User s Manual The User s Manual is Copyright 2003 RAREFIND ENGINEERING INNOVATIONS All Rights Reserved. 1 of 77 Table of Contents 1. 2. 3. 4. 5. Overview...3 Introduction...3 Installation...4
Hypercosm. Studio. www.hypercosm.com
Hypercosm Studio www.hypercosm.com Hypercosm Studio Guide 3 Revision: November 2005 Copyright 2005 Hypercosm LLC All rights reserved. Hypercosm, OMAR, Hypercosm 3D Player, and Hypercosm Studio are trademarks
d3.js Data-Driven Documents Scott Murray, Jerome Cukier & Jeffrey Heer VisWeek 2012 Tutorial
d3.js Data-Driven Documents Scott Murray, Jerome Cukier & Jeffrey Heer VisWeek 2012 Tutorial How much data (bytes) did we produce in 2010? 2010: 1,200 exabytes Gantz et al, 2008, 2010 2010: 1,200 exabytes
Intro to Excel spreadsheets
Intro to Excel spreadsheets What are the objectives of this document? The objectives of document are: 1. Familiarize you with what a spreadsheet is, how it works, and what its capabilities are; 2. Using
Portal Connector Fields and Widgets Technical Documentation
Portal Connector Fields and Widgets Technical Documentation 1 Form Fields 1.1 Content 1.1.1 CRM Form Configuration The CRM Form Configuration manages all the fields on the form and defines how the fields
Creating a Poster in Powerpoint
Creating a Poster in Powerpoint January 2013 Contents 1. Starting Powerpoint 2. Setting Size and Orientation 3. Display a Grid 5. Apply a background 7. Add text to your poster 9. Add WordArt to your poster
Using Microsoft Word. Working With Objects
Using Microsoft Word Many Word documents will require elements that were created in programs other than Word, such as the picture to the right. Nontext elements in a document are referred to as Objects
D3.JS: Data-Driven Documents
D3.JS: Data-Driven Documents Roland van Dierendonck Leiden University [email protected] Sam van Tienhoven Leiden University [email protected] Thiago Elid Leiden University [email protected]
SAS BI Dashboard 4.3. User's Guide. SAS Documentation
SAS BI Dashboard 4.3 User's Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2010. SAS BI Dashboard 4.3: User s Guide. Cary, NC: SAS Institute
38 Essential Website Redesign Terms You Need to Know
38 Essential Website Redesign Terms You Need to Know Every industry has its buzzwords, and web design is no different. If your head is spinning from seemingly endless jargon, or if you re getting ready
Microsoft Expression Web Quickstart Guide
Microsoft Expression Web Quickstart Guide Expression Web Quickstart Guide (20-Minute Training) Welcome to Expression Web. When you first launch the program, you ll find a number of task panes, toolbars,
Virto Pivot View for Microsoft SharePoint Release 4.2.1. User and Installation Guide
Virto Pivot View for Microsoft SharePoint Release 4.2.1 User and Installation Guide 2 Table of Contents SYSTEM/DEVELOPER REQUIREMENTS... 4 OPERATING SYSTEM... 4 SERVER... 4 BROWSER... 4 INSTALLATION AND
JavaFX Session Agenda
JavaFX Session Agenda 1 Introduction RIA, JavaFX and why JavaFX 2 JavaFX Architecture and Framework 3 Getting Started with JavaFX 4 Examples for Layout, Control, FXML etc Current day users expect web user
CSE 512 - Data Visualization. Visualization Tools. Jeffrey Heer University of Washington
CSE 512 - Data Visualization Visualization Tools Jeffrey Heer University of Washington How do people create visualizations? Chart Typology Pick from a stock of templates Easy-to-use but limited expressiveness
Visualizing the Top 400 Universities
Int'l Conf. e-learning, e-bus., EIS, and e-gov. EEE'15 81 Visualizing the Top 400 Universities Salwa Aljehane 1, Reem Alshahrani 1, and Maha Thafar 1 [email protected], [email protected], [email protected]
Voice Driven Animation System
Voice Driven Animation System Zhijin Wang Department of Computer Science University of British Columbia Abstract The goal of this term project is to develop a voice driven animation system that could take
Guide To Creating Academic Posters Using Microsoft PowerPoint 2010
Guide To Creating Academic Posters Using Microsoft PowerPoint 2010 INFORMATION SERVICES Version 3.0 July 2011 Table of Contents Section 1 - Introduction... 1 Section 2 - Initial Preparation... 2 2.1 Overall
MASTERTAG DEVELOPER GUIDE
MASTERTAG DEVELOPER GUIDE TABLE OF CONTENTS 1 Introduction... 4 1.1 What is the zanox MasterTag?... 4 1.2 What is the zanox page type?... 4 2 Create a MasterTag application in the zanox Application Store...
Deposit Identification Utility and Visualization Tool
Deposit Identification Utility and Visualization Tool Colorado School of Mines Field Session Summer 2014 David Alexander Jeremy Kerr Luke McPherson Introduction Newmont Mining Corporation was founded in
AUTOMATED CONFERENCE CD-ROM BUILDER AN OPEN SOURCE APPROACH Stefan Karastanev
International Journal "Information Technologies & Knowledge" Vol.5 / 2011 319 AUTOMATED CONFERENCE CD-ROM BUILDER AN OPEN SOURCE APPROACH Stefan Karastanev Abstract: This paper presents a new approach
Excel Unit 4. Data files needed to complete these exercises will be found on the S: drive>410>student>computer Technology>Excel>Unit 4
Excel Unit 4 Data files needed to complete these exercises will be found on the S: drive>410>student>computer Technology>Excel>Unit 4 Step by Step 4.1 Creating and Positioning Charts GET READY. Before
Word 2007: Basics Learning Guide
Word 2007: Basics Learning Guide Exploring Word At first glance, the new Word 2007 interface may seem a bit unsettling, with fat bands called Ribbons replacing cascading text menus and task bars. This
Excel Intermediate Session 2: Charts and Tables
Excel Intermediate Session 2: Charts and Tables Agenda 1. Introduction (10 minutes) 2. Tables and Ranges (5 minutes) 3. The Report Part 1: Creating and Manipulating Tables (45 min) 4. Charts and other
Visualization with Excel Tools and Microsoft Azure
Visualization with Excel Tools and Microsoft Azure Introduction Power Query and Power Map are add-ins that are available as free downloads from Microsoft to enhance the data access and data visualization
COGNOS 8 Business Intelligence
COGNOS 8 Business Intelligence QUERY STUDIO USER GUIDE Query Studio is the reporting tool for creating simple queries and reports in Cognos 8, the Web-based reporting solution. In Query Studio, you can
Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ
PharmaSUG 2014 PO10 Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ ABSTRACT As more and more organizations adapt to the SAS Enterprise Guide,
BT CONTENT SHOWCASE. JOOMLA EXTENSION User guide Version 2.1. Copyright 2013 Bowthemes Inc. [email protected]
BT CONTENT SHOWCASE JOOMLA EXTENSION User guide Version 2.1 Copyright 2013 Bowthemes Inc. [email protected] 1 Table of Contents Introduction...2 Installing and Upgrading...4 System Requirement...4
Drawing a histogram using Excel
Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to
Using Adobe Dreamweaver CS4 (10.0)
Getting Started Before you begin create a folder on your desktop called DreamweaverTraining This is where you will save your pages. Inside of the DreamweaverTraining folder, create another folder called
Catalog Creator by On-site Custom Software
Catalog Creator by On-site Custom Software Thank you for purchasing or evaluating this software. If you are only evaluating Catalog Creator, the Free Trial you downloaded is fully-functional and all the
Participant Guide RP301: Ad Hoc Business Intelligence Reporting
RP301: Ad Hoc Business Intelligence Reporting State of Kansas As of April 28, 2010 Final TABLE OF CONTENTS Course Overview... 4 Course Objectives... 4 Agenda... 4 Lesson 1: Reviewing the Data Warehouse...
Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence
Web Development Owen Sacco ICS2205/ICS2230 Web Intelligence Introduction Client-Side scripting involves using programming technologies to build web pages and applications that are run on the client (i.e.
Adobe Illustrator CS5 Part 1: Introduction to Illustrator
CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Adobe Illustrator CS5 Part 1: Introduction to Illustrator Summer 2011, Version 1.0 Table of Contents Introduction...2 Downloading
10CS73:Web Programming
10CS73:Web Programming Question Bank Fundamentals of Web: 1.What is WWW? 2. What are domain names? Explain domain name conversion with diagram 3.What are the difference between web browser and web server
Access 2007 Creating Forms Table of Contents
Access 2007 Creating Forms Table of Contents CREATING FORMS IN ACCESS 2007... 3 UNDERSTAND LAYOUT VIEW AND DESIGN VIEW... 3 LAYOUT VIEW... 3 DESIGN VIEW... 3 UNDERSTAND CONTROLS... 4 BOUND CONTROL... 4
YouthQuest Quick Key FOB Project
YouthQuest Quick Key FOB Project This project is designed to demonstrate how to use the 3D design application, Moment of inspiration, to create a custom key fob for printing on the Cube3 3D printer. Downloading
Web Design Specialist
UKWDA Training: CIW Web Design Series Web Design Specialist Course Description CIW Web Design Specialist is for those who want to develop the skills to specialise in website design and builds upon existing
Designing a poster. Big is important. Poster is seldom linear, more like a MindMap. The subjects belonging together are located close to each other.
Designing a poster Poster is seldom linear, more like a MindMap The subjects belonging together are located close to each other. Big is important. Warm colours bring closer, cold ones estrange. A human
STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.
STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter
A Tutorial on dynamic networks. By Clement Levallois, Erasmus University Rotterdam
A Tutorial on dynamic networks By, Erasmus University Rotterdam V 1.0-2013 Bio notes Education in economics, management, history of science (Ph.D.) Since 2008, turned to digital methods for research. data
Interspire Website Publisher Developer Documentation. Template Customization Guide
Interspire Website Publisher Developer Documentation Template Customization Guide Table of Contents Introduction... 1 Template Directory Structure... 2 The Style Guide File... 4 Blocks... 4 What are blocks?...
DataPA OpenAnalytics End User Training
DataPA OpenAnalytics End User Training DataPA End User Training Lesson 1 Course Overview DataPA Chapter 1 Course Overview Introduction This course covers the skills required to use DataPA OpenAnalytics
How To Change Your Site On Drupal Cloud On A Pcode On A Microsoft Powerstone On A Macbook Or Ipad (For Free) On A Freebie (For A Free Download) On An Ipad Or Ipa (For
How-to Guide: MIT DLC Drupal Cloud Theme This guide will show you how to take your initial Drupal Cloud site... and turn it into something more like this, using the MIT DLC Drupal Cloud theme. See this
Advanced Microsoft Excel 2010
Advanced Microsoft Excel 2010 Table of Contents THE PASTE SPECIAL FUNCTION... 2 Paste Special Options... 2 Using the Paste Special Function... 3 ORGANIZING DATA... 4 Multiple-Level Sorting... 4 Subtotaling
Creating and Configuring a Custom Visualization Component
ORACLE CORPORATION Creating and Configuring a Custom Visualization Component Oracle Big Data Discovery Version 1.1.x Oracle Big Data Discovery v1.1.x Page 1 of 38 Rev 1 Table of Contents Overview...3 Process
Interactive HTML Reporting Using D3 Naushad Pasha Puliyambalath Ph.D., Nationwide Insurance, Columbus, OH
Paper DV09-2014 Interactive HTML Reporting Using D3 Naushad Pasha Puliyambalath Ph.D., Nationwide Insurance, Columbus, OH ABSTRACT Data visualization tools using JavaScript have been flourishing recently.
Adobe Dreamweaver CC 14 Tutorial
Adobe Dreamweaver CC 14 Tutorial GETTING STARTED This tutorial focuses on the basic steps involved in creating an attractive, functional website. In using this tutorial you will learn to design a site
Creating a website using Voice: Beginners Course. Participant course notes
Creating a website using Voice: Beginners Course Topic Page number Introduction to Voice 2 Logging onto your website and setting passwords 4 Moving around your site 5 Adding and editing text 7 Adding an
White Paper Using PHP Site Assistant to create sites for mobile devices
White Paper Using PHP Site Assistant to create sites for mobile devices Overview In the last few years, a major shift has occurred in the number and capabilities of mobile devices. Improvements in processor
CUSTOMER+ PURL Manager
CUSTOMER+ PURL Manager October, 2009 CUSTOMER+ v. 5.3.1 Section I: Creating the PURL 1. Go to Administration > PURL Management > PURLs 2. Click Add Personalized URL 3. In the Edit PURL screen, Name your
Utilizing Microsoft Access Forms and Reports
Utilizing Microsoft Access Forms and Reports The 2014 SAIR Conference Workshop #3 October 4 th, 2014 Presented by: Nathan Pitts (Sr. Research Analyst The University of North Alabama) Molly Vaughn (Associate
DOING MORE WITH WORD: MICROSOFT OFFICE 2010
University of North Carolina at Chapel Hill Libraries Carrboro Cybrary Chapel Hill Public Library Durham County Public Library DOING MORE WITH WORD: MICROSOFT OFFICE 2010 GETTING STARTED PAGE 02 Prerequisites
Information Literacy Program
Information Literacy Program Excel (2013) Advanced Charts 2015 ANU Library anulib.anu.edu.au/training [email protected] Table of Contents Excel (2013) Advanced Charts Overview of charts... 1 Create a chart...
Data Deduplication in Slovak Corpora
Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, Slovakia Abstract. Our paper describes our experience in deduplication of a Slovak corpus. Two methods of deduplication a plain
Netigate User Guide. Setup... 2. Introduction... 5. Questions... 6. Text box... 7. Text area... 9. Radio buttons...10. Radio buttons Weighted...
Netigate User Guide Setup... 2 Introduction... 5 Questions... 6 Text box... 7 Text area... 9 Radio buttons...10 Radio buttons Weighted...12 Check box...13 Drop-down...15 Matrix...17 Matrix Weighted...18
Dreamweaver Tutorial - Dreamweaver Interface
Expertrating - Dreamweaver Interface 1 of 5 6/14/2012 9:21 PM ExpertRating Home ExpertRating Benefits Recommend ExpertRating Suggest More Tests Privacy Policy FAQ Login Home > Courses, Tutorials & ebooks
4) How many peripheral devices can be connected to a single SCSI port? 4) A) 8 B) 32 C) 1 D) 100
FALL 07-08 COMP100 MIDTERM-2 EXAM /FACULTY OF ECON. AND ADMINISTRATIVE SCIENCES OF EUL Student Registration No: Instructor: Prof.Dr.Hüseyin Oğuz Student Name-Surname: Dept. of Computer Information Systems
UH CMS Basics. Cascade CMS Basics Class. UH CMS Basics Updated: June,2011! Page 1
UH CMS Basics Cascade CMS Basics Class UH CMS Basics Updated: June,2011! Page 1 Introduction I. What is a CMS?! A CMS or Content Management System is a web based piece of software used to create web content,
Figure 3.5: Exporting SWF Files
Li kewhatyou see? Buyt hebookat t hefocalbookst or e Fl ash + Af t eref f ect s Chr i sjackson ISBN 9780240810317 Flash Video (FLV) contains only rasterized images, not vector art. FLV files can be output
Virtual Exhibit 5.0 requires that you have PastPerfect version 5.0 or higher with the MultiMedia and Virtual Exhibit Upgrades.
28 VIRTUAL EXHIBIT Virtual Exhibit (VE) is the instant Web exhibit creation tool for PastPerfect Museum Software. Virtual Exhibit converts selected collection records and images from PastPerfect to HTML
bbc Creating a Purchase Order Form Adobe LiveCycle Designer ES2 November 2009 Version 9
bbc Adobe LiveCycle Designer ES2 November 2009 Version 9 2009 Adobe Systems Incorporated. All rights reserved. Adobe LiveCycle Designer ES2 (9.0) for Microsoft Windows November 2009 This tutorial is licensed
Sizmek Features. Wallpaper. Build Guide
Features Wallpaper Build Guide Table Of Contents Overview... 3 Known Limitations... 4 Using the Wallpaper Tool... 4 Before you Begin... 4 Creating Background Transforms... 5 Creating Flash Gutters... 7
Introducing Apache Pivot. Greg Brown, Todd Volkert 6/10/2010
Introducing Apache Pivot Greg Brown, Todd Volkert 6/10/2010 Speaker Bios Greg Brown Senior Software Architect 15 years experience developing client and server applications in both services and R&D Apache
Using MindManager 14
Using MindManager 14 Susi Peacock, Graeme Ferris, Susie Beasley, Matt Sanders and Lindesay Irvine Version 4 September 2014 2011 Queen Margaret University 1. Navigating MindManager 14... 3 Tool Bars and
The Essential Guide to HTML Email Design
The Essential Guide to HTML Email Design Index Introduction... 3 Layout... 4 Best Practice HTML Email Example... 5 Images... 6 CSS (Cascading Style Sheets)... 7 Animation and Scripting... 8 How Spam Filters
To determine the fields in a table decide what you need to know about the subject. Here are a few tips:
Access Introduction Microsoft Access is a relational database software product that you can use to organize your data. What is a "database"? A database is an integrated collection of data that shares some
Dreamweaver. Introduction to Editing Web Pages
Dreamweaver Introduction to Editing Web Pages WORKSHOP DESCRIPTION... 1 Overview 1 Prerequisites 1 Objectives 1 INTRODUCTION TO DREAMWEAVER... 1 Document Window 3 Toolbar 3 Insert Panel 4 Properties Panel
Creating Online Surveys with Qualtrics Survey Tool
Creating Online Surveys with Qualtrics Survey Tool Copyright 2015, Faculty and Staff Training, West Chester University. A member of the Pennsylvania State System of Higher Education. No portion of this
USER GUIDE. Unit 2: Synergy. Chapter 2: Using Schoolwires Synergy
USER GUIDE Unit 2: Synergy Chapter 2: Using Schoolwires Synergy Schoolwires Synergy & Assist Version 2.0 TABLE OF CONTENTS Introductions... 1 Audience... 1 Objectives... 1 Before You Begin... 1 Getting
Responsive web design Are we ready for the new age?
Responsive web design Are we ready for the new age? Nataša Subić, The Higher Education Technical School of Professional Studies in Novi Sad, Serbia, [email protected] Tanja Krunić, The Higher Education
Working with the Ektron Content Management System
Working with the Ektron Content Management System Table of Contents Creating Folders Creating Content 3 Entering Text 3 Adding Headings 4 Creating Bullets and numbered lists 4 External Hyperlinks and e
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
CREATING EXCEL PIVOT TABLES AND PIVOT CHARTS FOR LIBRARY QUESTIONNAIRE RESULTS
CREATING EXCEL PIVOT TABLES AND PIVOT CHARTS FOR LIBRARY QUESTIONNAIRE RESULTS An Excel Pivot Table is an interactive table that summarizes large amounts of data. It allows the user to view and manipulate
Mail Programming Topics
Mail Programming Topics Contents Introduction 4 Organization of This Document 4 Creating Mail Stationery Bundles 5 Stationery Bundles 5 Description Property List 5 HTML File 6 Images 8 Composite Images
ADOBE DREAMWEAVER CS3 TUTORIAL
ADOBE DREAMWEAVER CS3 TUTORIAL 1 TABLE OF CONTENTS I. GETTING S TARTED... 2 II. CREATING A WEBPAGE... 2 III. DESIGN AND LAYOUT... 3 IV. INSERTING AND USING TABLES... 4 A. WHY USE TABLES... 4 B. HOW TO
^/ CS> KRIS. JAMSA, PhD, MBA. y» A- JONES & BARTLETT LEARNING
%\ ^/ CS> v% Sr KRIS JAMSA, PhD, MBA y» A- JONES & BARTLETT LEARNING Brief Contents Acknowledgments Preface Getting Started with HTML Integrating Images Using Hyperlinks to Connect Content Presenting Lists
MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES
MICROSOFT OFFICE 2007 MICROSOFT OFFICE ACCESS 2007 - NEW FEATURES Exploring Access Creating and Working with Tables Finding and Filtering Data Working with Queries and Recordsets Working with Forms Working
Systems Analysis Input and Output 1. Input and Output
Systems Analysis Input and Output 1 Input and Output A course in information architecture or web design, complemented with work in Human-Computer Interaction, will help the analyst understand how to improve
EBSCOhost User Guide Searching. Basic, Advanced & Visual Searching, Result List, Article Details, Additional Features. support.ebsco.
EBSCOhost User Guide Searching Basic, Advanced & Visual Searching, Result List, Article Details, Additional Features Table of Contents What is EBSCOhost... 3 System Requirements... 3 Inside this User Guide...
Create interactive web graphics out of your SAS or R datasets
Paper CS07 Create interactive web graphics out of your SAS or R datasets Patrick René Warnat, HMS Analytical Software GmbH, Heidelberg, Germany ABSTRACT Several commercial software products allow the creation
