Pixel-Oriented Network Visualization
|
|
|
- Aron Cole
- 9 years ago
- Views:
Transcription
1 Pixel-Oriented Network Visualization Static Visualization of Change in Social Networks Klaus Stein, René Wegener, and Christoph Schlieder Abstract Most common network visualizations rely on graph drawing. While without doubt useful, graphs suffer from limitations like cluttering and important patterns may not be realized especially when networks change over time. We propose a novel approach for the visualization of user interactions in social networks: a pixeloriented visualization of a graphical network matrix where activity timelines are folded to inner glyphs within each matrix cell. Users are ordered by similarity which allows to uncover interesting patterns. The visualization is exemplified using social networks based on corporate wikis. Key words: network visualization, pixel-oriented visualization, wiki Introduction One important aspect of social network analysis consists in finding interaction patterns between social actors by appropriate visualization paradigms. Social network visualizations offer great help in getting deeper insights in structure and relations. Most commonly graphs are used to represent social networks giving an overview especially over smaller networks, but normally do not cover network dynamics. Too little attention has been paid to networks changing over time: new users come in, old ones leave, new links between users get established, interaction between certain users declines, etc. Klaus Stein Christoph Schlieder Computing in the Cultural Sciences, University of Bamberg, Germany {klaus.stein,christoph.schlieder}@uni-bamberg.de René Wegener Information Systems, Kassel University, Germany, [email protected] This chapter is based on a conference paper at ASONAM 00.
2 Klaus Stein, René Wegener, and Christoph Schlieder Techniques like animated graphs give some insights into network dynamics but also inherit several shortcomings: graphs tend to clutter when larger networks are being displayed, and animation requires digital media as well as more cognitive resources compared to static visualizations. A static visualization technique able to display dynamics even in large social networks is needed. The visualization problem addressed in this chapter first arised when we analyzed user collaboration in organizational wikis where we missed a good visualization for temporal dynamics which could be used in the context of visual data mining. The social networks studied are extended coauthor networks extracted from organizational wikis using the interlocking measure. The idea of interlocking is simple: if an user B edits a page previously edited by another user A, a directed link from B to A is established. An user C editing the same page afterwards establishes links to A and B and so on. Each link is associated with a time stamp, so the network holds the interaction record of users in time. The analysis of interlocking networks can be applied to many different data sets from SVN or GIT repositories over corpuses to newsgroups, discussion boards, CMS. Interlocking gives a directed network, as each user action is considered as an answer to actions of other users before. The visualizations presented in this chapter are nevertheless also useful on undirected graphs, as long as time stamps of interaction are available. This chapter describes the following contributions to the state of the art:. we introduce a new kind of social network visualization based on the pixel-oriented visualization paradigm and extend it for the presentation of networks evolving in time in a way that supports visual data mining;. we compare different glyph layout patterns and their application to the problem domain;. we provide measures for arranging (sorting) nodes; and. we show that temporal patterns indicating cooccurrence and similar behavior are perceptually salient in our visualization even on larger networks. The chapter is organized as follows. We discuss common visualization approaches for social networks and time series in section. Section introduces our pixel-oriented network visualization approach and describes the adequate choices of glyph layouts and node arrangement. Section presents a case study that illustrates how to apply our approach to real-world network data. We conclude in section with a discussion of the results and an outlook on future work. Related work Visual data mining techniques take advantage of the efficient perceptual grouping processes of the human visual system (see e. g. [7, ]). Even in large data sets, perceptual saliency draws the observer s attentions to patterns. Visualization is most useful to generate hypothesis about regularities in a data set. On the other hand see [] for a detailed discussion of this measure
3 Pixel-Oriented Network Visualization visualization does not provide a proof. For instance nodes positioned close together by a layout algorithm suggest some correlation which vanishes by using another layout. Shneiderman [9] gives an overviews on visualization of large datasets, a detailed description can be found in [].. Social Network Graphs Much research has been conducted in the field of social network visualization (for an overview see [8, 0]). For a discussion of the explanatory power of network visualization see []. The most common way to present a social network is the network graph. The nodes are arranged by one of various graph layout algorithms which continue to be improved in their computational properties as well as their usability (see e. g. []). Most software packages for network analysis include graph drawing functionality. Furthermore, a variety of specialized graph drawing packages are available. Alternatively networks can be presented by adjacency matrices that represent the network by some kind of numbers for actors and relations. Ghoniem, Fekete and Castagliola [] compare node-link diagrams and matrices. While probably less appealing to the user, matrix visualizations avoid common problems like cluttering. The authors conclude that node-link representations are best suited for smaller graphs while a higher number of nodes and higher degree of density are better visualized as matrix representations. Shneiderman and Aris [0] state that network graphs can be understood best if they contain between 0 0 nodes and 0 00 links. A higher number of nodes and links makes it more difficult to follow the links, count or identify nodes etc. One way to avoid cluttering is to group related nodes. For example Shi et al. [8] and Balzer and Deussen [] use hierarchical clustering, Peng and SiKun [] propose a subgroup analysis layout algorithm based on attributes and Chen et al. [9] create subgraphs for subgroups. Leung and Carmichel [] also group nodes and arrange them in a rectangular grid using horizontal edges to show relations between different actors. Henry, Fekete and McGuffin [] take a different approach. They combine nodelink layouts and matrix representations in order to give an easy to understand overview of a network while also revealing details that couldn t be recognized in a pure node-link visualization (see also [7]). e. g. UCINET ( JUNG (http: //jung.sourceforge.net/), Graphviz ( GUESS ( Pajek ( Visone ( and others. See also com/top/science/math/combinatorics/software/graph_drawing/, and the INSNA software list ( insna.org/software/).
4 Klaus Stein, René Wegener, and Christoph Schlieder Mar. 007 Sep. 008 (a) full timespan Mar. Sep. 007 (b) first term Sep. 007 Mar. 008 (c) second term Mar. Sep. 008 (d) third term Fig. network evolution in time (students wiki). Change in Time For visualizing time-oriented data a variety of methods is available, for an overview see []. Broadly speaking, the methods for visualizing time oriented data in social networks fall into two classes: sequences of snapshots and (interactive) animations. Moody [] considers a third class: network summary statistics plotted as a line graph over time. Since the network topology cannot be recovered from this visualization, we will not consider it. Two major problems arise with using a temporal sequence of snapshots as visualization:. the temporal resolution, i. e., the number of elements of the sequence, is severely restricted,. the optimization of the layout algorithm conflicts with changes. We illustrate the problems of visualizations using sequences of snapshots with a network from our own data. The network presented in figure shows students working on a larger project across three terms using a wiki as documentation platform. So in each term a bunch of new students join, others leave. The visualization uses a spring embedding layout algorithm that optimizes the length of the edges between the nodes (see []) in order to achieve a grouping of highly interconnected sets of nodes. (b) to (d) show snapshots of the network at different times that illustrate the temporal change. While for each subgraph both spatial dimensions are used to lay out the graph on the big scale, the x-axis represents time from left (past) to right (future). The spatial extension of each data point (network graph) restricts the temporal resolution to few (three) time points. We could not layout each of the graphs with spring embedding but had to keep each node at a fixed position to be able to compare the graphs. So while the change shows up very well in (b) to (d) the grouping in the single graphs is not optimal. Moody states that the poor job of representing change in the network is a problem fundamental to the media and suggests to use animations. Network animation software is available either stand alone (e. g. SoNIA, see []) or as part of dedicated
5 Pixel-Oriented Network Visualization network analysis software (e. g. SONIVIS ) as most of the network visualization libraries support dynamic node and edge change. By interpolating between the graphs that represent the different intervals of the network, nodes are moved to smoothly re-layout the graph responding to changing edges. One drawback here is that this representation cannot be published in print, some kind of digital medium is needed. More important, animations are nice and striking in presentations but less useful for analysis as they do not give an overview at a glance. Animations inherit the issues of change blindness which means that some changes (maybe just because the glimpse of an eye) will not be realized by the user [7]. The static graphical presentation of data allows to externalize concepts as everything is available on paper. The visual system provides the data analyst with scanning routines that permit a very efficient processing of externalized data. Presenting all the information on one single figure instead of using an animation takes advantage of this ability. In fact, switching our attention from one area of a single figure to another will usually not only be much faster than finding the right interval of an animation with the help of a slide bar [], it will also allow to for deeper inspection. Human s visual working memory only stores a handful of objects at a time [] and watching animations forces us to remember what we have seen while a static representation keeps everything accessible in parallel. Pixel-oriented Visualization of Network Data The idea of pixel-oriented visualization was introduced by [9] and further developed in [8], [7] and other publications. Although the original publications do not provide a concise defintion, the basic idea of the visualiation methods is simple to describe. Each pixel of the screen is used to visualize one data point, representing its value by its color. This allows to visualize great amounts of data while avoiding overlapping and cluttering. Pixel-oriented visualization has been applied by Guo et al. [] for the visualization of very large scale network matrices (BOSAM 7 ), but to the best of our knowledge it has not been tried on temporal and weighted networks. In this section we show, how to adapt the idea of pixel-oriented visualization for a static representation of social network data in time.. Collaboration in Time Figure a shows typical time series data in graph representation: the x-axis represents time as the independent variable, while the dependent variable (interaction For an animated representation of the network shown in Fig. see wiai.uni-bamberg.de/mwstat/examples/wiki weekly_network.swf 7 bitmap of sorted adjacency matrix
6 Klaus Stein, René Wegener, and Christoph Schlieder 0 0 u u u u0 u u u 0 Apr 07 Jul 07 Oct 07 Jan 08 Apr 08 Jul 08 Oct 08 Jan 09 Apr 09 Jul 09 (a) monthly 0 8 u u u u0 u u u 0 Apr 07 Jul 07 Oct 07 Jan 08 Apr 08 Jul 08 Oct 08 Jan 09 Apr 09 Jul 09 (b) daily u u u u0 u u u (c) monthly u u u u0 u u u (d) daily Fig. collaboration activity per user (workgroup wiki). All figures show the same timespan from April 007 to Sep. 009 (x-axis), only the resolution is different. activity) is displayed on the y-axis. Different colors (levels of gray) are used to distinguish the different users. Figure c gives exactly the same data in pixel-oriented representation. The x-axis again represents time, the y-axis now is used to separate the users line by line. Color (grayscale) represents interaction intensity. The advantage of the graph-based representation consists in a visually salient rendering of extreme values (intensity spikes around June 07, August 08 and April 09) but makes it hard to distinguish the different users (figure a) as their graphs overlap. In the pixel-oriented representation the activity of different users is clearly visible as there is no overlapping. The user timelines are easy to compare (we see in figure c which users where active in August 08 and which in April 09), but we do not get the absolute intensity values, only the relative intensity distribution is visible. Figures (b) and (d) present the same data in higher resolution, i. e. day by day instead of month by month following the idea that to map each data value to a colored pixel [... ] allow[s] us to visualize the largest amount of data which is possible on current displays [7]. The pixel-oriented visualization delivers additional implications: As you can see high collaboration values in one month are usually generated by the high collaboration rates of only a few days and not by continuing collaboration during this month. (d) uses the calendar metaphor which means to arrange the timeline in a weekly zigzag: 8 9 Monday Tuesday Wednesday user 8 9 Thursday 9 0 Friday 0 7 Saturday 7 8 Sunday Although the single dots are too fine to be able to tell which day exactly it represents we nevertheless get the interesting patterns. We not only get an impression which users collaborate a lot at which time during the year (as every pixel column
7 Pixel-Oriented Network Visualization 7 u u u u0 u u u u u u u0 u u u (a) matrix u u u u0 u u u u u u u0 u u u (b) week steps u u u u0 u u u u u u u0 u u u (c) weekly steps Fig. workgroup wiki network matrix. (a) shows the adjacency matrix of the network. Each matrix element is colored according to its value. In (b) and (c) each glyph shows the folded timeline row by row. The nodes are sorted by their (weighted) degree. represents a week), we additionally see 8 that only users U and U are working on Sunday while no one works on Saturday.. The Pixel Matrix We adapt the idea of pixel-oriented visualization applying it to social network graphs. We present the network as an adjacency matrix with each row and each column corresponding to one node and the matrix elements giving the weight of the edge between the corresponding nodes, as shown in figure a. Following the pixel-oriented visualization paradigm to present each value by one colored pixel we can inject the whole timeline of user interaction in one matrix element by folding it. Figure b gives the network matrix with each matrix element showing one glyph which holds the timeline folded row by row, each pixel representing the user-user-interaction within four weeks. This is somehow the inverse representation to figure where time is given on the outer x-axis while now time is folded at the inside, i. e. within each glyph. The scale is changed from monthly to four weeks to get timespans of equal length. While monthly data is easier to present and describe it has the disadvantage that each month covers a different number of days and a different number of certain weekdays: June 00 has four Fridays, July has five. This is important when work is organized weekly, e. g. the workgroup has a weekly meeting on Friday generating extra social interaction. On a monthly raster this generates 0 % more activity for July while everything is even. The other way around if there is some monthly event the four-weeks raster would show irritating data. 8 at least I hope this is visible in the printout
8 8 Klaus Stein, René Wegener, and Christoph Schlieder We see users U, U7, U and U interacting a lot with each other over the whole timespan. Users U8 and U come in after around one year for three months, working closely together with small interaction with others (U8 has few connections to the most prominent four users while U only meets U). And users U0, U and U join the network even later, interacting mainly with the prominent four and not with each other. This small example gives first insight on the potential of pixel-oriented network visualization (PONV). We do not know exactly at which time users U8 and U came in and we do not get a detailed analysis on the interaction between users U to U, but we see gray dots sprinkled over the whole glyph telling us that there was interaction between these users all the time. And we see that U0, U and U came in at the same time even though their timelines are not layed out side by side. Our visual perception is perfect in detecting that these gray dots are on the same position within their glyph. While we are not able to read absolute data from the graph (neither exact times nor exact intensity values of collaboration) we get a good overview over the temporal behavior of each user as well as temporal cooccurrence of certain events. And this is exactly what visual data mining is meant for: an exploratory analysis of the data to detect regularities. In a next step, all evidence in the data set is evaluated that supports the hypothetical regularity or conflicts with it. For instance, it could be checked at what time U did his first edit operation, at what time U8 her last edit operation and so on. In other words, the cues from visual examination are backed up with hard data. An additional remark about U. As we see in the U row he comes in rather early, interacts once and is gone, but in the U column there are several interactions visible even at later times. This is a result of the network creation method we used. Interlocking response graphs are directed and an edge from user B to user A is set at the time B edits a page A edited before. And in the network matrix as presented in figure the row gives the tail node (R) and the column gives the head node (C), so the edge points from R to C. So what we see here is that U edited a page users U to U edited before, and months later users U to U8 responded by editing this page. And we can also see some closer interaction between U and U as the (U,U)-glyph shows us U must have edited this page shortly before U.. Inner Glyphs Our visual perception is good in pattern recognition and edge detection. Looking at figure b we detect some interesting vertical bars at (U,U), (U,U) and (U7,U). Unfortunately these findings are arbitrary. The timeline is packed into the glyph row by row (see figure a) and each glyph has pixel. This means that two pixel aligned vertically contain data that is weeks apart. Looking at the same data with glyphs of width 7 would reveal other arbitrary vertical bars. On the other hand we may miss timespans of high activity which start at the end of one column
9 Pixel-Oriented Network Visualization 9 row by row row snake col by col col snake diagonal diag. snake hilbert z-curve (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) Fig. Layout patterns for inner glyphs. (a) to (h) give the layout path while (i) to (p) show example glyphs for a color gradient from black to white in the corresponding layout. (See also [8, 7]) and end at the beginning of the next one. The latter problem can be reduced by laying out the rows in a snake-like way (see figure b) as here continuous timespans are not ripped apart. Things are different in the example before (figure d). 9 Each column represents one week with Monday in the top and Sunday in the bottom row which allowed us to see who is working on weekends, and obviously a snake pattern (figure d) would disturb. While figure b kind of allows to distinguish single pixels and with some effort even to see which row a pixel belongs to we are lost in figure c with a weekly resolution. There are darker and lighter regions and you can hardly see how many rows are involved, in other words: you see darker and lighter region patterns which may be incidental as our vision groups events which are layed out next to each other in successive rows while in fact being numbers of weeks apart. Space-filling curves like the well-known Hilbert curve (figure g) are one answer to this problem as these curves lay out one-dimensional data in two-dimensional space in a locality-preserving way, i. e. data points being close together in the onedimensional representation are kept close in the two-dimensional layout. Unfortunately it also brings data points close which were very far away from each other as visible in figure o. A second disadvantage of this layout is that the main data order is non-linear. While the layouts (a) to (f) show a main linear direction (top down, left right, topleft downright) 0 the main timeline for the Hilbert curve is U-shaped which is kind of irritating not only for the uninformed reader. The recursive z-curve (h) roughly maintains an overall topleft downright direction with partly more locality than the pure diagonal layouts (e) and (f) but on the other hand shows leaps. We found the row and column layouts least irritating to the uninformed as well as expert user, they are easy to explain and the more complicated layouts are not 9 where the timeline was presented column by column 0 which could be flipped e. g. for Arabic readers who may prefer right left
10 0 Klaus Stein, René Wegener, and Christoph Schlieder u u u0 u u u u (a) monthly u u u u u u u0 (b) daily Fig. collaboration activity per user (workgroup wiki), users arranged by similarity of timelines. capable of solving all the issues described above. And finally its up to the decision of the user which layout he or she prefers. If the data timeline is known to be structured by periods relevant to the problem domain, e. g. quarters or terms, the pixel-oriented visualization can easily adapt to that fact. For example, to present 0 years of data month by month use a row by row or column by column layout with 0, and you may be able to see some pattern for Christmas time, for one year ( weeks) day by day use or 8 and so on. If no such rhythm applies choose quadratic glyphs with any layout you may like but keep in mind how to interpret it and which patterns may show up spuriously.. Arranging the Glyphs Displaying the timelines of the users row by row as in figures c and d raises the question about the best order to arrange them. A similar issue arises with the matrix representation (figure ). In many cases additional background knowledge is available about the members of the social network. They belong to some part of the organization, work together in a certain project, have a certain role (manager, clerk, intern etc.). Grouping users by one of this aspects can help to identify certain patterns (are members of the same projects working together within certain timespans? Are there different patterns showing up for different roles?). A second criterion for grouping is some node feature. For the examples up to now we used the (weighted) degree of each node, that is we sorted by a network parameter. This is reasonable as such parameters provide well-defined sorting criteria, but it is also kind of arbitrary as one would need to argue why not to base sorting on content-based criteria such as activity, timespan, or grouped nodes highly connected in the network. The pixel-oriented network visualization supports the identification of patterns in the timelines. The layout algorithm can support the pattern recognition process by choosing arrangements placing similar objects side by side. Figure shows the same user collaboration timelines as figure with users grouped by similarity of their timelines. The most outstanding change is that U0, U and U, the three
11 Pixel-Oriented Network Visualization users only working in April 009, are grouped together and not longer disturbed by U. We give a larger and more substantial example in section (figures 9 and 0). The rest of this section focusses on ordering by these similar patterns. First an approach for computing distances (dissimilarities) on user timelines is given. Based on these distances the timelines are then ordered by multidimensional scaling (MDS). And finally we describe additional considerations for matrix ordering... Comparing Timelines Each user timeline gives a high dimensional vector, where each data point (user activity at a certain time) is mapped to one vector component. So each timeline is represented as a point in a high dimensional space. This allows to compute the distance (i. e. the discrepancy) between each pair of timelines and to group users with close timelines and put them side by side. We will not discuss the large number of algorithms for high dimensional clustering but concentrate on the specifics of the network data. Consider the following timelines ( =, =, = 0): day T T T T T T T7 Each timeline is a vector presenting a point in -dimensional space. Simply computing the distance between these points using some metric works for T and T. They are close compared to the others as they only differ in their 9. component, so with euclidean distance we get: d(t, T) =, compared to d(t, T) = 0, d(t, T) =.8, d(t, T) = 0 and so on. But d(t, T) =., so T is closer to T than to T. Mathematically, this is obvious as dimensions are independent to each other and T and T differ in coordinates by 0 =, but it may not be what we want. Users T, T and T were active in the sixt week so it may be reasonable to count their timelines as similar, even if they did not show active on the same days. One way to solve this would be to use a coarser raster for computing the distances, we can use weekly timelines for ordering and daily timelines for presentation. So T and T show activity intensity of in week six and no activity in other weeks giving a complete match. But it would not work for users T and T, as we are unlucky with the weekly border: T shows activity in week three while T is active in week four. Therefore, we propose another solution. The basic idea consists in not computing the distances on the timelines as given above but instead to smoothen (blur) them. For each component a i of the timeline vector the neighbors a i k,a i k+,...,a i e. g. CLIQUE []. For an overview on cluster analysis see Everitt et al., 009 [].
12 Klaus Stein, René Wegener, and Christoph Schlieder and a i+,a i+,...,a i+k are modified by the value of a i, e. g. by addition of a i. For example with k = by computing: a i = a i + a i + a i+ we get the ordering timeline for all timepoints i. This gives modified timelines ( =, =., =, = 0): day T T T T d(t, T ) =., d(t, T ) =., d(t, T ) =, d(t, T ) = 7.9. So T and T are closer than T and T, as requested. So in general smoothing works. How to do the smoothing exactly, e. g. how to choose the size of the neighborhood k, depends on the specific problem domain. For the daily timelines of the workgroup wiki (figure b) we used k = 7 which means that activities of two users are overlapping as long as they are not more than one week apart which is feasible for users collaborating in a wiki over several years... Distance Measures In the examples given the distances are computed using the euclidean distance metric. Comparing the timelines for T, T and T7 gives the euclidean distances d(t, T) =.9 and d(t, T7) =.7, so T is much closer to T7 than to T. This holds for all minkowski (p-norm) metrices, e. g. manhattan metric gives d(t, T) = and d(t, T7) = and maximum metric gives d(t, T) = and d(t, T7) =. Whether this is the desired result depends on the scenario. Are two users which are active at the same time similar even if the activity of one is very high and of the other is very low or is low activity more similar to no activity? The former can be achieved in several ways. Adding some δ to each value greater than 0 increases the relative gap between 0 (no activity) and other values. Alternatively transforming the activity values by a i = m a i reduces the importance of high values, for very large m it is similar to setting all non-zero values to. Now a distance measure can be applied to the modified timelines. For example, with m = and the euclidean metric we get d(t, T) = 0. and d(t, T7) =.7 as expected. Other measures like the Pearson correlation (known as r) or the cosine coefficient (which computes the cosine of the angle between two vectors) can be used to compare timelines by their relative intensity pattern. As they compute the similarity s of two vectors with indicating maximum similarity, and not the distance, we define d(ta, TB) = s(ta, TB) to get a distance measure as required for MDS. Two timeline vectors with TA = λ TB,(λ > 0) are considered similar by these measures, so we get d(t, T) = 0. With cosine similarity we get d(t, T ) = 0., d(t, T ) = (the maximum distance possible) and d(t, T ) = 0.09, i. e. sensible values. for indices outside the time range (e. g. a ) the value 0 is substituted. which are in fact very similar, see e. g. []
13 Pixel-Oriented Network Visualization u u u u u u u0 u u u u u u u0 U8 U U U U U U0 U7 U U Fig. workgroup wiki network matrix ordered by collaboration activity. Fig. 7 Two-dimensional MDS representation of the workgroup wiki timelines Unfortunately both measures do not allow to compare with the zero vector (which has no direction). While this is not a problem for ordering collaboration activities as in figure where users with empty timelines could simply be put at the end, it will give problems for ordering matrices (section..) as here a lot of glyphs with empty timelines are normal (see e. g. figure ). Therefore comparison with the zero vector has to be handled as special case. Two empty timelines have distance 0. As absolute intensity is ignored by cosine, it seems natural to take the limes for λ 0 and set the distance of any to the zero vector to 0. As this does not give the desired effect a possible solution is to use a fixed maximum distance of (which is larger than all other distances)... From Timeline Distances to User Arrangement Applying distance measures as described for all pairs of timelines gives a distance matrix. Now timelines with low distances can be clustered together. Using multidimensional scaling down to one dimension assigns a one dimensional position to each timeline. Please note that this even works if the distance matrix does not fulfil all properties of a metric space (e. g. triangle inequality). Sorting the users according to these positions of their corresponding timelines arranges similar ones next to each other as visible in figure. While the sorting is done on the smoothened timelines which, as noted above, may have coarser resolution than the original timelines, the original ones are used in visualization. Choosing good projections for MDS is known as a difficult problem [], especially to only one dimension. We tested a lot of distance measures and parameters (more than described above) on social network data of several wikis (examples are given in section ) and found the ordering of the nodes to be very sensitive even multidimensional scaling (MDS, []) in several variants [0].
14 Klaus Stein, René Wegener, and Christoph Schlieder to small changes of the distance matrix while the grouping of users with very similar timelines stays rather robust. This means small parameter changes give different layouts, but within these similar users (e. g. U and U8 in figure ) are grouped together. So it is important to choose the right distance measure for the general decision whether to focus on absolute intensity or on relative cooccurrence of activity, as this defines which timelines should be grouped together, and adjusting the parameters can improve the visual appearance... Arranging Matrix Rows and Columns For the network matrix further parameters have to be considered for sorting the nodes. While grouping by external knowledge or some node feature stays the same arranging by similarity gets more complex as now each node does not longer correspond to exactly one timeline but (for n nodes) is connected to (n ) timelines (the corresponding row and column). We cannot simply arrange these n(n ) timelines by similarity as this would destroy the matrix structure. We can only rearrange whole columns and rows so that rows/columns containing similar timelines are placed next to each other. While it is possible to have a different order for rows and columns we would not recommend this, as it is unintuitive for a network matrix. An easy and often sufficient approach is to use the node arrangement computed on their collaboration activity timelines for the matrix (figures b, ). Our workgroup wiki example will not gain any further improvements by more sophisticated measures. For other (larger) networks things are different as we will see later (figures 9, 0 and ). The user timelines only show the unspecific activity pattern of the users. So two users being active at the same time are grouped together even if their activity is totally unrelated. To compute the similarity of two user rows in the network matrix we therefore compute the distances of their glyphs column by column, e. g. for the users U and U in figure (third and second bottommost row) we compare the glyph distances d(u/u8, U/U8), d(u/u, U/U),..., d(u/u0, U/U0). The distance of the rows U and U is the sum of the distances of their glyphs. Please note that for comparison of the single glyph timelines everything described in section.. (smoothing, distance measures etc.) can be used. Pairs involving diagonal elements (in the example i. e. d(u/u, U/U) and d(u/u, U/U)) cannot be computed and are either skipped or the diagonal glyph is virtually set to a timeline with each point set to the maximum of all values in the matrix. The latter measure supports users working closely together to be considered more similar, e. g. d(u/u, U/U) < d(u0/u, U/U). Both options work and we found the effect of the latter negligible on large networks.
15 Pixel-Oriented Network Visualization Fig. 8 Graphical user interface of PONVA. The main window shows a pixel-oriented visualization of the network matrix while the dialog in front allows to change γ-correction. To the right detailed information about one selected pixel is displayed.. Interactive Visual Data Mining The visualization paradigm described in this section has been implemented as part of a network analysis package. Figure 8 gives a screenshot of PONVA, our interactive pixel-oriented network visualization application, which allows us to experiment with different layout algorithms and settings on various networks. Zooming into the full plot as well as zooming the glyphs by changing the resolution (figure (b) to (c)) allows to switch between overview and detailed inspection. Selecting the active timespan within one glyph or one column or row and highlighting the corresponding pixels in the other glyphs helps comparing user activity in time. Same holds for selecting one user or a pair of users and highlighting the corresponding row(s), column(s) (and intersections). And clicking on a pixel reveals detailed data from the point in time it represents to its numeric value. Selecting one of the different glyph layout mechanisms, color scales and some γ-correction appropriate to the network to be analyzed is done interactively. And finally rows and columns can be sorted according to various similarity measures, which are not discussed within this chapter due to limited space. Additionally an interactive application allows to use the distance information available from MDS. Figure 7 shows a two-dimensional configuration of the time- POVNA is written in Java. Not all features described are included in the current stable release.
16 Klaus Stein, René Wegener, and Christoph Schlieder lines (folded to row-by-row glyphs) scaled to two dimensions. This places the users/glyphs according to the similarity relations of their timelines, so similar glyphs are visually clustered together. This representation is is not too useful on a printed paper as it takes a larger amount of space which conflicts with the paradigm of pixeloriented visualization that requires to use every pixel on the screen to represent one data point. The visualizations also suffers from cluttering as there are problems with overlapping glyphs and the annotation of the user names. However it is useful for interactive exploration where one can scroll and zoom in and annotations are dynamically displayed on mouseover or mouseclick. Exploring Larger Networks In the following we demonstrate the practical value of pixel-oriented visualization with three real world data sets. The datasets were collected during our research on organizational wikis. A particularity of the data set is the fact that rich background knowledge about the organization context of the social networks has been collected independently. We show how user interaction patterns in time become visually salient in the pixel-oriented visualization and demonstrate that these findings correspond to social patterns in the organization. Additionally whe show how different node ordering methods improve the visualization.. Students Wiki In section. we introduced the students wiki (figure ). Figure 9 shows the pixeloriented visualization of this network sorted by (weighted) node degree: (b) gives the interaction timeline for each user (day by day grouped weekly) and (a) the corresponding network matrix. Both plots only give users with at least 0 interactions as the full matrix with users would not be readable on this paper size. 7 The first thing catching our eye in figure 9b is this dotted horizontal bar showing up at about one third of the timeline (users marked with ). This is the start of the second term and obviously the new and some of the older students did a lot of work in this week. Next we see heavy traffic at the beginning of the timeline for a two week timespan ( ). Some users continue to participate, others leave, but most of the active users within the first two terms do not show up in the third one, and the users active in the third term ( ) were not present before. Exceptions are only users U8 and U0 who show up again at the end of the third term and U8 who came in in the middle of the second term but without being very active. The users of the In-depth case studies of the wikis used here including other types of (not pixel-oriented) visualizations are provided by [] 7 this is less a problem in interactive usage where we can scroll.
17 Pixel-Oriented Network Visualization 7 u0 u u9 u u0 u0 u u u0 u u0 9 0 u99 8 u 0 u0 u u97 u0 u0 u0 u u u09 u08 9 u u0 8 u0 u u9 u u0 u0 u u u0 u u0 9 0 u99 8 u 0 u0 u u97 u0 u0 u0 u u u09 u08 9 u u0 8 (a) network matrix u0 u u9 u u0 u0 u u u0 u u0 9 0 u99 8 u 0 u0 u u97 u0 u0 u0 u u u09 u08 9 u u0 8 (b) collaboration intensity. Time goes from bottom to top Fig. 9 students wiki (users with at least 0 links)
18 8 Klaus Stein, René Wegener, and Christoph Schlieder u0 9 u u0 u99 u u u0 u u9 u u0 u0 u u0 u u u0 u u09 u 9 8 u0 u0 u08 u 0 u0 u0 9 u u0 u99 u u u0 u u9 u u0 u0 u u0 u u u0 u u09 u 9 8 u0 u0 u08 u 0 u0 (a) network matrix u0 9 u u0 u99 u u u0 u u9 u u0 u0 u u0 u u u0 u u09 u 9 8 u0 u0 u08 u 0 u0 (b) collaboration intensity. Time goes from bottom to top Fig. 0 students wiki (users with at least 0 links), grouped by collaboration intensity timeline similarity
19 Pixel-Oriented Network Visualization 9 u0 u9 u u u u99 u0 8 9 u0 0 u97 u u0 u u0 u0 u u0 u u09 u 9 u u0 u08 u0 0 u0 8 u u0 u9 u u u u99 u0 8 9 u0 0 u97 u u0 u u0 u0 u u0 u u09 u 9 u u0 u08 u0 0 u0 8 u (a) network matrix u0 u9 u u u u99 u0 8 9 u0 0 u97 u u0 u u0 u0 u u0 u u09 u 9 u u0 u08 u0 0 u0 8 u (b) collaboration intensity. Time goes from bottom to top Fig. students wiki (users with at least 0 links), grouped by glyph row similarity
20 0 Klaus Stein, René Wegener, and Christoph Schlieder first and second term overlap but there is low connection to the third term, and this confirms what we see in figure (b) to (d). While all this is visible in figure 9b, it gets more obvious in figure 0b where the user timelines are arranged by similarity as described in section... The students active in the first, second and third term are now grouped together with the ones active during a whole year (first and second term) nicely placed in between. The network matrix in figure 9a presents a nearly regular pattern of similar glyphs interrupted by rows and columns with rather empty glyphs and some few darker ones. By looking closer we can distinguish users with different glyph patterns. The changed arrangement of rows and columns in figure 0a makes things easier by revealing patterns which give further insights: while there is collaboration between the students of the first and second term visible in the matrix the third term is nearly fully disconnected. Most of the students present in the first term collaborate with most of their fellows, for the wiki this means they all worked on the same wiki pages, they did not split up in subgroups working on different pages. Only user U0 stands out. While joining the wiki in the third term contrary to his fellows, he connects to most of the users of the first and second term, i. e. works on the pages they edited before. In figure not the collaboration intensity of the student timelines but the similarity of the glyph rows in the matrix is used to arrange the students. The difference to figure 0 is not as big as between figures 9 and 0 but noticeable. The three groups of students are clearly visible in the matrix, as well as the connection between the first and second group (users U8 to U, i. e. the first six rows/columns in the matrix, plus some connection for rows/columns U to U). The exact configuration of rows and columns is highly dependent from the parameters chosen (smooth factor, distance metric, etc.) as if the distances between rows are small even small changes give different MDS to the one-dimensional space. Nevertheless the general patterns visible in the figure are rather stable. So figures 9 to are a good example for the perceptual salience of regular temporal collaboration patterns on one hand and the improvements possible by sophisticated grouping of the nodes. They not only show cooccurrence of activity of single users but also which users and groups of users are connected at which time.. Facilitation Wiki Figure introduces a new network. It shows the internal wiki of an organization that lends support for information and communication technologies to small and medium enterprises. The wiki features mostly research articles, project reports, and later publications thereof. The wiki is maintained as a dedicated project, and employees are enforced to generate a certain amount of input per anno.
21 Pixel-Oriented Network Visualization u u 0 u u 0 u9 u9 u u u u u 0 u u0 u u (a) collaboration intensity u u 0 u u 0 u9 u9 u u u u u 0 u u0 u u u u 0 u u 0 u9 u9 u u u u u 0 u u0 u u (b) network matrix Fig. facilitation wiki (users with at least 0 links) The first thing catching the eye when inspecting the collaboration intensity timelines (a) is that users U ( ) and U8 ( ) stand out. 8 U was the project manager of the wiki project and obviously did a lot of work until she left the company after three quarters of the timeline, where U8 had to take over this position. Both users were connected to almost everybody else as we can see in the first two rows of the network matrix (figure b). Only a connection from U to U is missing and as we can see in the user timelines U joined the wiki after U left. We could backup our findings by interviews where we learned that U (and later U8) was asked by all others to assist them with the wiki. However, the wiki was never really accepted by the users, and almost nobody started to use the wiki as a collaboration tool, so she had to work as a gardener, formating the articles others had written. And the missing collaboration is visible in the network matrix, most of the glyphs are empty or very sparsely filled, only the pair U80 and U stands out, here direct collaboration shows up. Contrary to the last example no uniform patterns are visible, but temporal cooccurrence as well as temporal connection draw attention when present in the fuzzy gray-spotted area of glyphs as our visual perception smoothens single dots to larger areas. 8 we may also see that they never edited the wiki on weekends.
22 Klaus Stein, René Wegener, and Christoph Schlieder u u u u u0 u9 u u u u u u u0 u u u u9 u u (a) collaboration intensity u u u u u0 u9 u u u u u u u0 u u u u9 u u u u u u u0 u9 u u u u u u u0 u u u u9 u u (b) network matrix Fig. startup wiki network matrix. Startup Wiki Our last example (figure ) shows the startup wiki. It is the company wiki of an European market leader for one-stop solutions and services in mobile or proximity marketing. It was installed during the founding of the company and since then keeps its role as primary collaboration and main content management system for the engineers. Growing from to users the primary users never stopped to use the wiki and new employees joining get connected to the others. Wiki growth at its best, and nicely visible in the user interlocking network matrix. We see users with high activity as well as ones showing up only once which is not surprising as e. g. accounting uses other specialized software and other groups only need to read wiki contents without editing. Contrary to the facilitation wiki users connect to each other, we do not have this one gardener getting all the connections but collaboration between the active users. Not everyone is using the wiki intensely but those who do are connected to each other.
23 Pixel-Oriented Network Visualization Conclusions and Outlook In this chapter we introduced pixel-oriented network visualization (PONV), a new visualization for weighted social networks changing in time. Our approach focuses on using static visualization of dynamic network data as a method of data exploration by the means of visual data mining which works best on medium-size networks where the whole matrix fits on the screen or paper at once. We discussed different patterns for pixel as well as the glyph arrangement. Furthermore, we presented different pixel layouts from simple ones like plain rows to more complex space-filling curves. It turns out that the arrangement of the glyphs should be adapted according to the actual task and can be lead by external knowledge, some node feature or similarity. We therefore introduced different distance measures. The interpretation of PONV is not immediately obvious but easily learned by the interested expert. Using several interlocking coauthor networks extracted from organizational wikis we could show how pixel-oriented network visualization reveals perceptual salient patterns for temporal cooccurrence of user collaboration as well as user-user-connection. In our examples we tended to stick to the simple row by row approach for the arrangement of the pixels which we found to be least irritating. In addition a glyph arrangement according to similarity lead to the visually most appealing results. PONV allows to detect similar collaboration patterns across users and to reveal the collaboration between a pair of users across time. It is nevertheless meant as complementary visualization besides network graphs and others and not as a substitution, as interactions between groups of more than two users only show up indirectly, it focuses on temporal patterns, not on the detection of network clusters. This also means that PONV is less useful on rather sparse networks with low traffic where no visual patterns emerge. Topics for future work are the development of more sophisticated glyph layouts as well as combining pixel-oriented visualization with other layout algorithms like using the glyphs as nodes in a graph representation integrating both visualization techniques. Finally the understandability and usefulness of PONV shall be examined and at best improved by user studies. The analysis and graphics in this chapter are produced using the pixvis/ponv module of our own Wiki Explorator library. 9 It is available for download under an open source licence. 0 9 Wiki Explorator is written in Ruby using R, Gnuplot, Graphviz and other open source software. 0 We also provide an online wiki analysis service based on this library at
24 Klaus Stein, René Wegener, and Christoph Schlieder Acknowledgements Part of this work was supported by the Volkswagenstiftung through Grant No. II/8 09. We thank the reviewers for their helpful comments. References. Aigner, W., Miksch, S., Müller, W., Schumann, H., Tominski, C.: Visualizing time-oriented data. a systematic view. Computers & Graphics (), 0 09 (007). Ankerst, M., Berchtold, S., Keim, D.A.: Similarity clustering of dimensions for an enhanced visualization of multidimensional data. In: INFOVIS 98: Proceedings of the 998 IEEE Symposium on Information Visualization. IEEE Computer Society (998). Balzer, M., Deussen, O.: Level-of-detail visualization of clustered graph layouts. In: APVIS 07., pp. 0. IEEE (007). Bender-deMoll, S., McFarland, D.A.: The art and science of dynamic network visualization. Journal of Social Structure 7() (00). URL articles/volume7/demollmcfarland. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is nearest neighbor meaningful? Database Theory ICDT 99 pp. 7 (999). Brandes, U., Kenis, P., Raab, J.: Explanation through network visualization. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences (), (00) 7. Chen, C.: Information Visualization: Beyond the Horizon, edn. Springer, London (00) 8. Chen, I.X., Yang, C.Z.: Visualization of social networks. In: B. Furht (ed.) Handbook of Social Network Technologies and Applications, pp Springer US (00) 9. Chen, W., Ji, M., Tang, X., Zhang, B., Shi, B.: Visualization tools for exploring social networks and travel behavior. In: Environmental Science and Information Application Technology (ES- IAT), 00 International Conference on, vol., pp. 9. IEEE (00) 0. Cox, M.A.A., Cox, T.F.: Multidimensional scaling. In: C.h. Chen, W. Härdle, A. Unwin (eds.) Handbook of Data Visualization, Springer Handbooks Comp.Statistics, pp. 7. Springer Berlin Heidelberg (008). DOI Eppstein, D., Gansner, E.R. (eds.): Graph Drawing. 7th International Symposium, GD 009, LNCS, vol. 89 (00). Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis, th edn. Wiley Publishing (009). Ghoniem, M., Fekete, J.D., Castagliola, P.: A comparison of the readability of graphs using node-link and matrix-based representations. In: INFOVIS 0: Proceedings of the IEEE Symposium on Information Visualization, pp. 7. IEEE, Washington, DC, USA (00). DOI Guo, Y., Chen, C., Zhou, S.: Fingerprint for network topologies. In: J. Zhou (ed.) Complex Sciences, pp. 77. Springer (009). Henry, N., Fekete, J.D., McGuffin, M.J.: Nodetrix: a hybrid visualization of social networks. IEEE Transactions on Visualization and Computer Graphics () (007). DOI org/0.09/tvcg Kamada, T., Kawai, S.: An Algorithm for Drawing General Undirected Graphs. Information Processing Letters () (989) 7. Keim, D.A.: Designing pixel-oriented visualization techniques: Theory and applications. IEEE Transactions on Visualization and Computer Graphics (), 9 78 (000). DOI 8. Keim, D.A., Ankerst, M., Kriegel, H.P.: Recursive pattern: A technique for visualizing very large amounts of data. In: VIS 9: Proceedings of the th conference on Visualization 9. IEEE Computer Society, Washington, DC, USA (99)
25 Pixel-Oriented Network Visualization 9. Keim, D.A., Keim, D.A., Kriegel, H.P., Kriegel, H.P.: Visdb: Database exploration using multidimensional visualization. IEEE Computer Graphics and Applications (99) 0. Krempel, L.: Network visualization. In: J. Scott, P.J. Carrington (eds.) Sage Handbook of Social Network Analysis (009). Kruskal, J.B., Wish, M.: Multidimensional Scaling. Sage University Paper series on Quantitative Application in the Social Sciences. Sage Publications (978). Leung, C., Carmichael, C.: Exploring social networks: A frequent pattern visualization approach. In: IEEE International Conference on Social Computing/IEEE International Conference on Privacy, Security, Risk and Trust, pp. 9. IEEE (00). Leydesdorff, L.: On the normalization and visualization of author co-citation data: Salton s cosine versus the jaccard index. Journal of the American Society for Information Science and Technology 9(), 77 8 (008). Moody, J., McFarland, D., Bender-deMoll, S.: Dynamic network visualization. American Journal of Sociology 0(), 0 (00). DOI 0.08/09. Peng, W., SiKun, L.: Social network visualization via domain ontology. In: International Conference on Information Engineering and Computer Science, pp.. IEEE (009). Reeve, L., Han, H., Chen, C.: Information visualization and the semeantic web. In: V. Geroimenko, C. Chen (eds.) Visualizing the Semantic Web: XML-Based Internet and Information Visualization. Springer, London (00) 7. Riche, N.H., Fekete, J.D.: Novel visualizations and interactions for social networks exploration. In: B. Furht (ed.) Handbook of Social Network Technologies and Applications, pp.. Springer US (00). DOI 8. Shi, L., Cao, N., Liu, S., Qian, W., Tan, L., Wang, G., Sun, J., Lin, C.: HiMap: Adaptive visualization of large-scale online social networks. In: Visualization Symposium, PacificVis 09. IEEE (009) 9. Shneiderman, B.: Extreme visualization: squeezing a billion records into a million pixels. In: Proceedings of the 008 ACM SIGMOD international conference on Management of data, pp.. ACM (008) 0. Shneiderman, B., Aris, A.: Network visualization by semantic substrates. IEEE Transactions on Visualization and Computer Graphics (), 7 70 (00). DOI 09/TVCG.00.. Stein, K., Blaschke, S.: Corporate Wikis: Comparative Analysis of Structures and Dynamics. In: K. Hinkelmann, H. Wache (eds.) Proceedings of the th Conference on Professional Knowledge Management, Lecture Notes in Informatics, pp Gesellschaft für Informatik, Bonn (009). Stein, K., Blaschke, S.: Interlocking communication. measuring collaborative intensity in social networks. In: Social Networks Analysis and Mining: Foundations and Applications. Springer (00). Unwin, A., Theus, M., Hofmann, H.: Graphics of large datasets: visualizing a million. Springer-Verlag New York Inc (00). Ware, C.: Visual Queries: The Foundation of Visual Thinking. Springer, Berlin, Heidelberg (00). Wegener, R.: Kollaborationsprozesse in Wikis: Entwurf und Umsetzung eines Analysewerkzeugs. Master s thesis, University of Bamberg (009)
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
Principles of Data Visualization for Exploratory Data Analysis. Renee M. P. Teate. SYS 6023 Cognitive Systems Engineering April 28, 2015
Principles of Data Visualization for Exploratory Data Analysis Renee M. P. Teate SYS 6023 Cognitive Systems Engineering April 28, 2015 Introduction Exploratory Data Analysis (EDA) is the phase of analysis
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
Interactive Voting System. www.ivsystem.nl. IVS-Basic IVS-Professional 4.4
Interactive Voting System www.ivsystem.nl IVS-Basic IVS-Professional 4.4 Manual IVS-Basic 4.4 IVS-Professional 4.4 1213 Interactive Voting System The Interactive Voting System (IVS ) is an interactive
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
Voronoi Treemaps in D3
Voronoi Treemaps in D3 Peter Henry University of Washington [email protected] Paul Vines University of Washington [email protected] ABSTRACT Voronoi treemaps are an alternative to traditional rectangular
Solving Simultaneous Equations and Matrices
Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering
VISUAL ALGEBRA FOR COLLEGE STUDENTS. Laurie J. Burton Western Oregon University
VISUAL ALGEBRA FOR COLLEGE STUDENTS Laurie J. Burton Western Oregon University VISUAL ALGEBRA FOR COLLEGE STUDENTS TABLE OF CONTENTS Welcome and Introduction 1 Chapter 1: INTEGERS AND INTEGER OPERATIONS
Assignment 4 CPSC 217 L02 Purpose. Important Note. Data visualization
Assignment 4 CPSC 217 L02 Purpose You will be writing a Python program to read data from a file and visualize this data using an external drawing tool. You will structure your program using modules and
The Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
Graph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration
Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Agenda. TreeMaps. What is a Treemap? Basics
Agenda TreeMaps What is a Treemap? Treemap Basics Original Treemap Algorithm (Slice-and-dice layout) Issues for Treemaps Cushion Treemaps Squarified Treemaps Ordered Treemaps Quantum Treemaps Other Treemaps
Multi-Dimensional Data Visualization. Slides courtesy of Chris North
Multi-Dimensional Data Visualization Slides courtesy of Chris North What is the Cleveland s ranking for quantitative data among the visual variables: Angle, area, length, position, color Where are we?!
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups Abstract Yan Shen 1, Bao Wu 2* 3 1 Hangzhou Normal University,
Modeling Guidelines Manual
Modeling Guidelines Manual [Insert company name here] July 2014 Author: John Doe [email protected] Page 1 of 22 Table of Contents 1. Introduction... 3 2. Business Process Management (BPM)... 4 2.1.
Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA
Are Image Quality Metrics Adequate to Evaluate the Quality of Geometric Objects? Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA ABSTRACT
Network Metrics, Planar Graphs, and Software Tools. Based on materials by Lala Adamic, UMichigan
Network Metrics, Planar Graphs, and Software Tools Based on materials by Lala Adamic, UMichigan Network Metrics: Bowtie Model of the Web n The Web is a directed graph: n webpages link to other webpages
CS 4620 Practicum Programming Assignment 6 Animation
CS 4620 Practicum Programming Assignment 6 Animation out: Friday 14th November 2014 due: : Monday 24th November 2014 1 Introduction In this assignment, we will explore a common topic in animation: key
CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin
My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Guidelines for Effective Email Creative
Guidelines for Effective Email Creative While the need for quality and effective design has always existed from a marketing standpoint, challenges unique to the email space require a different look at
5. Binary objects labeling
Image Processing - Laboratory 5: Binary objects labeling 1 5. Binary objects labeling 5.1. Introduction In this laboratory an object labeling algorithm which allows you to label distinct objects from a
THE ULTIMATE BEST PRACTICES
THE ULTIMATE BEST PRACTICES FOR EFFECTIVE BANNER AD DESIGN AND DEVELOPMENT By Amir Dori, Marketing Manager at Digitaland ? IS THIS EBOOK RIGHT FOR ME? This ebook is meant to be a guide for the best practices
Chapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
RapidMiner 5.2: Advanced Charts
RapidMiner 5.2: Advanced Charts Manual Rapid-I www.rapid-i.com c 2012 by Rapid-I GmbH. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted,
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION
A HYBRID APPROACH FOR AUTOMATED AREA AGGREGATION Zeshen Wang ESRI 380 NewYork Street Redlands CA 92373 [email protected] ABSTRACT Automated area aggregation, which is widely needed for mapping both natural
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Classroom Tips and Techniques: The Student Precalculus Package - Commands and Tutors. Content of the Precalculus Subpackage
Classroom Tips and Techniques: The Student Precalculus Package - Commands and Tutors Robert J. Lopez Emeritus Professor of Mathematics and Maple Fellow Maplesoft This article provides a systematic exposition
TIBCO Spotfire Network Analytics 1.1. User s Manual
TIBCO Spotfire Network Analytics 1.1 User s Manual Revision date: 26 January 2009 Important Information SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO
Matrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c
Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c Lesson Outline BIG PICTURE Students will: manipulate algebraic expressions, as needed to understand quadratic relations; identify characteristics
Representing Geography
3 Representing Geography OVERVIEW This chapter introduces the concept of representation, or the construction of a digital model of some aspect of the Earth s surface. The geographic world is extremely
the points are called control points approximating curve
Chapter 4 Spline Curves A spline curve is a mathematical representation for which it is easy to build an interface that will allow a user to design and control the shape of complex curves and surfaces.
Simple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
MassArt Studio Foundation: Visual Language Digital Media Cookbook, Fall 2013
INPUT OUTPUT 08 / IMAGE QUALITY & VIEWING In this section we will cover common image file formats you are likely to come across and examine image quality in terms of resolution and bit depth. We will cover
Excel -- Creating Charts
Excel -- Creating Charts The saying goes, A picture is worth a thousand words, and so true. Professional looking charts give visual enhancement to your statistics, fiscal reports or presentation. Excel
GelAnalyzer 2010 User s manual. Contents
GelAnalyzer 2010 User s manual Contents 1. Starting GelAnalyzer... 2 2. The main window... 2 3. Create a new analysis... 2 4. The image window... 3 5. Lanes... 3 5.1 Detect lanes automatically... 3 5.2
Cluster Analysis: Advanced Concepts
Cluster Analysis: Advanced Concepts and dalgorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 08/06/2006 1 Introduction to Data Mining 08/06/2006 1 Outline Prototype-based Fuzzy c-means
Visualization of 2D Domains
Visualization of 2D Domains This part of the visualization package is intended to supply a simple graphical interface for 2- dimensional finite element data structures. Furthermore, it is used as the low
Universal Simple Control, USC-1
Universal Simple Control, USC-1 Data and Event Logging with the USB Flash Drive DATA-PAK The USC-1 universal simple voltage regulator control uses a flash drive to store data. Then a propriety Data and
Exploratory Data Analysis for Ecological Modelling and Decision Support
Exploratory Data Analysis for Ecological Modelling and Decision Support Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and 5th ECEM conference,
BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou
BIG DATA VISUALIZATION Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou Let s begin with a story Let s explore Yahoo s data! Dora the Data Explorer has a new
Exploratory Spatial Data Analysis
Exploratory Spatial Data Analysis Part II Dynamically Linked Views 1 Contents Introduction: why to use non-cartographic data displays Display linking by object highlighting Dynamic Query Object classification
Visualization Quick Guide
Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive
Dotted Chart and Control-Flow Analysis for a Loan Application Process
Dotted Chart and Control-Flow Analysis for a Loan Application Process Thomas Molka 1,2, Wasif Gilani 1 and Xiao-Jun Zeng 2 Business Intelligence Practice, SAP Research, Belfast, UK The University of Manchester,
ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL
Kardi Teknomo ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL Revoledu.com Table of Contents Analytic Hierarchy Process (AHP) Tutorial... 1 Multi Criteria Decision Making... 1 Cross Tabulation... 2 Evaluation
Section 1.1. Introduction to R n
The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to
Copyright 2008 Stephen Few, Perceptual Edge Page of 11
Copyright 2008 Stephen Few, Perceptual Edge Page of 11 Dashboards can keep people well informed of what s going on, but most barely scratch the surface of their potential. Most dashboards communicate too
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
SAS BI Dashboard 3.1. User s Guide
SAS BI Dashboard 3.1 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2007. SAS BI Dashboard 3.1: User s Guide. Cary, NC: SAS Institute Inc. SAS BI Dashboard
Visibility optimization for data visualization: A Survey of Issues and Techniques
Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological
KaleidaGraph Quick Start Guide
KaleidaGraph Quick Start Guide This document is a hands-on guide that walks you through the use of KaleidaGraph. You will probably want to print this guide and then start your exploration of the product.
Component visualization methods for large legacy software in C/C++
Annales Mathematicae et Informaticae 44 (2015) pp. 23 33 http://ami.ektf.hu Component visualization methods for large legacy software in C/C++ Máté Cserép a, Dániel Krupp b a Eötvös Loránd University [email protected]
TEXT-FILLED STACKED AREA GRAPHS Martin Kraus
Martin Kraus Text can add a significant amount of detail and value to an information visualization. In particular, it can integrate more of the data that a visualization is based on, and it can also integrate
Updates to Graphing with Excel
Updates to Graphing with Excel NCC has recently upgraded to a new version of the Microsoft Office suite of programs. As such, many of the directions in the Biology Student Handbook for how to graph with
Get The Picture: Visualizing Financial Data part 1
Get The Picture: Visualizing Financial Data part 1 by Jeremy Walton Turning numbers into pictures is usually the easiest way of finding out what they mean. We're all familiar with the display of for example
An Introduction to Point Pattern Analysis using CrimeStat
Introduction An Introduction to Point Pattern Analysis using CrimeStat Luc Anselin Spatial Analysis Laboratory Department of Agricultural and Consumer Economics University of Illinois, Urbana-Champaign
MultiExperiment Viewer Quickstart Guide
MultiExperiment Viewer Quickstart Guide Table of Contents: I. Preface - 2 II. Installing MeV - 2 III. Opening a Data Set - 2 IV. Filtering - 6 V. Clustering a. HCL - 8 b. K-means - 11 VI. Modules a. T-test
Creating Interactive PDF Forms
Creating Interactive PDF Forms Using Adobe Acrobat X Pro Information Technology Services Outreach and Distance Learning Technologies Copyright 2012 KSU Department of Information Technology Services This
DHL Data Mining Project. Customer Segmentation with Clustering
DHL Data Mining Project Customer Segmentation with Clustering Timothy TAN Chee Yong Aditya Hridaya MISRA Jeffery JI Jun Yao 3/30/2010 DHL Data Mining Project Table of Contents Introduction to DHL and the
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Hierarchical Data Visualization
Hierarchical Data Visualization 1 Hierarchical Data Hierarchical data emphasize the subordinate or membership relations between data items. Organizational Chart Classifications / Taxonomies (Species and
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful
The P.A.R.C. Principles of Visual Design. Proximity. CMPT 165: Design Principles
The P.A.R.C. Principles of Visual Design CMPT 165: Design Principles Tamara Smyth, [email protected] School of Computing Science, Simon Fraser University October 24, 2011 In The Non-Designer s Design Book,
STAN. Structure Analysis for Java. Version 2. White Paper. Fall 2009
STAN Structure Analysis for Java Version 2 White Paper Fall 2009 Abstract: This paper gives a brief introduction to structure analysis using STAN, a static code analysis tool bringing together Java development
Using Library Dependencies for Clustering
Using Library Dependencies for Clustering Jochen Quante Software Engineering Group, FB03 Informatik, Universität Bremen [email protected] Abstract: Software clustering is an established approach
HierarchyMap: A Novel Approach to Treemap Visualization of Hierarchical Data
P a g e 77 Vol. 9 Issue 5 (Ver 2.0), January 2010 Global Journal of Computer Science and Technology HierarchyMap: A Novel Approach to Treemap Visualization of Hierarchical Data Abstract- The HierarchyMap
Gephi Tutorial Quick Start
Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get
Design Elements & Principles
Design Elements & Principles I. Introduction Certain web sites seize users sights more easily, while others don t. Why? Sometimes we have to remark our opinion about likes or dislikes of web sites, and
DICON: Visual Cluster Analysis in Support of Clinical Decision Intelligence
DICON: Visual Cluster Analysis in Support of Clinical Decision Intelligence Abstract David Gotz, PhD 1, Jimeng Sun, PhD 1, Nan Cao, MS 2, Shahram Ebadollahi, PhD 1 1 IBM T.J. Watson Research Center, New
1 Example of Time Series Analysis by SSA 1
1 Example of Time Series Analysis by SSA 1 Let us illustrate the 'Caterpillar'-SSA technique [1] by the example of time series analysis. Consider the time series FORT (monthly volumes of fortied wine sales
Evaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results
Evaluation of a New Method for Measuring the Internet Distribution: Simulation Results Christophe Crespelle and Fabien Tarissan LIP6 CNRS and Université Pierre et Marie Curie Paris 6 4 avenue du président
A Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 [email protected] Abstract The main objective of this project is the study of a learning based method
Microsoft Excel 2010 Charts and Graphs
Microsoft Excel 2010 Charts and Graphs Email: [email protected] Web Page: http://training.health.ufl.edu Microsoft Excel 2010: Charts and Graphs 2.0 hours Topics include data groupings; creating
an introduction to VISUALIZING DATA by joel laumans
an introduction to VISUALIZING DATA by joel laumans an introduction to VISUALIZING DATA iii AN INTRODUCTION TO VISUALIZING DATA by Joel Laumans Table of Contents 1 Introduction 1 Definition Purpose 2 Data
Access 2007 Creating Forms Table of Contents
Access 2007 Creating Forms Table of Contents CREATING FORMS IN ACCESS 2007... 3 UNDERSTAND LAYOUT VIEW AND DESIGN VIEW... 3 LAYOUT VIEW... 3 DESIGN VIEW... 3 UNDERSTAND CONTROLS... 4 BOUND CONTROL... 4
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
A Short Introduction to Computer Graphics
A Short Introduction to Computer Graphics Frédo Durand MIT Laboratory for Computer Science 1 Introduction Chapter I: Basics Although computer graphics is a vast field that encompasses almost any graphical
Time series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
Visualizing an Auto-Generated Topic Map
Visualizing an Auto-Generated Topic Map Nadine Amende 1, Stefan Groschupf 2 1 University Halle-Wittenberg, information manegement technology [email protected] 2 media style labs Halle Germany [email protected]
Models of Cortical Maps II
CN510: Principles and Methods of Cognitive and Neural Modeling Models of Cortical Maps II Lecture 19 Instructor: Anatoli Gorchetchnikov dy dt The Network of Grossberg (1976) Ay B y f (
Tracking Project Progress
L E S S O N 2 Tracking Project Progress Suggested lesson time 45-55 minutes Lesson objectives To begin tracking an active project, you will: a b c Modify the environment for tracking. You will use the
Data Visualization Techniques
Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The
m ac romed ia Fl a s h Curriculum Guide
m ac romed ia Fl a s h Curriculum Guide 1997 1998 Macromedia, Inc. All rights reserved. Macromedia, the Macromedia logo, Dreamweaver, Director, Fireworks, Flash, Fontographer, FreeHand, and Xtra are trademarks
Data Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns
20 A Visualization Framework For Discovering Prepaid Mobile Subscriber Usage Patterns John Aogon and Patrick J. Ogao Telecommunications operators in developing countries are faced with a problem of knowing
