Data Structures for Big Data
|
|
|
- Gilbert Washington
- 10 years ago
- Views:
Transcription
1 Int. Journal of Computing and Optimization, Vol. 1, 2014, no. 2, HIKARI Ltd, Data Structures for Big Data Ranjit Biswas Department of Computer Science Jamia Hamdard University Hamdard Nagar, New Delhi , India Copyright 2014 Ranjit Biswas. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract The present day world dealing with big data (expanding very fast in 4Vs: Volume, Varity, Velocity and Veracity) needs New advanced kind of logical and physical storage structures, New advanced kind of heterogeneous data structures, New mathematical theories and New models for processing giant big data in 4Vs. In the literature, there is no appropriate data structure for big data, no appropriate network topology for big data, no appropriate distributed system for big data. The existing data structures, network topologies and type of distributed system are not sufficiently rich to deal with big data. The existing network topologies like tree topology or bus/ring/star/mesh/hybrid topologies seem to be weak topologies for big data processing. For a success, there is no other way but to develop new data structures, new type of distributed systems having a very fast and tremendous extent of mutual compatibility and mutual understanding with the new data structures, new type of network topologies to support the new distributed system, and of course new mathematical/logical theory models. Needless to mention that the next important issue is how to integrate all these new to make ultimately a single and simple scalable system to the laymen users. With these views in mind, Biswas [11, 12, 15] introduced a new special type of fundamental data structure atrain exclusively for heterogeneous big data (where, the data structure train is exclusively for homogeneous big data) and then introduced a new distributed system called by Atrain Distributed System (ADS) which can process big data of any 4V of any momentum. The rich merit of ADS [15] is due to its layout architecture making it infinitely scalable both in breadth (horizontally) and depth (vertically). The Atrain Distributed System (ADS) could be unitier or multitier having a single unique Pilot Computer (PC) and several Distributed Computers (DCs) connected by new
2 74 Ranjit Biswas type of network topologies called by Multi-horse Cart Topology and Cycle Topology. ADS is completely compatible with the data structure atrain. The data structure atrain [11, 12, 15] is a very powerful dynamic and flexible data structure, being the first data structure introduced exclusively for big data. Thus Data Structures for Big Data is to be regarded as a new subject in Big Data Science, not just as a new topic, considering the explosive momentum of the big data. With philosophically eyes, big data could be viewed as a higher order tensor. Based upon the train-atrain notion of Biswas [11, 12, 15], in this paper the author introduces few more data structures for big data : homogeneous stacks train stack and rt-coach stack, heterogeneous stacks atrain stack and ra-coach stack, homogeneous queues train queue and rt-coach queue, heterogeneous queues atrain queue and ra-coach queue, homogeneous binary trees train binary tree and rt-coach binary tree, heterogeneous binary trees atrain binary tree and ra-coach binary tree, homogeneous trees train tree and rt-coach tree, heterogeneous trees atrain tree and ra-coach tree, to enrich the subject Data Structures for Big Data for big data science. Mathematics Subject Classification: 68P05, 68M14, 11C20, 15B36 Keywords: train, CD-table, atrain, atrain distributed system (ADS), Multi-horse Cart Topology and Cycle Topology, train stack, atrain stack, rt-coach stack, ra-coach stack, train queue, atrain queue, rt-coach queue, ra-coach queue, train binary tree, atrain binary tree, rt-coach binary tree, ra-coach binary tree, train tree, atrain tree, rt-coach tree and ra-coach tree 1 Introduction The present world of big data [5-7, 9, 10-14, 15, 18 ] are expanding very fast in 4Vs: Volume, Varity, Velocity and Veracity, and also in many more directions. How to deal with explosive momentum of big data, how to process big data in an efficient way within limited resources, etc. are of major concern to the computer scientists now - a - days. Big data has variable mass of 4Vs in various directions, and in this sense it is to be philosophically regarded as a vector, or rather as a tensor. The existing data structures of computer science are neither appropriate nor sufficient to deal with the big data. The homogeneous data structure r-train and the heterogeneous data structure r-atrain for big data are introduced by Biswas in [11, 12, 15], being the first attempt to introduce any exclusive data structure for big data. The architecture of a new type of distributed system Atrain Distributed System (ADS) exclusively designed for processing big data is then introduced by Biswas in [15]. The huge merit of ADS stands on the fact that both the data structures train and atrain are 100% compatible with the new distributed system ADS for processing big data, in particular to coup easily with 4Vs of any momentum.
3 Data structures for big data 75 It is obvious that the Data Structures for Big Data is to be regarded as a new subject in big data science, not just as a new topic, considering the explosive momentum of the big data in a new universe. One could view big data with philosophical eyes as a higher order tensor. The data structures train and atrain with ADS and the mathematical models like solid matrix/latrix, hematrix/helatrix to store big data of any big amount of any datatype have opened easy gates to the world giant organizations and their developers to deal with big data. In this paper we introduce few more new data structures which are : homogeneous stacks train stack and rt-coach stack ; heterogeneous stacks atrain stack and ra-coach stack ; homogeneous queues train queue and rt-coach queue ; heterogeneous queues atrain queue and ra-coach queue ; homogeneous binary trees train binary tree and rt-coach binary tree ; heterogeneous binary trees atrain binary tree and ra-coach binary tree ; homogeneous trees train tree and rt-coach tree ; heterogeneous trees atrain tree and ra-coach tree ; to enrich the subject Data Structures for Big Data with the extended notion of the architecture of the data structures r-atrain and r-train of Biswas [11, 12, 15] for big data. As a pre-requisite to understand the work of this paper, one could see the work of Biswas [11, 12, 15] to get introduced with the basics of r-train, r-atrain and ADS for big data. 2 r-train Stack and r-atrain Stack In this section we introduce two new data structures for big data: (1) r-train stack or train stack in short, and (2) r-atrain stack or atrain stack in short. (The natural number r is suitably predecided and fixed by the programmer depending upon the problem of Big Data under consideration and also upon the organization/industry for which the problem is posed. We do not feel it appropriate to propose r in a r-train or in a r-atrain as a variable). 2.1 Introducing r-train Stack (or Train Stack, in short) First of all we define few terminologies Coach of a Train Stack The term coach here is coined from the notion of the data structures train and atrain for big data [11, 12, 15]. By a coach C in a train stack we mean a non-empty larray (however, it could be a null larray) containing data of homogeneous data type.
4 76 Ranjit Biswas The following figure shows a coach of a 7-train in a 7-train stack as a larray e Figure 1. a coach of a train in a 7-train stack Suppose that the larray C has m number of elements in it. If each element of C is of size x bytes, then to store the coach C in memory exactly m.x number of consecutive bytes are required Status of a Coach and Tagged Coach (TC) of a Train Stack The status s of a coach is a non-negative integer variable which is equal to the number of elements present in it (i.e. in its larray) at this point of time. Therefore, 0 s r. The significance of the variable s is that it informs us about the exact number of free spaces available in the coach at this point of time. If there is no element in the larray of the coach C, then the value of s is 0 at this point of time. If C is a coach, then the corresponding tagged coach (TC) is denoted by the notation (C, s) where s is the status of the coach. This means that C is a coach tagged with an information on the total amount of available free space (here it is termed as elements) inside it at this time. For example, consider the coach C in Figure 1. Clearly its corresponding TC will be denoted by (C, 0) as shown below in Figure e Figure 2. the tagged coach of the coach C in a train of a 7-train stack Consider the following coach H in a train of a 7-train stack represented as a larray in Figure 3 below ɛ 3 ɛ 61 e Figure 3. a coach of a train in a 7-train stack Clearly its corresponding TC will be denoted by (H, 2) as shown in Figure ɛ 3 ɛ 61 2 e Figure 4. the tagged coach of the coach H
5 Data structures for big data r-train Stack (or, Train Stack) A r-train stack (or, train stack) is a crisp stack of r-trains containing data of homogeneous data type in the whole stack. For a given r-train stack, the value of the natural number r is fixed throughout. For different r-train stacks, the value of r could be different. The r-train stack is thus a linear data structure in which a r-train may be added or removed only at the end, called the TOP of the train stack in LIFO manner as in classical stack. If TOP = 0 (Null), it implies that the train stack is an empty train stack. Figure 5. a train stack Thus a train stack T (see Figure 5) may be written as an array of trains as T: T1, T2, T3,.,Tn with the implication that the rightmost element is the top element (train Tn). In a train stack, all the trains contain data of identical data type, and so a train stack is to be regarded as a homogeneous data structure. Insertion and deletion can occur only at the top of the train stack. Two basic operations associated with train stacks are: (i) PUSH (to insert a new train into a train stack) (ii) POP (to delete a train from a train stack) The train stack has the following two important operations too, which can be easily implemented using PUSH and POP multiple times, and the properties of the data structure larray:
6 78 Ranjit Biswas (iii) c - PUSH for pushing a coach. (to insert a tagged coach in a given train of the train stack) (iv) c - POP for popping a coach. (to delete a tagged coach from a given train of the train stack) rt-coach Stack (or Coach Stack) A rt-coach stack (or, coach stack in short) is a special case of r-train stack where every train consists of one coach only. For a given rt-coach stack, the value of the natural number r is fixed throughout. For different rt-coach stacks, the value of r could be different. Thus a rt-coach stack is a crisp stack of tagged coaches containing data of homogeneous data type. It is a linear data structure in which a tagged coach may be added or removed only at the end, called the TOP of the train stack in LIFO manner as in classical stack. If TOP = 0 (Null), it implies that the coach stack is an empty coach stack. (The term rt-coach signifies that it is a coach of a r-train). Figure 6. a rt-coach stack Thus a rt-coach stack T (see Figure 6) may be written as an array of TCs as T: (C1, s1), (C2, s2), (C3, s3), (C4, s4),.., (Cr, sr) with the implication that the rightmost element is the top element (tagged coach). In a rt-coach stack all the coaches contain data of identical data type, and so a rt-coach stack is to be regarded as a homogeneous data structure. Insertion and deletion can occur only at the top of the rt-coach stack. Two basic operations associated with rt-coach stacks are:
7 Data structures for big data 79 (i) c - PUSH (to insert a new tagged coach into a rt-coach stack) (ii) c - POP (to delete a tagged coach from a rt-coach stack) The rt-coach stack has the following two important operations too, which can be easily implemented using PUSH, POP, c-push, and c-pop operations multiple times, and the properties of the data structure larray: (iii) e - PUSH for pushing an element. (to insert an element into a tagged coach (TC) of a rt-coach stack where status s is not 0, i.e. at least 1). Insertion of an element (a new passenger) x inside the coach Ci is feasible if x is of same datatype (like other passengers of the coach) and if there is an empty space available inside the coach Ci. If status of Cj is greater than 0 then data can be stored successfully in the coach, otherwise insertion operation fails here. After each successful insertion, the status s of the coach is to be updated by doing s = s 1. For insertion of x, we can replace the lowest indexed passenger of Ci with x. (iv) e - POP for popping an element. (to delete an element from a tagged coach of a rt-coach stack, retaining the tagged coach in the memory adjusting its status s by doing s s+1). We can delete a data element eij from the coach Ci. Deletion of a data (passenger) from a coach means replacement of the data by an element (of same datatype). Consequently, if eij=, then the question of deletion does not arise. Here it is pre-assumed that eij is a non- member element of the coach Ci. Thus deletion of eij is done by replacing it by the null element, and updating the status s by doing s = s+1. Deletion of a data element (passenger) does not effect the size r of the coach. For example, consider the tagged coach (Ci, m) where Ci = < ei1, ei2, ei3, ei4,..., eir>. If we delete ei3 from the coach Ci, then the updated tagged coach will be (Ci, m+1) where Ci = < ei1, ei2,, ei4,..., eir>. 2.2 Introducing r-atrain Stack (or Atrain Stack) Before proceeding to introduce r-atrain stack, few terminologies needs to be defined: Code of a Datatype and CD-Table for an Atrain Stack The concept of CD-Table in atrain stack is analogous to that of the CD-Table introduced in case of r-atrain (atrain) by Biswas [11, 12, 15]. A table is to be created by the user (i.e. by the concerned organization) to fix unique integer code
8 80 Ranjit Biswas for each datatype which are under use in the organization. This is not an absolute set of codes to be followed universally by every organization, but it is a local document for the concerned organization. For different organizations, this table could be different. But once it is fixed by an organization it should not be altered by this organization, except that addition of new datatypes and corresponding codes may be decided and be incorporated in the table at any stages later retaining the existing records. This table is called Code of Datatype Table or CD-Table (in short). A sample CD-Table of a hypothetical organization is shown in Table.1 for the sake of understanding. It may be noted that for any organization, for the datatypes character, integer, boolean, etc. the individual space requirement respectively are absolutely fixed. But for a particular organization, for the datatypes String-1, String-2 and String-3 (String type appearing thrice in this case) the space requirement in the above table has been fixed at 10 bytes for one kind, 20 bytes for another and 50 bytes for another kind. Similarly there are four types of file: File - 1, File - 2, File - 3 and File - 4, in the CD- Table and the space requirement for them have been fixed at 100 KB, 1 MB, 10 MB and 25 MB respectively, fixed by choice of the concerned organization (developers). Table. 1. A hypothetical example of a CD-Table of an organization Sr. No. Datatype Space required in bytes (n) Code of Datatype (c) 1 Character Integer Real String String String File KB 6 8 File-2 1 MB 7 9 File-3 10 MB 8 10 File-4 25 MB Coach of an Atrain Stack As seen in subsection earlier, all the coaches in a train stack contain data of homogeneous datatype across the train. In an atrain stack although a coach stores homogeneous data only (not heterogeneous data), but different coaches of an atrain stack store data elements of different datatypes. Thus the data structure atrain stack is a kind of heterogeneous data structure. For constructing a coach for an atrain stack in an organization, we must know in advance the datatype of the data to be stored in it. For this we have to look at the CD-Table of the organization, and reserve space accordingly for r number of data.
9 Data structures for big data 81 Suppose that the larray A has r number of elements in it of a given datatype. If each element of A is of size x bytes (refer to CD-Table) then to store the coach C in memory exactly r.x number of consecutive bytes are required, and accordingly the coach be created by the programmer (concerned organization). In our discussion henceforth, by the phrase datatype of a coach we shall always mean the datatype of the data elements of the coach. A coach stores and can store only homogeneous data (i.e. data of identical datatype), but datatype may be different for different coaches in an atrain stack. By a coach C in an atrain stack we mean a non-empty larray (however, it could be a null larray) containing data of homogeneous data type. The following two figures shows two coaches C1 and C2 of an atrain stack T e Figure 6. a coach C1 of a 7-atrain stack T # ɛ & ɛ # e Figure 7. another coach C2 of the 7-atrain stack T Status of a Coach and Tagged Coach (TC) of an Atrain Stack The status s of a coach in an atrain stack is a pair of information (c, n), where c is a non-negative integer variable which is the code of datatype (with reference to the concerned CD-Table) of the data to be stored in this coach and n is a non-negative integer variable which is equal to the number of elements present in it (i.e. in its larray) at this point of time. Therefore, 0 n r. In the status s = (c, n) of a coach, the information c is called the code-status of the coach and the information n is called the availability-status of the coach at this time. The significance of the variable n is that it informs us about the exact number of free spaces available in the coach at this point of time. If there is no element at this time in the larray of the coach C, then the value of n is 0. Thus, without referring the CD-Table, the status of a coach can not be and should not be fixed. If C is a coach in an atrain stack, then the corresponding tagged coach (TC) is denoted by the notation [C, s], where s = (c, n) is the status of the coach. This means that C is a coach tagged with the following two information: (i) one signifying the datatype of the data of the coach. (ii) the other reflects the total amount of available free spaces (here it is termed as elements) inside the coach at this time.
10 82 Ranjit Biswas (Thus the concept of TC for atrain stack is slightly different from the concept of TC for train stack defined earlier in section ). For example, consider Figure 6. and Figure 7. of two coaches C1 and C2 respectively of the atrain stack T. Then the following two figures (in Figure 8. and Figure 9.) shows the two tagged coaches C1 and C2 of the atrain stack T e, (2,0) Figure 8. the tagged coach of the coach C1 in the 7-atrain stack T # ɛ & ɛ # e, (0,2) Figure 9. the tagged coach of the coach C2 in the 7-atrain stack T r-atrain Stack (or, Atrain Stack) An atrain stack is a crisp stack of r-atrains containing data of heterogeneous data type. Here, each coach is homogeneous internally containing data of identical data type, but different coaches have data of different data types. For a given r-atrain stack, the value of the natural number r is fixed throughout. For different r-atrain stacks, the value of r could be different. The datatype in an atrain stack may vary from coach to coach (unlike in train stacks), but in a coach all data must be homogeneous i.e. of identical datatype. Thus each coach is homogeneous, although the atrain stack is heterogeneous. In this sense, atrain stack is regarded as a heterogeneous data structure. (In a homogeneous data structure the data elements considered are all of the same datatype, like in array, linked list, train, etc., but in a heterogeneous data structure data elements are of various datatypes as in atrain). Analogous to the notion of a train stack, an atrain stack is also a linear data structure in which an atrain may be added or removed only at the end, called the TOP of the atrain stack in LIFO manner as in classical stack (see Figure 10). If TOP = 0 (Null), it implies that the atrain stack is an empty atrain stack.
11 Data structures for big data 83 Figure 10. an atrain stack Thus an atrain stack T (see Figure 10) may be written as an array of atrains as T: T1, T2, T3,.,Tn with the implication that the rightmost element is the top element (atrain Tn). Insertion and deletion can occur only at the top of the atrain stack. Two basic operations associated with atrain stacks are: (i) PUSH (to insert a new atrain into an atrain stack) (ii) POP (to delete an atrain from an atrain stack) The atrain stack has the following two important operations too, which can be easily implemented using PUSH and POP multiple times, and the properties of the data structure larray : (iii) c - PUSH for pushing a coach. (to insert a tagged coach in a given atrain of the atrain stack) (iv) c - POP for popping a coach. (to delete a tagged coach from a given atrain of the atrain stack) It is important to note that in this heterogeneous data structure atrain stack, the size of coaches in bytes are not same, although each coach accommodates r number of data elements (including ɛ which is of same data type as explained in [12, 15]). It depends upon the code from the CD-Table. Therefore, each time we push a new atrain in an atrain stack (assuming that there is no overflow), the TOP gets updated accordingly in a dynamic way. Similarly, each time we POP an atrain from an atrain stack, the TOP gets updated accordingly.
12 84 Ranjit Biswas ra-coach Stack (or Coach Stack) A ra-coach stack (or, coach stack in short) is a special case of r-atrain stack where every atrain consists of one coach only. For a given ra-coach stack, the value of the natural number r is fixed throughout. For different ra-coach stacks, the value of r could be different. Thus a ra-coach stack is a crisp stack of tagged coaches (of atrain) containing data of heterogeneous data type. It is a linear data structure in which a tagged coach may be added or removed only at the end, called the TOP of the ra-coach stack in LIFO manner as in classical stack. If TOP = 0 (Null), it implies that the ra-coach stack is an empty ra-coach stack. (The term ra-coach signifies that it is a coach of a r-atrain). Figure 11. a ra-coach stack Thus a ra-coach stack T(see Figure 11) may be written as an array of TCs as T: (C1,s1), (C2,s2), (C3,s3), (C4,s4),..,(Cr,sr) i.e. as T: (C1,(c1,s1)), (C2,(c2,s2)), (C3,(c3,s3)), (C4,(c4,s4)),.., (Cr,(cr,sr)), with the implication that the rightmost element is the top element (TC). Insertion and deletion can occur only at the top of the ra-coach stack. Two basic operations associated with ra-coach stacks are: (i) c - PUSH (to insert a new tagged coach into a ra-coach stack) (ii) c - POP (to delete a tagged coach from a ra-coach stack) The ra-coach stack has the following two important operations too, which can be easily implemented using PUSH, POP, c-push, and c-pop operations multiple times, and the properties of the data structure larray:
13 Data structures for big data 85 (iii) e - PUSH for pushing an element. (to insert an element into a tagged coach (TC) of a ra-coach stack where status s is not 0, i.e. at least 1). Insertion of an element (a new passenger) x inside the coach Ci is feasible if x is of same datatype (like other passengers of the coach) and if there is an empty space available inside the coach Ci. If status of Cj is greater than 0 then data can be stored successfully in the coach, otherwise insertion operation fails here. After each successful insertion, the status s of the coach is to be updated by doing s = s 1. For insertion of x, we can replace the lowest indexed passenger of Ci with x. (iv) e - POP for popping an element. (to delete an element from a tagged coach of a ra-coach stack, retaining the tagged coach in the memory adjusting its status s by doing s s+1). We can delete a data element eij from the coach Ci. Deletion of a data (passenger) from a coach means replacement of the data by an element (of same datatype). Consequently, if eij =, then the question of deletion does not arise. Here it is pre-assumed that eij is a non- member element of the coach Ci in the atrain stack. For j = 1, 2,..., r, deletion of eij is done by replacing it by the null element, and updating the availability-status n by doing n = n+1. Deletion of a data element (passenger) does not effect the size r of the coach. For example, consider the tagged coach [Ci, (ci,m)] where Ci = < ei1, ei2, ei3, ei4,..., eir>. If we delete ei3 from the coach Ci, then the updated tagged coach will be [Ci, (ci, m+1)] where Ci = < ei1, ei2,, ei4,..., eir>. It can be noted that the data structure rt-coach stack is a special case of the data structure r-train stack, the data structure ra-coach stack is a special case of the data structure r-atrain stack. The classical stack (in Computer Science) is a 1A-coach stack, special case of the data structures ra-coach stack for r = 1. 3 r-train Queue and r-atrain Queue In this section we introduce two new data structures for big data: (1) r-train queue or train queue in short, and (2) r-atrain queue or atrain queue in short. The concept of the terminologies: coach, status and tag in a train queue are similar to those of a coach in a train stack as discussed in subsection The concept of the terminologies: coach, status, tag, CD-table in an atrain queue are similar to those of a coach in an atrain stack as discussed earlier in the subsections and
14 86 Ranjit Biswas A r-train queue (or, train queue) is a crisp queue of r-trains containing data of homogeneous data type across the whole queue. It is a linear data structure in which deletion of a train can take place only at one end called the front and insertion of a train can take place only at the other one end called the rear, in the manner of FIFO list, as shown in Figure 12 below. Figure 12. a train queue In the larray of a train queue, there are two pointer variables: FRONT and REAR. The FRONT contains the location of the front TC and the REAR contains the location of the rear TC. If FRONT = Null, then the train queue is empty. Whenever a TC is deleted from the train queue, the value of FRONT gets incremented by 1. Similarly, whenever a TC is added to the train queue, the value of REAR gets incremented by 1. A rt-coach queue is a crisp queue of tagged coaches (of trains) containing data of homogeneous data type. It is a linear data structure in which deletion of tagged coach can take place only at one end called the front and insertion of a tagged coach can take place only at the other one end called the rear, in the manner of FIFO list, as shown in Figure 13 below. Figure 13. a rt-coach queue A r-atrain queue (or, atrain queue) is a crisp queue of r-atrains containing data of heterogeneous data type, where each coach is homogeneous in itself. Datatypes of different coaches are different. Thus it is a linear data structure in which deletion of an atrain can take place only at one end called the front and insertion of an atrain can take place only at the other one end called the rear, in the manner of FIFO list, as shown in Figure 14 below. Figure 14. an atrain queue The datatype in an atrain queue may vary from coach to coach (unlike in train
15 Data structures for big data 87 queue), but in a coach all data must be homogeneous i.e. of identical datatype. Thus each coach is homogeneous, although the atrain queue is heterogeneous. In this sense, atrain queue is regarded as a heterogeneous data structure. A ra-coach queue is a crisp queue of tagged coaches (as defined in sections and 2.2.3) containing data of heterogeneous data type (see Figure 15). Figure 15. a ra-coach queue In an atrain queue or ra-coach queue, different TCs are of different sizes depending upon the CD-table concerned. It can be noted that the data structure rt-coach queue is a special case of the data structure r-train queue, the data structure ra-coach queue is a special case of the data structure r-atrain queue. The classical queue is a special case of the data structures r-train queue and r-atrain queue. 4 r-train Binary Tree and r-atrain Binary Tree In this section we introduce two new non-linear data structures for big data: (1) r-train binary tree or train binary tree in short, and (2) r-atrain binary tree or atrain binary tree in short. They are basically extension of the notion of classical binary tree, and are defined in a similar way in the context of train-atrain architecture for big data. The concept of the terminologies: coach, status and tag in a train binary tree or train tree are similar to those of a coach in a train stack as discussed in subsection The concept of the terminologies: coach, status, tag, CD-table in an atrain binary tree or atrain tree are similar to those of a coach in an atrain stack as discussed earlier in the subsections and A r-train binary tree (or train binary tree, in short) T is defined as a finite set of r-trains called train-nodes such that: (a) T is empty (called the null tree or empty tree), or (b) T contains a distinguished train-node R called the root-train of T, and the remaining train-nodes of T form an ordered pair of disjoint train binary trees T1 and T2.
16 88 Ranjit Biswas If T does contain a root-train R, then the two train binary trees T1 and T2 are called respectively left train subtree and right train subtree of the train binary tree R. If T1 is non-empty then its root-train is called the left successor of R and if T2 is non-empty then its root-train is called the right successor of R (see Figure 16). The concept of left child train, right child train, parent train, sibling trains, etc. can also be visualized in a classical way. Figure 16. a train binary tree A rt-coach binary tree T is defined as a finite set of tagged train coaches called coach-nodes such that: (a) T is empty (called the null tree or empty tree), or (b) T contains a distinguished coach-node R called the root-coach of T, and the remaining coach-nodes of T form an ordered pair of disjoint rt-coach binary trees T1 and T2 (see Figure 17). Figure 17. a rt-coach binary tree
17 Data structures for big data 89 A r-atrain binary tree (or, atrain binary tree, in short) can also be defined in an analogous way (see Figure 18) with tagged coaches as defined in section Figure 18. an atrain binary tree A ra-coach binary tree T is defined as a finite set of tagged atrain coaches called coach-nodes such that: (a) T is empty (called the null tree or empty tree), or (b) T contains a distinguished coach-node R called the root-coach of T, and the remaining coach-nodes of T form an ordered pair of disjoint ra-coach binary trees T1 and T2 (see Figure 19). Figure 19. a ra-coach binary tree A r-train tree (or train tree in short) T is defined as a non-empty finite set of r-trains called train-nodes such that: (a) T contains a distinguished train-node R called the root-train of T, and (b) the remainingtrain-nodes of T form an ordered collection of zero or more number of disjoint train trees T1, T2,.., Tm.
18 90 Ranjit Biswas The train trees T1, T2,..,Tm are called train subtrees of the root-train R, and the root-trains of T1, T2,.., Tm are called successors of R. The concept of a children trains, parent train, sibling trains, etc. can be visualized in a classical way. A r-atrain tree (or atrain tree, in short), rt-coach tree and ra-coach tree can be defined in an analogous way. The data structures r-train, r-atrain, r-train tree, and r-atrain tree should not be confused with any of the data structures (basically for multidimensional data) of the group : X-tree, R-tree, R+-trees, R*-trees, M-tree, KD-tree, PQ-tree, SPQR-tree, Hilbert R-tree, etc. etc. It may be noted that the data structure rt-coach binary tree is a special case of the data structure r-train binary tree, the data structure ra-coach binary tree is a special case of the data structure r-atrain binary tree. The classical binary tree is a special case of the data structures r-train binary tree and r-atrain binary tree. Similarly the classical tree is a special case of the data structures r-train tree and r-atrain tree. The ADS (unitier or multitier) with the new type of network topologies Multi-horse Cart Topology and Cycle Topology defined by Biswas in [15] is the appropriate distributed system to implement atrain tree for big data of any 4Vs of any momentum as the ADS is scalable upto any desired extent both in breadth and depth. 5 Conclusion Today s supercomputer or multiprocessor system which can provide huge parallelism has become the dominant computing platform (through the proliferation of multi-core processors). In most of the giant business organizations, the system has to deal with Big Data of homogeneous or heterogeneous nature for which the data structures of the existing literature can not always lead to the desired optimal satisfaction. The existing data structures of Computer Science are neither appropriate nor sufficient to deal with big data. The very common and frequent operations like Insertion, deletion, searching, etc. are required to be much faster even if the Big Data be of heterogeneous types. Such situations require some way or some method which work more efficiently than the simple rudimentary data structures such as arrays, linked lists, hash tables, binary search trees, dictionary, etc. Obviously, there is a need of a dash of creativity of a new or better performed heterogeneous data structure which at the same time must be of rudimentary in nature. The data structure r-train (or train in short) is a homogeneous data structure can deal with homogeneous Big Data very efficiently. The data structure r-atrain (or, atrain) is a robust kind of dynamic heterogeneous data structure which encapsulates the merits of the arrays and of the linked lists, but are different from them and can deal with heterogeneous Big Data. The strong merit of the atrain distributed system (ADS) introduced in [15] is that it is an infinitely scalable architecture (along breadth and depth both) of
19 Data structures for big data 91 distributed system for processing big data of any momentum of 4Vs. It is so constructed that it has 100% compatibility with the data structure atrain (and train) exclusively discovered for big data. The data structures train tree and atrain tree can be well implemented in atrain distributed systems for big data. Although big data is explosively expanding with 4Vs, the subject has not been growing in the same pace. One of the reasons is that the existing data structures are not the appropriate data structures for big data. The first attempt to introduce a data structure for big data is the discovery of the data structures atrain (and train) in [11, 12, 15], with the distributed system ADS of new type of network topologies Multi-horse Cart Topology and Cycle Topology. The Data Structures for Big Data is to be regarded as a new subject, not just as a new topic in Big Data Science, considering the explosive momentum of the big data. Consequently, in this paper we introduce few more new data structures for big data which are : homogeneous stacks train stack and rt-coach stack, heterogeneous stacks atrain stack and ra-coach stack, homogeneous queues train queue and rt-coach queue, heterogeneous queues atrain queue and ra-coach queue, homogeneous binary trees train binary tree and rt-coach binary tree, heterogeneous binary trees atrain binary tree and ra-coach binary tree, homogeneous trees train tree and rt-coach tree, heterogeneous trees atrain tree and ra-coach tree, to enrich the new subject Data Structures for Big Data with the extended notion of the architecture of the data structures r-atrain (atrain) and r-train (train) of Biswas [11, 12, 15] for big data. The natural number r is fixed for a stack / queue / binary tree or tree, but could be different for different stacks / queues / binary trees or trees. All these big data structures can be well implemented in ADS. The Big Data Universe needs intensive attention not only just for finding solutions to the present problem of explosive expansion, but also for the tremendous future expansion. It is fact [15] that the present 4Ns are lagging behind the present 4Vs. Everyday s big data is bigger than the previous day s big data in the big data universe. The data structure atrain for big data and the distributed system ADS for big data can deal with any amount of 4Vs because of its scalability property upto any desired amount both in breadth and depth. However, for the future big data of the Big Data Universe, the train/atrain data structures can be further extended embedded with the families of data structures proposed in this paper to create new big data structures viz. m-train of r-trains, m-atrain of r-atrains, m-train of r-train stacks, m-atrain of r-atrain stacks, m-train of r-train queues, m-atrain of r-atrain queues, m-train of r-train binary trees, m-atrain of r-atrain binary trees, etc. depending upon the 4Vs of future big data. Consequently, the corresponding distributed system EADS ( Extended ADS ) of several designs can be created by developing new methods on how to integrate multiple ADSs to make an EADS, designing new type of Hybrid Topologies of Multi-horse Cart Topology, Cycle Topology and other existing classical topologies.
20 92 Ranjit Biswas Our next work will also be to study these big data structures with sufficient examples, to introduce the data structures train graph and atrain graph considering the big data in the topologies of communication network, in particular while a large number of communication networks logically merge together to form a singly network and to work efficiently as a single network as it seems to the laymen users. An important problem to the computer scientists is to develop a new but simple big data language called by ADSL ' (Atrain Distributed System Language) which can easily be used by any laymen user at the PC of an ADS to download unlimited amount of relevant big data of any heterogeneous datatype (if not of homogeneous datatype) from the cyberspace, to upload unlimited amount of big data of any heterogeneous datatype (or homogeneous datatype) in the cyberspace, to store the downloaded big data online in the coaches of PC/DCs of ADS in an organized way of atrain (train) architecture, to answer any query be it arithmetical or statistical or relational query or imprecise query on big data. References [1] Andrew S. Tanenbaum, Computer Networks (3rd Edition), Pearson Education India, [2] Ashu M. G. Solo, Multidimensional Matrix Mathematics: Part 1-6, Proceedings of the World Congress on Engineering 2010 Vol III, WCE 2010, June 30 - July 2, 2010, London, ISBN: , ISSN: (Print); ISSN: (Online). [3] Bashir Alam, Matrix Multiplication using r-train Data Structure, 2013 AASRI Conference on Parallel and Distributed Computing Systems, AASRI (Elsevier) Procedia 5 (2013 ) , doi: /j.aasri [4] Christian Krattenthaler and Michael Schlosser, A New Multidimensional Matrix Inverse with Applications to Multiple q-series, Discrete Mathematics, Volume 204, Issues 1-3, 6 June 1999, Pages [5] David Feinleib, Big Data Demystified: How Big Data Is Changing The Way We Live, Love And Learn, The Big Data Group Publisher, LLC San Francisco, USA, [6] Erik Thomsen, OLAP Solutions: Building Multidimensional Information Systems, 2nd Edition. John Wiley & Sons. USA, [7] Jeffrey Needham, Disruptive Possibilities: How Big Data Changes Everything, O reilly Publisher, Cambridge, 2013.
21 Data structures for big data 93 [8] Joel L Franklin, Matrix Theory, Mineola, N.Y., [9] Jules J Berman, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information, Morgan Kaufmann (Elsevier) Publisher, USA, [10] Phil Simon, Too Big to Ignore: The Business Case for Big Data, John Wiley & Sons, New Jersey, [11] Ranjit Biswas, Heterogeneous Data Structure R-Atrain, INFORMATION: An International Journal (Japan), Vol.15 (2) February 2012, pp ( 2012 International Information Institute of Japan & USA). [12] Ranjit Biswas, Heterogeneous Data Structure r-atrain, Chapter-12 in Global Trends in Knowledge Representation and Computational Intelligence by B. K. Tripathy and D. P. Acharjya, IGI Global, USA, (DOI: / , ISBN13: , ISBN10: , EISBN13: ). [13] Ranjit Biswas, Region Algebra, Theory of Objects & Theory of Numbers, International Journal of Algebra, Vol. 6 (8), 2012, [14] Ranjit Biswas, Theory of Solid Matrices & Solid Latrices, Introducing New Data Structures MA, MT: for Big Data, International Journal of Algebra, Vol.7 (16) (2013) pp , [15] Ranjit Biswas, Processing of Heterogeneous Big Data in an Atrain Distributed System (ADS) Using the Heterogeneous Data Structure r-atrain, International Journal Computing and Optimization, Vol.1 (1) 2014, pp [16] Sitarski, Edward, HATs: Hashed array trees, Dr. Dobb's Journal 21(11), [17] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, Introduction to Algorithms, Second Edition, MIT Press and McGraw-Hill, [18] Viktor Mayer - Schönberger and Kenneth Cukier, BIG DATA: A Revolution That Will Transform How We Live, Work, and Think, Eamon Dolan/Houghton Mifflin Harcourt Publisher, Received: August 15, 2014; Published: November 12, 2014
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON THE USAGE OF OLD AND NEW DATA STRUCTURE ARRAYS, LINKED LIST, STACK,
1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.
1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D. base address 2. The memory address of fifth element of an array can be calculated
Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C
Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate
5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.
1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The
DATA STRUCTURES USING C
DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give
IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS
Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE
Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students
Eastern Washington University Department of Computer Science Questionnaire for Prospective Masters in Computer Science Students I. Personal Information Name: Last First M.I. Mailing Address: Permanent
Algorithms and Data Structures
Algorithms and Data Structures Part 2: Data Structures PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Summer Term 2016 Overview general linked lists stacks queues trees 2 2
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
Data Structure [Question Bank]
Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:
Fast Sequential Summation Algorithms Using Augmented Data Structures
Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik [email protected] Abstract This paper provides an introduction to the design of augmented data structures that offer
CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team
CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team Lecture Summary In this lecture, we learned about the ADT Priority Queue. A
Chapter 3: Restricted Structures Page 1
Chapter 3: Restricted Structures Page 1 1 2 3 4 5 6 7 8 9 10 Restricted Structures Chapter 3 Overview Of Restricted Structures The two most commonly used restricted structures are Stack and Queue Both
New Hash Function Construction for Textual and Geometric Data Retrieval
Latest Trends on Computers, Vol., pp.483-489, ISBN 978-96-474-3-4, ISSN 79-45, CSCC conference, Corfu, Greece, New Hash Function Construction for Textual and Geometric Data Retrieval Václav Skala, Jan
Full and Complete Binary Trees
Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full
THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM
THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM Iuon Chang Lin Department of Management Information Systems, National Chung Hsing University, Taiwan, Department of Photonics and Communication Engineering,
1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++
Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The
Load Balancing using Potential Functions for Hierarchical Topologies
Acta Technica Jaurinensis Vol. 4. No. 4. 2012 Load Balancing using Potential Functions for Hierarchical Topologies Molnárka Győző, Varjasi Norbert Széchenyi István University Dept. of Machine Design and
Master of Science in Computer Science College of Computer and Information Science. Course Syllabus
Master of Science in Computer Science College of Computer and Information Science Course Syllabus Course Number/Name: CS 5800 Algorithms (Master Level) Term: Fall 2013, 15 week term, Hybrid Dates: September
Data Structure with C
Subject: Data Structure with C Topic : Tree Tree A tree is a set of nodes that either:is empty or has a designated node, called the root, from which hierarchically descend zero or more subtrees, which
Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students
Eastern Washington University Department of Computer Science Questionnaire for Prospective Masters in Computer Science Students I. Personal Information Name: Last First M.I. Mailing Address: Permanent
Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) n-element vector start address + ielementsize 0 +1 +2 +3 +4... +n-1 start address continuous memory block static, if size is known at compile time dynamic,
Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students
Eastern Washington University Department of Computer Science Questionnaire for Prospective Masters in Computer Science Students I. Personal Information Name: Last First M.I. Mailing Address: Permanent
Bachelor of Games and Virtual Worlds (Programming) Subject and Course Summaries
First Semester Development 1A On completion of this subject students will be able to apply basic programming and problem solving skills in a 3 rd generation object-oriented programming language (such as
NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE
www.arpapress.com/volumes/vol13issue3/ijrras_13_3_18.pdf NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE Hebah H. O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan
10CS35: Data Structures Using C
CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a
Catalan Numbers. Thomas A. Dowling, Department of Mathematics, Ohio State Uni- versity.
7 Catalan Numbers Thomas A. Dowling, Department of Mathematics, Ohio State Uni- Author: versity. Prerequisites: The prerequisites for this chapter are recursive definitions, basic counting principles,
The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).
CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical
Multi-dimensional index structures Part I: motivation
Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for
ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology)
ALLIED PAPER : DISCRETE MATHEMATICS (for B.Sc. Computer Technology & B.Sc. Multimedia and Web Technology) Subject Description: This subject deals with discrete structures like set theory, mathematical
Network (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 [email protected],
Compression algorithm for Bayesian network modeling of binary systems
Compression algorithm for Bayesian network modeling of binary systems I. Tien & A. Der Kiureghian University of California, Berkeley ABSTRACT: A Bayesian network (BN) is a useful tool for analyzing the
Glossary of Object Oriented Terms
Appendix E Glossary of Object Oriented Terms abstract class: A class primarily intended to define an instance, but can not be instantiated without additional methods. abstract data type: An abstraction
Proposal of Dynamic Load Balancing Algorithm in Grid System
www.ijcsi.org 186 Proposal of Dynamic Load Balancing Algorithm in Grid System Sherihan Abu Elenin Faculty of Computers and Information Mansoura University, Egypt Abstract This paper proposed dynamic load
Please consult the Department of Engineering about the Computer Engineering Emphasis.
COMPUTER SCIENCE Computer science is a dynamically growing discipline. ABOUT THE PROGRAM The Department of Computer Science is committed to providing students with a program that includes the basic fundamentals
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range
THEORY OF COMPUTING, Volume 1 (2005), pp. 37 46 http://theoryofcomputing.org Polynomial Degree and Lower Bounds in Quantum Complexity: Collision and Element Distinctness with Small Range Andris Ambainis
A Novel Binary Particle Swarm Optimization
Proceedings of the 5th Mediterranean Conference on T33- A Novel Binary Particle Swarm Optimization Motaba Ahmadieh Khanesar, Member, IEEE, Mohammad Teshnehlab and Mahdi Aliyari Shoorehdeli K. N. Toosi
7.1 Our Current Model
Chapter 7 The Stack In this chapter we examine what is arguably the most important abstract data type in computer science, the stack. We will see that the stack ADT and its implementation are very simple.
Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *
Binary Heaps A binary heap is another data structure. It implements a priority queue. Priority Queue has the following operations: isempty add (with priority) remove (highest priority) peek (at highest
Methods to increase search performance for encrypted databases
Available online at www.sciencedirect.com Procedia Economics and Finance 3 ( 2012 ) 1063 1068 Emerging Markets Queries in Finance and Business Methods to increase search performance for encrypted databases
Row Echelon Form and Reduced Row Echelon Form
These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation
Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format
, pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational
Data Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data
CHAPTER 7: The CPU and Memory
CHAPTER 7: The CPU and Memory The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
Atmiya Infotech Pvt. Ltd. Data Structure. By Ajay Raiyani. Yogidham, Kalawad Road, Rajkot. Ph : 572365, 576681 1
Data Structure By Ajay Raiyani Yogidham, Kalawad Road, Rajkot. Ph : 572365, 576681 1 Linked List 4 Singly Linked List...4 Doubly Linked List...7 Explain Doubly Linked list: -...7 Circular Singly Linked
International Journal of Software and Web Sciences (IJSWS) www.iasir.net
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International
On the k-path cover problem for cacti
On the k-path cover problem for cacti Zemin Jin and Xueliang Li Center for Combinatorics and LPMC Nankai University Tianjin 300071, P.R. China [email protected], [email protected] Abstract In this paper we
[Refer Slide Time: 05:10]
Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture
Computer Organization
Computer Organization and Architecture Designing for Performance Ninth Edition William Stallings International Edition contributions by R. Mohan National Institute of Technology, Tiruchirappalli PEARSON
Domains and Competencies
Domains and Competencies DOMAIN I TECHNOLOGY APPLICATIONS CORE Standards Assessed: Computer Science 8 12 I VII Competency 001: The computer science teacher knows technology terminology and concepts; the
Linear Programming. March 14, 2014
Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1
Computer Science 281 Binary and Hexadecimal Review
Computer Science 281 Binary and Hexadecimal Review 1 The Binary Number System Computers store everything, both instructions and data, by using many, many transistors, each of which can be in one of two
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER
ASSEMBLY PROGRAMMING ON A VIRTUAL COMPUTER Pierre A. von Kaenel Mathematics and Computer Science Department Skidmore College Saratoga Springs, NY 12866 (518) 580-5292 [email protected] ABSTRACT This paper
Oracle8i Spatial: Experiences with Extensible Databases
Oracle8i Spatial: Experiences with Extensible Databases Siva Ravada and Jayant Sharma Spatial Products Division Oracle Corporation One Oracle Drive Nashua NH-03062 {sravada,jsharma}@us.oracle.com 1 Introduction
PES Institute of Technology-BSC QUESTION BANK
PES Institute of Technology-BSC Faculty: Mrs. R.Bharathi CS35: Data Structures Using C QUESTION BANK UNIT I -BASIC CONCEPTS 1. What is an ADT? Briefly explain the categories that classify the functions
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
Types of Degrees in Bipolar Fuzzy Graphs
pplied Mathematical Sciences, Vol. 7, 2013, no. 98, 4857-4866 HIKRI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.37389 Types of Degrees in Bipolar Fuzzy Graphs Basheer hamed Mohideen Department
136 CHAPTER 4. INDUCTION, GRAPHS AND TREES
136 TER 4. INDUCTION, GRHS ND TREES 4.3 Graphs In this chapter we introduce a fundamental structural idea of discrete mathematics, that of a graph. Many situations in the applications of discrete mathematics
Parallelization: Binary Tree Traversal
By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First
Binary Heap Algorithms
CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks [email protected] 2005 2009 Glenn G. Chappell
BookTOC.txt. 1. Functions, Graphs, and Models. Algebra Toolbox. Sets. The Real Numbers. Inequalities and Intervals on the Real Number Line
College Algebra in Context with Applications for the Managerial, Life, and Social Sciences, 3rd Edition Ronald J. Harshbarger, University of South Carolina - Beaufort Lisa S. Yocco, Georgia Southern University
COURSE TITLE COURSE DESCRIPTION
COURSE TITLE COURSE DESCRIPTION CS-00X COMPUTING EXIT INTERVIEW All graduating students are required to meet with their department chairperson/program director to finalize requirements for degree completion.
Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?
Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide
AP Computer Science AB Syllabus 1
AP Computer Science AB Syllabus 1 Course Resources Java Software Solutions for AP Computer Science, J. Lewis, W. Loftus, and C. Cocking, First Edition, 2004, Prentice Hall. Video: Sorting Out Sorting,
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R
Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Why? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti 2, Nidhi Rajak 3
A STUDY OF TASK SCHEDULING IN MULTIPROCESSOR ENVIROMENT Ranjit Rajak 1, C.P.Katti, Nidhi Rajak 1 Department of Computer Science & Applications, Dr.H.S.Gour Central University, Sagar, India, [email protected]
GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. Course Curriculum. DATA STRUCTURES (Code: 3330704)
GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT Course Curriculum DATA STRUCTURES (Code: 3330704) Diploma Programme in which this course is offered Semester in which offered Computer Engineering,
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc
Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment
A Performance Study of Load Balancing Strategies for Approximate String Matching on an MPI Heterogeneous System Environment Panagiotis D. Michailidis and Konstantinos G. Margaritis Parallel and Distributed
Keywords Quantum logic gates, Quantum computing, Logic gate, Quantum computer
Volume 3 Issue 10 October 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Introduction
Load Re-Balancing for Distributed File. System with Replication Strategies in Cloud
Contemporary Engineering Sciences, Vol. 8, 2015, no. 10, 447-451 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2015.5263 Load Re-Balancing for Distributed File System with Replication Strategies
Big Data Processing: Past, Present and Future
Big Data Processing: Past, Present and Future Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. [email protected] [email protected] @OrionGM
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
Notes on Factoring. MA 206 Kurt Bryan
The General Approach Notes on Factoring MA 26 Kurt Bryan Suppose I hand you n, a 2 digit integer and tell you that n is composite, with smallest prime factor around 5 digits. Finding a nontrivial factor
On Integer Additive Set-Indexers of Graphs
On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that
The Set Data Model CHAPTER 7. 7.1 What This Chapter Is About
CHAPTER 7 The Set Data Model The set is the most fundamental data model of mathematics. Every concept in mathematics, from trees to real numbers, is expressible as a special kind of set. In this book,
A Catalogue of the Steiner Triple Systems of Order 19
A Catalogue of the Steiner Triple Systems of Order 19 Petteri Kaski 1, Patric R. J. Östergård 2, Olli Pottonen 2, and Lasse Kiviluoto 3 1 Helsinki Institute for Information Technology HIIT University of
2) What is the structure of an organization? Explain how IT support at different organizational levels.
(PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different
MEng, BSc Applied Computer Science
School of Computing FACULTY OF ENGINEERING MEng, BSc Applied Computer Science Year 1 COMP1212 Computer Processor Effective programming depends on understanding not only how to give a machine instructions
CHAPTER 17: File Management
CHAPTER 17: File Management The Architecture of Computer Hardware, Systems Software & Networking: An Information Technology Approach 4th Edition, Irv Englander John Wiley and Sons 2010 PowerPoint slides
1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
Memory Systems. Static Random Access Memory (SRAM) Cell
Memory Systems This chapter begins the discussion of memory systems from the implementation of a single bit. The architecture of memory chips is then constructed using arrays of bit implementations coupled
Common Data Structures
Data Structures 1 Common Data Structures Arrays (single and multiple dimensional) Linked Lists Stacks Queues Trees Graphs You should already be familiar with arrays, so they will not be discussed. Trees
