Journal of Convergence Informaton Technology Volume 5, Number 7, Setember 00 A Study on Secure Data Storage Strategy n Cloud Comutng Danwe Chen, Yanjun He, Frst Author College of Comuter Technology, Nanjng Unversty of Posts and Telecommuncatons, chendw@njut.edu.cn *,Corresondng Author College of Comuter Technology, Nanjng Unversty of Posts and Telecommuncatons,realmeh@gmal.com do: 0.456/jct.vol5.ssue7.3 Abstract Based on fundamental theores of equatons n algebra, n congruence surlus rncle n elementary number theory, and the Abhshe s onlne data storage algorthm, we roose a secure data storage strategy n cloud comutng. The strategy slts data d nto sectons usng the data slttng algorthm, ensures hgh data securty by smlfyng equaton solutons, and at the same tme, guarantees hghly relable data usng the coeffcents generated by the slttng algorthm. Keywords: Cloud comutng, Data arttonng, Dstrbuted storage, Securty strategy. Introducton Cloud comutng manly rovdes three nds of servces: IaaS (Infrastructure as a Servce), PaaS (Platform as a Servce) and SaaS (Software as a Servce) []. The major dfference between servce based on cloud comutng and tradtonal servce s that user data s stored not n the local server, but n the dstrbuted storage system of the servce suler. In many cases, however, users (esecally busness users) have hgh demands regardng data securty and relablty. Generally, n tradtonal data rotecton methods, lantext data s stored after encryton. In ractcal alcatons, symmetrc encryton algorthms, such as DES and AES, are usually adoted because of ther effcency. Although data stored n the cloud server are encryted, encryton algorthm rovdes relatvely lower securty. Therefore, encryted data are very lely to be vulnerable to attacs [] and busness nterests become comromsed once the server s nvaded. In ths aer, we roose a secure data storage strategy caable of addressng the shortcomngs of tradtonal data rotecton methods and mrovng securty and relablty n cloud comutng.. Data securty storage strateges Secure data storage n cloud comutng s realzed on the bass of a dstrbuted system. After reachng the cloud, data can be randomly stored n any one or more servers. Accordng to characterstcs of the storage mode, each server n the dstrbuted system can be abstracted as a storage node. Suose there are m servers n the system, wrtten as: S{ s, s,.., s m }. Suose the lantext data set s d. The equatons based on the slttng algorthm s aled to data set d to generate (<m) data, wrtten as:{ d, d,.., d } Partton(d) n whch Partton() s the data slttng algorthm llustrated n detal n Secton 3 of ths aer. The generated data blocs are then slt, and servers are randomly chosen out of m servers, whch can be exressed as the followng formula:{ d, d,.., d }ma(s), where S{ s, s,.., s m }. The data restoraton rocess can be exressed as d d d d mod, where s a large rme number. The core of the secure storage strategy s ts data slttng algorthm, whch s an extenson of fundamental theores of equatons n algebra, n congruence surlus rncle n elementary number theory[3], ey sharng of Shamr[4] and onlne data storage algorthm of Abhshe[5,6], through whch data slttng storage s realzed. The safety of the strategy manly deends on two asects. Frst, s the dffculty of decodng the data slttng algorthm. The second, s that because storage servers are randomly chosen after data slttng, encryted data cannot be comletely obtaned by attacng one or more servers, mang decodng even more dffcult. 75
A Study on Secure Data Storage Strategy n Cloud Comutng Danwe Chen, Yanjun He In addton, the strategy has nherent advantages n ts fault tolerance comared wth tradtonal data rotecton methods. In cloud comutng, no assumtons on the robustness of any node n the dstrbuted system can be made. Varous unexected factors can all result n temorary naccessblty of some nodes or ermanent naccessblty of data. In such a case, tradtonal data rotecton means are often owerless. The secure strategy roosed n ths aer ensures that data can be restored even when some nodes fal, whch consderably mroves system relablty. 3. Core algorthm of the secure strategy 3. Data slttng algorthm In crytograhy, t s much more convenence for constructng an somorhc quotent rng as a comlex feld than algebrac oeraton when wth the same structure [7]. We construct an somorhc quotent rng wth the same structure as comlex feld Ζ (where s a large rme number), and a congruence equaton exressed as: x + a x + d 0 mod () where d Ζ s the data to be slt, 0 a -, and 0 d - (Note: d here can be -d). Accordng to the fundamental rncle of equatons n algebra, Equaton () and{ r,... } C (C has roots. These roots are exressed as: x + a x + d 0 mod s a set of comlex numbers),the Equaton () can be rewrtten as: ( x r ) 0 mod () where r -. These r are data blocs generated after the slttng. Equatons () and () show that d s ndeendent of varable x. Therefore, the followng can be generated: d mod (3) r 3. Hgh effcency of the algorthm Hgh effcency of the algorthm s llustrated through two asects: data slttng and storage rocess and data restoraton rocess. In the data slttng and storage rocess, slttng algorthm aled to data set d generates blocs of data r,.... Then, these data blocs are stored n a randomly chosen server. In addton, coeffcent a s stored as bacu nformaton. The rocess manly ncludes the followng oeratons:. - numbers r,... are randomly chosen wthn the fnte feld Ζ.. r d ( r r r ) mod s calculated. 3. Coeffcents a s calculated by constructng olynomal (), n whch () s shown n the followng: ( ) ( x r )( x r )...( x r ) mod x + a x + a x +... + ax + a 0 mod From the calculaton rocess above, we can nfer that multlcatons, one modular nverson, and the multlcaton of () olynomal to tmes are needed for the algorthm to generate blocs of data. Therefore, the tme comlexty s O()[8]. For data decodng and restoraton, the user retreves data of each bloc R{ r,... } from relevant servers accordng to the locally-stored data oston ndex, and then obtans the lantext data by calculatng d r r r mod. Clearly, tme comlexty of the decodng rocess, the same as that of the encryton rocess, s O(). Therefore, executon effcency of the algorthm, whether n an encryton or decodng rocess, s rather hgh much hgher than n asymmetrcal encryton algorthm. 76
3.3 Securty of the algorthm Journal of Convergence Informaton Technology Volume 5, Number 7, Setember 00 Theorem. If coeffcent a ( -) s randomly chosen and s zero when - coeffcents are dfferent, the robablty of generatng authentc data set d s less than /, even when the roots of the - equatons are nown. Proof: Gven data set d, the coeffcent n Equaton () s chosen wth the followng method to ensure that the equaton has roots: - roots r,... are randomly chosen n fnte feld Ζ. The -th root can be obtaned wth the followng equaton: r d ( r r r ) mod (4) As the roots are randomly dstrbuted n Ζ [9], the robablty of obtanng r wthout nowng d s /. Conversely, the robablty of obtanng d wthout nowng r s also /. The followng exlanaton s resented to dscuss why the coeffcents cannot be zero at the same tme. Suose a 0 a... a 0, Equaton () s converted nto x + d 0 mod (5). Based on n congruence surlus rncle n elementary number theory, we can nfer that, for Equaton (5) to have roots, the necessary condtons are GCD(-,); GCD(d,); and b Ζ. Data set d s the -th ower of b. Usually, ( s+) s chosen, n whch s N. In such a case, there are certan requrements on data set d, as well, whch s unaccetable n ractcal alcaton. Even f d satsfes relevant requrements, the attacer can easly calculate orgnal data set d, by merely determnng the data of one bloc generated by the slttng and the number of blocs. In ths case, securty s rather oor. Therefore, the algorthm requres that the chosen coeffcents cannot be zero at the same tme. Theorem. If the attacer nvades a storage node, steals data bloc r, and wants to restore data set d wth aggressve methods based on r and decode coeffcents of ( ) olynomal n fnte feld Ζ, the requred tme comlexty s Ω( ( ) ).! Proof: Suose the coeffcent set of Equaton () s A{ a 0, a }, n whch 0 a -. Each grou of examles of a values are corresondent to the only seres of solutons of ts root set R{ r,... }, and vce versa. To solve the coeffcents of ( ) olynomal, Equaton (3) can be used. Suose the attacer obtans r ; he needs to randomly choose - numbers (Note: - numbers here can be reeated) from the set S{0,,,,-}. Clearly, tmes of calculatons that the attacer needs can be exressed as the followng formula: + ( + )! ( )(! )! ( + )( + 3 )...( + )( )! ( + )( + 3 )...( + )! ( )! ( )( )! ( )! In ractcal alcaton, >>>>. Such a calculaton amount s far larger than the rocessng ablty of manstream comuters and ensures that t cannot be decoded n current comutng envronments. 4. Relablty of the secure data storage strategy One of the advantages dstngushng the roosed secure strategy from tradtonal data rotecton s that t rovdes hghly relable data rotecton. When slttng lantext data nto data blocs, we obtan data blocs r,... and - coeffcents a of equatons. These coeffcents are stored n the 77
A Study on Secure Data Storage Strategy n Cloud Comutng Danwe Chen, Yanjun He server as bacu nformaton [0]. In an actual envronment, one or more of the nodes storng data blocs cannot be accessed because of the roblems of the networ or the server tself. Now data set d cannot be restored wth Equaton (5). In such case, vstng only one of the nodes s needed. Suose the data were retreved from node r.coeffcents a are then retreved from the server of the bacu coeffcent. r and the coeffcents are substtuted nto Equaton (3), and lantext data set d can be obtaned. Therefore, the strategy rovdes a sold method for rotectng the data stored n the dstrbuted system. 5. A comarson between the secure storage strategy and tradtonal data rotecton methods 5. Securty of the analyss strategy of the exerment The securty of data slttng algorthm s related to ey length. Furthermore, t also ncreases exonentally wth the ncrease n the number of data blocs. However, tradtonal data rotecton methods usually adot symmetrcal encryton, such as DES[], the securty of whch merely deends on ey length. For a comuter caable of rocessng one mllon nstructons wthn a second, wth the same ey length, the decodng tme of the slttng encryton method sgnfcantly ncreases wth the ncrease n the number of data blocs, whereas the decodng tme of the symmetrcal encryton algorthm remans almost the same (Fgure ). In ractcal analyss, the ey length of the algorthm s usually determned as 9 bts. The number of data blocs s 6. Its securty s 8 tmes hgher than that of tradtonal methods, and ts relablty s 50 tmes hgher. 5. Relablty of the analyss strategy of the exerment The relablty of the secure data storage strategy deends on the bacu data coeffcents. When one or more nodes cannot be accessed, the secure strategy can ensure that the data wll be restored as long as one of the nodes can be accessed. However, tradtonal data storage methods requre all the data n the nodes to be retreved. Thus, the more blocs the data are slt nto, the oorer the relablty of tradtonal data storage. Fgure shows that the rato of the relablty of the slttng storage strategy to that of tradtonal data rotecton methods ncreases exonentally, wth the ncrease n the number of data slttng blocs. Therefore, the secure storage strategy has tremendous advantages n terms of relablty. Fgure. Comarson between decodng tme of slttng encryton and that of encryton decodng. Fgure. Analyss of the relablty of slttng storage. 78
Journal of Convergence Informaton Technology Volume 5, Number 7, Setember 00 6. Concluson Based on relevant rncles n algebra and elementary number theory, ey sharng of Shamr and onlne data storage algorthm of Abhshe, ths aer rooses a secure data storage strategy alcable to the dstrbuted system n cloud comutng, whch successfully solves varous data securty roblems encountered by servce modes based on cloud comutng. In terms of data securty, the strategy enhances the decodng dffculty tenfold wth the ncreased number of data blocs. In addton, ts fault tolerance s hgher than that of the sngle-node storage method by hundreds of tmes. The secure strategy, however, also has ts shortcomngs, such as much data redundancy. These shortcomngs can be taen as mrovement drectons of subsequent research. 7. Reference [] J.Rttnghouse,J.Ransome, Cloud Comutng: Imlementaton, Management, and Securty, 009. [] Prasanta GogoB, Borah,D K Bhattacharyya, Anomaly Detecton Analyss of Intruson Data usng Suervsed & Unsuervsed Aroach, Journal of AICIT, AICIT, vol.5, no.,.95-, 00. [3] K.q. FENG Number Theory and Crytograhy, Scence Press, Chna, 007. [4] A.Shamr How to Share a Secret[J]. Communcatons of the ACM, vol.,no.,.6-63, 979. [5] M.H. Dehord, S. Mashhad, New effcent and ractcal verfable mult-secret sharng schemes, Informaton Scences, vol.9, no.6 74, 008. [6] A.Parah, S. Ka. Onlne data storage usng mlct securty, Informaton Scences, vol.79, no. 333 333, 009. [7] T. Moon, Error Correcton Codng: Mathematcal Methods and Algorthms, Wley, USA, 005. [8] A. Aho,J. Hocroft, J. Ullman, The Desgn and Analyss of Comuter Algorthms, Addson- Wesley, USA,974. [9] S. Ka, A cubc ublc-ey transformaton,crcuts, Systems and Sgnal Processng, vol.6,. 353 359, 007. [0] Anests A. Totss, K-grd: A Structure for Storage and Retreval of Affectve Knowledge, Journal of AICIT, AICIT, vol. 4, no.,.6-30, 009. [] Bruce Schneer, Aled Crytograhy, John Wley & Sons, USA, 996. 79