Lning Shms fo Unodd XML Rdu Ciunu Univsity of Lill & INRIA, Fn du.iunu@ini.f S lwk Stwoko Univsity of Lill & INRIA, Fn slwomi.stwoko@ini.f Astt W onsid unodd XML, wh th ltiv od mong silings is ignod, nd w invstigt th polm of lning shms fom xmpls givn y th us. W fous on th shm fomlisms poposd in [10]: disjuntiv multipliity shms (DMS) nd its stition, disjuntion-f multipliity shms (MS). A lning lgoithm tks s input st of XML doumnts whih must stisfy th shm (i.., positiv xmpls) nd st of XML doumnts whih must not stisfy th shm (i.., ngtiv xmpls), nd tuns shm onsistnt with th xmpls. W invstigt lning fmwok inspid y Gold [18], wh lning lgoithm should sound i.., lwys tun shm onsistnt with th xmpls givn y th us, nd omplt i.., l to podu vy shm with suffiintly ih st of xmpls. Additionlly, th lgoithm should ffiint i.., polynomil in th siz of th input. W pov tht th DMS lnl fom positiv xmpls only, ut thy not lnl whn w lso llow ngtiv xmpls. Moov, w show tht th MS lnl in th psn of positiv xmpls only, nd lso in th psn of oth positiv nd ngtiv xmpls. Futhmo, fo th lnl ss, th poposd lning lgoithms tun miniml shms onsistnt with th xmpls. 1. Intodution Whn XML is usd fo doumnt-nti pplitions, th ltiv od mong th lmnts is typilly impotnt.g., th ltiv od of pgphs nd hpts in ook. On th oth hnd, in s of dt-nti XML pplitions, th od mong th lmnts my unimpotnt [1]. In this pp w fous on th ltt us s. As n xmpl, tk in Figu 1 th XML doumnts stoing infomtion out ooks. Whil th od of th lmnts titl, y, utho, nd dito my diff fom on ook to noth, it hs no impt on th smntis of th dt stod in this smi-stutud dts. A shm fo XML is dsiption of th typ of dmissil doumnts, typilly dfining fo vy nod its ontnt modl i.., th hildn nods it must, my, o nnot ontin. In this pp w study th polm of lning unodd shms fom doumnt xmpls givn y th us. Fo instn, onsid th th XML doumnts fom Figu 1 nd ssum tht th us wnts to otin shm whih is stisfid y ll th th doumnts. A dsil solution is shm whih llows ook to hv, in ny od, xtly on titl, optionlly on y, nd ith t lst on utho o t lst on dito. Studying th thotil foundtions of lning unodd shms hs svl ptil motivtions. A shm svs s fn fo uss who do not know yt th utho C. Ppdimitiou titl Computtionl lning thoy y 2011 titl Shm mthing nd mpping ook titl Computtionl omplxity ook utho M. Kns ook dito y 1994 utho U. Vzini dito dito E. Rhm A. Bonifti Z. Bllhsn Figu 1. Th XML doumnts stoing infomtion out ooks. stutu of th XML doumnt, nd ttmpt to quy o modify its ontnts. If th shm is not givn xpliitly, it n lnd fom doumnt xmpls nd thn d y th uss. Fom noth point of viw, Flosu [14] pointd out th nd to utomtilly inf good-qulity shms nd to pply thm in th poss of dt intgtion. This is lly dt-nti pplition, thfo unodd shms might mo ppopit. Anoth motivtion of lning th unodd shm of XML olltion is quy minimiztion [2] i.., givn quy nd shm, find smll yt quivlnt quy in th psn of th shm. Futhmo, w wnt to us infd unodd shms nd optimiztion thniqus to oost th lning lgoithms fo twig quis [26], whih od-olivious. Pviously, shm lning hs n studid fom positiv xmpls only i.., doumnts whih must stisfy th shm. Fo instn, w hv ldy shown shm lnd fom th th doumnts fom Figu 1 givn s positiv xmpls. Howv, it is onivl to find pplitions wh ngtiv xmpls (i.., doumnts tht must not stisfy th shm) might usful. Fo instn, ssum snio wh th shm of dt-nti XML olltion volvs ov tim nd som doumnts my om osolt w..t. th nw shm. A us n mploy ths doumnts s ngtiv xmpls to xtt th nw shm of th olltion. Thus, th shm mintnn [14]
n don inmntlly, with littl fdk ndd fom th us. This kind of pplition motivts us to invstigt th polm of lning unodd shms whn w lso llow ngtiv xmpls. W fous ou sh on lning th unodd shm fomlisms ntly poposd in [10]: th disjuntiv multipliity shms (DMS) nd its stition, disjuntionf multipliity shms (MS). Whil thy mploy usfindly syntx inspid y DTDs, thy dfin unodd ontnt modl only, nd, thfo, thy tt suitd fo unodd XML. Thy lso tin muh of th xpssivnss of DTDs without n ins in omputtionl omplxity. Essntilly, DMS is st of uls ssoiting with h ll th possil num of ouns fo ll th llowd hildn lls y using multipliitis: (0 o mo ouns), (1 o mo),? (0 o 1), 1 (xtly on oun; oftn omittd fo vity). Additionlly, ltntivs n spifid using stitd disjuntion ( ) nd ll th onditions gthd with unodd ontntion ( ). Fo xmpl, th following shm is stisfid y th th doumnts fom Figu 1. ook Ñ titl y? putho dito q. This DMS llows ook to hv, in ny od, xtly on titl, optionlly on y, nd ith t lst on utho o t lst on dito. Moov, this is miniml shm stisfid y th doumnts fom Figu 1 us it ptus th most spifi shm stisfid y thm. On th oth hnd, th following shm is lso stisfid y th doumnts fom Figu 1, ut it is mo gnl: ook Ñ titl y? utho dito. This shm llows ook to hv, in ny od, xtly on titl, optionlly on y, nd ny num of utho s nd dito s. It is not miniml us it pts ook hving t th sm tim utho s nd dito s, unlik th fist xmpl of shm. Moov, th sond shm is MS us it dos not us th disjuntion option. In this pp w ddss th polm of lning DMS nd MS fom xmpls givn y th us. W popos dfinition of th lnility inflund y omputtionl lning thoy [21], in ptiul y th infn of lngugs [13, 18]. A lning lgoithm tks s input st of XML doumnts whih must stisfy th shm (i.., positiv xmpls), nd st of XML doumnts whih must not stisfy th shm (i.., ngtiv xmpls). Essntilly, lss of shms is lnl if th xists n lgoithm whih tks s input st of xmpls givn y th us nd tuns shm whih is onsistnt with th xmpls. Moov, th lning lgoithm should sound i.., lwys tun shm onsistnt with th xmpls givn y th us, omplt i.., l to podu vy shm with suffiintly ih st of xmpls, nd ffiint i.., polynomil in th siz of th input. Ou ppoh is novl in two ditions: Pvious sh on shm lning hs n don in th ontxt of odd XML, typilly on lning stitd lsss of gul xpssions s ontnt modls of th DTDs. W fous on lning unodd shm fomlisms nd th sults positiv: th DMS nd th MS lnl fom positiv xmpls only. Th lning fmwoks invstigtd fo in th littu typilly inf shm using olltion of doumnts sving s positiv xmpls. W study th impt of ngtiv xmpls in th poss of shm lning. In this s, th lning lgoithm should tun shm stisfid y ll th positiv xmpls nd y non of th ngtiv ons. W show tht th MS lnl in th psn of oth positiv nd ngtiv xmpls, whil th DMS not. W summiz ou lnility sults in Tl 1. Fo th lnl ss, w popos lning lgoithms whih tun miniml shm onsistnt with th xmpls. Shm fomlism + xmpls only + nd - xmpls DMS Ys (Th. 4.4) No (Th. 6.4) MS Ys (Th. 5.1) Ys (Th. 6.1) Tl 1. Summy of lnility sults. Rltd wok. Th Doumnt Typ Dfinition (DTD), th most widspd XML shm fomlism [8, 19], is ssntilly st of uls ssoiting with h ll gul xpssion tht dfins th dmissil squns of hildn. Thfo, lning DTDs dus to lning gul xpssions. Gold [18] showd tht th nti lss of gul lngugs is not idntifil in th limit. Consquntly, sh hs n don on stitd lsss of gul xpssions whih n ffiintly lnl [24]. Hgwld t l. [20] xtndd th ppoh fom [24] nd poposd systm whih infs on-unmiguous gul xpssions [11] s th ontnt modls of th lls. Goflkis t l. [17] dsignd ptil systm whih infs onis nd smntilly mningful DTDs fom doumnt xmpls. Bx t l. [6, 7] poposd lning lgoithms fo two lsss of gul xpssions whih ptu mny ptil DTDs nd suint y dfinition: singl oun gul xpssions (SOREs) nd its sulss onsisting of hin gul xpssions (CHAREs). Bx t l. [5] lso studid lning lgoithms fo th sulss of dtministi gul xpssions in whih h lpht symol ous t most k tims (k-ores). Mo ntly, Fydng nd Kötzing [15] poposd mo ffiint lgoithms fo th ov mntiond stitd lsss of gul xpssions. Sin th DMS disllow ptitions of symols mong th disjuntions, thy n sn s stitd SOREs intptd und ommuttiv losu i.., n unodd olltion of hildn mths gul xpssion if th xists n oding tht mths th gul xpssion in th stndd wy. Th lgoithms poposd fo th infn of SOREs [7, 15] typilly sd on onstuting n utomton nd thn tnsfoming it into n quivlnt SORE. Bing sd on utomt thniqus, th lgoithms fo lning SOREs tk odd input, thfo n dditionl input tht th DMS do not hv i.., th od mong th lls. Fo this son, w nnot du lning DMS to lning SOREs. Consquntly, w hv to invstigt nw thniqus to solv th polm of lning unodd shms. Moov, ll th xisting lning lgoithms tk into ount only positiv xmpls. W lso mntion som of th ltd wok on lning shm fomlisms mo xpssiv thn DTDs. XML Shm, th sond most widspd shm fomlism [8, 19], llow th ontnt modl of n lmnt to dpnd on th ontxt in whih it is usd, thfo it is mo diffiult to ln. Bx t l. [9] poposd ffiint lgoithms to utomtilly inf onis XML Shm dsiing givn st of XML doumnts. In diffnt ppoh, Chidlovskii [12] usd xtndd ontxt-f gmms to modl shms fo
XML nd poposd shm xttion lgoithm. Ogniztion. This pp is ognizd s follows. In Stion 2 w psnt pliminy notions. In Stion 3 w fomlly dfin th lning fmwok. In Stion 4 nd Stion 5 w psnt th lnility sults fo DMS nd MS, sptivly, whn only positiv xmpls llowd. In Stion 6 w disuss th impt of ngtiv xmpls on lning. Finlly, w summiz ou sults nd outlin futh ditions in Stion 7. 2. Pliminis Thoughout this pp w ssum n lpht Σ whih is finit st of symols. W lso ssum tht Σ hs totl od Σ, tht n tstd in onstnt tim. Ts. W modl XML doumnts with unodd lld ts. Fomlly, t t is tupl pn t, oot t, l t, hild tq, wh N t is finit st of nods, oot t P N t is distinguishd oot nod, l t : N t Ñ Σ is lling funtion, nd hild t N t N t is th pnt-hild ltion. W ssum tht th ltion hild t is yli nd qui vy non-oot nod to hv xtly on pdsso in this ltion. By T w dnot th st of ll finit ts. W psnt n xmpl of t in Figu 2. Figu 2. An xmpl of t. Unodd wods. An unodd wod is ssntilly multist of symols i.., funtion w : Σ Ñ N 0 mpping symols fom th lpht to ntul nums, nd w ll wpq th num of ouns of th symol in w. W dnot y W Σ th st ontining ll th unodd wods ov th lpht Σ. W lso wit P w s shothnd fo wpq 0. An mpty wod ε is n unodd wod tht hs 0 ouns of vy symol i.., εpq 0 fo vy P Σ. W oftn us simpl psnttion of unodd wods, witing h symol in th lpht th num of tims it ous in th unodd wod. Fo xmpl, whn th lpht is Σ t,, u, w 0 stnds fo th funtion w 0pq 3, w 0pq 0, nd w 0pq 2. Th (unodd) ontntion of two unodd wods w 1 nd w 2 is dfind s th multist union w 1 Z w 2 i.., th funtion dfind s pw 1 Z w 2qpq w 1pq w 2pq fo ll P Σ. Fo instn, Z. Not tht ε is th idntity lmnt of th unodd ontntion ε Z w w Z ε w fo ll unodd wod w. Also, givn n unodd wod w, y w i w dnot th ontntion w Z... Z w (i tims). A lngug is st of unodd wods. Th unodd ontntion of two lngugs L 1 nd L 2 is lngug L 1 Z L 2 tw 1 Z w 2 w 1 P L 1, w 2 P L 2u. Fo instn, if L 1 t, u nd L 2 t,, εu, thn L 1 Z L 2 t,,,, u. Multipliity shms. A multipliity is n lmnt fom th st t,,?, 0, 1u. W dfin th funtion mpping multipliitis to sts of ntul nums. Mo pisly: t0, 1, 2,...u, t1, 2,...u,? t0, 1u, 1 t1u, 0 t0u. Givn symol P Σ nd multipliity M, th lngug of M, dnotd Lp M q, is t i i P M u. Fo xmpl, Lp q t,,...u, Lp 0 q tεu, nd Lp? q tε, u. A disjuntiv multipliity xpssion E is: E : D M 1 1... D Mn n, wh fo ll 1 i n, M i is multipliity nd h D i is: D i : M 1 1 1... M 1 k k, wh fo ll 1 j k, Mj 1 is multipliity nd j P Σ. Moov, w qui tht vy symol P Σ is psnt t most on in disjuntiv multipliity xpssion. Fo instn, p q p dq is disjuntiv multipliity xpssion, ut p q p dq is not us pps twi. A disjuntion-f multipliity xpssion is n xpssion whih uss no disjuntion symol i.., n xpssion of th fom M 1 1... M k k, wh th i s piwis distint symols in th lpht nd th M i s multipliitis (with 1 i k). W dnot y DME th st of ll th disjuntiv multipliity xpssions nd y ME th st of ll th disjuntion-f multipliity xpssions. Th lngug of disjuntiv multipliity xpssion is: Lp M 1 1... M k k q LpM 1 1 q Y... Y Lp M k k q, LpD M q tw 1 Z... Z w i w 1,..., w i P LpDq ^ i P M u, LpD M 1 1... D Mn n q LpD M 1 1 q Z... Z LpD Mn n q. If n unodd wod w longs to th lngug of disjuntiv multipliity xpssion E, w dnot it y w ù E, nd w sy tht w stisfis E. Whn symol (sp. disjuntiv multipliity xpssion E) hs multipliity 1, w oftn wit (sp. E) instd of 1 (sp. E 1 ). Moov, w omit witing symols nd disjuntiv multipliity xpssions with multipliity 0. Tk, fo instn, E 0 p q d? nd not tht oth th symols nd s wll s th disjuntion p q hv n impliit multipliity 1. Th lngug of E 0 is: LpE 0q t i j k d l i, j, k, l P N 0, i 1, j k 1, l 1u. Nxt, w ll th unodd shm fomlisms fom [10]: Dfinition 2.1 A disjuntiv multipliity shm (DMS) is tupl S poot S, R Sq, wh oot S P Σ is dsigntd oot ll nd R S mps symols in Σ to disjuntiv multipliity xpssions. By DMS w dnot th st of ll disjuntiv multipliity shms. A disjuntion-f multipliity shm (MS) S poot S, R Sq is stition of th DMS, wh R S mps symols in Σ to disjuntion-f multipliity xpssions. By MS w dnot th st of ll disjuntion-f multipliity shms. To dfin stisfiility of DMS S y t t w fist dfin th unodd wod h n t of hildn of nod n P N t i.., h n t pq tm P N t pn, mq P hild t ^ l tpmq u. Now, t t stisfis S, in symols t ù S, if l tpoot tq oot S nd fo ny nod n P N t, h n t P LpR Spl tpnqqq. By LpSq T w dnot th st of ll th ts stisfying S. In th squl, w psnt shm S poot S, R Sq s st of uls of th fom Ñ R Spq, fo ny P Σ. If
LpR Spqq ε, thn w wit Ñ ɛ o w simply omit witing suh ul. Exmpl 2.2 W psnt shms S 1, S 2, S 3, S 4 illustting th fomlisms dfind ov. Thy hv th oot ll nd th uls: S 1 : Ñ? Ñ? Ñ? Ñ S 2 : Ñ Ñ? Ñ Ñ S 3 : Ñ p q Ñ? Ñ? Ñ S 4 : Ñ p q Ñ ɛ Ñ? Ñ S 1 nd S 2 MS, whil S 3 nd S 4 DMS. Th t fom Figu 2 stisfis only S 1 nd S 3. Not tht th xist DMS suh tht th smllst t in thi lngug hs siz xponntil in th siz of th lpht, s w osv in th following xmpl. Exmpl 2.3 W onsid fo n 1 th lpht Σ t, 1, 1,..., n, nu nd th DMS S 5 hving th oot ll nd th following uls: Ñ 1 1, i Ñ i 1 i 1 pfo 1 i nq, i Ñ i 1 i 1 pfo 1 i nq, n Ñ ɛ, n Ñ ɛ. W psnt in Figu 3 th uniqu t stisfying this shm nd w osv tht its siz is xponntil in th siz of th lpht. 1 1 2 2 2 2 3 3 3 3 3 3 3 3................................................ n n n n n n n n n n n n n n n n Figu 3. Th uniqu t stisfying th shm S 5. Altntiv dfinition with htizing tipls. Any disjuntiv multipliity xpssion E n xpssd ltntivly y its (htizing) tipl pc E, N E, P Eq onsisting of th following sts: Th onfliting pis of silings C E ontins pis of symols in Σ suh tht E dfins no wod using oth symols simultnously: C E tp 1, 2q P Σ Σ Dw P LpEq. 1 P w ^ 2 P wu. Th xtndd dinlity mp N E ptus fo h symol in th lpht th possil nums of its ouns in th unodd wods dfind y E: N E tp, wpqq P Σ N 0 w P LpEqu. Th sts of quid symols P E whih ptus symols tht must psnt in vy wod; ssntilly, st of symols X longs to P E if vy wod dfind y E ontins t lst on lmnt fom X: P E tx Σ @w P LpEq. D P X. P wu. As n xmpl w tk E 0 p q d?. Bus P E is losd und supsts, w list only its miniml lmnts: C E0 tp, q, p, qu, P E0 ttu, t, u,...u, N E0 tp, 0q, p, 1q, p, 0q, p, 1q, pd, 0q, pd, 1q, p, 1q, p, 2q,...u. Two quivlnt disjuntiv multipliity xpssions yild th sm tipls nd hn pc E, N E, P Eq n viwd s th noml fom of givn xpssion E [10]. Moov, h st hs ompt psnttion of siz polynomil in th siz of th lpht nd omputl in PTIME. W illustt thm on th sm E 0 p q d? : C E onsists of sts of symols psnt in E suh tht ny piwis two of thm onfliting: C E 0 tt, uu. N E is funtion mpping symols to multipliitis suh tht fo ny unodd wod w P LpEq, nd fo ny symol P Σ, wpq P N Epq : N E 0 pq, N E 0 pq N E 0 pq N E 0 pdq?. P E ontins only th -miniml lmnts of PE: P E 0 ttu, t, uu. Also not tht w n sily onstut disjuntiv multipliity xpssion fom its htizing tipl. A simpl lgoithm hs to loop ov th sts fom C E nd P E to omput fo h ll with whih oth lls it is linkd y th disjuntion opto. Thn, using N E, th lgoithm ssoits to h ll nd h disjuntion th ot multipliity. Fo xmpl, tk th following ompt tipls: C E 1 tt, u, t, duu, P E 1 tt, u, tuu, N E 1 pq, N E 1 pq 1, N E 1 pq N E 1 pdq N E 1 pq?. Not tht thy htiz th xpssion: E 1 p q p? d? q. W hv intodud th ltntiv dfinition with htizing tipls us w lt popos n lgoithm whih lns htizing tipls fom unodd wod xmpls (Algoithm 1 fom Stion 4). Thn, fom this infomtion, th osponding disjuntiv multipliity xpssion n onstutd in stightfowd mnn. 3. Lning fmwok W us vint of th stndd lngug infn fmwok [13, 18] dptd to lning disjuntiv multipliity xpssions nd shms. A lning stting is tupl ontining th st of onpts tht to lnd, th st of instns of th onpts tht to sv s xmpls in lning, nd th smntis mpping vy onpt to its st of instns. Dfinition 3.1 A lning stting is tupl pe, C, Lq, wh E is st of xmpls, C is lss of onpts, nd L is funtion tht mps vy onpt in C to th st of ll its xmpls ( sust of E). Fo xmpl, th stting fo lning disjuntiv multipliity xpssions fom positiv xmpls is th tupl pw Σ, DME, Lq nd th stting fo lning disjuntiv multipliity shms fom positiv xmpls is pt, DMS, Lq. W otin nlogously th lning sttings fo disjuntionf multipliity xpssions nd shms: pw Σ, ME, Lq nd
pt, MS, Lq, sptivly. Th gnl fomultion of th dfinition llows us to sily dfin sttings fo lning fom oth positiv nd ngtiv xmpls, whih w psnt in Stion 6. To dfin lnl onpt, w fix lning stting K pe, C, Lq nd w intodu som uxiliy notions. A smpl is finit nonmpty sust D of E i.., st of xmpls. A smpl D is onsistnt with onpt P C if D Lpq. A lning lgoithm is n lgoithm tht tks smpl nd tuns onpt in C o spil vlu null. Dfinition 3.2 A lss of onpts C is lnl in polynomil tim nd dt in th stting K pe, C, Lq if th xists polynomil lning lgoithm ln stisfying th following two onditions: 1. Soundnss. Fo ny smpl D, th lgoithm lnpdq tuns onpt onsistnt with D o spil null vlu if no suh onpt xists. 2. Compltnss. Fo ny onpt P C th xists smpl CS suh tht fo vy smpl D tht xtnds CS onsistntly with i.., CS D Lpq, th lgoithm lnpdq tuns onpt quivlnt to. Futhmo, th dinlity of CS is polynomilly oundd y th siz of th onpt. Th smpl CS is lld th htisti smpl fo w..t. ln nd K. Fo lning lgoithm th my xist mny suh smpls. Th dfinition quis tht on htisti smpl xists. Th soundnss ondition is ntul quimnt, ut lon it is not suffiint to limint tivil lning lgoithms. Fo instn, if w wnt to ln disjuntiv multipliity xpssions fom positiv xmpls ov th lpht t 1,..., nu, n lgoithm lwys tuning 1... n is sound. Consquntly, w qui th lgoithm to omplt nlogously to how it is don fo gmmtil lngug infn [13, 18]. Typilly, in th s of polynomil gmmtil infn, th siz of th htisti smpl is quid to polynomil in th siz of th onpt to lnd [13], wh th siz of smpl is th sum of th sizs of th xmpls tht it ontins. Fom th dfinition of th DMS, sin ptitions of symols disdd mong th disjuntions, th siz of shm is polynomil in th siz of th lpht. Thus, ntul quimnt would tht th siz of th htisti smpl is polynomilly oundd y th siz of th lpht. Th xist DMS suh tht th smllst t in thi lngug is xponntil in th siz of th lpht (f. Exmpl 2.3). Bus of sp stitions, w hv imposd in th dfinition of lnility tht th dinlity (nd not th siz) of th htisti smpl is polynomilly oundd y th siz of th onpt, hn y th siz of th lpht. Howv, w l to otin htisti smpls of siz polynomil in th siz of th lpht y using ompssd psnttion of th XML ts, fo xmpl with ditd yli gphs [23]. W will povid in th full vsion of th pp th dtils out this ompssion thniqu nd th nw dfinition of th lnility. Th lgoithms tht w popos in this pp tnsf without ny lttion fo th dfinition using ompssd ts. Additionlly to th onditions imposd y th dfinition of lnility, w intstd in th xistn of lning lgoithms whih tun miniml onpts fo givn st of xmpls. It is impotnt to mphsiz tht w mn minimlity in tms on lngug inlusion. Whn only positiv xmpls llowd, DMS S is miniml DMS onsistnt with st of ts D iff D LpSq, nd, fo ny S 1 S, if D LpS 1 q, thn LpS 1 q LpSq. W similly otin th dfinition of minimlity fo lning disjuntiv multipliity xpssions. Intuitivly, miniml shm onsistnt with st of xmpls is th most spifi shm onsistnt with thm. Fo xmpl, ll th th XML doumnts stoing infomtion out ooks fom Figu 1. Assum tht th us povids th th doumnts s positiv xmpls to lning lgoithm. Th most spifi shm onsistnt with th xmpls is: ook Ñ titl y? putho dito q. Anoth possil solution is th shm: ook Ñ titl y? utho dito. It is lss likly tht us wnts to otin suh shm whih llows ook to hv t th sm tim utho s nd dito s. In this s, th most spifi shm lso osponds to th ntul quimnts tht on might wnt to impos on XML olltion stoing infomtion out ooks, in ptiul ook hs ith t lst on utho o t lst on dito. Minimlity is oftn pivd s tt fittd lning solution [3 5, 16], nd this motivts ou quimnt fo th lning lgoithms to tun miniml onpts onsistnt with th xmpls. 4. Lning DMS fom positiv xmpls Th min sult of this stion is th lnility of th disjuntiv multipliity shms fom positiv xmpls i.., in th stting pt, DMS, Lq. W psnt lning lgoithm tht onstuts miniml shm onsistnt with th input st of ts. Fist, w study th polm of lning disjuntiv multipliity xpssion fom positiv xmpls i.., in th stting pw Σ, DME, Lq. W psnt lning lgoithm tht onstuts miniml disjuntiv multipliity xpssion onsistnt with th input olltion of unodd wods. Givn st of unodd wods, th my xist mny onsistnt miniml disjuntiv multipliity xpssions. In ft, fo som sts of positiv xmpls th my n xponntil num of suh xpssions (f. th poof of Lmm 6.2). Tk in Exmpl 4.1 smpl nd two onsistnt miniml disjuntiv multipliity xpssions. Exmpl 4.1 Consid th lpht Σ t,,, d, u nd th st of unodd wods D t, d, u. Tk th following two disjuntiv multipliity xpssions: E 1 p q p? d? q, E 2 p d q. Not tht D LpE 1q nd D LpE 2q. Also not tht LpE 1q LpE 2q (us of ) nd LpE 2q LpE 1q (us of ). On th oth hnd, w sily osv tht oth E 1 nd E 2 miniml disjuntiv multipliity xpssions with lngugs inluding D. Bfo w psnt th lning lgoithms, w hv to intodu dditionl notions. Fist, w dfin th funtion min fit multipliityp q whih, givn st of unodd wods D nd ll P Σ, omputs th multipliity M suh tht @w P D. wpq P M nd th dos not xist noth multipliity M 1 suh tht M 1 M nd @w P D. wpq P M 1. Fo xmpl, givn th st of unodd
wods D t, d, u, w hv: min fit multipliitypd, q, min fit multipliitypd, q 1, min fit multipliitypd, q?. Nxt, w intodu th notion of mximl-liqu ptition of gph. Givn gph G pv, Eq, mximl-liqu ptition of G is gph ptition pv 1,..., V k q suh tht: Th sugph indud in G y ny V i is liqu (with 1 i k), Th sugph indud in G y th union of ny V i nd V j is not liqu (with 1 i j k). In Figu 4 w psnt gph nd mximl-liqu ptition of it i.., tt, u, tu, t, duu. Not tht th gph fom Figu 4 llows on oth mximl-liqu ptition i.., ttu, tu, t, d, uu. On th oth hnd, ttu, tu, t, du, tuu is not mximl-liqu ptition us it ontins two sts suh tht thi union indus liqu i.., tu nd tu. Figu 4. A gph nd mximl-liqu ptition of it. Vtis fom th sm tngl long to th sm st. Unlik th liqu polm, whih is known to NPomplt [25], w n ptition in PTIME gph in mximl liqus with gdy lgoithm. In th squl, w ssum tht th vtis of th gph lls fom Σ. Fo givn gph th my xist mny mximl-liqu ptitions nd w us th totl od Σ to popos dtministi lgoithm onstuting mximl-liqu ptition. Th lgoithm woks s follows: w tk th smllst ll fom Σ w..t. Σ nd not yt usd in liqu, nd w ittivly xtnd it to mximl liqu y dding onntd lls. Evy tim whn w hv hoi to dd nw ll to th unt liqu, w tk th smllst ll w..t. Σ. W pt this until ll th lls usd. This lgoithm yilds to uniqu mximl-liqu ptition. Fo xmpl, fo th gph fom Figu 4, w omput th mximl-liqu ptition mkd on th figu i.., tt, u, tu, t, duu. W dditionlly dfin th funtion mx liqu ptitionp q whih tks s input gph, omputs mximl-liqu ptition using th gdy lgoithm dsid ov nd, t th nd, fo thnil sons, th lgoithm disds th singltons. Fo xmpl, fo th gph fom Figu 4, th funtion mx liqu ptitionp q tuns tt, u, t, duu. Clly, th funtion mx liqu ptitionp q woks in PTIME. Nxt, w psnt Algoithm 1 nd w lim tht, givn st of unodd wods D, it omputs in polynomil tim disjuntiv multipliity xpssion E onsistnt with D. Algoithm 1 woks in th stps nd w illustt h of thm on th smpl D t, d, u fom Exmpl 4.1. Th fist stp (lins 1-2) omputs th ompt psnttion of th xtndd dinlity mp fo h symol fom Σ, using th funtion min fit multipliityp q. W igno in th squl th symols nv ouing in wods fom D (lin 3). Fo th smpl fom Exmpl 4.1, w inf: N Epq, N Epq 1, N Epq N Epdq N Epq?. d Algoithm 1 Lning disjuntiv multipliity xpssions fom positiv xmpls. lgoithm ln DME pdq Input: A st of unodd wods D tw 1,..., w nu Output: A miniml disjuntiv multipliity xpssion E onsistnt with D 1: fo P Σ do 2: lt N Epq min fit multipliitypd, q 3: lt Σ 1 t P Σ N Epq P t?, 1,, uu 4: lt G pσ 1, tp, q P Σ 1 Σ 1 @w P D. R w _ R wuq 5: lt C E mx liqu ptitionpgq 6: lt P E ttu N Epq P t1, uu Y tx P C E @w P D. D P X. P wu 7: tun E htizd y th tipl pc E, N E, P E q Th sond stp of th lgoithm (lins 4-5) omputs th ompt sts of onfliting silings. Fist, w onstut th gph G hving s st of vtis th lls ouing t lst on in unodd wods fom D. Two lls linkd y n dg in G if th dos not xist n unodd wod in D wh oth of thm psnt t th sm tim, in oth wods th two lls ndidt pi of onfliting silings. Nxt, w pply th funtion mx liqu ptitionp q on th gph G. Fo th unodd wods fom Exmpl 4.1 w otin th gph fom Figu 4, nd w inf C E tt, u, t, duu. Not tht th mximl-liqu ptition implis th minimlity of th disjuntiv multipliity xpssion onstutd lt using th infd C E. Th thid stp of th lgoithm (lin 6) omputs th - miniml sts of quid symols P E. Eh symol hving ssoitd multipliity 1 o longs to quid st of symols ontining only itslf us it is psnt in ll th unodd wods fom D nd w wnt to ln miniml onpt. Moov, w dd in P E th sts of onfliting silings infd t th pvious stp with th popty tht on of thm is psnt in ny unodd wod fom D, to gunt th minimlity of th infd lngug. Fo th smpl fom Exmpl 4.1, tu longs to P E. Sin fom th pvious stp w hv C E tt, u, t, duu, t this stp w hv to dd t, u to P E us ll th wods in th smpl ontin ith o. On th oth hnd, w do not dd t, du us th smpl ontins th wod. Th infd P E is tt, u, tuu. Finlly, th lgoithm tuns th disjuntiv multipliity xpssion htizd y th infd tipl (lin 7). Fo th smpl D, it tuns E p q p? d? q. Not tht if t stp 2 w tk ptition whih is not mximlliqu on, fo xmpl ttu, tu, t, du, tuu, nd w lt onstut disjuntiv multipliity xpssion using it, w gt p? d? q?, whih inluds oth E 1 nd E 2 fom Exmpl 4.1, thfo is not miniml. Also not tht t stp 3, without t, u ddd to P E, th sulting shm would pt n unodd wod without ny nd, so th lnd lngug would not miniml. Algoithm 1 is sound nd h of its th stps quis polynomil tim. Nxt, w pov th ompltnss of th lgoithm. Givn disjuntiv multipliity xpssion E, w onstut in th stps its htisti smpl CS E. At th sm tim, w illustt th onstution on th disjuntiv multipliity xpssion E 1 p q p? d? q: 1. W tk th pis of symols whih n found togth in n unodd wod in LpEq. Fo h of thm, w dd in CS E n unodd wod ontining only
th two symols. Nxt, fo h symol ouing in th disjuntions fom E, w dd in CS E n unodd wod ontining only on oun of tht symol. W lso dd in CS E th mpty wod. Fo E 1 w otin: t,, d,, d,,, d,,,, d,, εu. 2. W pl h unodd wod w otind t th pvious stp with w Z w 1, wh w 1 is miniml unodd wod suh tht w Z w 1 P LpEq. Th nwly otind CS E ontins unodd wods fom LpEq. Fo E 1 w otin: t,, d,,, du. 3. Fo h symol fom th lpht suh tht N Epq is o, w ndomly tk n unodd wod w fom CS E nd ontining nd w dd to CS E th unodd wod w Z. In th wost s, t this stp th num of wods in th htisti smpl is dould, ut it mins polynomil in th siz of th lpht. Fo E 1 w otin: t,,, d,,, du. Not tht th my xist mny quivlnt htisti smpls. Th fist stp of th onstution implis tht th only potntil onflits to onsidd in Algoithm 1 th onflits implid y th xpssion. In oth wods, ll th onntd omponnts of th gph of potntil onflits fom Algoithm 1 liqus. Thus, th is only on possil mximl-liqu ptition to don in th lgoithm. Moov, th sond nd thid stps of th onstution nsu tht, fo ny smpl onsistntly xtnding th htisti smpl, Algoithm 1 infs th ot sts of quid symols nd th xtndd dinlity mp, sptivly. W hv poposd Algoithm 1, whih is sound nd omplt lgoithm fo lning miniml disjuntiv multipliity xpssions fom unodd wods positiv xmpls. Thus, w n stt th following sult: Lmm 4.2 Th onpt lss DME is lnl in polynomil tim nd dt fom positiv xmpls i.., in th stting pw Σ, DME, Lq. Nxt, w xtnd th sult fo DMS. W popos Algoithm 2, whih lns disjuntiv multipliity shm fom st of ts. W ssum w.l.o.g. tht ll th ts fom th smpl hv s oot ll th sm ll. If this ssumption is not stisfid, th smpl is not onsistnt. Th lgoithm infs, fo h ll fom th lpht, th miniml disjuntiv multipliity xpssion onsistnt with th hildn of ll th nods lld fom th ts fom th smpl. Algoithm 2 Lning DMS fom positiv xmpls. lgoithm: ln DMS pdq Input: A st of ts D tt 1,..., t nu s.t. l ti poot ti q (with 1 i nq Output: A miniml DMS S onsistnt with D 1: fo P Σ do 2: lt D 1 th n t t P D. n P N t. l tpnq u 3: lt R Spq ln DME pd 1 q 4: tun S p, R Sq Algoithm 2 tuns miniml disjuntiv multipliity shm onsistnt with th smpl us th infd ul fo h ll psnts miniml disjuntiv multipliity xpssion otind using Algoithm 1. Nxt, w show tht Algoithm 2 is lso omplt y poviding onstution of htisti smpl of dinlity polynomil in th siz of th lpht. Fo this pupos, w hv to dfin fist two dditionl notions. Givn DMS S poot S, R Sq nd ll P Σ, w dfin th following two ts: min tòps,q is miniml t stisfying S nd ontining nod lld, min tóps,q is miniml t stisfying S 1 p, R Sq. It is quivlnt to min tòps 1,q. W illustt th two notions dfind ov in th following xmpl: Exmpl 4.3 Consid th DMS S hving th oot ll nd th uls: Ñ p q Ñ d?, Ñ d, Ñ ɛ W psnt in Figu 5 som ts nd w xplin fo h of thm how it n usd. () min tóps,q min tòps,q min tòps,q min tòps,q (f) min tóps,q () min tóps,q min tòps,q min tòps,q min tòps,q (g) min tóps,q () min tòps,q d d (h) min tóps,dq (d) min tòps,dq d (i) min tóps,q Figu 5. Ts usd fo Exmpl 4.3. () min tóps,q Nxt, w psnt th onstution of th htisti smpl fo lning DMS fom positiv xmpls. W tk DMS S poot S, R Sq ov n lpht Σ nd w ssum w.l.o.g. tht ny symol of th lpht n psnt in t lst on t fom LpSq. Fo h P Σ, fo h w P CS RS pq, w omput t t s follows: w gnt t min tòps,q, w tk th nod lld y (lt it n ), nd fo ny P Σ, whil h n t pq wpq w fus in n opy of min tóps,q. W otin smpl of dinlity polynomilly oundd y th siz of th lpht. Givn DMS S, th my xist mny htisti smpls CS S. Eh of thm hs th popty tht, if w onstut smpl D whih xtnds CS S onsistntly with S, thn ln DMS pdq tuns S. This povs th ompltnss of Algoithm 2. W illustt th onstution of th htisti smpl on th shm S fom Exmpl 4.3. Rll tht w hv ldy psntd th ts min tòps,q nd min tóps,q fo h fom th lpht. W lso onstut th htisti smpls fo th disjuntiv multipliity xpssions fom th uls of S: CS RS pq t,,,, u, CS RS pq tε, du, CS RS pq CS RS pq t, u, CS RS pdq CS RS pq tεu.
In Figu 6 w psnt htisti smpl CS S fo th DMS S nd w xplin th pupos of h t: (), (), (), (d), nd () nsu tht th is infd th ot ul fo th oot i.., R Spq, () nd (f) nsu tht th is infd th ot R Spq, (d) nd (g) nsu tht th is infd th ot R Spq, () nd (h) nsu tht th is infd th ot R Spq, Th nods lld y d nd nv hv hildn in th ts fom CS S, so th infd th ot uls fo R Spdq nd R Spq. () () () (d) Figu 6. Chtisti smpl fo th shm S fom Exmpl 4.3. W hv poposd Algoithm 2, whih is sound nd omplt lgoithm fo lning disjuntiv multipliity shms fom ts positiv xmpls. Thus, w n stt th min sult of this stion: Thom 4.4 Th onpt lss DMS is lnl in polynomil tim nd dt fom positiv xmpls i.., in th stting pt, DMS, Lq. 5. Lning MS fom positiv xmpls In this stion w show tht th MS lnl fom positiv xmpls i.., in th stting pt, MS, Lq. Rll tht th MS llow no disjuntion in th uls, in oth wods thy us xpssions of th fom M 1 1... Mn n. Du to this vy ptiul fom, w n ptu MS S poot S, R Sq using funtion µ : Σ Σ Ñ t0, 1,?,, u otind ditly fom th uls of S: () d (f) Ñ µp, 1q 1... µp,nq n. Fo xmpl, givn th shm S hving th oot nd th uls: w hv : (g) Ñ, Ñ, Ñ??, µp, q, µp, q 1, µp, q 0, µp, q 0, µp, q, µp, q 0, µp, q?, µp, q?, µp, q 0. Not tht givn th funtion µp q w n sily onstut th initil S. W us this htiztion in Algoithm 3, polynomil nd sound lgoithm whih lns miniml MS fom st of ts. W ssum w.l.o.g. tht ll th ts fom th smpl hv s oot ll th sm ll. If this ssumption is not stisfid, th smpl is not onsistnt. Th minimlity of th lgoithm follows fom th minimlity of th infd multipliity fo h pi of lls p, q, using th funtion min fit multipliityp q (f. Stion 4). Moov, Algoithm 3 is omplt. W n sily (h) onstut htisti smpl of dinlity polynomil in th siz of th lpht y using th sm stps povidd in th pvious stion, fo unodd wods nd fo ts. Algoithm 3 Lning MS fom positiv xmpls. lgoithm ln MS pdq Input A st of ts D tt 1,..., t nu s.t. l ti poot ti q (with 1 i nq Output A miniml MS S onsistnt with D 1: fo P Σ do 2: lt D 1 th n t t P D. n P N t. l tpnq u 3: fo P Σ do 4: lt µp, q min fit multipliitypd 1, q 5: tun S hving th oot ll nd ptud y µ W hv poposd sound nd omplt lgoithm whih lns miniml MS onsistnt with st of positiv xmpls, so w n stt th following sult: Thom 5.1 Th onpt lss MS is lnl in polynomil tim nd dt fom positiv xmpls i.., in th stting pt, MS, Lq. 6. Impt of ngtiv xmpls In th pvious stions, w hv onsidd th sttings wh th us povids positiv xmpls only. In this stion, w llow th us to dditionlly spify ngtiv xmpls. Th min sults of this stion tht th MS lnl in polynomil tim nd dt in th psn of oth positiv nd ngtiv xmpls, whil th DMS not. W us two symols nd to mk whth n xmpl is positiv o ngtiv, nd w dfin: W Σ WΣ t, u, L peq tpw, q w P LpEquYtpw, q w P W Σ z LpEqu, wh E is disjuntiv multipliity xpssion, T T t, u, L psq tpt, q t P LpSqu Y tpt, q t P T z LpSqu, wh S is disjuntiv multipliity shm. Fomlly, th stting fo lning disjuntiv multipliity xpssions fom positiv nd ngtiv xmpls is pw Σ, DME, L q, whil fo lning DMS fom positiv nd ngtiv xmpls w hv pt, DMS, L q. W otin nlogously th sttings fo disjuntion-f multipliity xpssions nd shms: pw Σ, ME, L q nd pt, MS, L q, sptivly. W study th polm of hking whth th xists onpt onsistnt with th input smpl us ny sound lning lgoithm nds to tun null if nd only if th is no suh onpt. Thfo, onsistny hking is n si polm thn lning nd its inttility pluds lnility. Fomlly, givn lning stting K pe, C, Lq, th K-onsistny is th following dision polm: CONS K td E D P C. D Lpqu. Not tht th onsistny hking is tivil whn only positiv xmpls llowd. Fo instn, if w wnt to ln disjuntiv multipliity xpssions fom positiv xmpls ov th lpht t 1,..., nu, th disjuntiv multipliity xpssion 1... n is lwys onsistnt with th xmpls. Whn w lso llow ngtiv xmpls, th polm oms mo omplx, ptiully in th s of disjuntiv multipliity xpssions nd shms, wh this polm is not ttl.
Fist, w show tht th onsistny hking is ttl fo MS. In Stion 5, w hv poposd Algoithm 3, whih lns miniml MS onsistnt with st of positiv xmpls. Not tht, givn st of ts, th xists uniqu miniml MS onsistnt with thm. Th gumnt is tht Algoithm 3 uss th funtion min fit multipliityp q (f. Stion 4) to inf miniml multipliitis whih uniqu nd suffiint to ptu MS. Thus, th onsistny hking oms tivil fo MS: givn smpl ontining positiv nd ngtiv xmpls, th xists MS onsistnt with thm iff no t usd s ngtiv xmpl stisfis th miniml MS tund y Algoithm 3. Consquntly, w sily dpt Algoithm 3 to hndl oth positiv nd ngtiv xmpls nd w popos Algoithm 4. Algoithm 4 Lning MS fom positiv nd ngtiv xmpls. lgoithm ln MS pdq Input A smpl D tpt, αq t P T, α P t, uu Output A miniml MS S suh tht D L psq, o null if no suh shm xists 1: lt D 1 tt P T pt, q P Du 2: lt S ln MS pd 1 q 3: if Dt P T. pt, q P D ^ t P LpSq thn 4: tun null 5: tun S Essntilly, Algoithm 4 tuns th miniml shm onsistnt with th positiv xmpls iff th is no ngtiv xmpl stisfying it, nd othwis it tuns null. Not tht Algoithm 4 is sound nd woks in polynomil tim in th siz of th input. Th ompltnss of Algoithm 4 follows fom th ompltnss of Algoithm 3. Givn MS S, w n onstut htisti smpl CS S tht ontins only positiv xmpls, nlogously to how it is don fo Algoithm 3. W hv poposd polynomil, sound, nd omplt lgoithm whih lns miniml MS fom positiv nd ngtiv xmpls, so w stt th fist sult of this stion: Thom 6.1 Th onpt lss MS is lnl in polynomil tim nd dt fom positiv nd ngtiv xmpls i.., in th stting pt, MS, L q. Nxt, w pov tht th onpt lss DMS is not lnl in polynomil tim nd dt in th stting DMS pt, DMS, L q. Fo this pupos, w fist show th inttility of lning disjuntiv multipliity xpssions fom positiv nd ngtiv xmpls i.., in th stting DME pw Σ, DME, L q. W study th omplxity of hking th onsistny of st of positiv nd ngtiv xmpls nd w pov th inttility of CONS DME. Intuitivly, this follows fom th ft tht, givn st of unodd wods, th my xist n xponntil num of miniml onsistnt disjuntiv multipliity xpssions, nd w my nd to hk ll of thm to did whth th xist ngtiv xmpls stisfying thm. Fomlly, w hv th following sult: Lmm 6.2 CONS DME is NP-omplt. Poof W pov th NP-hdnss y dution fom 3SAT whih is known s ing NP-omplt. W tk fomul ϕ in 3CNF ontining th luss 1,..., k ov th vils x 1,..., x n. W gnt smpl D ϕ ov th lpht Σ tt 1, f 1,..., t n, f nu suh tht: pt 1f 1... t nf n, q P D ϕ, pε, q P D ϕ, pt if i, q, pt it if if i, q P D ϕ, fo 1 i n, pw j, q P D ϕ, wh w j v j1v j1v j2v j2v j3v j3, fo ny j suh tht 1 j k, wh x j1, x j2, x j3 th litls usd in th lus j nd fo ny l suh tht 1 l 3, v jl is t jl if x jl is ngtiv litl in j, nd f jl othwis. Fo xmpl, fo th fomul px 1 _ x 2 _ x 3q ^ p x 1 _ x 3 _ x 4q, w gnt th smpl: pt 1f 1t 2f 2t 3f 3t 4f 4, q, pt 1f 1, q, pt 2f 2, q, pt 3f 3, q, pt 4f 4, q, pε, q, pt 1t 1f 1f 1, q, pt 2t 2f 2f 2, q, pt 3t 3f 3f 3, q, pt 4t 4f 4f 4, q, pf 1f 1t 2t 2f 3f 3, q, pt 1t 1f 3f 3t 4t 4, q. Fo givn ϕ, vlution is funtion V : tx 1,..., x nu Ñ ttu, flsu. Eh of th 2 n possil vlutions nods miniml disjuntiv multipliity xpssion E V onsistnt with th positiv xmpls fom D ϕ, onstutd s follows: E V pv 1... v nq v 1?... v n?, wh, fo 1 i n, if V px iq tu thn v i t i nd v i f i. Othwis, v i f i nd v i t i. Nxt, w show tht, fo ny vlution V, V ù ϕ iff E V is onsistnt with D ϕ. Fo th only if s, onsid vlution V suh tht V ù ϕ nd w tk th osponding xpssion E V pv 1... v nq v? 1... v? n. Not tht t 1f 1... t nf n nd ll t if i s (with 1 i n) stisfy E V, whil ε dos not stisfy E V. Also not tht fo 1 i n, on symol twn t i nd f i ous t lst on, whil th oth ous t most on, so ll t it if if i s do not stisfy E V. Assum tht th is w j (with 1 j k) suh tht w j stisfis E V, whih y onstution implis tht th lus j is not stisfid y th vlution V, whih implis ontdition. Hn, w j dos not stisfy E V fo ny 1 j k. Thfo, E V is onsistnt with D ϕ. Fo th if s, w ssum tht E V is onsistnt with th smpl D ϕ. Sin th w j s (with 1 j k) nod th vlutions mking th luss j s fls nd non of th w j s stisfis E V, thn th vlution V nodd in E V mks th fomul ϕ stisfil. Th onstution of D ϕ lso nsus tht if th xists disjuntiv multipliity xpssion onsistnt with D ϕ, it hs th fom of E V. Thfo, ϕ P 3SAT iff D ϕ P CONS DME. To pov th mmship of CONS DME to NP, w point out tht Tuing mhin gusss disjuntiv multipliity xpssion E, whos siz is lin in Σ sin ptitions disdd mong th disjuntions of E. Moov, hking whth E is onsistnt with th smpl n sily don in polynomil tim. W xtnd th ov sult to CONS DMS : Coolly 6.3 CONS DMS is NP-omplt. Poof Th NP-hdnss of CONS DME implis th NPhdnss of CONS DMS : it is suffiint to onsid flt ts hving ll th sm oot ll. Moov, to pov th mmship of CONS DMS to NP, Tuing mhin gusss disjuntiv multipliity shm S, whos siz is polynomil in Σ, nd hks whth S is onsistnt with th smpl (whih n don in polynomil tim).
Sin onsistny hking in th psn of positiv nd ngtiv xmpls is inttl fo DMS, w onlud tht: Thom 6.4 Unlss P = NP, th onpt lss DMS is not lnl in polynomil tim nd dt fom positiv nd ngtiv xmpls i.., in th stting pt, DMS, L q. 7. Conlusions nd futu wok W hv studid th polm of lning unodd XML shms fom xmpls givn y th us. W hv invstigtd th lnility of DMS nd MS in two sttings: on llowing positiv xmpls only, nd on tht llows oth positiv nd ngtiv xmpls. To th st of ou knowldg, no sh hs n don on lning unodd XML shm fomlisms, no on llowing oth positiv nd ngtiv xmpls in th poss of shm lning. W hv povn tht th DMS lnl only fom positiv xmpls, nd w hv shown tht thy not lnl fom positiv nd ngtiv xmpls y using th inttility of th onsistny hking. Moov, w hv povn tht th MS lnl in oth sttings: fom only positiv xmpls, nd lso fom positiv nd ngtiv xmpls. Fo ll th lnl ss w hv poposd lning lgoithms tht tun miniml shms onsistnt with th xmpls. As futu wok, w wnt to us mo spifi lnility ondition i.., to qui th siz (instd of th dinlity) of th htisti smpl to polynomil in th siz of th lpht. Thus, w will fully dh to th lssil dfinition of th htisti smpl in th ontxt of gmmtil infn [13]. Ou pliminy sh indits tht w l to do this y using ompssd psnttion of th XML doumnts with ditd yli gphs [23]. Th lning lgoithms tht w popos in this pp will wok without ny lttion. Moov, w would lik to xtnd ou lning lgoithms fo mo xpssiv unodd shms, fo instn shms whih llow numi ouns [22] of th fom n,ms tht gnliz multipliitis y quiing th psn of t lst n nd t most m lmnts. Additionlly, w wnt to us th lning lgoithms fo unodd shms to oost th xisting lning lgoithms fo twig quis [26]. Fo this pupos, w hv to invstigt fist th polm of quy minimiztion [2] in th psn of DMS. Nxt, w wnt to popos twig quy lning lgoithm whih infs th shm of th doumnts nd thn it uss th shm to impov th qulity of th lnd twig quy. Rfns [1] S. Aitoul, P. Bouhis, nd V. Vinu. Highly xpssiv quy lngugs fo unodd dt ts. In ICDT, pgs 46 60, 2012. [2] S. Am-Yhi, S. Cho, L. V. S. Lkshmnn, nd D. Sivstv. T pttn quy minimiztion. VLDB J., 11(4):315 331, 2002. [3] D. Angluin. Indutiv infn of foml lngugs fom positiv dt. Infomtion nd Contol, 45(2):117 135, 1980. [4] D. Angluin. Infn of vsil lngugs. J. ACM, 29(3):741 765, 1982. [5] G. J. Bx, W. Gld, F. Nvn, nd S. Vnsummn. Lning dtministi gul xpssions fo th infn of shms fom XML dt. TWEB, 4(4), 2010. [6] G. J. Bx, F. Nvn, T. Shwntik, nd K. Tuyls. Infn of onis DTDs fom XML dt. In VLDB, pgs 115 126, 2006. [7] G. J. Bx, F. Nvn, T. Shwntik, nd S. Vnsummn. Infn of onis gul xpssions nd DTDs. ACM Tns. Dts Syst., 35(2), 2010. [8] G. J. Bx, F. Nvn, nd J. Vn dn Bussh. DTDs vsus XML Shm: A ptil study. In WDB, pgs 79 84, 2004. [9] G. J. Bx, F. Nvn, nd S. Vnsummn. Infing XML shm dfinitions fom XML dt. In VLDB, pgs 998 1009, 2007. [10] I. Bonv, R. Ciunu, nd S. Stwoko. Simpl shms fo unodd XML. In WDB, 2013. Thnil pot t http://xiv.og/s/1303.4277. [11] A. Büggmnn-Klin nd D. Wood. On-unmiguous gul lngugs. Inf. Comput., 142(2):182 206, 1998. [12] B. Chidlovskii. Shm xttion fom XML: A gmmtil infn ppoh. In KRDB, 2001. [13] C. d l Higu. Chtisti sts fo polynomil gmmtil infn. Mhin Lning, 27(2):125 138, 1997. [14] D. Flosu. Mnging smi-stutud dt. ACM Quu, 3(8):18 24, 2005. [15] D. D. Fydng nd T. Kötzing. Fst lning of stitd gul xpssions nd DTDs. In ICDT, pgs 45 56, 2013. [16] P. Gi nd E. Vidl. Infn of k-tstl lngugs in th stit sns nd pplition to syntti pttn ognition. IEEE Tns. Pttn Anl. Mh. Intll., 12(9):920 925, 1990. [17] M. Goflkis, A. Gionis, R. Rstogi, S. Sshdi, nd K. Shim. XTRACT: Lning doumnt typ dsiptos fom XML doumnt olltions. Dt Min. Knowl. Disov., 7(1):23 56, 2003. [18] E. M. Gold. Lngug idntifition in th limit. Infomtion nd Contol, 10(5):447 474, 1967. [19] S. Gijznhout nd M. Mx. Th qulity of th XML w. In CIKM, pgs 1719 1724, 2011. [20] J. Hgwld, F. Numnn, nd M. Wis. XStut: Effiint shm xttion fom multipl nd lg XML doumnts. In ICDE Wokshops, pg 81, 2006. [21] M. J. Kns nd U. V. Vzini. An intodution to omputtionl lning thoy. MIT Pss, 1994. [22] P. Kilpläinn nd R. Tuhknn. On-unmiguity of gul xpssions with numi oun inditos. Inf. Comput., 205(6):890 916, 2007. [23] M. Lohy, S. Mnth, nd E. Noth. XML ompssion vi DAGs. In ICDT, pgs 69 80, 2013. [24] J.-K. Min, J.-Y. Ahn, nd C.-W. Chung. Effiint xttion of shms fo XML doumnts. Inf. Poss. Ltt., 85(1):7 12, 2003. [25] C. H. Ppdimitiou. Computtionl omplxity. Addison- Wsly, 1994. [26] S. Stwoko nd P. Wizok. Lning twig nd pth quis. In ICDT, pgs 140 154, 2012.