WIS & Engineering Geert-Jan Houben
Contents Web Information System (WIS) Evolution in Web data WIS Engineering Languages for Web data XML (context only!) RDF XML Querying: XQuery (context only!) RDFS SPARQL OWL
Web Information System (WIS)
Information system Exchanges information with object system (= business process) Stores and manages information: data-intensive Requires careful engineering of information exchange Requires careful engineering and modeling of object system Traditionally database-oriented
Web Information System Information System based on Web technology (Web-based, Web-aware, Web-enabled etc.) Web technology can be used as front-end, e.g. application is available on the Web (or Intranet) via a browser Enables easy use and maintenance of (personalized) end-user access Web metaphor is appealing for end-users Requires different techniques for engineering the system s interfaces
Web Information System Web technology can also be used in back-end of information system Organize (connect) the data inside the system using Web technology Use World Wide Web as provider of data (or Intranet) Typically highly volatile information (distributed and heterogeneous) Requires different techniques for engineering the implementation
Examples Real-estate sales Employee databases Museum databases Digital libraries Mail order catalogs Reservation systems Auctions, virtual marketplaces Electronic TV program guides Ref: Special section on Web Information Systems in Communications of the ACM, July 1998, Vol. 41, No. 7
Web-based IS Web-browser as front-end Data repository (data base) as back-end Design: Data (content) structures: ER-modeling Navigation Presentation: layout Implementation: On-line (direct database access) Off-line (generated from database)
WebML View A Web-enabled software system whose main purpose is to publish and maintain large amounts of data Interfaces directed to general public Exploratory Browsing-oriented Personalized (1 to 1) Data stored by means of DBMS technology Possibly pre-existing the Web application Normally volatile With severe freshness requirements May be distributed and heterogeneous
OOHDM View (1) Object-Oriented Hypermedia Design Method WWW brought new generation of IS: Navigation through heterogeneous information space Operations querying or affecting that information Introduced hypertext (hypermedia) paradigm Applications are constantly modified, enriched with new services, and new navigation and interface features are added
OOHDM View (2) Web-based application, first good hypermedia applications Traditional (SE) methodologies have no notion of linking: little is said about incorporating hypertext into interface Size and complexity imply systematic approach for evolution and reuse of design knowledge
RMM s View History: graphics designers + programmers Experience: central information architecture and shared, common mechanisms/services helpful for coping with problems of scalability and information anarchy
Nielsen s View On the Web, the only constant is change. A site that works perfectly as long as its stays the same will quickly die. Healthy navigation structure key to success Building interface is also complex, connecting interface objects to rest of application
Evolution
Hypermedia Hypertext + multimedia Information objects: text, images, animations, audio, video Not possible to show everything at once: Layout Timing Navigation Design (generate) presentation for different platforms: WWW, WAP, PDA s, etc. W3C expects that majority of surfers will not be using a PC
Evolution in hypermedia First: standalone special-purpose systems Now: Web-based From authoring to designing to generating From static to dynamic (generated from database query result) From single site to portals (integrated access service) From read-only to interactive and often collaborative (read-write)
Research (Hera) In WIS how to (automatically) generate hypermedia access to the information? Hypermedia access: navigation Personalization, customization: adaptation Information integration: (logistics of) information retrieval from available, heterogeneous sources Interoperability w.r.t. dynamics: for e-business, business rules, service integration Querying and transforming of data and metadata Ref: wwwis.win.tue.nl/~hera
Three Generations of Web 1. HTML written by author Easy, uniform interface Large effort for maintenance Not suited for changing information 2. Automatically generating information First, using templates (and databases) Later, using XML and XSLT transformations 3. Automatic processing of information Explicit metadata (RDF) Agreement on meaning (ontologies)
Future developments Semantic Web research aims at even further developments in Web applications From human-readable via machine-readable to machine-processable
Common Syntax: XML HTML: a fixed set of tags complicates the identification of information elements XML allows to define data structures: Tags with freely chosen names No predefined tags enables definition, transmission, validation and interpretation of data between applications (and organizations) Freely chosen attributes Simple definition: DTD Extended definition: XML-Schema
<skills> <people> <person> </person> <person> </person> </people> <seminars> <seminar> </seminar> </seminars> </skills> <name>bob</name> <know-how>xml</know-how> <name>peter</name> <know-how>xml</know-how> <know-how>rdf</know-how> <topic>xml</topic> <participant> <name>karin</name> <name>alice</name> </participant>
Querying //person/name[../know-how= XML"] $union$ //seminar[topic= XML"]/participant/name
Specification of meaning: RDF Resource: denotes an information item, e.g. via a URL Property type: name of a property of a resource Value: value for that property Example: Resource = URL of web page Property type = author Value = John Smith
<?xml:namespace ns = "http://www.w3.org/rdf/rdf/" prefix = "RDF"?> <?xml:namespace ns = "http://purl.oclc.org/dc/" prefix = "DC"?> <?xml:namespace ns = "http://person.org/businesscard/" prefix = "CARD"?> <RDF:RDF> <RDF:Description RDF:HREF = "http://uri-of-document-1"> <DC:Creator RDF:HREF = "#Creator_001"/> </RDF:Description> <RDF:Description ID="Creator_001"> <CARD:Name>John Smith</CARD:Name> <CARD:Email>smith@home.net</CARD:Email> <CARD:Affiliation>Home, Inc.</CARD:Affiliation> </RDF:Description> </RDF:RDF>
Meaning: ontologies Ontology = a vocabulary with associated meaning Meaning expressed via relationships between concepts (no relationships with "real world") Possibility to define synonyms, specializations and other relationships Use of same ontology = contract on meaning of words (tags, attributes) Often, industry or domain dependent
Future of the Web 1. Common syntax (XML) 2. "Specification" of meaning (RDF) 3. "Agreement" on meaning (ontologies) 4. Use logic to derive conclusions Necessary in e-commerce: How does the supplier's message relate to the customer's knowledge? 5. Goal: trust in communication between Web systems, and hence the possibility to automate using agents Ref: w3.org
Layer Cake
WIS & Metadata WIS repository (back-end) typically assembled from different heterogeneous sources, e.g. databases, files, WWW To manage (coordinate) data from different sources, metadata helps to organize and structure the data
Metadata Describing the data: Data about data Definitional data providing information about or documentation of data managed in an application Sometimes provided by sources Needed by IS: specifying logistics of data Engineering metadata: Meaning Validity Quality
WIS Engineering
WIS Engineering Methodology Design of WIS requires careful engineering of information exchange between IS and OS Implies engineering of front-end (interface) and back-end (storage & retrieval) Professional applications: from art to engineering well-founded (software) engineering methodologies model-driven