Database Concepts for a new generation of CMS



Similar documents
database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Geodatabase Programming with SQL

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

Content Management Systems: Drupal Vs Jahia

A collaborative platform for knowledge management

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

Web development... the server side (of the force)

How To Create A Table In Sql (Ahem)

Saturday, 13 June CouchDB. Apache

CSE 530A Database Management Systems. Introduction. Washington University Fall 2013

Drupal CMS for marketing sites

How to Design and Create Your Own Custom Ext Rep

Content Management Software Drupal : Open Source Software to create library website

What is Drupal, exactly?

The SkySQL Administration Console

A Brief Introduction to MySQL

iservdb The database closest to you IDEAS Institute

Raima Database Manager Version 14.0 In-memory Database Engine

Building Library Website using Drupal

INSTALLING, CONFIGURING, AND DEVELOPING WITH XAMPP

An Open Source NoSQL solution for Internet Access Logs Analysis

Certified PHP/MySQL Web Developer Course

SQL. Short introduction

CMSs, Open Source, Hosted & Cloud-Based Applications

Rapid Application Development of Oracle Web Systems

Typo3_smartsite. Smartsite CMS Release 5 5/24/2006

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Symfony vs. Integrating products when to use a framework

A basic create statement for a simple student table would look like the following.

Database Schema Deployment. Lukas Smith - lukas@liip.ch CodeWorks PHP on the ROAD

DIPLOMA IN WEBDEVELOPMENT

Application Developer Guide

The Open Source CMS. Open Source Java & XML

Oracle Database 10g Express

<Insert Picture Here> Michael Hichwa VP Database Development Tools Stuttgart September 18, 2007 Hamburg September 20, 2007

Relational Database Basics Review

MAGENTO TRAINING PROGRAM

Integrating Big Data into the Computing Curricula

Comparing SQL and NOSQL databases

Bitrix Site Manager 4.1. User Guide

CSCI110 Exercise 4: Database - MySQL

Oracle Application Express MS Access on Steroids

CS 564: DATABASE MANAGEMENT SYSTEMS

IGW+ Certificate. I d e a l G r o u p i n W e b. International professional web design,

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Typo3_tridion. SDL Tridion R5 3/21/2008

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

<Insert Picture Here>

Framework as a master tool in modern web development

100% NO CODING NO DEVELOPING IMMEDIATE BUSINESS -25% -70% UNLIMITED SCALABILITY DEVELOPMENT TIME SOFTWARE STABILITY

SQL Injection. The ability to inject SQL commands into the database engine through an existing application

Document Oriented Database

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

5 Mistakes to Avoid on Your Drupal Website

Beyond The Web Drupal Meets The Desktop (And Mobile) Justin Miller Code Sorcery Workshop, LLC

IBM DB2 XML support. How to Configure the IBM DB2 Support in oxygen

7. Databases and Database Management Systems

Content Manager User Guide Information Technology Web Services

OBJECTS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 21

Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

San Jose State University

Configuration Management in Drupal 8. Andrea Pescetti Nuvole

Using EMC Documentum with Adobe LiveCycle ES

Axway API Gateway. Version 7.4.1

Introduction: Database management system

Database Design Patterns. Winter Lecture 24

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

Content Management Systems: Drupal Vs Jahia

Modern Web Development From Angle Brackets to Web Sockets

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Certified Apache CouchDB Professional VS-1045

Object Relational Database Mapping. Alex Boughton Spring 2011

php tek 2006 in Orlando Florida Lukas Kahwe Smith

CommonSpot Content Server Version 6.2 Release Notes

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

ORACLE BUSINESS INTELLIGENCE WORKSHOP

SQL Server for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

Advanced Object Oriented Database access using PDO. Marcus Börger

Mul$media im Netz (Online Mul$media) Wintersemester 2014/15. Übung 03 (Nebenfach)

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

ECG-1615A. How to Integrate IBM Enterprise Content Management Solutions With Microsoft SharePoint and IBM Connections. elinar.com

Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380

Open Source Content Management System for content development: a comparative study

Terms and Definitions for CMS Administrators, Architects, and Developers

Apache Sling A REST-based Web Application Framework Carsten Ziegeler cziegeler@apache.org ApacheCon NA 2014

Multimedia im Netz Online Multimedia Winter semester 2015/16

Kentico 8 Certified Developer Exam Preparation Guide. Kentico 8 Certified Developer Exam Preparation Guide

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Architecture and Mode of Operation

Introduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University

Red Hat Enterprise Portal Server: Architecture and Features

Microsoft Dynamics CRM Security Provider Module

Jackalope JCR & PHP. OpenExpo Herbst 2009, Winterthur, 23. September Tobias Ebnöther & Christian Stocker

Agile Development and Schema Evolution. Schema Evolution Agile Development & Databases Case Study on the Herschel Project

Quick start. A project with SpagoBI 3.x

Short notes on webpage programming languages

tibbr Now, the Information Finds You.

Web Development using PHP (WD_PHP) Duration 1.5 months

Transcription:

Drupal - ezpublish -... Young Media Concepts GmbH November 3, 2008

1 Intro 2 Static database schemes On Flexibility 3 ez Publish Content Model 4 Improvements on ez Publish Linking objects Keys 5 Java Content Repository 6 CouchDB

What does a CMS? Store content (Content = highly structured, mostly textual informations) Organize content Organize content relations Present content in different ways Implement Workflows, Permissions Search content Examples of CMS CMS: Drupal, ezpublish, Plone Groupware: egroupware, Horde, Exchange Issue Tracker: Flyspray, OTRS, Bugzilla ERP: Organize documents and workflows

What are not CMS s? OpenOffice, GIMP: Editors Gallery2: Organizes Images, not textual content Mailreaders, RSS reader, Browser: retrieve and present Booking sites: concentrate on workflow, few content more...?

Tables are static Drupal: node, node revisions, comments, poll, book,... egroupware: egw addressbook, egw addressbook extra, egw cal, egw cal extra,... OTRS: ticket, service, faq item,... New modules come with new table definitions.

Schema for a Drupal module f u n c t i o n s u b s c r i p t i o n s s c h e m a ( ) { $schema [ s u b s c r i p t i o n s ] = a r r a y ( f i e l d s => a r r a y ( s i d => a r r a y ( t y p e => s e r i a l, u n s i g n e d => TRUE, not n u l l => TRUE, d i s p width => 10 ), module => a r r a y ( t y p e => v a r c h a r, l e n g t h => 64, not n u l l => FALSE ), f i e l d => a r r a y ( t y p e => v a r c h a r, l e n g t h => 32, not n u l l => FALSE ), v a l u e => a r r a y ( t y p e => v a r c h a r, l e n g t h => 237, not n u l l => FALSE ),... more f i e l d s p r i m a r y key => a r r a y ( s i d ), indexes => a r r a y ( module => a r r a y ( module, f i e l d, v a l u e ), r e c i p i e n t u i d => a r r a y ( r e c i p i e n t u i d ) ), ) ;

Schema for an egroupware module $ p h p g w b a s e l i n e = a r r a y ( egw bookmarks => a r r a y ( f d => a r r a y ( bm id =>a r r a y ( t y p e => auto, n u l l a b l e =>F a l s e ), bm owner =>a r r a y ( t y p e => i n t, p r e c i s i o n =>4, n u l l a b l e =>True ), bm access =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n =>255, n u l l a b l e =>True ), b m u r l =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n =>255, n u l l a b l e =>True ), bm name =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n =>255, n u l l a b l e =>True ), bm desc =>a r r a y ( t y p e => t e x t, n u l l a b l e =>True ), bm keywords =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n =>255, n u l l a b l e =>True ), bm category =>a r r a y ( t y p e => i n t, p r e c i s i o n =>4, n u l l a b l e =>True ), b m r a t i n g =>a r r a y ( t y p e => i n t, p r e c i s i o n =>4, n u l l a b l e =>True ), b m i n f o =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n =>255, n u l l a b l e =>True ), b m v i s i t s =>a r r a y ( t y p e => i n t, p r e c i s i o n =>4, n u l l a b l e =>True ) ), pk => a r r a y ( bm id ), f k => a r r a y ( ), i x => a r r a y ( ), uc => a r r a y ( ) ) ) ;

conclusion Modifying the database scheme in such systems requires programming and is a major surgery.

On Flexibility columns in egw addressbook contact id contact tid contact owner contact private cat id n family n given n middle n prefix n suffix n fn n fileas contact bday org name org unit contact title contact role contact assistent contact room adr one street adr one locality adr one region adr one postalcode adr one countryname contact label adr two street adr two street2 adr two locality adr two region adr two postalcode adr two countryname tel work tel cell tel fax tel assistent tel car tel pager tel home tel fax home tel cell private tel prefer contact email contact email home contact url contact url home contact freebusy uri contact calendar uri contact note contact tz contact geo contact pubkey contact created contact creator contact modified contact modifier contact jpegphoto account id contact etag

On Flexibility problems with static tables only two web addresses possible who needs tel car and tel pager anymore? No field for a client id

On Flexibility custom fields in egroupware Table egw addressbook extra contact id contact owner contact name contact value This is a first step in the direction of the Entity-Attribute-Value Pattern in database design. a a en.wikipedia.org/wiki/entity-attribute-value_model

On Flexibility Entity-Attribute-Value Pattern Entity Attribute Value 1 n family Koch 1 n given Thomas 2 n family Musterman 2 n given Max 2 n prefix Dr. ez Publish uses this pattern in a more advanced version.

Maps Object Orientation into a RDBMS Classes and Objects becomes Content Classes and Content Objects

EAV Pattern - Advantages Flexibility

EAV Pattern - Disadvantages Many slow JOINs necessary Everything is a varchar Specialised tools necessary to browse the DB Hand coding SQL for this pattern is a pain Many build-in database features aren t available referential integrity checks domain integrity checks indexes

Solution: Ask Lukas Kahwe Smith LAMP software architect Liip since 2008 database: MySQL, SQLite, Oracle, PostGreSQL MySQL 5.0 Certified Developer Publications, Speeches and Contributions in the field of databases Release Manager for PHP 5.3 www.pooteeweet.org

Lukas K. Smith Presentation: SQL (un)patterns Split up into separate tables with as many columns as necessary, create DDL on the fly if necessary

One Table for each content class

DDL on the fly How it works Adding an attribute: ALTER TABLE ADD COLUMN Deleting an attribute: ALTER TABLE DROP COLUMN Get the whole object: SELECT *

critique DDL is not transactional save MyISAM also does not have transactions and is used in production You do not change your content classes weekly, but in a staged development process The current implementation in ez Publish is even worse: PHP updates the related attributes until it times out DDL is transactional save in PostgreSQL

critique You must access many tables to get informations on different object types Yes, but most of the time you instantiate the objects anyhow and can get the informations in PHP land Use views

conclusion Databases are powerful tools, but todays PHP applications use only a small subset of the available features.

Linking objects Linking is essential 1... ERP5 also implements some extra features to enhance programming productivity. Perhaps the most interesting is the concept of relationship managers, which are objects responsible for keeping the relationships between pairs of objects. 1 ERP5: Designing for Maximum Adaptability. In: Beautiful Code, S.345

Linking objects Links in egroupware egw links => a r r a y ( f d => a r r a y ( l i n k i d =>a r r a y ( t y p e => auto, n u l l a b l e =>F a l s e ), l i n k a p p 1 =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n => 25, n u l l a b l e =>F a l s e ), l i n k i d 1 =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n => 50, n u l l a b l e =>F a l s e ), l i n k a p p 2 =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n => 25, n u l l a b l e =>F a l s e ), l i n k i d 2 =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n => 50, n u l l a b l e =>F a l s e ), l i n k r e m a r k =>a r r a y ( t y p e => v a r c h a r, p r e c i s i o n => 100 ), l i n k l a s t m o d =>a r r a y ( t y p e => i n t, p r e c i s i o n => 8, n u l l a b l e =>F a l s e ), l i n k o w n e r =>a r r a y ( t y p e => i n t, p r e c i s i o n => 4, n u l l a b l e =>F a l s e )

Linking objects Linking of contents must be a first class priority for the storage layer. Put 1:n links inside the table to reduce joins Individual n:m link tables for every pair of linked tables?

Keys Surrogate Keys vs. Natural Keys Surrogate Keys id int(11) NOT NULL auto increment Natural Keys unique identifiers with meaning: en, de, ro, fr,... postal codes user logins email addresses

Keys On surrogate keys.. They suck. Example surrogate keys give no clue on the object they represent many times you already have a natural unique key - two checks for uniqueness on write are not portable from one system to another (dev - production) Content classes and class attributes in ezpublish have unique string identifiers and surrogate keys.

Keys On natural keys.. They are lovely. give a hint for debugging and avoids wrong numbers often save an id column are portable from one system to another (dev - production) Natural keys may include multiple columns. on speed Don t worry over long natural key columns! Do caching! Use binary instead of text: No colation int already takes four bytes, most natural keys are unique after the first few bytes

Keys links archives.postgresql.org/pgsql-hackers/2006-01/ msg00414.php pooteeweet.org/blog/1015 en.wikipedia.org/wiki/surrogate_key

Java Standard, pushed by Day Software since 2002.

Quick overview A repository has hierarchical workspaces with nodes. Nodes have properties. All content resides in the properties. Nodes are of types. Types dictate kind and number of properties and child nodes. Nodes can have reference properties to other nodes (objectrelation).

Storage engine agnostic Relational Database Object Database (CouchDB, ZOPE-DB) XML (-File/-Database) Filesystem

some free implementations Alfresco CMS (with own JCR) Apache Jackrabbit TYPO3CR (in development) some users Magnolia CMS Nuxeo (France)

Critique - Imposes content tree - Limited set of data types - Still requires predefined schemes + versionable + lockable + nodes are referencable via UUIDs (for linking) + Multiple types per node possible (like interfaces in PHP) + Type inheritance

links www.ibm.com/developerworks/java/library/j-jcr/ www.artima.com/lejava/articles/contentrepository. html en.wikipedia.org/wiki/content_repository_api_for_ Java

The idea Damien Katz wanted to recreate Lotus Notes start 2005 open source license (Apache) See lotusnotessucks.4t.com/

What exactly RESTful API versioning replication written in Erlang for concurrency

RESTfull API GET - select POST - update PUT - insert DELETE - drop

Applications Email, to dos, notes, wikis, forums, CMS, CRM, cataloging, approval workflow, expense tracking & reporting, status reports, asset management, project management, bug/issue tracking...

JSON Documents { i d : 123ABC, r e v : BCCE34, name : Darth Vader, l o c a t i o n : { name : Death S t a r, s t r e e t : Galaxy S t r e e t 4, z i p : 56665 }, f a v o u r i t e c o l o u r : b l a c k, t o o l s : [ helmet, l i g h t s a b r e ] }

Views with Map/Reduce in Javascript Example contact document: { t y p e : c o n t a c t, f i r s t n a m e : Fred, l a s t n a m e : Katz, company : JFK Construction,... }... }

View function View of contact documents indexed by lastname: f u n c t i o n ( doc ){ i f ( doc. t y p e == c o n t a c t ) { map( doc. lastname, doc. f i r s t n a m e ) ; } }

Get View Results GET /db/ view/contacts/by last?key= Katz { t o t a l r o w s : 3, rows : [ { id : 2DA92AAA628CB4156134F36927CF4876, key : Katz, v a l u e : Roseanna }, { id : 4156134F36927CF48762DA92AAA628CB, key : Katz, v a l u e : Fred }, { id : 4F36927CF48762DA92AAA628CB415613, key : Katz, v a l u e : Gwendolyn } ]}

Links www.ibuildings.com/blog/archives/ 1291-Some-thoughts-on-CouchDB.html