Flatten from/to Relational Jácome Cunha João Saraiva Joost Visser CIC 2007 22-23 October Universidade do Minho Software Improvement Group Jácome Cunha Flatten from/to Relational 1 / 21
Overview Jácome Cunha Flatten from/to Relational 2 / 21
Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB Jácome Cunha Flatten from/to Relational 3 / 21
Motivation Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB Spreadsheets are considered one of the largest programming languages in the world Their languages/systems lack structured programming features Programming in a spreadsheet environment is an error prone task Data manipulation is not so supported as in other paradigms Jácome Cunha Flatten from/to Relational 4 / 21
Bidirectional transformation Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB DB SS The flatten model of a spreadsheet is mapped into a relational database model The other way around is also shown Jácome Cunha Flatten from/to Relational 5 / 21
What is a spreadsheet? Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB cell text formulas graphics references to other cells incremental Jácome Cunha Flatten from/to Relational 6 / 21
Relational database Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB It is a structured collection of records distributed over tables Some concepts: Primary key it is a set of attributes in the record that uniquely determine all the others Functional dependency a set of attributes X functionally determines a set of attributes Y iff each X value is associated with only one Y value Foreign key it is a reference to another entry in other table Jácome Cunha Flatten from/to Relational 7 / 21
Advantages of DB Motivation Bidirectional transformation What is a spreadsheet? Relational database Advantages of DB The existence of database management system (DBMS) allows: insert, remove and update data in the DB chance the format of the DB ensures the integrity of the DB backup and replication give permissions to users or deny them s: Oracle, Microsoft Access, MySQL, PostgreSQL Jácome Cunha Flatten from/to Relational 8 / 21
The big picture Inferring FDs Normalisation Data migration Jácome Cunha Flatten from/to Relational 9 / 21
The big picture The big picture Inferring FDs Normalisation Data migration a 1 1... a 1 n.. :: (A 1... A n ) a m 1... a mn ց extract FDs 2LT Π(A i...a k A j... A l ) normalisation (a m...a v a n...a y ) :: Π(A m... A v A n... A t ) Jácome Cunha Flatten from/to Relational 10 / 21
Inferring FDs The big picture Inferring FDs Normalisation Data migration We assume that the table has only data, not code or formulas or other things Information about relation between columns is needed otherwise there will be too many FDs and will become useless The FUN algorithm (Noel Novelli and Rosine Cicchetti) is used to infer the FDs Jácome Cunha Flatten from/to Relational 11 / 21
Normalisation The big picture Inferring FDs Normalisation Data migration For each association of columns several FDs are yielded by the algorithm One is chosen and a table is created according to it The columns that are not referred are collected into a table where the primary key is a set constituted by the primary keys from other tables Jácome Cunha Flatten from/to Relational 12 / 21
Data migration The big picture Inferring FDs Normalisation Data migration Performed by implementing refinement laws into 2LT ka B to = [ k A B] from Abstraction (from) and representation (to) functions are needed to witness the refinement Jácome Cunha Flatten from/to Relational 13 / 21
Internships Company s table Student s tables DB model Jácome Cunha Flatten from/to Relational 14 / 21
Internships Internships Company s table Student s tables DB model Jácome Cunha Flatten from/to Relational 15 / 21
Internships Internships Company s table Student s tables DB model There should be enough cases to represent the reality Jácome Cunha Flatten from/to Relational 15 / 21
Company s table Internships Company s table Student s tables DB model company, contacts and location are related company -> contacts, location contacts -> company, location Jácome Cunha Flatten from/to Relational 16 / 21
Company s table Internships Company s table Student s tables DB model company, contacts and location are related company -> contacts, location contacts -> company, location Company company contacts location Jácome Cunha Flatten from/to Relational 16 / 21
Student s tables Internships Company s table Student s tables DB model st.number, st.name and graduation are also related st.number -> st.name, graduation Jácome Cunha Flatten from/to Relational 17 / 21
Student s tables Internships Company s table Student s tables DB model st.number, st.name and graduation are also related st.number -> st.name, graduation Student number name graduation Jácome Cunha Flatten from/to Relational 17 / 21
DB model Internships Company s table Student s tables DB model Company Internship Student company contacts location company number salary status description time number name graduation Company (_company, contacts, location) Student (_st.number, st.name, graduation) Internship (_company, _st.number, salary, status, description, time) foreign key (company) references Company(company) foreign key (st.number) references Student(st.number) Jácome Cunha Flatten from/to Relational 18 / 21
Future work Jácome Cunha Flatten from/to Relational 19 / 21
Future work Start with a flatten data table, a spreadsheet Determine FDs (automatically) and the relational DB schema (not so automatically) Allows the migration to a well known and supported paradigm Jácome Cunha Flatten from/to Relational 20 / 21
Future work Future work Import a spreadsheet (and other formats) to HASKELL Tune the FDs inference/normalisation Integration into 2LT framework Create ways of export the new model to SQL or to spreadsheet Jácome Cunha Flatten from/to Relational 21 / 21