Hypothetical Indexes towards Self-tuning in PostgreSQL PGCon 2010 Sérgio Lifschitz Departamento de Informática Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio) - Brasil sergio@inf.puc-rio.br Brazil, Rio, PUC-Rio South-America 2014 World cup, 2016 Olympics Private University Top of the brazilian ranking Particularly for Computer Sciences Some projects: Lua language, e-lua, Ginga (HDTV, IPTV) and Self-* with PostgreSQL! Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 2 1
Motivation (1) insert into sales (prodnum, s_date, qty, price) values (4, current_timestamp, 20, 348); (2) select prodnum, s_date, sum(price) as total from sales where price > 1500000 and s_date between 20040101 and 20040131 group by prodnum, s_date; What are the best indices for an application that uses these commands? Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 3 Motivation (cont.) DBA and indexes: Query performances Do index really help? Selectivity Bad for updates? Index selection: NP-hard problem What-if queries are good! What about Autonomous or self-* approaches? Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 4 2
Related Work SMART (Self Managing and Resource Tuning) project - IBM Almaden; AutoAdmin - Microsoft Research; Oracle ; Oracle 10g: Automatic Database Diagnostic Monitor SQL Server 2005: Database Tuning Advisor DB2: db2advis PostgreSQL; Autovacuum + (ongoing?) DI PUC-Rio. Local Self-tuning Global Self-tuning Benefits heuristics Non-intrusive approaches Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 5 Quick pause Tutorial hypothetical indexes PostgreSQL v8 video Reference: www.inf.puc-rio.br/~postgresql Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 6 3
How did we do it? Table: pg_index Add column: indishypothetical Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 7 Module: Catalog Metabase or catalog Files: pg_attribute and pg_index Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 8 4
Some syntax issues Module: Parser --- File: gram.y Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 9 Modules mainly changed (Backend) PARSER (3 files) Syntax and Semantic Create/Drop hypothetical index Explain hypothetical CATALOG (3 files) Metabase indishypothetical? Index build / register / drop / reindex COMMANDS (8 files) Index Definition Data structure copy Relation removal OPTIMIZER (4 files) Plan Estimate Index Size EXECUTOR (3 files) Execute Plan Index Open NODES (1 file) Functions copy Planned / Query / Index Stmt TCOP (2 files) Traffic Cop DropStmt (add OBJECT_HYP_INDEX) Calls to pg_analyze_and_rewrite BOOTSTRAP (1 file) Grammar Calls to DefineIndex UTILS (1 file) Plan Cache Management Calls to pg_analyze_and_rewrite with new attribute (hypothetical) Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 10 5
Hypothetical vs Actual Indexes Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 11 Only what-if? No! Autonomic Index Tuning Self-* research Specific Question: Find a feasible index self-tuning strategy that does not require human intervention Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 12 6
Agents and DBMS Agents: adaptability and proactivity DBMS: performance and scalability Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 13 Index Self-tuning based on Differences On-line heuristic Evaluate commands as they are submitted Estimate alternative indexing solutions hypothetical indexes Adapt index design on-the-fly Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 14 7
Index Self-tuning based on Differences Query Evaluation Strategy: Benefit of Hypothetical Index = Cost of Query with Actual Indexes Cost of Query with Hypothetical Indexes; Update Accumulated Benefit of Hypothetical Index; If (Accumulated Benefit of Hypothetical Index > Cost to Create Hypothetical Index) Then Reset Accumulated Benefit of Hypothetical Index; Materialize Hypothetical Index; End if; Updates follow similar rules, but consider index destruction Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 15 Architectures Agent based Self-tuning Agent architecture Integration architecture Pros & Cons Intrusive, but well delimited On-line: no human intervention, but limited time to search solution space Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 16 8
Agent based Self-tuning Agent Architecture (Kendall et al) Layered: ease of construction Well delimited points of interaction with DBMS Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 17 Agent based Self-tuning Integrating the tuning agent with PostgreSQL Postmaster Postgres Statement Processor Optimizer Built-in Agent (Postgres) Queue Sensory Beliefs Reasoning Action Statement Processor Postgres Statement Processor Optimizer Storage Structures Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 18 9
Current and future work Fine tuning of hypothetical indexes Self-tuning on-the-fly: Hypothetical indexes on DB catalog Which are to be created? Hypothetical plans feature Automatic materialized views PostgreSQL patch and tests VLDBs and actual DBs Versions 8.x and public benchmarks Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 19 Self-tuning @ DI PUC-Rio SGBD PostgreSQL (v7 e v8) Start: agent-based self-tuning Index selection heuristics Hypothetical indexes and explain Workload capture Automatic create, drop and reindex Recently: hypothetical plans Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 20 10
Acknowledgments Ana Carolina Brito, Andrea Weberling and José Maria Monteiro & RogérioCosta, Maíra Noronha, Marcos Salles, AnolanMilanes, Eduardo Morelli, Isabel Porto, Carlos Juliano Viana and Renato Mogrovejo Hypothetical Indexes in PostgreSQL Sérgio Lifschitz DI PUC-Rio PGCon2010 21 Questions? Thank you! Merci beaucoup! Muito obrigado! www.inf.puc-rio.br/~postgresql sergio@inf.puc-rio.br 11