ECS 165B: Database System Implementa6on Lecture 2



Similar documents
Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

Defending Computer Networks Lecture 3: More On Vulnerabili3es. Stuart Staniford Adjunct Professor of Computer Science

The programming language C. sws1 1

The Advantages of Dan Grossman CSE303 Spring 2005, Lecture 25

Memory management. Announcements. Safe user input. Function pointers. Uses of function pointers. Function pointer example

CSC230 Getting Starting in C. Tyler Bletsch

6.088 Intro to C/C++ Day 4: Object-oriented programming in C++ Eunsuk Kang and Jean Yang

How To Write Portable Programs In C

Operating Systems and Networks

WORKSPACE WEB DEVELOPMENT & OUTSOURCING TRAINING CENTER

Crash Course in Java

CS 106 Introduction to Computer Science I

1.00 Lecture 1. Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders

The C Programming Language course syllabus associate level

INTRODUCTION TO OBJECTIVE-C CSCI 4448/5448: OBJECT-ORIENTED ANALYSIS & DESIGN LECTURE 12 09/29/2011

UNIX Sockets. COS 461 Precept 1

Lecture 1 Introduction to Android

Programming in Python. Basic information. Teaching. Administration Organisation Contents of the Course. Jarkko Toivonen. Overview of Python

Getting Started with the Internet Communications Engine

Buffer Overflows. Security 2011

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Jonathan Worthington Scarborough Linux User Group

Lecture 11 Doubly Linked Lists & Array of Linked Lists. Doubly Linked Lists

Software Engineering Concepts: Testing. Pointers & Dynamic Allocation. CS 311 Data Structures and Algorithms Lecture Slides Monday, September 14, 2009

Habanero Extreme Scale Software Research Project

Object Oriented Software Design II

How to create/avoid memory leak in Java and.net? Venkat Subramaniam

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Welcome! You ve made a wise choice opening a savings account. There s a lot to learn, so let s get going!

Monitoring, Tracing, Debugging (Under Construction)

ONE OF THE MOST JARRING DIFFERENCES BETWEEN A MANAGED language like PHP,

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Operating System Engineering: Fall 2005

We will learn the Python programming language. Why? Because it is easy to learn and many people write programs in Python so we can share.

Java Interview Questions and Answers

Semantic Analysis: Types and Type Checking

CSE 1223: Introduction to Computer Programming in Java Chapter 2 Java Fundamentals

Get the Better of Memory Leaks with Valgrind Whitepaper

COMP 356 Programming Language Structures Notes for Chapter 10 of Concepts of Programming Languages Implementing Subprograms.

Principles of Software Construction: Objects, Design, and Concurrency. Course Introduction. toad. toad Fall School of Computer Science

Tes$ng Web Applica$ons. Willem Visser RW334

Quiz I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Department of Electrical Engineering and Computer Science

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

LINKED DATA STRUCTURES

COS 217: Introduction to Programming Systems

Leak Check Version 2.1 for Linux TM

Japan Communication India Skill Development Center

ECE 122. Engineering Problem Solving with Java

Services. Relational. Databases & JDBC. Today. Relational. Databases SQL JDBC. Next Time. Services. Relational. Databases & JDBC. Today.

Japan Communication India Skill Development Center

Tutorial on Socket Programming

Syllabus for CS 134 Java Programming

MPLAB TM C30 Managed PSV Pointers. Beta support included with MPLAB C30 V3.00

Japan Communication India Skill Development Center

C Dynamic Data Structures. University of Texas at Austin CS310H - Computer Organization Spring 2010 Don Fussell

Object Oriented Software Design II

Database 2 Lecture I. Alessandro Artale

Heartbleed. or: I read the news, too. Martin R. Albrecht. Information Security Group, Royal Holloway, University of London

AVRO - SERIALIZATION

02-201: Programming for Scientists

Last Class: Communication in Distributed Systems. Today: Remote Procedure Calls

Instructor: Michael J. May Semester 1 of The course meets 9:00am 11:00am on Sundays. The Targil for the course is 12:00pm 2:00pm on Sundays.

Programming Languages CIS 443

Introduction to Python

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

CSCI E 98: Managed Environments for the Execution of Programs

DIABLO VALLEY COLLEGE CATALOG

CS 4604: Introduc0on to Database Management Systems

What is COM/DCOM. Distributed Object Systems 4 COM/DCOM. COM vs Corba 1. COM vs. Corba 2. Multiple inheritance vs multiple interfaces

An Incomplete C++ Primer. University of Wyoming MA 5310

Moving from CS 61A Scheme to CS 61B Java

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Illustration 1: Diagram of program function and data flow

Programming Languages

CS 241 Data Organization Coding Standards

How to write a design document

8.5. <summary> Cppcheck addons Using Cppcheck addons Where to find some Cppcheck addons

Software Engineering Techniques

DNA Data and Program Representation. Alexandre David

Introduction to Socket Programming Part I : TCP Clients, Servers; Host information

Caml Virtual Machine File & data formats Document version: 1.4

The Java Series. Java Essentials I What is Java? Basic Language Constructs. Java Essentials I. What is Java?. Basic Language Constructs Slide 1

Comp151. Definitions & Declarations

COMPARISON OF OBJECT-ORIENTED AND PROCEDURE-BASED COMPUTER LANGUAGES: CASE STUDY OF C++ PROGRAMMING

Security types to the rescue

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007

/* File: blkcopy.c. size_t n

C++ Programming Language

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

Hacking Techniques & Intrusion Detection. Ali Al-Shemery arabnix [at] gmail

Introduction to Data Structures

Operating System Overview. Otto J. Anshus

C++ INTERVIEW QUESTIONS

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

UBIFS file system. Adrian Hunter (Адриан Хантер) Artem Bityutskiy (Битюцкий Артём)

Coding Rules. Encoding the type of a function into the name (so-called Hungarian notation) is forbidden - it only confuses the programmer.

Comp 411 Principles of Programming Languages Lecture 34 Semantics of OO Languages. Corky Cartwright Swarat Chaudhuri November 30, 20111

The Transport Layer and Implica4ons for Network Monitoring. CS 410/510 Spring 2014

CSC 551: Web Programming. Spring 2004

Transcription:

ECS 165B: Database System Implementa6on Lecture 2 UC Davis, Spring 2011 Por6ons of slides based on earlier ones by Raghu Ramakrishnan, Johannes Gehrke, Jennifer Widom, Bertram Ludaescher, and Michael Gertz.

Announcements No class Friday (NorCal Database Day 2011 @ UCD) hup://dbday.cs.ucdavis.edu Warmup homework assignment (serializa6on and memory management in C++) out later today, due Sunday @ 11:59pm Project starts with forming teams on Monday

Today s Agenda Coding skills: serializa6on and memory management in C++ File and buffer management in a DBMS Reading: Chapter 13 of textbook

Coding Skills: Memory Management in C++ Much of DavisDB project work (star6ng with part 1) will require wri6ng serializa,on and deserializa,on code Students last year had trouble with this, hence we ll cover the basics up front (and get some prac6ce in the warmup homework )

Coding Skills: Memory Management in C++ DBMS is a long- running server process, cannot afford to leak memory Many (most) real applica6ons are like this: web servers, GUI applica6ons, opera6ng system device drivers,... In DavisDB, you ll be required to clean up memory resources properly

What s Wrong With This Code? class Foo { char* str_;... void setstring(char* str) { ;... ~Foo() { str_ = strdup(str);

class Foo { char* str_ = 0;... What s Wrong With This Code? void setstring(const char* str) {... if (str_!= 0) { ~Foo() { free(str_); str_ = strdup(str); if (str_!= 0) { free(str_); str_ = 0;

Basic Idea in Memory Management Allocated memory must be freed when no longer needed Don t have to worry about this (as much) in garbage- collected languages like Java, C#, Python, etc Definitely do have to worry about this in C/C++ To be a good C/C++ programmer, need to become very disciplined about memory management

How to Allocate and Free Memory in C++ Confusingly, there are three ways to allocate memory from the heap in C++: malloc, new, and new[] malloc inherited from C, used by C library rou6nes like strdup new typed, object- oriented version of malloc; used to allocate a single object new[] used to allocate an array of objects Memory must be freed differently depending on how it was allocated: malloc <-> free new <-> delete new[] <-> delete[]

How to Remember to Free Unused Memory Whenever you write a piece of code alloca6ng some memory, you already must be thinking about who will free it, where and when Ownership of memory must be very clearly understood DESCRIPTION The strdup() func6on allocates sufficient memory for a copy of the string s1, does the copy, and returns a pointer to it. The pointer may subsequently be used as an argument to the func6on free(3). With strdup, it is clear that the caller owns the returned memory

So What s Wrong With This Code? struct Point { int x; int y; Point* foo(int x, int y) { Point p; p.x = x; p.y = y; return &p; void bar() { Point* point = foo(1,2); delete point;

So What s Wrong With This Code? struct Point { int x; int y; Point* foo(int x, int y) { Point p; p.x = x; p.y = y; return &p; // uh-oh, returning a pointer to stack- // allocated memory. big no-no. void bar() { Point* point = foo(1,2); delete point;

So What s Wrong With This Code? struct Point { int x; int y; void foo(int x, int y, p->x = x; p->y = y; void bar() { Point point; Point* p) { foo(1,2, &point); struct Point { int x; int y; Point* foo(int x, int y) { Point* p = new Point(); p->x = x; p->y = y; return p; void bar() { Point p = foo(1,2); delete p;

Finding Memory Leaks Very useful and easy- to- use tool on Unix- based plalorms (including CSIF): valgrind Usage: valgrind [op,ons] <prog- and- args> Will tell you if you ve leaked memory, and where it was allocated We ll use this in the warmup homework, and in grading your project components

Smart Pointers: a Useful C++ Idiom for Memory Management Basic idea: instead of working with raw pointer, like Foo* ptr = new Foo(1); ptr = new Foo(2); // oops, leaked the first object work with a wrapped version that automa6cally frees the underlying object when leaving scope std::auto_ptr<foo> ptr = new Foo(1); ptr = new Foo(2); // first object deleted automatically // on assignment

Smart Pointers: Pros and Cons Pros: automates away much of the grunt work of memory management; great idea so long as used uniformly in a code base and all developers involved understand them Cons: no free lunch, s6ll need to understand underlying memory management concepts, plus smart pointers introduce their own complexi6es; bad idea to mix code that uses smart pointers with code that doesn t (very confusing) Since this is a DIY class and you need to understand memory management anyway, we ll forego smart pointers

Serializa6on/Deserializa6on Serializa,on (aka marshalling): conversion of data structure or object into format that can be stored (e.g., in file or memory buffer, or transmiued on network) and deserialized (or unmarshalled) later to recover data structure/object As with memory management and smart pointers, some standard facili6es (e.g., Boost serializa6on library) exist to help with this But in the spirit of DIY (and avoiding going deeply into Boost libraries), we ll do it ourselves in this class

Primi6ve for Serializa6on: memcpy Given a structure Foo, how to serialize Foo into a byte- buffer? struct Foo { int x; float y; f1 = {1, 0.5; char buffer[128]; memcpy(buffer, &f1, sizeof(f1)); How to deserialize? Use memcpy again: struct Foo f2; memcpy(&f2, buffer, sizeof(f2));

Serializa6on Gotchas One problem with preceding approach: deserializing Foo on another machine may not work properly - endianness issues - size of data types (32- bit vs. 64- bit vs....) We ll ignore this issue for DavisDB (single- machine) Second problem: versioning. Suppose a later version of code adds a third field to Foo. What can go wrong? We ll ignore this issue too for DavisDB (single- version) Third problem, this one we do need to worry about for DavisDB: memcpy of the structure only works for plain old data...

What s Wrong Here? struct Bar { ; int x; char* str; char buffer[128]; char str[] = new char[64]; sprintf(str, hello, world! ); struct Bar bar = {1, str; memcpy(buffer, &bar, sizeof(bar)); delete[] str;... memcpy(&bar, buffer, sizeof(bar));

What s Wrong Here? struct Bar { int x; char* str; // structure is not plain old data ; char buffer[128]; char str[] = new char[64]; sprintf(str, hello, world! ); struct Bar bar = {1, str; memcpy(buffer, &bar, sizeof(bar)); delete[] str;... memcpy(&bar, buffer, sizeof(bar)); // bar will have // bad str // pointer

The Fix: Custom Serialize/Deserialize Methods class Bar { int x; char* str = 0; void serialize(char* buffer) { memcpy(buffer, &x, sizeof(x)); int len = strlen(str); memcpy(buffer+sizeof(x), &len, sizeof(len)); memcpy(buffer+sizeof(x)+sizeof(len), str, len*sizeof(char)); static Bar* deserialize(char* buffer) { Bar* bar = new Bar(); memcpy(&(bar->x), buffer, sizeof(bar->x)); int len; memcpy(&len, buffer+sizeof(bar->x), sizeof(len)); bar->str = new char[len+1]; memcpy(bar->str, buffer+sizeof(bar->x)+sizeof(len), len*sizeof(char));... ;

The Fix: Custom Serialize/Deserialize Methods Usage: Bar b; b.x = 1; b.str = hello, world! ; char buffer[128]; b.serialize(buffer);... Bar* c = Bar::deserialize(buffer, &c);... delete c;

The Fix: Custom Serialize/Deserialize Methods Code can be greatly simplified using buffer that automa6cally keeps track of read/write posi6on: i.e., behaves like a stream void serialize(char* buffer) { memcpy(buffer, &x, sizeof(x)); int len = strlen(str); memcpy(buffer+sizeof(x), &len, sizeof(len)); memcpy(buffer+sizeof(x)+sizeof(len), str, len*sizeof(char)); becomes void serialize(bytestream* buffer) { buffer->write(&x, sizeof(x)); int len = strlen(str); buffer->write(&len, sizeof(len)); buffer->write(str, len*sizeof(char)); part of homework will ask you to write such a Bytestream class