The Piranha computer algebra system. introduction and implementation details

Similar documents

Higher Education Math Placement

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

( ) FACTORING. x In this polynomial the only variable in common to all is x.

High-Performance Modular Multiplication on the Cell Processor

Faster polynomial multiplication via multipoint Kronecker substitution

Florida Math Correlation of the ALEKS course Florida Math 0028 to the Florida Mathematics Competencies - Upper

Copy in your notebook: Add an example of each term with the symbols used in algebra 2 if there are any.

Alum Rock Elementary Union School District Algebra I Study Guide for Benchmark III

The mathematics of RAID-6

Indiana State Core Curriculum Standards updated 2009 Algebra I

ECE 842 Report Implementation of Elliptic Curve Cryptography

1.3 Polynomials and Factoring

Algebra and Geometry Review (61 topics, no due date)

A simple and fast algorithm for computing exponentials of power series

BookTOC.txt. 1. Functions, Graphs, and Models. Algebra Toolbox. Sets. The Real Numbers. Inequalities and Intervals on the Real Number Line

1.3 Algebraic Expressions

Basics of Polynomial Theory

Algebra I. In this technological age, mathematics is more important than ever. When students

HIBBING COMMUNITY COLLEGE COURSE OUTLINE

PRE-CALCULUS GRADE 12

Algebra 1 Course Information

Matrix Multiplication

Pre-Algebra Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Polynomial Operations and Factoring

Compiling Object Oriented Languages. What is an Object-Oriented Programming Language? Implementation: Dynamic Binding

How To Write A Hexadecimal Program

Extra Credit Assignment Lesson plan. The following assignment is optional and can be completed to receive up to 5 points on a previously taken exam.

Factoring Quadratic Expressions

Performance Results for Two of the NAS Parallel Benchmarks

South Carolina College- and Career-Ready (SCCCR) Pre-Calculus

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Factorization Methods: Very Quick Overview

5.3 SOLVING TRIGONOMETRIC EQUATIONS. Copyright Cengage Learning. All rights reserved.

What are the place values to the left of the decimal point and their associated powers of ten?

Florida Algebra 1 End-of-Course Assessment Item Bank, Polk County School District

MATH 095, College Prep Mathematics: Unit Coverage Pre-algebra topics (arithmetic skills) offered through BSE (Basic Skills Education)

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Operations with Algebraic Expressions: Multiplication of Polynomials

Grade 6 Mathematics Assessment. Eligible Texas Essential Knowledge and Skills

DRAFT. Algebra 1 EOC Item Specifications

SECTION 0.6: POLYNOMIAL, RATIONAL, AND ALGEBRAIC EXPRESSIONS

Algorithms, Flowcharts & Program Design. ComPro

11 Multivariate Polynomials

Giac/Xcas, a swiss knife for mathematics

Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions

Florida Math for College Readiness

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

expression is written horizontally. The Last terms ((2)( 4)) because they are the last terms of the two polynomials. This is called the FOIL method.

NSM100 Introduction to Algebra Chapter 5 Notes Factoring

A Concrete Introduction. to the Abstract Concepts. of Integers and Algebra using Algebra Tiles

Department of Computer Science

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Big Bend Community College. Beginning Algebra MPC 095. Lab Notebook

Factoring Algorithms

Successful completion of Math 7 or Algebra Readiness along with teacher recommendation.

Operation Count; Numerical Linear Algebra

Algebra I Vocabulary Cards

Mathematics Online Instructional Materials Correlation to the 2009 Algebra I Standards of Learning and Curriculum Framework

2.3. Finding polynomial functions. An Introduction:

SHARED HASH TABLES IN PARALLEL MODEL CHECKING

A Primer on Index Notation

How To Design Software

Notes on Factoring. MA 206 Kurt Bryan

AIP Factoring Practice/Help

MATH 0110 Developmental Math Skills Review, 1 Credit, 3 hours lab

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

Wentzville School District Algebra 1: Unit 8 Stage 1 Desired Results

Discuss the size of the instance for the minimum spanning tree problem.

A Deduplication File System & Course Review

C++ Programming Language

by the matrix A results in a vector which is a reflection of the given

Access Code: RVAE4-EGKVN Financial Aid Code: 6A9DB-DEE3B-74F

DEGREE PLAN INSTRUCTIONS FOR COMPUTER ENGINEERING

SOLVING LINEAR SYSTEMS

Vocabulary Words and Definitions for Algebra

Next Generation GPU Architecture Code-named Fermi

Using Algebra Tiles for Adding/Subtracting Integers and to Solve 2-step Equations Grade 7 By Rich Butera

Binary Numbering Systems

The program also provides supplemental modules on topics in geometry and probability and statistics.

AMSCO S Ann Xavier Gantert

D. J. Bernstein University of Illinois at Chicago. See online version of paper, particularly for bibliography: /papers.

FACTORIZATION OF TROPICAL POLYNOMIALS IN ONE AND SEVERAL VARIABLES. Nathan B. Grigg. Submitted to Brigham Young University in partial fulfillment

Algebra 2 Year-at-a-Glance Leander ISD st Six Weeks 2nd Six Weeks 3rd Six Weeks 4th Six Weeks 5th Six Weeks 6th Six Weeks

DNA Data and Program Representation. Alexandre David

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

Arithmetic and Algebra of Matrices

MBA Jump Start Program

Characteristics of Java (Optional) Y. Daniel Liang Supplement for Introduction to Java Programming

Lecture 5 Rational functions and partial fraction expansion

The Quadratic Sieve Factoring Algorithm

Faster deterministic integer factorisation

Factoring Polynomials

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Modélisation et résolutions numérique et symbolique

Definitions 1. A factor of integer is an integer that will divide the given integer evenly (with no remainder).

South Carolina College- and Career-Ready (SCCCR) Algebra 1

Examples of Tasks from CCSS Edition Course 3, Unit 5

A Systematic Approach to Factoring

Learning Objectives 9.2. Media Run Times 9.3

Transcription:

: introduction and implementation details Advanced Concepts Team European Space Agency (ESTEC) Course on Differential Equations and Computer Algebra Estella, Spain October 29-30, 2010

Outline A Brief Overview 1 A Brief Overview 2 3 4

Outline A Brief Overview 1 A Brief Overview 2 3 4

Piranha in a Nutshell It is an algebraic manipulation framework Around 12000 SLOC (Single Lines Of Code) Written in C++ and object-oriented It uses extensively existing Free-Software tools and libraries (Boost, GMP, Python,... ) Multiplatform (GNU/Linux, Windows, BSD) Free-Software itself

for Celestial Mechanics Polynomials: Fourier series: Poisson series: i i j C i p i i ( ) cos C i (i t) sin ( ) cos C i,j p j (i t) sin Echeloned Poisson series: C i,j,k p k i j k l (l d) δ j,l ( ) cos (i t) sin

The Framework 1/2 A Brief Overview Q: Can we manipulate these algebraic structures in a general and unified way? The Basic Ideas 1 Series are collections of terms 2 Terms are coefficient-key pairs 3 Terms are uniquely identified by their keys: t 1 t 2 t 1.key t 2.key 4 A key can appear at most once in a series, i.e., a series is a set

The Framework 1/2 A Brief Overview Q: Can we manipulate these algebraic structures in a general and unified way? The Basic Ideas 1 Series are collections of terms 2 Terms are coefficient-key pairs 3 Terms are uniquely identified by their keys: t 1 t 2 t 1.key t 2.key 4 A key can appear at most once in a series, i.e., a series is a set

The Framework 2/2 A Brief Overview

Object-Oriented and Generic Programming The C++ Language High performance and high-level design are not mutually exclusive OO: inheritance, polymorphism, encapsulation, modularity Generic programming: type-agnostic classes Template meta-programming (aka modern C++, see Alexandrescu [2001]): OO features with zero overhead, efficient compile-time optimizations and checks The Bottom Line It is possible to share a consistent portion of the implementation among the supported algebraic structures and reduce code duplication to a minimum without sacrificing performance

Object-Oriented and Generic Programming The C++ Language High performance and high-level design are not mutually exclusive OO: inheritance, polymorphism, encapsulation, modularity Generic programming: type-agnostic classes Template meta-programming (aka modern C++, see Alexandrescu [2001]): OO features with zero overhead, efficient compile-time optimizations and checks The Bottom Line It is possible to share a consistent portion of the implementation among the supported algebraic structures and reduce code duplication to a minimum without sacrificing performance

A Quick SLOC Analysis 1/2 Gregoire & Colbert Chapront [2003]: Fourier and Poisson series manipulators Written in Fortran 90 Feature set comparable with Piranha s 4000 SLOC each (Piranha is 12000 SLOC) Piranha supports additionally: polynomials as top-level series multiple representations for keys and numerical coefficients (complex, reals, integers, rationals, arbitrary-size, etc.) 12 different manipulators are currently implemented within the framework (other combinations can be trivially added)

A Quick SLOC Analysis 2/2 Piranha s SLOC count divided by directory:

Pyranha A Brief Overview Python bindings for Piranha Uses the Boost.Python library Compiled-code performance with the flexibility of an interpreted language Python is a real computer language (not some obscure ad-hoc language) Many possibilities for extensions Interactive graphical environment with IPython, matplotlib and PyQt4

Outline A Brief Overview 1 A Brief Overview 2 3 4

Schoolbook multiplication Given: a(x) = a 1 x + a 0, b(x) = b 1 x + b 0, compute a(x) b(x) as Complexity: O ( n 2). a 0 b 0 + a 0 b 1 x + a 1 b 0 x + a 1 b 1 x 2.

Asymptotically fast multiplication: Karatsuba Karatsuba s algorithm: given a(x) = a 1 x + a 0, b(x) = b 1 x + b 0, express a(x) b(x) as a 0 b 0 + [(a 0 + a 1 ) (b 0 + b 1 ) a 0 b 0 a 1 b 1 ] x + a 1 b 1 x 2, with 3 multiplications vs 4 of the classical method. Complexity: O ( n log 3) 2.

Asymptotically fast multiplication: FFT Convert polynomials to vector of coefficients Compute the FFT of both vectors Pointwise multiplication of the FFTed vectors Inverse FFT to recover the result of the multiplication Complexity: O (n log n).

Alas... A Brief Overview Issues Both Karatsuba and FFT: have a high constant factor in complexity which make them unsuitable for typical problems in Celestial Mechanics rely on the assumption that the polynomials being multiplied are dense perform poorly on real-world multivariate polynomials Bottom line: back to schoolbook multiplication.

Kronecker s trick A Brief Overview z y x Code 0 0 0 0 0 0 1 1 0 0 2 2 0 0 3 3 0 1 0 4 0 1 1 5 0 1 2 6 0 1 3 7 0 2 0 8 0 2 1 9 0 2 2 10 0 2 3 11............ 3 3 3 63 Idea: code the sets of exponents into integer values Maintain lexicographic order Homomorphism between the vector space of integers and Z which preserves addition and subtraction Operations on integer vectors are reduced to O (1) complexity Codes can be used as perfect hash values or indices in an array Series are encoded on-the-fly during multiplication

Exploiting modern computer designs memory hierarchies (to the whiteboard) spatial locality of reference temporal locality of reference prefetcher multi-core CPUs parallelization (multi-thread)

Memory hierarchy A Brief Overview

Dense multiplication use Kronecker exponents directly as indices in an array use cache-blocking to promote temporal locality of reference monomial ordering is prefetch-friendly when applicable, top performance is achieved

Memory access patterns Unoptimized vs Optimized

Sparse multiplication use Kronecker exponents directly as hash values optimized hash table: items stored in sequential and contiguous buckets order input polynomials according to exponent modulo table size cache-blocking

Parallelization A Brief Overview P 1 P 2 1 1 2 3 n 3 n 2 n 1 n 2 2 3 4 n 2 n 1 n 1 3 3 4 5 n 1 n 1 2 4 4 5 6 n 1 2 3 cache-blocking provides a natural way to avoid contention interval arithmetics on the exponents used to guarantee write-in memory areas are disjoint

Outline A Brief Overview 1 A Brief Overview 2 3 4

A Brief Overview Fateman s dense benchmark. Compute: s(s + 1), s = (1 + x + y + z + t) 30. 46376 x 46376 = 2 150 733 376 term-by-term multiplications Final polynomial length = 635 376 Monagan-Pearce s sparse benchmark. Compute: f g, f = (1 + x + y + 2z 2 + 3t 3 + 5u 5 ) 12, g = (1 + u + t + 2z 2 + 3y 3 + 5x 5 ) 12. 6188 x 6188 = 38 291 344 term-by-term multiplications Final polynomial length = 5 821 335

Benchmark results Test Coefficient System Time ccpm Fateman double Core2Quad 4.29s 4.8 Fateman double Core2Duo 5.62s 4.6 Fateman double PPC64 4.96s 4.6 Fateman double Xeon 3.73s 4.6 Fateman double Atom 20.15s 15.0 Fateman GMP mpz Core2Quad 67.90s 75.8 Fateman 61-bit integer SDMP-Core2 60.25s 67.2 Fateman 61-bit integer SDMP-Corei7 70.59s 85.3 ELP double Core2Quad 15.62s 10.3 MP-sparse double Core2Quad 1.71s 107.2 MP-sparse double Xeon 1.59s 110.5 MP-sparse double Corei7 1.15s 88.0 MP-sparse 37-bit integer SDMP-Core2 1.86s 116.6 MP-sparse 37-bit integer SDMP-Corei7 1.56s 108.4

Benchmark results: parallelization dense case: 90% of maximum theoretical speedup sparse case: 70% of maximum theoretical speedup SDMP gets (super)linear speedup in the dense case, but does not scale up in the sparse case possible improvements: reduce synchronization barriers make algorithm non-deterministic...

Outline A Brief Overview 1 A Brief Overview 2 3 4

Future Steps Code refactor and cruft-elimination Extension of the Python bindings, GUI improvements, etc. Documentation Create a community