CS Software Engineering for Scientific Computing. Lecture 25:Mixed Language Programming.

1 CS Software Engineering for Scientific Computing Lecture 25:Mixed Language Programming.

2 Different languages Technical, historical, cultural differences can result in people choosing to program codes you would like to use in different languages. Several options are available to you when you wish to invoke this code as Third Party Library - Re-implement the functions you want in your own language - While this sounds ridiculous, it is actually the dominant means of communicating algorithms to this day. - Translate the code using a tool - f2c is a classic in this category - creates code that lives in your repo. You branch from the reference implementation - Perform inter-language procedure calls, compilation and linking. - Implement a client-server or Service Oriented Architecture (SOA) design. 2"

3 Re-implement Cons Pros - You ll probably not have as much testing and robustness as the widely used package - You might get the routine wrong. - infeasible past a relatively simple level of complexity - You miss out on advances others make in the state-of-the-art. - If you are re-implementing something from a slower execution language (python, or Java) it will probably run faster. - Most tools are happiest when working with a single-language code environment - profilers, build systems, debuggers, revision control, documentation. - nothing special needed to call code written in your own language. Good practice - Keep an active regression test system that compares your implementation with the reference package - See if somebody else is already maintaining a mixed language binding for your language and join that team. Most of this applies to translation approaches 3"

4 Client-Server Models (or SOA) Problem: - Code written in language A - Code written in language B - One needs an operation to be performed be the other Solution: protocol written in language C! - example - HTTP HyperText Transfer Protocol - clients written in any language (browsers, crawlers, etc) - servers written in any language (apache, php, perl, C) While network-centric in practice it is not an essential element. - I can hook up programs with unix pipes and STDIN/STDOUT For long term viability and flexibility this is a good way to insulate your development - transfer all discussion of interoperability to just your own user community, as expressed in your transfer protocol. Communication protocols are much more forgiving than a language semantic but also a terror to debug. 4"

5 C++ calling C Despite the abbreviation, C++ is not C. The main difference is how symbols are named - symbol is the text string that is used to label a function name C compilers are all compatible in their convention for what symbols are named. - I can compile with different C compilers and usually successfully link them together into a common executable. C++ compiler have traditionally all had their own proprietary naming conventions. Easiest to show with an example 5"

6 f1.cpp vs f1.c void f1(int a_a, int a_b, int a_c)! {! int temp = a_a;! double b = a_b;! }! >g++ -c o f1.o f1.cpp! >gcc c o f1c.o f1.c! > nm f1.o ; nm f1c.o! 6"

7 Name Mangling s EH_frame1! T Z2f1iii! S Z2f1iii.eh! s EH_frame1! T _f1! S _f1.eh! Since C does not have overloading, or classes, symbol names do not require any form of mangling Mangling is the process of making up unique string names for member function and overloaded functions. You can look at various object files for classes and functions you have laying around. 7"

8 Linking So what are the consequences on linking your C++ program to C compiled code? #include "f1.h! int main()! {! }! f1(1,2,2);! return 0;! > g++ -c -o f1test.o f1test.cpp! > g++ f1test.o f1c.o! Undefined symbols for architecture x86_64:! "f1(int, int, int)", referenced from:! _main in f1test.o! 8"

9 nm f1test.cpp s EH_frame1! U Z2f1iii! U gxx_personality_v0! T _main! S _main.eh! The linker was looking for Z2f1iii! But f1c.o contains _f1 C++ and C name their functions differently We can tell C++ to instead look for the C named symbol 9"

10 extern C extern "C" {! #include "f1.h! }! int main()! {! f1(1,2,2);! return 0;! }! We can tell the C++ compiler to NOT mangle a function name, but just use the default C naming conventions. Also, I should mention this is called conditional compilation s EH_frame1! U gxx_personality_v0! U _f1! T _main! S _main.eh! 10"

11 Writing a portable C header: f1.h #ifdef cplusplus! extern "C" {! #endif! A macro that is defined by all C++ compilers void f1(int a_a, int a_b, int a_c);! #ifdef cplusplus! }! #endif! 11"

12 Fortran 2003 C Interoperability More cumbersome than you would imagine subroutine f1(a, b, c) BIND(C)! USE ISO_C_BINDING! integer (C_INT) a, b, c;! return! end! Like with the C++ compiler, I can now tell the Fortran compiler to create a symbol with a C naming convention. To call this function from C++ I would need a declaration extern "C" {! }! void f1(int* a, int* b, int* c);! 12"

13 C as the typical default LCD Most languages will provide a means to make their functionality accessible to a C interface. In general, this creates the two step process as shown for Fortran 2003 and C++ 1. library code is instructed to create a C named symbol 2. calling code is instructed to look for a C named symbol - Then the linker can find everything and hook it all up Some languages, like C++, embed their strong typing in their naming convention. Others, like C and F77, do not catch the error of calling a function with the wrong number or type of arguments. We ll stop here with compiled languages. The approach for the rest is the same, but you will rarely encounter the other compiled languages. 13"

14 Interpreted languages Interpreted languages (Java, Python, MATLAB, etc.) These are not linkable (no.o objects) The procedure is quite different depending on which direction you want to go. Interpreted language are executing inside a virtual machine. - It is like an abstraction of a computer, but is in reality just another executing process Python calling C++: Extending Recall, most languages provide a mechanism to create a C binding for their functions. - Write wrapper code that includes Python.H - This includes parsing your string inputs - compile to a C binding - link into a shared library - load shared library into the python interpreter with the import function 14"

15 Simple wrapped function #include Python.h! static PyObject * toy_system(pyobject *self, PyObject *args)! {! }! const char *command; int sts;! if (!PyArg_ParseTuple(args, "s", &command))! return NULL;! sts = system(command);! return Py_BuildValue("i", sts);! 15"

16 Not done yet. Also need a vtable static PyMethodDef ToyMethods[] =! { {"system", toy_system, METH_VARARGS, "Execute a shell command."},! {NULL, NULL, 0, NULL} /* Sentinel */ }; //and still not done yet static struct PyModuleDef toymodule = {! PyModuleDef_HEAD_INIT,! "toy", /* name of module */! NULL, /* module documentation, may be NULL */! -1, /* size of per-interpreter state of the module,! variables. */! ToyMethods! };! or -1 if the module keeps state in global 16"

17 and still not done yet: Init function PyObject* PyInit_toy(void)! {! PyObject* res = PyModule_Create(&toymodule);! if (!res) return NULL;! return res;! }! So, what do you do with this toy.cpp file?! 17"

18 Compiling and linking a python module > g++ -c -fpic -I/Library/Frameworks/ Python.framework/Versions/3.2/include/python3.2m/ -o toy.o toy.cpp! - -fpic creates Position Independent Code. It means that all the pointer offsets in the function are relative to the stack frame, not the function address. This makes is relocatable. > g++ -shared -L/Library/Frameworks/ Python.framework/Versions/Current/lib -lpython3.2 -o toy.so toy.o! - -shared means we are building a dynamic or shared library 18"

19 Using a Python Module > python! Python (v3.2.2:137e45f15c0b, Sep , 17:28:59)! [GCC (Apple Inc. build 5666) (dot 3)] on darwin! Type "help", "copyright", "credits" or "license" for more information.! >>> import toy! >>> toy.system("ls -la");! total 40! drwxr-xr-x 5 bvs bvs 170 Nov 21 17:15.! drwxr-xr-x 102 bvs bvs 3468 Nov 21 16:39..! -rw-r--r-- 1 bvs bvs 831 Nov 21 17:03 toy.cpp! -rw-r--r-- -rwxr-xr-x 0! >>>! 1 bvs bvs 1872 Nov 21 17:03 toy.o! 1 bvs bvs 8748 Nov 21 17:06 toy.so! 19"

20 Why would we do this? Python is nice and fun to use. Good for rapidly prototyping new ideas. The interpreter can make you code quite slow. You can link in optimized and compiled code for the performance critical operations in your program. 20"

21 Why would we not do this? Good bye debugging Good bye profiling To get those things back you end up re-implementing your code base in the compiled language It is not hopeless to debug python modules 21"

22 Debugging Python Modules >gdb python! (gdb) run! >>> import toy! Reading symbols for shared libraries. done! >>>! Program received signal SIGINT, Interrupt.! 0x00007fff8a3ad932 in select$darwin_extsn ()! (gdb) break toy_system! Breakpoint 1 at 0x100669e66! (gdb) cont! Continuing.! toy.system("ls -la");! Breakpoint 1, 0x e66 in toy_system ()! (gdb) where! #0 0x e66 in toy_system ()! #1 0x b31f4 in PyEval_EvalFrameEx ()! #2 0x b41ba in PyEval_EvalCodeEx ()! #3 0x b44cf in PyEval_EvalCode ()! #4 0x db16e in PyRun_InteractiveOneFlags ()! #5 0x db43e in PyRun_InteractiveLoopFlags ()! #6 0x dbc71 in PyRun_AnyFileExFlags ()! #7 0x f0982 in Py_Main ()! #8 0x e5f in dyld_stub_strlen ()! #9 0x d04 in?? ()! 22"

23 C++ calling Python: Embedding #include <Python.h>! int main(int argc, char *argv[])! {! }! Py_Initialize();! PyRun_SimpleString("from time import ctime\n! Py_Finalize();! return 0;! print(ctime())\n");! >g++ -I/Library/Frameworks/Python.framework/Versions/ 3.2/include/python3.2m/ -L/Library/Frameworks/ Python.framework/Versions/3.2/lib -lpython3.2 simple.cpp! >./a.out! Mon Nov 21 21:45: ! 23"

24 To get fancier, you need to use Py interface Catching the return types from functions - PyObject dynamic casting return types to their derived types Turning your arguments into strings that get handed through the python parser. This gets ugly very quickly - It also changes syntax as you move through minor Python version numbers (?!?!) yes, that is a bad thing You end up parsing a lot of PyList objects in the raw interface. 24"

25 Tools SIP and SWIG - Automate much of the mechanical part of generating wrapper code given existing C or C++ code. Just have to follow certain coding conventions - Still not really automatic - Not all language semantics have an analogue between the languages. Boost.python - A set of C++ classes and templates that make 25"

26 Boost.python #include <boost/python.hpp>! char const* greet()! { return "hello, world"; }! BOOST_PYTHON_MODULE(hello_ext)! { using namespace boost::python;! def("greet", greet);! }! I ll skip the complicated compilation.! >>> import hello_ext! >>> print hello_ext.greet()! hello, world! 26"

27 Java works in a similar way The two languages are contemporaries really Java has been a bit schizophrenic about it s C interface. - Native C interfaces make it very hard to make your program certifiably secure. - Native C code makes your Java code non-portable, non- webbish Still, you have to either provide *everything* in your language, or provide an interface to C. - Not too many device drivers get written in Java. (Java s support for Real Time execution is still pretty new and brittle). The Java Native Interface is specified in <jni.h> First, we look at extending 27"

28 HelloWorld.h #include <jni.h>! #ifndef _Included_HelloWorld! #define _Included_HelloWorld! extern "C" {! JNIEXPORT void JNICALL Java_HelloWorld_print (JNIEnv *, jobject);! }! #endif! 28"

29 HelloWorld.cpp #include <jni.h>! #include <stdio.h>! #include "HelloWorld.h! JNIEXPORT void JNICALL Java_HelloWorld_print(JNIEnv *env, jobject obj)! {! }! printf("hello World!\n");! return;! 29"

30 HelloWorld.java class HelloWorld! {! private native void print();! public static void main(string[] args)! {! new HelloWorld().print();! }! static! { System.loadLibrary("HelloWorld"); }! }! 30"

31 Building it all and running >javac HelloWorld.java! >g++ -shared HelloWorld.cpp -o libhelloworld.so! >java HelloWorld! Hello World!! Now, there are also tools to help you get further along - javah - SWIG - reads a java class file and generates a header file stub for you - can parse your C/C++ code and generate shadow classes and wrappers to help you with JNI as well. Microsoft thinks they know better, or they just want Java to fail. MS tools to link with Java are notoriously buggy 31"

32 Java Embedding #include <jni.h>! JNIEnv* create_vm(javavm ** jvm)! { JNIEnv *env; JavaVMInitArgs vm_args;! JavaVMOption options;! options.optionstring = "-Djava.class.path=D:\ \Java Src\\TestStruct";! vm_args.version = JNI_VERSION_1_6;! vm_args.noptions = 1;! vm_args.options = &options;! vm_args.ignoreunrecognized = 0;! int ret = JNI_CreateJavaVM(jvm, (void**)&env, &vm_args);! if(ret < 0) printf("\nunable to Launch JVM\n");! return env;! }! 32"

33 What do you do with a JavaVM? Much the same as you would do with a Python VM Build up strings to pass to Java functions Handle Java Objects as return types. Most useful if you have a large Java written GUI already set up and working, but want to call it from your code. - This doesn t come up very often Mostly just wanted to show that interpreted languages have two kinds of interoperability, and there is a virtual machine. 33"

