Monitoring computer Vision Applications in Cloud Platforms (MOVACP) Sidi Ahmed Mahmoudi and Fabian Lecron University of Mons, Belgium sidi.mahmoudi@umons.ac.be Abstract Nowadays, images and videos have been present everywhere, they can come directly from camera, mobile devices or from other peoples that share their images and videos. The latter are used to present and illustrate different objects in a large number of situations (public areas, airports, hospitals, football games, etc.). This makes from image and video processing algorithms a very important tool used for various domains related to computer vision such as video surveillance, human behavior understanding, medical imaging and database (images and videos) indexation methods. The goal of this project is to develop a cloud application that integrate the mentioned methods using similar image and video processing (OpenCV, OpenGL, ITK, VTK, etc.). The application takes into account the variety of OS (Windows, Linux, Mac) and languages (Java, C++, Python, etc.). Experimentations will be conducted with different situations such as real time event detection and localization in crowd videos, motion tracking, medical image segmentation, 3D image reconstruction from 2D radiographs, 3D analysis of bones, etc. Keywords cloud computing, image and video processing, video surveillance, medical imaging. I. Principal investigators 1. Dr. Sidi Ahmed Mahmoudi from the Faculty of Engineering at the University of Mons. Belgium 2. Dr. Fabian Lecron from the Faculty of Engineering at the University of Mons. Belgium. II. Candidates Mohammed El Adoui, PhD Student, Faculty of Engineering, University of Mons, Belgium. Omar Seddati, PhD Student, numediart Institute,, University of Mons, Belgium. Mohamed Amine Lazouni, PhD, University of Tlemcen, Algeria. Mohammed Amine Larhmam, PhD Student, Faculty of Engineering, University of Mons, Belgium Mohammed Benjelloun, PhD, Faculty of Engineering, University of Mons, Belgium. Said Mahmoudi, PhD, Faculty of Engineering, University of Mons, Belgium. III. Dates Sidi Ahmed Mahmoudi: 4 weeks from 18 July to 12 August. Mohammed El Adoui: 4 weeks from 18 July to 12 August. Mohammed Amine Larhmam: 2 weeks Omar Seddati: 1 week Fabian Lecron : not yet defined Mohammed Benjelloun: not yet defined Said Mahmoudi : not yet defined. Notice that these dates are not definitive. They can be lightly changed during next months. Otherwise, we are actually discussing with other collaborators to joint our group. 1
IV. Project objectives The goal of this project is to develop a cloud application that integrates several computer vision methods related to motion tracking, video-surveillance and medical imaging applications. The latter are generally developed with similar image and video processing libraries (OpenCV 1, OpenGL 2, ITK 3, VTK 4, etc.). These modules are automatically integrated and configured in the cloud application. Within our platform, guests will have access to different computer vision techniques without having to download, install and configure the corresponding software. Each user could select the required application, load its data and retrieve results, with an environment similar to desktop. The main objective of this cloud application is to offer an effective tool for sharing scientific applications, and also to test (or validate) the developed methods with users from the entire world. Notice that the integrated applications are already developed (motion tracking, event detection and localization in real time, medical image segmentation, bone analysis, etc.). Moreover, the cloud platform should allow easily the integration of other methods. The main challenges of our project are : Development of a real time application for event detection and localization in multi-user scenarios 3D Medical image segmentation [1, 2] 3D reconstruction of shape [3], density and microarchitecture from 2D radiographs The above-mentioned applications will be shared within a cloud application so that each user can use the selected method, with an environment similar to desktop. V. Background information The background information of our project is related to the image and video processing applications used within motion analysis, video surveillance and medical 1 OpenCV computer vision library. www.opencv.org 2 www.opengl.org 3 www.itk.org 4 www.vtk.org imaging domains. We can cite several examples on which we are mainly interested such as: 1. Real time event detection and localization in multiuser scenarios [4] 2. Medical image segmentation in MR images [1, 2] 3. 3D image reconstruction from 2D radiographs [3] 4. Bone analysis : bone mineral density (BMD) and microarchitecture computation [5, 6] The above-mentioned applications would be integrated in a cloud platform from which guests and users could access to different computer vision techniques without having to download, install and configure the corresponding software. Actually, we are looking for the cloud platform that would be used for this aim. We can note some platforms such as IBM cloud platform 5, Google cloud platform 6, Amazon cloud computing 7. VI. Detailed technical description The technical description of our work can be presented with four main steps : Selection of the cloud platform: first, we should analyze the existing cloud platforms in order to select a platform that could be well adapted with our above-mentioned applications. Integration of the main image and video processing libraries (OpenCV, OpenGL, ITK, VTK, etc.). These modules are automatically integrated and configured in the cloud application. Securing data transfers to and from our cloud platform. This task is very important for video surveillance and (particularly) medical imaging applications. Integration of the above-mentioned applications so that guests can have access to different computer vision techniques without having to download, install and configure the corresponding software. 5 www.cloud.google.com 6 www.ibm.com 7 www.aws.amazon.com/cloud 2
Each user could select the required application, load its data and retrieve results, with an environment similar to desktop. The main applications that should be integrated within our cloud platform are described below. VI.1 Real time motion tracking and analysis in multi-user scenarios Video processing algorithms present a necessary tool for various domains related to computer vision such as motion tracking, event detection and localization in multi-user scenarios (crowd videos, mobile camera, scenes with noise, etc.). However, the new video standards, especially those in high definitions require more computation since their treatment is applied on large video frames. As result, the current implementations, even running on modern hardware, cannot provide a real-time processing (25 frames per second, fps). Several solutions have been proposed to overcome this constraint, by exploiting graphic processing units (GPUs). Although they exploit GPU platforms, they are not able to provide a real-time processing of high definition video sequences. This application proposes a new framework that enables an efficient exploitation of single and multiple GPUs, in order to achieve real-time processing of Full HD or even 4K video standards. Moreover, the framework includes several GPU based primitive functions related to motion analysis and tracking methods, such as silhouette extraction, contours extraction, corners detection and tracking using optical flow estimation. Based on this framework, we developed several real-time and GPU based video processing applications such as motion detection using moving camera, event detection and event localization. The framework is summarized in Figure 1. More detail about this framework are present in [4]. VI.2 Web based medical image segmentation framework Accurate Vertebra segmentation presents an essential step for automating the diagnosis of many spinal disorders. In case of MR images, this task becomes more challenging due to vertebra complex shape and high variation of soft tissue. In this work, we propose a web based framework for spine curve extraction and vertebra segmentation in T1-weighted MR images. This method is a fast parametrized algorithm based on three steps: 1. Image enhancing 2. Clustering 3. Shape recognition techniques. Figure 2 illustrates an example of this segmentation. VI.3 3D image reconstruction from 2D radiographs Severe cases of spinal deformities such as scoliosis are usually treated,by a surgery where instrumentation (hooks, screws and rods) is installed to the spine to correct deformities. Even if the purpose is to obtain a normal spine curve, the result is often straighter than normal. In this work, we propose a fast statistical reconstruction algorithm based on a general model which can deal with such instrumented spines. To this end, we present the concept of multilevel statistical model where the data are decomposed into a within-group and a between-group component. The reconstruction formulated as an optimization problem which can be solved very fast (few tenths of a second). Fig. 3 illustrates an example of 3D reconstruction based on this approach. The latter is well detailed in [3]. VI.4 Bone analysis: mineral density and microarchitecture computation In order to offer a 3D reconstruction of the bone density and microarchitecture from 2D radiographs, we developed a multiplatform application, implemented within Java. The latter allows extracting and calculating different density and microarchitecture metrics (bone mineral density (BMD), volume fraction, trabecular thickness, trabecular separation, etc). The BMD value was calculated within a Phatom that we know in advance its density values (of several points within the Phantom). A calibration is applied in order to estimate the density of each point within the 3D volume. As result, we obtained a value representing the bone mineral density of the whole 3D volume. Figure 4 shows an example of computing BMD and microarchitecture parameters, which are related to a vertebra region (L1) 3
Figure 1: Multi-GPU based Framework for real time motion tracking and analysis Figure 2: Web based medical image segmentation framework of a human. The figure shows also the 3D volume that illustrates the trabecular thickness and separation. Each color provides information with respect to the thickness value and the distance between neighbor spans. VII. Work plan and implementation schedule We plan to start actually (before joining the workshop) on the research, test and analysis of the existing cloud platforms in order to select a platform that could be well adapted with the above-mentioned applications. The plan that should be followed during the workshop is : 1. Task 1: Setting the cloud platform and integration of the main image and video processing libraries (OpenCV, OpenGL, ITK, VTK, etc.). These modules are automatically integrated and configured in the cloud application. The idea is that each collaborator can integrate its required libraries. This task should be developed during the first week. 2. Task 2: Securing data transfers to and from our cloud platform. This task is very important for video surveillance and (particularly) medical imaging applications. This task should be done during the second week. Notice that we intend to start working on these tasks before joining the workshop. 4
Figure 3: Radiographs of a postoperative patient and the associated reconstruction. Left: Lateral view. Center: Postero-anterior view. Right: 3D reconstruction 3. Task 3: Once the platform is well installed, each collaborator can install its application (described above) so that guests can have access to different computer vision techniques without having to download, install and configure the corresponding software. Each user could select the required application, load its data and retrieve results, with an environment similar to desktop. This task should be done during the two last weeks of the workshop. VIII. Benefits of the research The main benefits of this research is to offer a platform for sharing our scientific contribution related to computer vision domain. Indeed, each collaborator could integrate its application within our cloud platform so that users could test and exploit the required application, load its data and retrieve results, with an environment similar to desktop. As result, guests could exploit the shared applications without having to download, install and configure the corresponding software. With this platform, the scientific researchers could be able to share easily their applications. IX.1 Leader IX. Profile team The project leader is Sidi Ahmed Mahmoudi, which received the graduate engineering degree in computer science from the University of Tlemcen, Algeria, the master degree in multimedia processing from the Faculty of Engineering in Tours, France, and the PhD degree in engineering science from the University of Mons, Belgium, in 2006, 2008, and 2013, respectively. Currently, he is a researcher at the University of Mons, Belgium. His research interests are focused on the efficient exploitation of parallel (GPU) and heterogeneous (multi-cpu/ multi-gpu) architectures for faster processing of high definition images and videos. He also participated in national (ARC-OLIMP, Numdiart, Slowdio, CLEO) projects and European actions (COST IC 805). Sidi Ahmed Mahmoudi is author or co-author of 5 international journals, 3 book chapters and more than 30 conference and workshop papers. IX.2 Staff proposed by the leader Fabian Lecron received the computer science engineering degree from the Faculté Polytechnique de Mons (FPMs), Belgium, and the managementsciences degree from the Facultés Universi- 5
enterface 16. Final project proposal " MOVACP " (a) 3D Visualization (b) BMD and microarchitecture parameters Figure 4: BMD and microarchitecture computation from a 3D bone volume taires Catholiques de Mons (FUCaM), Belgium, respectively in 2008 and 2011. He obtained a Ph.D. degree in applied sciences at the University of Mons (formerly Faculté Polytechnique de Mons) in 2013. He is now postdoctoral researcher at the University of Mons (UMONS), Belgium. His main research areas are computer vision, image processing, collaborative recommendation, and data mining. Mohammed EL ADOUI received the Master degree in computer science (computer graphics and Image processing) from the University of Moulay Abdellah in Fez, Morocco, in 2015, and the basic license degree in Mathematics and Computer science from the Faculty of Science in Oujda, Mo6 rocco in 2013. Currently, he is a Phd student at the University of Mons, Belgium. His Phd that started in December 2015, is focused on Quantifying Tumor Vascular heterogeneity of breast cancer, using Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE_MRI). Mohammed El Amine LAZOUNI received his License degree in Electronic Biomedical from the University of Tlemcen, Algeria in 2008. In 2010 he obtained a master degree in the same field. And the PHD degree in the Biomedical Engineering Laboratory from the University of Tlemcen (Algeria) in 2014. Currently, he is a researcher at the University of Tlemcen, Algeria. His research interests are focused in computer assisted medical
decision support systems, machine learning, data collection and artificial intelligence. Mohammed El Amine LAZOUNI is author or co-author of 2 international journals, more than 15 conference papers and 1 book chapter. Mohamed Amine LARHMAM received the graduate engineering degree in Modeling and Scientific Computing from Mohammadia School of Engineering in Rabat, Morocco in 2010. He worked as a project engineer for 9 months at MCINET, Rabat. Currently, he works as teaching assistant in the computer science department at the faculty of Engineering, University of Mons. He is pursuing a PhD in engineering science in the same department. His researches focus on computer vision and machine learning with applications to Medical Image Analysis and Computer Aided Diagnosis. IX.3 Other researchers needed For our project, we would like to collaborate with researchers interested to the domains of cloud computing, image and video processing and video surveillance. The researchers are invited to develop the could platform and also include their applications if they want to share or validate (apply tests with different data sets) the related results. using a multilevel statistical model. In Medical Image Computing and Computer-Assisted Intervention. MICCAI 2012, volume 7511, pages 446 453. 2012. [4] Sidi Ahmed Mahmoudi and Pierre Manneback. Multi-gpu based event detection and localization using high definition videos. In International Conference on Multimedia Computing and Systems (ICMCS), pages 81 86, 2014. [5] Tristan Whitmarsh, Ludovic Humbert, Luis M. Del Rio Barquero, Silvana Di Gregorio, and Alejandro F. Frangi. 3d reconstruction of the lumbar vertebrae from anteroposterior and lateral dualenergy x-ray absorptiometry. Medical Image Analysis, 17(4):475 487, 2013. [6] T. Whitmarsh, L. Humbert, M. De Craene, L.M. Del Rio Barquero, and A.F. Frangi. Reconstructing the 3d shape and bone mineral density distribution of the proximal femur from dual-energy x-ray absorptiometry. Medical Imaging, IEEE Transactions on, 30(12):2101 2114, 2011. References [1] Sidi Ahmed Mahmoudi, F Lecron, P Manneback, M Benjelloun, and S Mahmoudi. GPU-Based Segmentation of Cervical Vertebra in X-Ray Images. HPCCE Workshop, IEEE International Conference on Cluster Computing, pages 1 8, 2010. [2] Larhmam Amine and et al. A portable multicpu/multi-gpu based vertebra localization in sagittal mr images. International Conference on Image Analysis and Recognition, ICIAR 2014, pages 209 218, 2014. [3] Fabian Lecron, Jonathan Boisvert, Said Mahmoudi, Hubert Labelle, and Mohammed Benjelloun. Fast 3d spine reconstruction of postoperative patients 7