EIVA NaviModel3 Efficient Sonar Data Cleaning Implementation of the S-CAN Automatic Cleaning Algorithm in EIVAs NaviModel3 Lars Dall, EIVA A/S
Contents: Introduction to NaviModel3 Cleaning functionalities within NaviModel3, with emphasis on S-CAN automatic cleaning Evaluation of the S-CAN efficiency Future Developments Summary EIVA NaviModel3 Data Cleaning
Abstract: EIVA NaviModel3 Data Cleaning The development of NaviModel3 has focused on optimizing the postprocessing environment with emphasis on two aspects primarily: Optimization of the visual environment in order to supply the operator with enhanced and improved background information for his decision making Speed-optimization and automation of the entire post-processing task One of the major components in the speed-optimization and automation has been the implementation of the S-CAN automatic data cleaning algorithm
EIVA NaviSuite
Important Features: EIVA NaviModel3 Unlimited Modelsize, based on the QuadTree principle TRN and TIN geometry types TRN models include, by default: - Raw points - Average, Minimum and Maximum Model Types - Interpolation Models Cleaning Functionalities, Automatic- and Manual Methods Add-on Modules for - Conventional Digital Terrain Modelling - Pipeline Inspection - Video Integration - Online 3D Visualization - Catenary and Cablelay Visualizations
The NaviModel3 Flow
NaviModel3 Quad Tree Principle The Quad Tree Principle Strategy for fast handling and visualisation of large datasets When zoomed out: generalisation of the model with a low resolution When zoomed in: high level resolution of what is on the screen; the rest is clipped away NaviModel always uses 32 levels in a single file database solution
NaviModel3 Quad Tree Principle Models and raw points are residing on the hard-drive (not in RAM). The effects of the Quad Tree principle can be visualised as follows. Note the high IO-efficiency both with respect to model and raw points:
NaviModel3 Model Types I The TRN geometry type: TRN models: Square cells divided into four triangles each To ensure a smooth transition between cells
NaviModel3 Model Types II The TIN geometry type: TIN Model, left Delaunay Triangulation, principle, right (circumcised circles between corners of triangles must not contain other points)
NaviModel3 Interpolation IDW interpolation, right, extrapolation, left n 1 1 As * * n 1 2 i 1 di ) d i 1 2 i Ap i, with
NaviModel3 Cleaning The situation prior to cleaning: DTM view with noise around the pipe, close to a rock-dump area (left) Noisy raw data around the pipe (embedded in the model and superimposed onto the DTM) (right)
Manual Point Edit 3D cleaning NaviModel3 Cleaning II
NaviModel3 Cleaning III After cleaning w. manual Point Edit 3D cleaning
Semi-automatic Histogram Plane Cleaning NaviModel3 Cleaning IV
NaviModel3 Cleaning V NaviModel3 supports inclusions of dedicated, user-developed plug-ins for cleaning and antinoise determination: The S-CAN (SCALGO Combinatorial Anti Noise) cleaning tool is developed in corporation with Center for Massive Data Algorithms (MADALGO) at the University of Aarhus The development of the tool focused on automatic cleaning of the massive multi-beam point-clouds, typically associated with pipe line surveys The S-CAN computes a Noise Score for each data point, and the user can then interactively clean parts of the dataset in NaviModel3 by selecting a region of the data and removing points with high noise scores The score value is determined in an initial, relatively processing-heavy, step The subsequent manual processing step of selecting areas with different and dedicated threshold values is developed for efficiency The S-CAN plug-in comes in two different variants: The Score variant The Components variant
NaviModel3 S-Can Cleaning The Components Variant: Separates input observations into series of observations that fulfil a requirement of maximum threshold between neighbouring points Neighbouring series are termed Surfaces. A large threshold separates into surfaces with high internal noise. A small threshold will divide the observations into more surfaces The largest surfaces, in terms of population, are listed, in sequence, in the user interface, for the user to choose which ones to keep If the threshold is not acceptable for the cleaning, a new indexing, with a new threshold must take place
NaviModel3 S-Can Cleaning II S-CAN Components automatic cleaning Initial action (left) Cleaning (right)
NaviModel3 S-Can Cleaning III After Cleaning w. S-CAN Components automatic cleaning
4 NaviModel3 S-Can Cleaning IV The Score Variant: Score calculates for all thresholds This optimizes the testing of the best possible threshold value for a given area The Score variant is often faster than the Component variant. It should only be used where one surface must be determined. A pipe and a seabed can sometimes be regarded a surface Similarly, Component should be used in situations with a larger variety in the seabed features Most often combining the two variants will yield an optimum solution, with Score being used as priority 1, because of its effectiveness, and Components used in the remaining, more complex areas
NaviModel3 S-Can Cleaning V S-CAN Score automatic cleaning Initial action (left) Cleaning (right)
NaviModel3 S-Can Cleaning VI After cleaning w. S-CAN Score automatic cleaning
Raw SBD file from Reson7125D (42 pipe)
After cleaning 11.2 mill points
NaviModel3 S-CAN Cleaning Performance blue line high performance, 64 bit (Score) green line high performance, 32 bit (Score) red line medium/low performance, 64 bit (Score) orange line medium performance, 32 bit (Score) brown line high performance, 32 bit (Components)
NaviModel3 S-CAN Cleaning Performance II Break-point Performance before the break-point Performance after the break-point 64-Bit W7 Laptop (8 GB) 34 (34) million points 6.3 (7.5) million/minute 3.1 (3.8) million/minute 64-Bit XP Desktop (8 GB) 34 million points 1.8 million/minute 0.5 million/minute 32-Bit W7 Laptop (3 GB) 12 million points 3.2 million/minute 0.6 million/minute 32-bit W7 Desktop (3 GB) 14 (14) million points 6.9 (9.4) million/minute 1.7 (2.5) million/minute Break-point, 64 bit: 1200 m data (60 minutes of observations) Break-point, 32 bit: 500 m data (25 minutes of observations) The Break-point is a function of the RAM available for the algorithm: Once all points to be cleaned can be contained in RAM, the algorithm is 4-5 times more efficient than when the auxiliary memory on the swap-file is applied to contain the points. For larger dataset it can be beneficial to divide the initial score determination into optimum parts, relative to the RAM available. When performance is of outmost importance, substantial improvements can be achieved by employing a 64-bit operating system with large amounts of RAM on a high performance computer.
NaviModel3 Efficient Data Cleaning Summary: Speed-optimization and automation: Cleaning functionalities: The automatic cleaning tools, the S-CAN variants, are important contributors to the speed increase, for the most part because they require a moderate user involvement and because they are easy to use Cleaning efficiency and optimization: S-CAN is capable of processing large datasets that do not match the limitations of internal memory. The constant movement of data to and from disc during the cleaning, does not appear to be a performance bottleneck Substantial improvements on S-CAN cleaning performance can be achieved by employing a 64-bit operating system with adequate amounts of RAM on a high performance computer Future developments