Implementation and Application of Cloud Computing Environmental Data Management for Los Alamos National Laboratory Dr. Sean Sandborgh Locus Technologies Note: All data shown is publically available at http://www.intellusnm.com
37 Square mile facility 2000 feet of elevation change 14 major canyon systems 2000 archeological sites Long legacy of environmental monitoring 1942 to the Present Routine GW sampling since 1948
Earliest recorded sampling date: October 10, 1942 1,223 different analytical parameters (including organics, radionucleides, inorganics and water quality parameters) are represented in the data Over 27,000 sampling locations Over 230,000 field samples have been collected Over 10,000,000 analytical records Over 48,000 field measurements Over 80,000,000 water level and flow data points Almost 60,000 laboratory Chain of Custodies have been sent to labs Plenty of high quality environmental data how to better use it?
Complexity = Time = Money
New Data New Data Replication Historical Primary Environmental Database Replication Public Website Database Air Environmental Database Custom Reports (Parent Database Specific) Public Interface (RACER) Custom Reports (Parent Database Specific)
The need to maintain multiple different databases Expensive, constant replication The need to replicate reports, or design report options which could use data from multiple databases Offline reports requiring the input of subsets of data No single individual or group understood all of LANL s data management strategy or structures Substantial costs in third party validation and internal data editing Changes needed to be propagated throughout the system
Complicated data management process Cradle to grave sample planning and tracking Third party validation of almost 100% of data Dozens of customized regulatory reports, each created using particular custom tools and tethered to particular data format inputs No one consistent data flow that applied to ALL environmental data
Databases included Access, SQL Server 2000, Oracle, Excel spreadsheets Substantial data standardization was required Reports included those hardcoded with the above databases as well as standalone Access reporting modules and heavily Macroed Excel spreadsheets
Data structure differences Differing nomenclature: Indices, report specific fields, valid values and look up tables How to develop clear scope of work when different environmental groups all use similar features? How to migrate data while still uploading new data and performing data edits? How and when to beta test new reports/functionality and perform user acceptance testing? Side by side data management comparisons/warm up period?
Standardize, standardize, standardize! Prioritize the nice to haves versus the mission critical functionality Budget significant time for implementation (be realistic!) Try to develop as much of the specific scopes for each individual contract item before implementation Longer and vastly more complicated doing the as you go approach
New Data Custom Reports (Generic Able to be used by all relevant data) Public Interface (Intellus)
Holding Table Incorrect Data Correct Data Automated Level 2 validation and verification checks Identify bad data Maintain consistency Central Database System Analysis Reporting Visualization Project Management
Complicated data management process Cradle to grave sample planning and tracking Cradle to grave planning, with better tracking of ALL field activities Third party validation of almost 100% of data ALL analytical data is automatically Level 2 validated by EIM Dozens of customized regulatory reports, each created using particular custom tools and tethered to particular data format inputs Reports are no longer specific to particular platforms and more features are available to all reports now No one consistent data flow that applied to ALL environmental data One consistent flow for all data
Inconsistent data between programs Expensive validation, and synchronization between multiple databases Reporting highly specialized and unable to be utilized by any other data Rigorous Level 2 automatic validation on data entry Single data structure for all programs Web based, always available data repository Public website able to harness the analytical power of EIM
Sample Planning 3 rd Party Data validation Regulatory Reporting Lab costs and scheduling Auto data validation Ad Hoc Reporting Field collection Lab invoice receipt and reconciliation Automatic Public Data Share Sample Fulfillment Sample send and receipt
Soils database Water database Air database Storm water reporting Waste analysis and reporting Manual DMR reporting Public data feeds
If you are interested in participating in a live pilot program please contact us at: info@locustec.com +1 650 960 1640
Already Migrated over 12,000,000 various data records to the Cloud.
No double input /output/qa/qc/edd standard/etc. The new way of doing business. Open windows of opportunities to improve other processes Significant cost reduction, ROI, ROO Skip the messy installation and maintenance of application Access both application and data over the Web No need for in house programmers and expensive development No need to pay consultants to develop and maintain applications
Uniform and standardized processes across organization and consultants Easy to switch service providers or bring on additional specialty consultants. One change benefits all users instantly SaaS shifts the costs of the upgrade to the next release from the customer to the vendor. This is huge upgrades are multimillion costs if the vendor does it instead of the customer this is a dramatic TCO improvement.
Significant cost reduction Pay what you use Gain ownership of data Convenience and ease of use Rolling upgrade program Crowd sourcing (wisdom of the crowd benefits all)
Common platform based on a single database Replace outdated legacy applications, custom databases, and spreadsheets Significant cost reduction Operating process improvements