by Gerrit Muller Buskerud University College e-mail: gaudisite@gmail.com www.gaudisite.nl Abstract The amount of software in many systems increases exponentially. This increase impacts the reliability of these systems. In the source code of software many hidden faults are present. These hidden faults can transform into errors during the system life cycle, due to changes in the system itself or in the context of the system. We will discuss the current trends and potential directions for future solutions of an increasing reliability problem. more/longer testing Distribution This article or presentation is written as part of the Gaudí project. The Gaudí project philosophy is to improve by obtaining frequent feedback. Frequent feedback is pursued by an open creation process. This document is published as intermediate or nearly mature version to get feedback. Further distribution is allowed as long as the document remains complete and unchanged. status: draft within cost&time improve reliability without performance penalty with increasing requirements increasing #suppliers #sites, et cetera more procedures reduce functionality improve way-ofworking reduce code Ideals more standardization improve recovery Trader improve detection improve robustness more formality Tangram more automation Tangram early integration Tangram shared understanding Boderc more feedback Trader tolerant interfaces robust patterns Trader
Figure of Content TM Introduction + ESI + Speaker Exploration + trends + consequences for reliability + potential solutions conclusion Research Projects + Trader + Tangram + Ideals + Boderc 2 Gerrit Muller RSIScontent
Embedded Systems Institute (ESI) Mission To advance industrial innovation and academic excellence in embedded systems engineering (ESE). Vision ESI and its partners create and apply world-class ESE methods. television cardio X-ray system GSM waferstepper 7 Founders: Industry (Philips, ASML, Océ) Universities (Twente, Delft, Eindhoven) Knowledge Institutes (TNO) printer 3 Gerrit Muller RSISwhatIsESI
ESI Research Agenda Embedded Systems Engineering performance reliability evolvability in relation to all other system qualities: security, power, cost, size, et cetera 4 Gerrit Muller RSISresearchAgenda
Industry as Laboratory source of inspiration challenging problems application playground apply new engineering methods research improve hypothesis industry observe results evaluate 5 Gerrit Muller IALAindustryAsLaboratory
Who is Gerrit Muller? CT OSS NM X-ray MR EasyVision manager system engineering Gaudí Philips Medical ASML Philips NatLab ESI 98 99 2 time pressure industrial experience pragmatics sales research reflection cost constraints products evidence exposure lots of people education Buskerud 6 Gerrit Muller FIESAwhoIsGerrit
Gaudi Systems Architecting www.gaudisite.nl deze lezing: www.gaudisite.nl/reliabilityofsoftwareintensivesystemsslides. pdf 7 Gerrit Muller
Exploration of Reliability Introduction + ESI + Speaker Exploration + trends + consequences for reliability + potential solutions conclusion Research Projects + Trader + Tangram + Ideals + Boderc 8 Gerrit Muller RSIScontentExploration
Trends in Embedded Systems fast moving market complex value chains How to survive in innovative domains? fast moving technology increased integration 9 Gerrit Muller ECMAproblem
Increased Team Size required team size historic trendour challenge 995 2 25 2 year Gerrit Muller DYOFteamsize
Number of Faults Propertional With Code Size manyears and LOC (lines of code) per product Mloc Mloc kloc Based on average 3 errors/kloc k typical amount of errors per product 99 995 2 25 Gerrit Muller ASMLproblem
The Hard Reset Syndrome: Power Down Needed! Hard Reset Required: Cell Phone Television PC Beamer Car Coffee Machine DVD player Airbus Pilot announces a flight delay, due to computer problems. A complete reset is required. The flight entertainment system also show a reset: a complete Linux boot. This reboot hangs: server xxx not found 2 Gerrit Muller RSIShardReset
How to Make SW Intensive Systems Reliable more/longer testing within cost&time improve reliability without performance penalty with increasing requirements increasing #suppliers #sites, et cetera more procedures reduce functionality improve way-ofworking reduce code more standardization improve recovery improve detection improve robustness more formality more automation early integration shared understanding more feedback tolerant interfaces robust patterns 3 Gerrit Muller RSISsolutions
Reliability Research Introduction + ESI + Speaker Exploration + trends + consequences for reliability + potential solutions conclusion Research Projects + Trader + Tangram + Ideals + Boderc 4 Gerrit Muller RSIScontentResearch
Example Trader Project Overview Industrial Relevance Trader Objective: Research agenda: Domain: CIP: Partners: Timeline: Television Related Architecture and Design to Enhance Reliability Develop method & tools to ensure reliable consumer electronics products System Reliability Digital TV Philips Semiconductors Philips CE, Tass, and Research DTI, IMEC, TUD, TU/e, UL, UT, ESI 9/24 9/28, 22 Fte Poor reliability has severe business impact Customer expectation of TV reliability is high Little tolerance for technical problems % fault-free design is not achievable High volume market implies high risks if reliability problems occur Low product margin leaves no buffer for service costs Service center costs multiplied by number of complaints Market share reduction likely, i.e. customers buy another brand (On) Time to market is critical Missing fixed shipping gates costs millions of dollars Trader Domain Trends TV complexity increase follows the PC world from Display movies over antenna to Display anything over everything Broadcast ATSC, DVB, ISDB, Analog Terrestrial, Satellite, Cable Connectivity Cameras, JPEG, flash cards, HDD, MP3, Web browser, Ethernet, USB, etc User Perceived Reliability Objective Determine the user -perceived severity of a product failure mode Methodology Create a model considering relevant factors User -perceived loss -of-functionality User -perceived reproducibility Failure -frequency Work -around difficulty User -group characteristics Failure characteristics. Validate model Aspects (all depends on user) Function importance Frequency Reproducibility Solvability Loss of Function / time / Behavior Total: Evaluate and suggest system failure recovery strategies The recovery strategy may not annoy the user even more! 4 4 3 3 4 576 2 4 4 2 5 6 3 (JPEG) 5 5 5 5 5 Gerrit Muller RSIStrader
Increasing Code Size in Televisions 965 979 From: COPA tutorial, Rob van Ommering kb Moore's law 2 99 2 MB 64 kb 6 Gerrit Muller LWAmooresLawRvO
Research: System Awareness to Improve Reliability user input system system output correction awareness monitor customer expectations system failure model 7 Gerrit Muller RSIStraderAwareness
Research: Code Analysis to Improve Reliability Expose product weaknesses at design time Source code analysis Identify hotspots in code Consider impact to user-visible behavior: prioritize warnings Software architecture reliability analysis Techniques to identify failure-prone components Evaluation of architectural alternatives and trade-offs Koala component map 8 Gerrit Muller RSIStraderCodeAnalysis
Quality Degradation Caused by Shit Propagation needed code bad code copy paste modify needed code code not relevant for new function repair code bad code new needed code new bad code 9 Gerrit Muller BLOATshitPropagation
Example of Shit Propagation Class Old: capacity = startcapacity values = int(capacity) size = Class New: capacity = values = int(capacity) size = Class DoubleNew: capacity = values = int(capacity) size = def insert(val): values[size]=val size+= if size>capacity: capacity*=2 relocate(values, capacity) copy paste def insert(val): values[size]=val size+= capacity+= relocate(values, capacity) copy paste def insert(val): values[size]=val size+= capacity+= relocate(values, capacity) def insertblock(v,len): for i= to len: insert(v[i]) 2 Gerrit Muller BLOATshitPropagationExample
Example Tangram Project Integration & Test 2 Gerrit Muller RSIStangram
Moore s Law line width in nm 25 8 3 7 5 997 999 2 23 25 27 22 Gerrit Muller ASMLmooresLaw
Challenge: Exponential Increase innovation performance imaging overlay productivity complexity feedback and adjustments hw and sw components reliability robustness 23 Gerrit Muller ASMLproblemIRC
4 Views on a Waferstepper subsystems control hierarchy laser light source illuminator beam shaping system control coordination light ethernet reticle stage positioning reticle handler input/output reticles laser lens illuminator measurement C&T reticle stage reticle handler wafer stage wafer handler system control coordination measurement alignment, levelling lens projection C&T contanimation, temperature vertic al motio n VME horizontal motio n vertic al motio n VME horizontal motio n wafer stage positioning wafer handler input/output kinematic wafers physics/optics sensor v y reticle slit laser pulse-freq, bw, wavelength,.. illuminator uniformity reticle v x expose step expose t wafer 25 mm/s dynamic exposure through slit NA abberations transmission lens aerial image wafer 24 Gerrit Muller EBMIsystemDiagrams
Research: Test Strategy Test Test Test Test S \ T t t2 t3 Fault state Fault state 2 Fault state 3 Fault state 4 Fault state 5 C 3 Test t4 Test t5 2 2 P % % % % % Dynamic simulation of integration & test approach:. Create model (modules, interfaces, faults, tests) 2. Execute model at any time 3. Balancing based on time, cost and remaining risk. Balancing functionality, quality and time/cost (over 2 % reduction of integration & test time) 25 Gerrit Muller RSIStangramTestStrategy
Example Ideals Project code size reduce by refactoring cross cutting concerns 26 Gerrit Muller RSISideals
Example Boderc Project 3x5E 25 29 27 Gerrit Muller IALAdomain
Boderc Goal multidisciplinary Boderc goal = A specific methodology to predict system performance based on modeling and analyze, discuss, document, and communicate throughput, quality within industrial constraints and restricted design space people, process, project duration, and cost power computing response time 28 Gerrit Muller BS6bodercGoal
Shared Understanding by Modeling 2 3 4 5 6 7 number of details back-of-theenvelop formula based executable system multi-disciplinary mono-disciplinary 29 Gerrit Muller ESICpyramidModels
Many Models Needed to Understand System 6. Kinematic modeling GUI 7. Thermo modeling 8. Control architecture Life Animation vario 9. Virtual printer models. Stepper motors 2 3 4 5 6 7 number of details 7 6' 6 3 8 2 5 4 9 small, simple, goal-driven models shorter cycle time, less cycles shorter product creation lead time 3 Gerrit Muller BS6mdModels
Coverage of Reliability by ESI Projects more/longer testing within cost&time improve reliability more procedures reduce functionality improve way-ofworking more formality more automation early integration Tangram Tangram Tangram without performance penalty with increasing requirements increasing #suppliers #sites, et cetera reduce code more standardization improve recovery improve detection improve robustness Ideals Trader shared understanding more feedback tolerant interfaces robust patterns Boderc Trader Trader 3 Gerrit Muller RSIScoverage
Towards a Conclusion, Some more Trends Introduction + ESI + Speaker Exploration + trends + consequences for reliability + potential solutions conclusion Research Projects + Trader + Tangram + Ideals + Boderc 32 Gerrit Muller RSIScontentConclusion
Applications depend on chain of systems Home Server Network Providers Service Providers Content Providers users infotainment appliance watch video browse photo's calendar and much more... 33 Gerrit Muller CVCproductChain
Interoperability: systems get connected at all levels host acquisition reconstruction display MR scanner other examination rooms workstation storage RIS printer PC beamer radiology department workstation archive HIS LIS PC security admin other clinical departments hospital hospital or other clinical center clinicians at home clinicians away clinical experts suppliers patients away patients at home world 34 Gerrit Muller DYOFscopeOfInteroperability
Multi dimensional interoperability integrating multiple applications cilinical analysis clinical support administrative financial workflow based on multiple media, networks DVD+RW memory stick memory cards bluetooth a/b/g UTMS in multiple languages cultures USA, UK, China, India, Japan, Korea France, Germany Italy, Mexico and multiple standards Dicom HL7 XML delivered by multiple vendors Philips GE Siemens and multiple releases R5 R6.2 R7. 35 Gerrit Muller DYOFmultiInteroperability
Interopearbility Trends and Research Challenges trends (partial) solutions market dynamics interoperability reliability globalization hype waves Moore's law emerging behavior future vs legacy heterogeneous vendors dynamics (continually changing) complexity heterogeneity dynamics #engineers involved standards economical interests latency due to slow process most fundamental solution design patterns in system feedback human-in-the-loop feedback semantic understanding containment gateway... new research challenges! 36 Gerrit Muller RSISinteroperability
Conclusion manyears and LOC (lines of code) per product Mloc Mloc kloc less code less errors/kloc Based on average 3 errors/kloc k reduce impact of hidden faults on system reliability typical amount of errors per product 99 995 2 25 37 Gerrit Muller RSISconclusion