Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005



Similar documents
Chapter 4 and 5 solutions

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

HOW TO USE MINITAB: DESIGN OF EXPERIMENTS. Noelle M. Richard 08/27/14

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

How To Run Statistical Tests in Excel

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

International Statistical Institute, 56th Session, 2007: Phil Everson

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

Confidence Intervals for the Difference Between Two Means

Randomized Block Analysis of Variance

Using simulation to calculate the NPV of a project

Design of Experiments for Analytical Method Development and Validation

Application Note. The Optimization of Injection Molding Processes Using Design of Experiments

Chapter 5. Linear Inequalities and Linear Programming. Linear Programming in Two Dimensions: A Geometric Approach

Multiple Linear Regression in Data Mining

Simple linear regression

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

SOLVING EQUATIONS WITH EXCEL

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Regression step-by-step using Microsoft Excel

Assessing Measurement System Variation

Comparables Sales Price

Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = Adverse impact as defined by the 4/5ths rule was not found in the above data.

Confidence Intervals for Cp

Paint Database Guide. Paint Database Guide. Color iqc and Color imatch Paint Database Guide. Version 8.0 July July 2012 Revision 1.

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

FTS Real Time System Project: Portfolio Diversification Note: this project requires use of Excel s Solver

SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions

Forecasting in STATA: Tools and Tricks

SUMAN DUVVURU STAT 567 PROJECT REPORT

Quantitative Inventory Uncertainty

How To Understand And Solve A Linear Programming Problem

Data Analysis Tools. Tools for Summarizing Data

WEBFOCUS QUICK DATA FOR EXCEL

Final Exam Practice Problem Answers

Using Excel for Handling, Graphing, and Analyzing Scientific Data:

Efficient Portfolios in Excel Using the Solver and Matrix Algebra

MS-EXCEL: ANALYSIS OF EXPERIMENTAL DATA

Gage Studies for Continuous Data

Scientific Graphing in Excel 2010

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

SAS Software to Fit the Generalized Linear Model

ABSORBENCY OF PAPER TOWELS

2.500 Threshold e Threshold. Exponential phase. Cycle Number

Simulation and Risk Analysis

Betting on Excel to enliven the teaching of probability

Applying Statistics Recommended by Regulatory Documents

2. Simple Linear Regression

Investigating Area Under a Curve

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance?

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Plots, Curve-Fitting, and Data Modeling in Microsoft Excel

Using MS Excel to Analyze Data: A Tutorial

LOGNORMAL MODEL FOR STOCK PRICES

Multivariate Analysis of Variance (MANOVA)

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

An analysis method for a quantitative outcome and two categorical explanatory variables.

So in order to grab all the visitors requests we add to our workbench a non-test-element of the proxy type.

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

How to Win the Stock Market Game

Gamma Distribution Fitting

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

Didacticiel - Études de cas

One-Way Analysis of Variance

Hedge Effectiveness Testing

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

Point Biserial Correlation Tests

Assignment objectives:

Standard Deviation Estimator

STATISTICA Formula Guide: Logistic Regression. Table of Contents

IBM SPSS Statistics 20 Part 4: Chi-Square and ANOVA

Tutorial: Get Running with Amos Graphics

How To Analyze Data In Excel 2003 With A Powerpoint 3.5

4 Other useful features on the course web page. 5 Accessing SAS

Linear Programming. Solving LP Models Using MS Excel, 18

Handling missing data in Stata a whirlwind tour

Dealing with Data in Excel 2010

Situation Analysis. Example! See your Industry Conditions Report for exact information. 1 Perceptual Map

The average hotel manager recognizes the criticality of forecasting. However, most

Sample Midterm Solutions

ANOVA. February 12, 2015

Java Modules for Time Series Analysis

Using R for Linear Regression

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

NCSS Statistical Software

Ordinal Regression. Chapter

MULTIPLE REGRESSION EXAMPLE

פרויקט מסכם לתואר בוגר במדעים )B.Sc( במתמטיקה שימושית

VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR

12: Analysis of Variance. Introduction

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Using the Solver add-in in MS Excel 2007

Multiple Linear Regression

Transcription:

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/ pramsey@northhavengroup.com, 603-672-5651 mstephens@northhavengroup.com, 207-363-5739 mgaudard@northhavengroup.com, 352-560-0312 Copyright 2005, North Haven Group 1

Talk Outline Introduction Desirability Functions Multiple Response Optimization Desirability Optimization in JMP Optimizing an Anodizing Process Using Importance Values Constrained Mixture Example Copyright 2005, North Haven Group 2

Introduction The challenge in multiple response optimization is to find settings for multiple input variables that achieve desirable performance levels for one or more responses. Designed experiments are often used to model each response as a function of the input variables. Each response of interest has its unique predictive model. These predictive models then form the basis for optimization. Often, settings that optimize one response will degrade another response. Copyright 2005, North Haven Group 3

Introduction Sometimes, predictive models for each response are derived from observational studies. The JMP software allows multiple optimization using models derived either from designed experiments or observational studies. JMP s approach to optimization for multiple responses is based upon the concept of desirability. Copyright 2005, North Haven Group 4

Desirability Functions Desirability appears to have been first proposed as a criterion for response optimization by Harrington (1965) and popularized by Derringer and Suich (1980). The first step in defining a desirability function is to assign values to the response that reflect their desirability. For the i th response, we define a function d i, that assumes values between 0 and 1, where: 0 indicates a value of the response that is least desirable, 1 indicates a value that is most desirable, and a value between 0 and 1 indicates the desirability of the associated response. Copyright 2005, North Haven Group 5

Min Max NHG Desirability Functions If the objective is to maximize a response, the desirability function might have the following shape (the specific shape depends upon the response optimization goals): Response 0 1 Desirability Copyright 2005, North Haven Group 6

Desirability Functions If the objective is to minimize the response, the desirability function may have this shape (again, the exact shape depends upon the response optimization goals): Response Max Min 0 1 Desirability Copyright 2005, North Haven Group 7

Desirability Functions If the objective is to match a target, the desirability function might have the following shape: Upper Bound Response Target Lower Bound 0 1 Desirability Copyright 2005, North Haven Group 8

Desirability Functions Suppose one wants to maximize the response, where a specified lower bound exists and the specified desirability control points are not linear over the range of the response. Below is a possible desirability function Desired Maximum Most Desirable Region Lowest Acceptable Level 0 0.6 1 Desirability Copyright 2005, North Haven Group 9

Desirability Functions Derringer and Suich (1980) proposed forms for the desirability function, based on the particular desirability goal. These equations are not smooth functions - the possible shapes of the desirability functions are limited. In contrast, JMP defines the desirability function based upon control points determined by the user, using piecewise smooth functions. These piecewise smooth functions allow greater flexibility in the shapes of the desirability functions and ensure good behavior of the desirability function over the three basic types of optimization. Copyright 2005, North Haven Group 10

Multiple Response Optimization A typical desirability objective function for multiple optimization is based on the geometric mean of the transformed responses d i (or the average of the natural logs of the desirabilities). For k responses: * D = K d1 d2 dk..., or equivalently, 1 D = D = Ln d + Ln d + + Ln d k * ln( ) ( ) ( ) ( ) 1 2... k. Notice this form of the objective function treats all k responses with equal weight or importance. Copyright 2005, North Haven Group 11

Multiple Response Optimization We define the weight or importance level of the i th response as w i. We impose the following constraints on the weights: 0 w 1 where i= 1, 2,..., k k i= 1 w i i = 1.0. The objective functions, incorporating differential weighting, have the form ( ) ( ) K ( k) [ ] D = w 1Ln d1 + w2ln d2 +... + w Ln d or D* = Exp D. Copyright 2005, North Haven Group 12

Desirability Optimization in JMP The JMP Prediction Profiler is a powerful tool for finding optimum settings for one or more responses of interest. The optimum settings may be minima, maxima, target values, or a combination of these. The user may also specify upper or lower bounds on the responses. The Prediction Profiler may be accessed through the Fit Model platform or it may be accessed directly from the Graph menu. To access the Prediction Profiler directly from the Graph Menu, the user must save the prediction formulas for the responses to the spreadsheet from within the Fit Model platform. Copyright 2005, North Haven Group 13

Desirability Optimization in JMP The Prediction Profiler allows simultaneous optimization on multiple responses employing a different model for each of the responses. The ability to optimize multiple responses based on a different model for each response is a powerful capability of the JMP software. To perform multiple optimization with the Prediction Profiler: Use Fit Model to fit a best model for each of the responses, Save each of the Prediction Formulas to the data table, and Open the Prediction Profiler from the Graph menu and enter the prediction formulas as the responses to be optimized. Copyright 2005, North Haven Group 14

Desirability Optimization in JMP The models used to perform the optimization can be estimated from observational data or from a designed experiment. The Fit Model platform of JMP, where the predictive models are estimated, does not place any requirements on the source of the data. Therefore, one can develop the predictive models from observational data and then use the Prediction Profiler in JMP to perform the optimization on the responses. However, keep in mind that experimental design data is always preferable to observational data for deriving cause and effect models. Copyright 2005, North Haven Group 15

Optimizing an Anodizing Process A Six Sigma project team is attempting to optimize an anodizing process (oxide surface coating) for an aluminum substrate. For aesthetics, the customer requires that the anodized surfaces be black in color. The process has two stages: anodize (A) and dye (D). Five factors are selected for an experiment: Bath Temp (A), Anodizing Time (A), Acid Concentration (A), Dye tank concentration (D), and Dye tank ph (D). Copyright 2005, North Haven Group 16

Optimizing an Anodizing Process Due to production requirements, only enough anodizing equipment time is available to perform eight to ten runs. The team elects to perform a resolution III, 2 5-2 fractional factorial with two center points (10 runs). Some of the potential two-way interactions were discounted for technical reasons, reducing the amount of aliasing. The four primary responses are: Anodize Thickness, L* (lightness of the color), a* (redness/greenness of the color), and b* (yellowness/blueness of the color). Copyright 2005, North Haven Group 17

Optimizing an Anodizing Process To meet customer requirements, the following specifications are set by the engineers (the color parameter targets and ranges were empirically determined from production data): Anodize Thickness: 0.9 ± 0.2 microns. L*: 10 ± 2. a*: 2 ± 2. b*: 0 ± 2. We will use the Prediction Profiler to find anodizing process conditions that simultaneously achieve the four response targets (or at least stay within the specification ranges). Copyright 2005, North Haven Group 18

Optimizing an Anodizing Process Using Fit Model, a separate model was fit to each of the four responses and the Prediction Formulas saved to the spreadsheet. We recommend saving the Fit Model Script to the data table for each model. Recall, to save the Prediction Formula to the spreadsheet: Click on the red diamond at the top of the Fit Model analysis output, Select the menu option Save Columns, and From Save Columns, select Prediction Formula. The next slide shows the Fit Model output for the best model for Thickness. Each of the four responses had a separate model. Copyright 2005, North Haven Group 19

Optimizing an Anodizing Process For Anodize Thickness, only factors for the anodize stage could have an influence. This allowed the team to estimate a model with no aliasing. To the right is the Fit Model report for the predictive model. Notice that the lack of fit test (based on center points) is not significant. Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts) Analysis of Variance Source Model Error C. Total Source Lack Of Fit Pure Error Total Error DF 5 4 9 Lack Of Fit Sum of Squares 0.71889166 0.00487537 0.72376702 DF 3 1 4 Parameter Estimates Term 0.993264 0.984844 0.034912 0.73815 10 Sum of Squares 0.00372337 0.00115200 0.00487537 Intercept Anodize Temp Anodize Time Acid Conc (Anodize Temp-75)*(Acid Conc-187.5) (Anodize Time-30)*(Acid Conc-187.5) Mean Square 0.143778 0.001219 Mean Square 0.001241 0.001152 Estimate -1.16418 0.0164458 0.0098187 0.0019964-0.000364 0.0005425 F Ratio 117.9630 Prob > F 0.0002* F Ratio 1.0774 Prob > F 0.5936 Max RSq 0.9984 Std Error 0.150969 0.000823 0.001234 0.000705 0.000047 7.053e-5 t Ratio -7.71 19.99 7.95 2.83-7.74 7.69 Prob> t 0.0015* <.0001* 0.0014* 0.0473* 0.0015* 0.0015* Copyright 2005, North Haven Group 20

Optimizing an Anodizing Process The four columns furthest to the right in the data table contain the Prediction Formulas saved in the Fit Model platform. In the table to the left are the saved scripts for the models. These Prediction Formula columns can now be used by the Prediction Profiler to perform multiple optimization. Copyright 2005, North Haven Group 21

Optimizing an Anodizing Process Select Profiler from the Graph menu option. The Prediction Profiler launch window will allow us to simultaneously optimize the four predicted responses for this experiment. Copyright 2005, North Haven Group 22

Optimizing an Anodizing Process Once in the Profiler Report Window, one can select the menu items Maximization Options and Maximize Desirability function. These set up the multiple optimization of the four responses, based upon the saved prediction formulas. Copyright 2005, North Haven Group 23

Optimizing an Anodizing Process The next slide depicts the results of the multiple optimization using the Prediction Profiler. A calculated desirability of 1.0 indicates that all response goals were simultaneously achieved. The output indicates that the overall desirability was 0.82. However, all predicted response levels are well within the specification range. The settings for the five factors that achieve the most desirable response values are depicted at the bottom of the output. Note that all four responses are very sensitive to Anodize Temp, based upon the desirability trace in the last row of the output. Copyright 2005, North Haven Group 24

Optimizing an Anodizing Process Profiler Prediction Profiler 1.25 1.00 0.75 Pred Thickness 0.877527 0.50 15 10 Pred L* 9.499911 5 0 6 4 2 Pred a* 2.407315 0 4 2 0-2 Pred b* -0.57861-4 50 60 70 80 90 100 15 20 25 30 35 40 45 160 170 180 190 200 210 4.5 5.0 5.5 6.0 6.5 7.0 9 10 11 12 13 14 15 16.00.25.50.75 1.00 Desirability 0.818468 0.000.250.50 1.00 66.1975 Anodize Temp 40 Anodize Time 205 Acid Conc 5.323956 Dye ph 11.65212 Dye Conc Desirability Copyright 2005, North Haven Group 25

Optimizing an Anodizing Process Anodizing Experiment Issues In production, not all of the five factors could be tightly controlled. In fact, from standard deviation estimates, the team concluded that only Anodize Time could be precisely controlled. In the Column Info window for each of the process factors, if the column property Sigma is specified, then JMP will provide propagation of error (POE) bars for the factor in the Profiler. The next slide shows the optimized process settings in the profiler with the POE bars. The bars represent a ±3σ window of variation in the response caused by the factor variation. Copyright 2005, North Haven Group 26

POE bars Optimizing an Anodizing Process Profiler Prediction Profiler 1.25 1.00 0.75 Pred Thickness 0.877527 0.50 15 10 Pred L* 9.499911 5 0 6 4 2 Pred a* 2.407315 0 4 2 0-2 Pred b* -0.57861-4 50 60 70 80 90 100 15 20 25 30 35 40 45 160 170 180 190 200 210 4.5 5.0 5.5 6.0 6.5 7.0 9 10 11 12 13 14 15 16.00.25.50.75 1.00 Desirability 0.818468 0.000.250.50 1.00 66.1975 Anodize Temp 40 Anodize Time 205 Acid Conc 5.323956 Dye ph 11.65212 Dye Conc Desirability Copyright 2005, North Haven Group 27

Optimizing an Anodizing Process The Profiler contains a Simulator function that can be used to illustrate the POE in the responses due to the factors. To the right, are the settings for the factors used in the simulation. Simulator Factors Anodize Temp Random Normal Mean 66.197497 Anodize Time Fixed 40 Acid Conc Random Normal Dye ph Random Normal Dye Conc Random Normal Responses Pred Thickness No Noise Mean 205 Mean 5.3239561 Mean 11.652119 Std Dev 3 Std Dev 1.625 Std Dev 0.1 Std Dev 0.323 Pred L* Pred a* Pred b* No Noise No Noise No Noise N Runs: 1000 Copyright 2005, North Haven Group 28

Optimizing an Anodizing Process These simulation results for the responses quantify anticipated variation due to lack of control of the process factors. Distributions Pred Thickness Pred L* Pred a* Pred b* 1 12 4 3.5 0.95 11 3 0 0.9 10 2.5 0.85 9 8 2 1.5-1 0.8 7 1 6 0.5 Moments Moments Moments Moments Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 0.8808721 0.0351241 0.0011107 0.8830517 0.8786925 1000 Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 9.4268638 0.9351656 0.0295725 9.4848952 9.3688324 1000 Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 2.3708303 0.4712389 0.0149019 2.4000728 2.3415877 1000 Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N -0.552679 0.3905999 0.0123519-0.528441-0.576918 1000 Copyright 2005, North Haven Group 29

Optimizing an Anodizing Process Further Issues Since the experiment was a resolution III design, main effects were aliased with two-way interactions. Furthermore, since all five factors were significant for one or more of the responses, it was not possible to resolve the aliases in most cases. No production equipment time was available to perform additional experiments to resolve which interactions might be active. The new process settings recommended by the Prediction Profiler optimization where far different from the current settings. Copyright 2005, North Haven Group 30

Optimizing an Anodizing Process The experimental results suggested that the new conditions would substantially increase yields and produce higher quality coatings. Equipment time was provided to perform two confirming runs at the new process conditions. The two confirming runs achieved 100% yields with very high quality anodize coatings on all parts. The current process had yields in the range of 40% with marginal quality coatings. Although aliases were unresolved, engineers decided to perform no further experimentation, since the suggested factor settings worked extraordinarily well in practice. This is not ideal! Copyright 2005, North Haven Group 31

Using Importance Values Often, multiple responses do not have the same level of importance to experimenters. JMP allows the user to specify weights or importance values for each of the responses. The default is to weight all responses equally. Recall, that each weight should be between 0 and 1 and the sum of the weights across the factors should equal 1. The response goals and importance values can be set in the JMP Profiler. They can also be set in the Column Info window as Column Properties (recommended). Copyright 2005, North Haven Group 32

Using Importance Values We revisit the Anodize experiment, but apply different importance values for each response. Note that the previous optimization weighted the four responses equally. The importance values for the four responses are: Anodize Thickness 0.15 L* 0.35 a* 0.20 b* 0.30 The next slide depicts the optimization with the new importance values. Copyright 2005, North Haven Group 33

Profiler Prediction Profiler 1.25 Using Importance Values 1.00 0.75 Pred Thickness 0.85994 0.50 0.25 15 10 5 Pred L* 9.617948 0 6 4 2 Pred a* 2.544555 0 4 2 0-2 Pred b* -0.51571-4 50 60 70 80 90 100 15 20 25 30 35 40 45 160 170 180 190 200 210 4.5 5.0 5.5 6.0 6.5 7.0 9 10 11 12 13 14 15 16.00.25.50.75 1.00 Desirability 0.807144 0.000.250.50 1.00 64.4519 Anodize Temp 40 Anodize Time 205 Acid Conc 5.285685 Dye ph 11.31579 Dye Conc Desirability Copyright 2005, North Haven Group 34

Constrained Mixture Example The next example illustrates the use of the Profiler Desirability function to optimize a mixture subject to upper and lower bound constraints on each component. In this example, the data were collected by observation of a process to produce a type of composite material. There were nine mixture components and each had upper and lower bound constraints. There were 3 responses. Y1 and Y3 were to maximized, while Y2 was to be minimized. Copyright 2005, North Haven Group 35

Constrained Mixture Example Each of the three responses had a different statistical model. The Profiler analysis, below, shows a most desirable mixture. Profiler Prediction Profiler 90 80 70 Pred Y1 97.1193 60 9.5 8.5 Pred Y2 7.188518 7.5 250 240 230 220.050.055.060.065.070.075.080.13.16.19.22.25.03.05.08.11.13.15.25.35.45.55.0004.0006.0009.0012.0014.0002.0003.05.10.15.20.25.003.006.009.012.015.050.055.060.065.070.075.080.00.25.50.75 1.00 Desirability 0.964221 0.00 0.50 1.00 Pred Y3 262.5054 0.069425 A 0.125 B 0.024 C 0.557834 E 0.00034 F 0.00013 G 0.1675 H 0.00577 J 0.05 D Desirability Copyright 2005, North Haven Group 36

Constrained Mixture Example Because the data were obtained by observation, as opposed to from a formally designed mixture experiment, there was considerable uncertainty as to how the predicted optimum mixture would actually perform. Responses Y1 and Y3 are considered inversely related by scientists, so considerable skepticism existed as to validity of the suggested optimum composition. Having no better solution to the problem, several trial batches were created. The responses in all three cases actually exceeded the predicted optimum values from the Profiler. Copyright 2005, North Haven Group 37

Constrained Mixture Example As a result, a multimillion dollar contract was saved and millions in new business were obtained for the improved, unique material. Copyright 2005, North Haven Group 38

Summary Desirability is a popular and proven technique to simultaneously determine optimum settings of input factors that achieve optimum performance levels for one or more responses. The JMP statistical software implements desirability optimization through the Fit Model platform and the Profiler. JMP allows the user to perform the desirability optimization with a different statistical model for each of the responses. The responses can be differentially weighted in terms of importance. Two multiple response case studies were presented where the optimum input factor settings suggested by the Profiler where confirmed, with substantial financial benefits to the businesses. Copyright 2005, North Haven Group 39