0263–8762/02/$23.50+0.00

# Institution of Chemical Engineers

www.catchword.com=titles=02638762.htm Trans IChemE, Vol 80, Part A, October 2002

OPTIMIZATION AND VALIDATION OF STEADY-STATE

FLOWSHEET SIMULATION METAMODELS

K. PALMER

1

and M. REALFF

2

1

Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, USA

2

School of Chemical Engineering, Georgia Institute of Technology, Atlanta, USA

T

he use of metamodels all ows the optimization of steady-state owsheet simulations to

proceed while requiring only a small number of solutions to be obtained from the

simulation. This paper presents the techniques needed to optimize the metamodel forms

described by Palmer and Realff. Through a case study, we exemplify characteristics of the

solution process and demonstrate the strength of the metamodeling approach. The operating

cost of an ammonia sy nthesis plant is optimized over six input variables using a total of only

32 solutions of the plant simulation as data for the metamodels.

Keywords: kriging; polynomial models; simulated annealing; canonical analysis; ridge

analysis.

INTRODUCTION

The ef ciency of steady-state owsheet simulations has

made it possible to perform process optimization studies

that historically had been impractical. However, the time

required to obtain a single solution of the owsheet is not

always trivial, and in these cases, the number of simulation

solutions available for an optimization study is severely

limited. Consequently, it becomes advantageous to consider

how to optimize a owsheet while using only a small

number of simulation runs.

In answer to this problem, we suggest use of the meta-

modeling approach proposed by Welch and Sacks

1

. They

describe the creation of approximating functions based upon

data gen erated from an electronic circuit simulation. The

metamodels are then used as a surrogate for the simulation

in optimization studies. Welch an d Sacks give the following

general procedure, which accounts for the possibility of

several output variables from the simulation contributing to

one or more objective functions:

(1) Postulate a model for each output

y

k

as a function of all

inputs x.

(2) Plan an initial experiment of a set of

N

x vectors, and

run the simulation at these input vectors to generate the

quality characteristic outputs.

(3) Use the data to t the metamodels (estimate model

parameters), and build approximating functions

^yy

k

…

x

†

.

These will be used to predict the outputs at new x

vectors. Before proceeding fur ther, the adequacy of the

predictors should be assessed and, if necessary,

improved.

(4) Plot the predicted surfaces. Particu larly when there are

con icting objectives, visualization of the input–output

relationships can help to establish trade-offs. Even if

model assessment indicated that the predictors suffer

large approximation errors, the plots may suggest promis-

ing subregions of the x space for the next experiment.

(5) Use the predictors to estimate the objective function(s),

and perfo rm a tentative optimization.

(6) If necessary, reduce the size of the input space. If Step 3

indicated that a predictor is not accurate enough yet,

reduce the size of the x region to a promising subregion

based on the plots in Step 4 and the tentative optimiza-

tion in Step 5, and return to Step 2 to collect new data

on the subregion. Otherwise continue to Step 7.

(7) Conduct a con rmation experiment to evaluate the

tentative optimum found in Step 5. If the con rmation

experiment demonstrates that the tentative optimum is

unsatisfactory, then we have to return to an earlier step.

Chemical engineers have used approximation modeling

techniques for other purposes in the past; see Palmer

and Realff

2

for a list of examples. However, we believe

that th is is the rst reported use of approximating models as

metamodels for a chemical engineering process owsheet

simulation.

The rst paper in this series, Palmer and Realff

2

, deals

with implementation issues surrounding the data collection

(Step 2) and metamodel building (Step 3) portions of the

procedure given above. With regard to the data collection

step, we emphasize the desirability of unique sample values

at each data collection point, broad dispersion of the data

collection points within the input variable space, and

minimal contribution to met amodel prediction error. We

introduce the Minimum Bias Latin Hypercube Design

(MBLHD) of Palmer and Tsui

3

as an example of a plan

that has these qualities.

773

For the model building step, we emphasize the need to

estimate model parameters from small data sets and the

desire to create metamodels that faithfully reproduce the

input–output characteristics of the simulation. We focus our

attention on two model forms: polynomial and kriging. Both

of these have been used successfully with small data sets in

other applications. Kriging

4

is a modeling technique from

geostatistics that predicts output values by exploiting the

similarity of the corresponding locations in the input spa ce.

Input points that are ‘near’ to each other are assumed to have

similar output values. An additional issue that we raise is the

assertion that improvement in a pproximation error can

be obtained by modeling basic outputs from the simulation

rather than summary measures of process performance. We

investigate the assertion for a speci c example process.

We introduce the ammonia synthesis plant described by

Morari and Grossmann

5

as an example process for investi-

gation. The intent of the optimization problem is to mini-

mize the operating cost of the plant. The op erating cost is

determined by 18 simulation output variables that represent

material and energy usage, or recovery, at various points in

the process. To simplify the task of generating separate

metamodels for the basic simulation outputs, we select a

subset of eight output variables that represent the

most important operating cost components. The eight key

outputs are:

°

hydrogen feed stream H

2

owrate;

°

rst compressor (RCM1) electricity u sage;

°

second compressor (RCM2) electricity usage;

°

reactants preheater (CHX1) steam usage;

°

distillation column (D1) reboiler steam usage;

°

D1 condenser refrigeration usage;

°

waste gas stream H

2

owrate;

°

waste gas stream CH

4

owrate.

The components of the operating cost corresponding to th ese

outputs are totaled to form an objective function, which is

used in lieu of the operating cost during optimizations.

Following the initial data collection, polynomial meta-

models are generated both directly for the o bjective function

value and separately for each of the eight key outputs. The

separate output models are then combined, using the appro-

priate cost factors, to create a second objective function

model. Predictions from the two models are found to be

nearly identical. This is taken as an indication that the

assertion about basic outputs does not hold in the case of

polynomial metamodels and linear objective functions.

Kriging metamodels are also generated from the initial

ammonia plant data, both directly for the objective function

and separat ely for the key outputs. Predictions of the

objective function from the direct model a nd the eight

outputs model are observed to d iffer substantially. It is not

practical to determine which model is more a ccurate by

comparing predictions to actual simulation results over a

large set of test conditions, because o f the lengthy execution

time of the simulation. An alternative approach is to rst

obtain a solution to the operating cost problem using the

metamodel. Then, the approximation error of the metamodel

prediction at the input variable solution point is evaluated

using a single simulation run. At the end of the work by

Palmer and Realff

2

, we leave the optimization and valida-

tion of the kriging models to a future publication.

In this article we present the techniques for optimization

of both the polynomial and kriging metamodels. We also

validate our solutions by evaluating the metamodel approxi-

mation error at the input variable solution points. We follow

the ammonia plant problem through three stages of data

collection, modeling, and tentative optimization to exem-

plify implementation characteristics of Steps 4 to 7 of the

procedure given by Welch and Sacks. The completed

ammonia plant optimization also demonstrates the strength

of the metamode ling approach, as the problem is solved

using a total of only 32 simulation runs to provide data to

generate the metamodels.

The second section presents the methods used to optimize

the polynomial and kriging models. The third section

presents the co mpletion of the ammonia plant example.

Finally, the fourth section contains our concluding remarks

and comments regarding future work.

OPTIMIZATION TECHNIQUES

As discussed earlier, there are several possible choices of

approximating model forms to use for the metamodel. In

this work, we used polynomial and kriging models because

of their ability to be generated from small data sets. The

appropriate optimization technique differs for each type o f

model.

Polynomial Metamodels

The polynomial models used were reduced versions of

second-order functions. The complete second-order poly-

nomial model is written as

^

yy…

x

† ˆ

^

bb

0

‡

X

m

iˆ

1

^

bb

i

x

i

‡

X

m

iˆ

1

^

bb

ii

x

2

i

‡

X

m¡

1

iˆ

1

X

m

jˆi‡

1

^

bb

ij

x

i

x

j

This model form can also be represented in matrix notation

^

yy…

x

† ˆ

^

bb

0

‡

x

0

^

bb

‡

x

0

^

BBx

…

1

†

where

^yy…

x

†

is the output value predicted by the metamodel,

x

ˆ ‰x

1

;

x

2

;

. . .

;

x

m

Š

0

is a column vector of the process

design variables,

^

bb

ˆ ‰

^

bb

1

;

^

bb

2

;

. . .

;

^

bb

m

Š

0

is a column

vector of the linear model parameters, and

^

BB is an

m £ m

symmetric matrix of the second-order model parameters

^

BB

ˆ

^

bb

11

^

bb

12

=

2

¢ ¢ ¢

^

bb

1

m

=

2

^

bb

12

=

2

^

bb

22

¢ ¢ ¢

^

bb

2

m

=

2

.

.

.

.

.

.

.

.

.

^

bb

1

m

=

2

¢ ¢ ¢

^

bb

mm

2

6

6

6

4

3

7

7

7

5

This model form has a rich history in approximation

modeling

6,7

. Below, we summarize the optimization techni-

ques described in these texts.

The optimization of the model given in equation (1) is

accomplished in two steps. The rst step is to perform a

canonical analysis. The canonical analysis begins by calcu-

lating the location of the stationary point

x

s

ˆ ¡

1

2

^

BB

¡

1

^

bb

The nature of the stationary point is then determined by

calculating the eigenvalues of the

^

BB matrix. The eigenvalues

can be read ily obtained using software su ch as Minitab

Trans IChemE, Vol 80, Pa rt A, October 2002

774 PALMER and REALFF

or Matlab. The signs of the eigenvalues indicate the nature

of the stationary point. There are three possibilities:

(1) If all of the eigenvalues are negative, the metamodel

describes the output variable surface as a hill with a

single maximum. The stationary point represents th e

process design values that produce the maximum

predicted output value.

(2) If all of the eigenvalues are positive, the metamodel

describes the output variable surface as a well wit h a

single minimum. The stationary point represents the

process design values that produce the minimum

predicted output value.

(3) If the eigenvalues are of mixed sign, the metamodel

describes the output variable surface as a saddle system

(Figure 1). The stationary point represents the centre of

the saddle’s seat.

In addition to the eigenvalues, the eigenvectors of

^

BB can

also be calculated. This information allows us simply to

describe the output variable surface. The metamodel can

now be written in its canonical form

^yy ˆ ^yy

s

‡

X

m

iˆ

1

l

i

w

2

i

where

^yy

s

is the predicted output value at the stationary point,

l

i

are the eigenvalues, and

w

i

are the canonical variables.

The canonical form is a result of a rede ned coordinate

system in which the axes have been translated, so that the

origin is placed at the stationary point, and rotated, so that

the

w

-axes are the principal axes of the output variable

contour system. The eigenvectors p

i

give the directions of

the new axes. The coordinates

w

i

of the former x point in the

translated and rotated system are

w

i

ˆ

p

0

i

x

1

¡ x

1;

s

x

2

¡ x

2;

s

.

.

.

x

m

¡ x

m

;

s

2

6

6

6

4

3

7

7

7

5

where p

i

is the normalized eigenvector associated with the

i

th eigenvalue. The magnitude of the eigenvalue indicates

the rate that th e output variable changes as one moves away

from the stationary point, along the associated axis.

The results of the canonical analysis determine the need

to perform the second step of the optimization. If the

stationary point is o f the desired nature and is located

within the region from which data was collected, then

there is no need to continue. If the stationary point is not

of the desired nature or it is located outside the explored

region, then a ridge analysis is necessary.

A ridge analysis is a constrained optimization procedure.

The set of feasible solutions is constrained to the set of points

that lie o n a hypersphere of radius

R

from the orig in of the

input variable space. At each iteration of the algorithm, the

point having the maximum (or minimum) predicted output

value, among all points on the hypersphere, is identi ed

x

ˆ ¡

1

2

…

^

BB

¡ m

I

†

¡

1

^

bb

where I is the

m £ m

identity matrix and

m

is a constant that

determines both the radius and the n ature of the solution.

If

m

is greater than the largest eigenvalue of

^

BB, then the

solution is a point of maximum predicted output. If

m

is less

than the smallest eigenvalue of

^

BB, then the solution is a point

of minimum predicted output. The radius is calculated as

R ˆ …

x

0

x

†

1

=

2

The radius is a monotonic function of

m

, within each of its

working regions.

As

m

approaches the largest eigenvalue of

^

BB from above,

the radius increases. An algorithm designed to nd the point

of maximum p redicted output within the explored region

would begin by selecting a

m

value that is far greater than

the largest eigenvalue,

l

…m†

. This would produce a solution

near the origin. In succeeding iterations, the value of

m

would then be incrementally decreased, always maintaining

m

>

l

…m†

. The solutions would move progressively further

away from the origin, indicating the path of steepest ascent.

The algorithm should terminate at the boundary of the

explored region.

Similarly, as

m

approaches the smallest eigenvalue of

^

BB

from below, the radius increases. An algorithm designed to

nd the point of minimum predicted output begins by

selecting a

m

value that is far less than the smallest eigenvalue,

l

…

1

†

. In succeeding iterations, the value of

m

is incrementally

increased, always maintaining

m

<

l

…

1

†

. The solutions move

progressivelyfurther away from the origin, indicatingthe path

of steepest descent. Once again, the algorithm should termi-

nate at the boundary of the explored region.

After th e nal solution has been ident i ed, it must be

validated. There is no guarantee that the metamodel faith-

fully reproduces th e simulation characteristics. The

predicted output value at the in put variable solution point

should be con rmed, to provid e some assurance that the

metamodel is adequate.

Kriging Metamodels

The kriging models use d were of the form

^yy…

x

† ˆ

^

bb ‡

r

0

…

x

†

R

¡

1

…

y

¡

1

^

bb† …

2

†

where the column vector y represents the known output

values. R is an

N £ N

matrix that has elements

R…

x

u

; x

v

†

for

all

u

;

v ˆ

1;

. . .

;

N

de ned as

R…

x

u

; x

v

† ˆ

Y

m

iˆ

1

exp

…¡

^

yy

i

jx

iu

¡ x

iv

j

2

†

which represent correlation s between pairs of points in th e

data collection plan. The vector r

0

…

x

† ˆ ‰R…

x

1

; x

†

;

. . .

;

R…

x

N

; x

†Š

gives the correlations between the data collectionFigure 1. Example of a saddle system surface.

Trans IChemE, Vol 80, Pa rt A, October 2002

OPTIMIZATION AND VALIDATION OF METAMODELS 775

points x

u

and the new input point x. The

^

yy

i

and

^

bb

are

parameters of the model.

The general character of the output variable surface can

be observed by calculating the main effects and interaction

effects for the model. Sacks

et al.

8

provide the necessary

formulae. The main effect of input variable

x

i

is

^mm

i

ˆ

…

^yy…

x

†

Y

h6ˆi

d

x

h

¡ ^mm

0

…

3

†

where

^

mm

0

ˆ

„

^

yy…

x

†

Q

m

hˆ

1

d

x

h

is the average predicted output

over the explored region. The interaction effect o f input

variables

x

i

and

x

j

is

^mm

ij

ˆ

…

^yy…

x

†

Y

h6ˆi

;

j

d

x

h

¡ ^mm

i

¡ ^mm

j

‡ ^mm

0

…

4

†

Since the integral of the correlation function does not have

an analytical solution, these formulae must be numerically

integrated.

The optimization of the model given in equation (2) is

accomplished by u se of a simulated annealing algorithm

9–12

.

Below, we describe the simulated annealing algorithm and

some of its characteristics.

Assuming that the objective is to minimize a function

C…

x

†

, where x is a vector of decision variables, the simulated

annealing algorithm proceeds as follows:

(1) Choose an initial point, x

0

, and determine the corres-

ponding value of the objective function,

C…

x

0

†

.

(2) Choose a small random step, Dx.

(3) At any point, x

k

, calculate the provisional change in the

objective function value D

C ˆ C…

x

k

‡

Dx

† ¡ C…

x

k

†

.

(4) Check the termination criterion. If satis ed, stop.

Otherwise,

(5) For D

C

< 0, set x

k‡

1

ˆ

x

k

‡

Dx and go to Step 2.

(6) For D

C

> 0, generate

W

, a uniform(0, 1) random vari-

able, and evaluate the probability exp

…¡B ¤

D

C†

:

(a) If

W

< exp

…¡B ¤

D

C†

, set x

k‡

1

ˆ

x

k

‡

Dx and go

to Step 2.

(b) If

W

> exp

…¡B ¤

D

C†

, go to Step 2.

Simulated annealing was proposed as an alternative to

gradient descent algorithms. Gradient descent algorithms

were known to fail when applied to nonconvex objective

functions, because of an inability to escape local minima.

The concept of the simulated annealing was to allow the

algorithm to ‘climb out’ of local minima by accepting

detrimental steps, on a probabilistic basis.

Operation of the simulated annealing algorithm depends

upon several characteristics that are controllable by the

investigator:

°

The small random step, Dx, is described as being wit hin

a neighbourhood of the point x

k

. The size of the

neighbourhood should be small enough that only moder-

ate changes in the objective function occur as a result of

movement from one neighbourhood to any other adj a-

cent neighbourhood. The step, x

k

‡

Dx, may be either to

a random point within the neighbourhood or to a

random point along the boundary of the neighbourhood.

For our implementation, the size of the neighbourhood

was de ned as 1=20 of the range for each input variable

and the step was selected as a random point within the

neighbourhood.

°

The value of the parameter

B

is used to control the

probability of accepting a detrimental step. Typically,

the value is adjusted during execution of the algorithm.

At the early stages of the algorithm, the value of

B

is set

small enough that the probability of accepting a detri-

mental step is relatively high. This allows for movement

through the input variable space. As the algorithm

progresses, the value of

B

is increased so that the

probability of moving away from the optimal solution is

reduced. The value of

B

can be adjusted according to the

nearness of the current objective function value to the

optimal objective function value, if known. Alternatively,

the

B

value can be adjusted according to the number of

steps taken. In our case, the initial value of

B

was selected

so that the probability of accepting a detrimental step was

approximately 0.95. This required some preliminary

probing of the surface to estimate the size of D

C

in the

neighbourhood of the initial point. The value of

B

was

increased by a factor of 1.11 at each stage of the

algorithm. The algorithm was allowed to progress through

50 stages.

°

The criterion for terminating the algorithm is usually

selected through a compromise of solution quality and

execution time. If the optimal objective function value is

known, then termination can occur when the current

objective function value comes within a desired margin

of error from the optimal value. If the optimal o bjective

function value is un known, termination can occur either

when a lack of improvement in the objective function

value is detected or when the number of steps taken

reaches a desired limit. In our case, each stage of the

algorithm co nsisted of 10,000 steps. In general, the

simulated annealing algorithm does not guarantee a

globally optimal solution. It is usually necessary to

con rm the optimal solution by executing the algorithm

several times beginning at different initial values, x

0

, or

simply to accept the best solution among several trials.

Process optimization studies follow the pattern of most

scienti c investigations. It is rare that the desired result is

obtained upon completion of an initial effo rt. Ty pically, a

series of progressive improvements in knowledge and tech-

nique are required for success. Th e example of an ammonia

synthesis plant is used to demonstrate this iterative optimi-

zation and validation procedure.

AMMONIA SYNTHESIS PLANT EXAMPLE

In Palmer and Realff

2

, we introduce an example optimi-

zation problem concerning the design of a process that

produces liquid ammonia from hydrogen gas and nitrogen

gas. The goal of the problem is to minimize the operating

cost for the process. Six variables are considered for their

impact upon the operating cost:

°

reaction pressure (

Pressure

, in kPa);

°

mole fraction of recycle stream sent to purge (

Purge

);

°

mole fraction of excess nitrogen in the reactor feed

(

ExcessN

2

);

Trans IChemE, Vol 80, Pa rt A, October 2002

776 PALMER and REALFF

°

temperature of the cooling water available to heat exchan-

gers (

Water

, in K);

°

mole ratio of methane to hydrogen in the hy drogen feed

stream (

Methane

);

°

mole ratio of argon to nitrogen in the nitrogen feed stream

(

Argon

).

The operating ranges for the input variables are shown in

Table 1.

We categorize the input variables into two groupings. The

Pressure

,

Purge

, and

ExcessN

2

variables are calle d control

variables. Control variable settings can be chosen durin g the

design procedure. They represent equipment speci cations

or operating conditions that we are free to manipulate. The

Water

,

Methane

, and

Argon

variables are called noise

variables. Noise variables may have a strong in uence on

process performance, but their values cannot be selected by

the process designers or operators. We include both types of

variables in the study in order to determine whether or not it

is possible to nd settings of the control variables that

minimize the negative in uences of the noise variables.

Initial Results

In Palmer and Realff

2

, we show the development of a

polynomial metamodel based upon data from an initial

study consisting of 1 6 simulation runs. Th e polynomial

metamodel is

objective ˆ

225:15

‡

34:74

x

1

¡

30:72

x

2

‡

22:24

x

4

‡

13:07

x

6

‡

212:44

x

2

1

¡

142:38

x

1

x

2

…

5

†

Each of the input variables in equation (5) is scaled to the

interval

‰¡

0:5;

‡

0:5

Š

as follows:

x

1

ˆ

Pressure ¡

18;000

8000

x

2

ˆ

Purge ¡

3:5

3:0

x

3

ˆ

ExcessN

2

¡

25

50

x

4

ˆ

Water ¡

293

20

x

5

ˆ

Methane ¡

0:8

0:8

x

6

ˆ

Argon ¡

0:2

0:2

The

objective

model, equation (5), predicts the value of

the objective function. The predicted value has units of

millions of dollars, and is related to the annual operating

cost of the process. The operating cost is a linear function of

18 simulation output variables. The objective function is

formulated as a subset of the most important components of

the operating cost, as determined by the magnitude and

variation of their contribution. Th e objective function is

determined by only eight simulation output variables. In

Palmer and Realff

2

, we demonstrate that the objective

function faithfully tracks changes in the operating cost.

The form of the polynomial metamodel offers insight as

to relationships between the control and noise variable s. In

order for the control variables to be of use in mitigating the

negative impacts of the noise variables, interactions between

the control and noise variables must exist. Interactions

appear in polynomial models as cross-product terms. Equa-

tion (5) does not contain a cross-produ ct term involving a

control variable an d a noise variable. The only interaction

term that appears in the model involves the

x

1

and

x

2

variables, which are both control variables. As a result, we

conclude that it is not possible to minimize the n egative

in u ences of the noise variables through selection of control

variable settings.

The lack of terms involving

x

3

and

x

5

indicates that the

percent excess nitrogen and the methane content of the

hydrogen feed do not have an observable impact on opera-

ting cost over this input variable space. As a result, recom-

mendations for the settings of these variables should be

based upon considerations other than operating cost.

The linear terms involving

x

4

and

x

6

indicate that the

cooling water temperature and argon content of the nitrogen

feed do impact the operating cost. The positive coef cients

of these terms indicate that an increase in the measured

value will produce increased co st. However, since these are

noise variables, we can only take note of the trend. We

are not free to select desirable values of the noise variables.

The nominal values of the noise variab les are used t hrough-

out the remaining calculations.

The impact of the

x

1

and

x

2

variables upon the operating

cost involves seco nd-order terms. It is only necessary to

perform the canonical and ridge analyses with respect to

these variables. The eigenvalues of the second-order coef -

cients matrix indicate that the surface is a saddle system.

The ridge analysis g ives the point of minimum predicted

cost within the input variable space. The result is shown in

Table 4. The actual objective function and operating cost

values are found by performing a simulation run at the input

variable solution point.

Before discussing the quality of the solution obtained

from the polynomial metamodels, we consider the optimiza-

tion of the kriging metamodels. Palmer and Realff

2

describes a metamodel of the form sh own in equation (2)

that is generated directly from the objective function values

obtained in the initial study. A second objective function

model is also described. It is generated from eight separate

metamodels o f the form in equation (2), one for each of

the eight simulation output variables that contributes to the

objective function. The cost contribution corresponding to

the predicted value for each output variable is totaled, as

shown in equation (6) to obtain a prediction of the objective

function. The raw data for the eight output variables are

shown in Palmer

13

. The parameter estimates for the kriging

metamodels are shown in Table 2. We continue to consider

both objective function models here because the question of

preference for prediction accuracy remains u nsettled.

objective ˆ

3:0876

£

10

¡

2

^yy

1

‡

7:962

£

10

¡

4

^yy

1

x

5

‡

7:919

£

10

¡

4

^yy

2

‡

7:919

£

10

¡

4

^yy

3

‡

7:070

£

10

¡

2

^

yy

4

‡

5:824

£

10

¡

2

^

yy

5

‡

1:081

£

10

¡

1

^yy

6

¡

1:504

£

10

¡

2

^yy

7

¡

4:678

£

10

¡

2

^yy

8

…

6

†

Table 1. Input variable space for ammonia plant study.

Variable Working range

Reaction pressure 14,000 to 22,000kPa

% of recycle to purge 2 to 5%

ExcessN

2

in reactor feed 0 to 50%

Cooling water temperature 283 to 303 K

CH

4

: H

2

mole ratio in H

2

feed 0.4 : 94.0 to 1.2 : 94.0

Ar : N

2

mole ratio in N

2

feed 0.1 : 99.8 to 0.3 : 99.8

Trans IChemE, Vol 80, Pa rt A, October 2002

OPTIMIZATION AND VALIDATION OF METAMODELS 777

Since the form of the k riging metamodel offers little

insight about the relationships between the variables, it is

useful to calculate the effe cts. For the direct model, the main

effects are found using equation (3). To investigate the main

effects, an effect value is calculated at several locations

within each input variable’s range of values. We decided to

divide each range into 20 equally-sized, co ntiguous

segments and use the midpoint of each segment. Similarly,

the interaction effects are found using equation (4). We

decided to use a grid consisting of 400 locations to study

each interaction. The value of the interaction effect was

calculated at each combination of the 20 locations used for

the main effects of the two variables. For the eight outpu ts

model, the effect valu e was calculated for each of the

separate output variables. The o bjective function effect

was then calculated by totaling the cost contributions

represented by the effect values for the eight outputs.

Table 3 shows that the direct model and the eight outputs

model p roduce similar effect estimates. The relative magni-

tudes of the effect values indicate the degree of impact that

each input variable, or interaction, has upon the objective

function. The maximum and minimum value for each

effect is shown as a summary of the data. Among the

control

£

noise interactions, the

x

1

x

4

effect has the largest

impact. Among the control

£

control interactions, only the

x

1

x

2

effect appears important.

Figure 2 shows the marginal impact of the

x

1

and

x

2

input

variables upon the objective function. The marginal impact

is calculated as the rst term of equation (4), that is, the

main effects and ov erall average are not removed. Figure 3

shows the marginal impact of the

x

1

and

x

4

input variables.

Interactions are observed in these plots as twists in the

wireframe. Figure 2 shows that the shape of the curve

representing the impact of

x

1

changes as the value of

x

2

changes. This is an indication of the importance of the

x

1

x

2

interaction. Conversely, Figure 3 sh ows that the shape of

the

x

1

curve remains nearly unchanged as the

x

4

value

changes. This indicates that the

x

1

x

4

interaction is practi-

cally negligible.

The observations above regarding Figures 2 and 3 demon-

strate the value of a graphic presentation of the results

(Step 4 in the procedure of Welch and Sacks). The numeric

presentation of the results in Table 3 is suf cient to identify

x

1

x

2

and

x

1

x

4

as the most interesting interactions. However,

it is only with the aid of the gures that one observes the

import of the relative magnitudes of the effects.

As with the polynomial model, the effects analysis of the

kriging models indicates that there are no important inter-

actions between the control and noise variables. This means

that it is not possible to in uence th e negative impacts of the

noise variables through the selection of settings for the

control variables. We note that the

x

4

input variable does

have an impact on the objective function. Among the control

variables, the

x

1

x

2

interaction has the greatest impact. The

simulated annealing algorithm is used to identify the control

variable settings th at produce the best predicted objective

function value. The simulation is then run at these settings to

obtain the corresponding actual objective function and

operating cost values. The resu lts are sh own in Table 4.

Table 4 shows that the best predicted objective function

value is obtained from the polynomial metamodel. However,

the kriging metamodel based upon the eight separate output

variable models identi ed the input variable settings that

Table 2. Initial study parameter estimates for kriging models.

^

yy

1

^

yy

2

^

yy

3

^

yy

4

^

yy

5

^

yy

6

^

bb

yˆ

0

: Direct obj func 7.64E‡ 00 5.05E¡01 2.17E¡05 2.04E¡01 2.55E¡04 1.28E¡02 243.441

y

ˆ

1

: Feed H

2

3.73E¡01 7.52E¡01 9.64E¡02 2.99E¡02 3.3 7E¡06 2.50E¡08 5000.27

yˆ

2

: RCM1 8.31E¡02 8.03E¡02 1.18E¡01 4.10E¡05 1.7 3E¡02 1.21E¡43 2070.62

yˆ

3

: RCM2 5.18E¡07 2.25E¡01 6.84E¡01 4.74E¡04 2.0 5E¡01 2.33E¡07 15,959.0

y

ˆ

4

: CHX1 2.18E¡01 1.44E¡01 3.48E¡03 1.55E¡03 2.1 6E¡02 8.47E¡04 387.634

yˆ

5

: D1 reboiler 4.53E‡ 00 2.93E¡01 1.23E¡08 6.41E¡02 1.62E¡02 4.61E¡03 309. 460

y

ˆ

6

: D1 condenser 4.22E‡ 00 2.51E¡01 7.64E¡07 4.58E¡02 1.43E¡02 3.50E¡03 285. 706

y

ˆ

7

: Waste H

2

6.48E¡02 5.67E¡02 8.05E¡03 7.03E¡06 1.2 6E¡04 4.16E¡10 41.0940

yˆ

8

: Waste CH

4

1.49E¡01 9.58E¡04 1.55E¡05 1.19E¡02 9.1 0E¡02 1.71E¡05 39.6433

Table 3. Initial study kriging model effects.

Direct model Eight output models

Min Max Min Max

Main effects

x

1

¡16.60 31.30 ¡15.92 40.64

x

2

¡11.34 9.93 ¡13.77 11.23

x

3

¡0.04 ¡0.04 ¡0.17 0.18

x

4

¡10.19 10.78 ¡8.68 9.13

x

5

¡0.08 ¡0.01 ¡2.05 1.59

x

6

¡0.04 ¡0.04 ¡1.07 1.08

Interaction effects

x

1

x

2

¡23.03 20.48 ¡27.97 27.97

x

1

x

3

¡0.57 1.08 ¡0.54 1.44

x

2

x

3

¡0.39 0.34 ¡0.56 0.56

x

1

x

4

¡6.65 6.49 ¡6.67 6.14

x

1

x

5

¡0.58 1.08 ¡3.32 3.88

x

1

x

6

¡0.57 1.07 ¡1.28 2.30

x

2

x

4

¡0.90 0.91 ¡0.65 0.65

x

2

x

5

¡0.39 0.34 ¡2.35 2.25

x

2

x

6

¡0.39 0.34 ¡0.60 0.55

x

3

x

4

¡0.35 0.37 ¡0.28 0.35

x

3

x

5

¡0.00 0.00 ¡2.14 2.22

x

3

x

6

¡0.00 ¡0.00 ¡0.01 0.07

Figure 2. Marginal impact of pressure and purge (X1X2 interaction) from

stage 1 kriging metamodel.

Trans IChemE, Vol 80, Pa rt A, October 2002

778 PALMER and REALFF

produce th e best actual objective funct ion value. For each of

the three metamodels, there is a substantial difference

between the predicted and actual objective function values.

This indicates that the metamodels may not be suf ciently

accurate representations of the simulation to identify the

optimal operating conditions.

One way to improve the approximation error of a meta-

model is to reduce the range of input variable values over

which the model is intended to apply. This approach relies

on the concept that a simple function may adequately

represent even the most complex function, if done over a

small enough range. When the input range is reduced, it is

assumed that the metamodel shape will more closely resem-

ble the shape of the simulation function, and the approxi-

mation error of the metamodel will improve. After reducing

the size of the input variable space, it is necessary to repeat

the data collection and model generation steps.

Follow-up Studies

Since both the polynomial and kriging metamodels indi-

cate that there are no exploitable interactions between the

control and noise variables, there is little rationale for

continued investigation and modeling of the noise variables.

By removing the noise variables from considerat ion, it is

possible to concentrate data collection plans on the gather-

ing of information regarding the control variables, and thus

reduce the requ ired number of simulation runs. Conse-

quently, the noise variables are set to their nominal values

for the follow-up study.

For the control variables, we note that the

x

1

x

2

interaction

has the greatest impact on the objective function and d ecide

to further reduce the input space along these variables. We

decide to cut the range of each variable to a third of that

used in the initial study. The results shown in Table 4

suggest that the best area for

pressure

is just below

19;000 kPa and the best area for

purge

is near 5%. With

this in mind, we set the input space for the second stage of

data collection to that shown in Table 5.

The data collection plan for the second stage is a

MBLHD for three input variables and e ight simulation

runs. The settings for the control variables, as wel l as the

observed operating cost and objective function values, are

shown in Table 6.

The polynomial metamodel for the objective function is

objective ˆ

237:46

¡

7:6254

x

1

¡

6:262

x

2

1

…

7

†

Each of the input variables used in tting equation (7) is

scaled to the interval

‰¡

0:5;

‡

0:5

Š

as follows

x

1

ˆ

Pressure ¡

18;872:5

2665

;

x

2

ˆ

Purge ¡

4:5

1:0

;

x

3

ˆ

ExcessN

2

¡

25

50

Kriging metamodels of the form shown in equation (2) are

generated both directly for the objective function and

separately for each of the eight key output variabl es. The

parameter estimates are shown in Table 7. Predictions of the

objective function value are ge nerated from the eight sepa-

rate output predictions using equation (6).

The form of the polynomial model indicates that pressure,

x

1

, has the dominant impact on the objective function over

this input space. The canonical analysis shows that the

stationary point is a point of maximum objective function

value. The ridge analysis reveals the point of minimum

predicted objective function value within the input space.

Since neither purge nor excess

N

2

have an important impact

on the objective function here, they are set to their midpoint

values. Table 8 shows the results at the input variable

solution point.

Figure 3. Marginal impact of pressure and water temperature (X1X4

interaction) from stage 1 kriging metamodel.

Table 4. Initial study optimization and veri cation.

Kriging,

Direct

model

Kriging,

Eight output

models Polynomial

Best point

Pressure 18,745 19,080 18,595

Purge 5.00 5.00 5.00

ExcessN

2

0.00 14.65 25.00

Water 302.4 302.4 302.4

Methane 0.800 0.800 0.800

Argon 0.200 0.200 0.200

Predicted

Objective function 229.7 222.3 218.8

Actual

Objective function 240.3 236.1 237.7

Operating cost 239.0 235.0 236.6

Table 5. Input variable space for second stage.

Variable Working range

Reaction pressure 17,540 to 20,205kPa

% of recycle to purge 4 to 5%

Excess N

2

in reactor feed 0 to 50%

Cooling water temperature 302.4K

CH

4

: H

2

mole ratio in H

2

feed 0.8 : 94.0

Ar : N

2

mole ratio in N

2

feed 0.2 : 99.8

Table 6. Data collection plan and results for second stage.

No. Pressure Purge ExcessN

2

Operating

cost

Objective

function

1 17,695 4.81 34.45 239.1 239.9

2 18,035 4.06 28.15 237.8 238.9

3 18,370 4.44 2.95 236.8 238.1

4 18,705 4.69 9.25 237.5 238.8

5 19,040 4.31 40.75 235.3 236.7

6 19,375 4.56 47.05 234.8 235.9

7 19,710 4.94 21.85 233.1 234.3

8 20,050 4.19 15.55 231.4 232.9

Trans IChemE, Vol 80, Pa rt A, October 2002

OPTIMIZATION AND VALIDATION OF METAMODELS 779

The calculated effect estimates from the kriging meta-

models indicate that the

x

1

x

3

interaction is dominant over

the input space. Figure 4 shows the marginal impact of

the pressure,

x

1

, and excess N

2

,

x

3

, input variables. Since the

purge does not have an important impact here, it is set to its

midpoint value. The simulated annealing algorithm is used

to identify the point of minimum objective function value

for each o f the two kriging metamodels. The results are

shown in Table 8.

For all three metamodels, the predicted objective function

value given in Table 8 is much closer to the actual result

than was observed in the initial study, Table 4. This suggests

that size of the input space is now appropriate to produce

models of accep table approximation e rror. However, the

pressure is at the upper limit of the working range selected

for the second stage; and, the models all predict that better

results could be obtained if the pressure were increased

further. Since the working range for the initial study extends

to greater pressures, the second stage result indicates that the

location of the input variable space should be shifted along

this variable.

A third stage of d ata collection and modeling is now

performed. The input variable space is similar to that used

for the second stage. The pressure range is shifted upwards

and is now from 19,335 to 22;000 kPa. The other variables

remain as shown in Table 5. The data collection plan for the

third stage is another MBLHD for three input variables and

eight simulation runs. The settings for the control variables,

as well as the observed op erating cost and objective function

values, are shown in Table 9.

The polynomial metamodel for the objective function is

objective ˆ

228:86

¡

8:136

x

1

‡

18:637

x

2

1

where

x

1

ˆ

Pressure ¡

20;667:5

2665

Kriging metamodels are generated both directly for the

objective function and separately for each of the eight key

output variables. The parameter estimates are shown in

Table 10. As before, equation (6) is used to obtain predic-

tions of the objective function based upon predictions of the

eight key output variables.

The form of the polynomial model once again indicates

that pressure,

x

1

, has the dominant impact on the objective

function over this input space. The canonical analysis for the

third stag e model shows that the stationary point does

produce the minimum predicted objective function valu e.

The purge and excess N

2

are set to their midpoint valu es.

Table 11 shows the result at the predicted optimal lo cation.

The calculated effect estimates from the direct kriging

metamodel indicate that there are no important interaction

effects. Figure 5 shows the marginal impact of each of the

input variables separately. The simulated annealing algo-

rithm is used to nd the recommended pressure value. The

preferred purge and excess N

2

values are selected by

inspection. For the eight output kriging metamodel, the

Table 7. Second stag e parameter estimates for kriging models.

^

yy

1

^

yy

2

^

yy

3

^

bb

yˆ

0

: Direct o bj func 7.67E‡ 00 2.66E¡05 1.17E‡ 00 236.805

y

ˆ

1

: Feed H

2

1.05E¡02 4.43E¡02 3.42E¡02 4991.98

yˆ

2

: RCM1 1.95E‡ 00 1.15E¡03 1.54E‡ 01 4932.13

yˆ

3

: RCM2 1.55E¡04 6.08E¡02 1.06E¡01 19,345.8

y

ˆ

4

: CHX1 9.63E¡01 1.35E¡01 4.95E¡01 272.298

yˆ

5

: D1 reboiler 4.78E¡01 6.02E¡03 5.98E¡02 298.791

y

ˆ

6

: D1 condenser 5.09E¡01 9.80E¡03 6.29E¡02 281.474

y

ˆ

7

: Waste H

2

5.24E¡03 3.00E¡02 1.17E¡02 10.3473

yˆ

8

: Waste CH

4

7.59E¡01 8.54E¡02 1.26E¡01 42.2015

Table 8. Second stage optimization results.

Kriging,

Direct

model

Kriging,

Eight output

models Polynomial

Best Point

Pressure 20,205 20,205 20,205

Purge 4.50 4.51 4.50

ExcessN

2

13.95 0.00 25.00

Predicted

Objective function 232.7 230.7 232.1

Actual

Objective function 233.2 229.9 232.0

Operating cost 231.8 228.5 230.6

Figure 4. Marginal impact of pressure and excess N

2

(X1X3 interaction)

from stage 2 kriging metamodel.

Table 9. Data collection plan and results for third stage.

No. Pressure Purge ExcessN

2

Operating

cost

Objective

function

1 19,490 4.31 21.85 233.9 235.2

2 19,830 4.81 40.75 233.6 234.6

3 20,165 4.56 2.95 228.9 230.3

4 20,500 4.06 34.45 230.0 231.4

5 20,835 4.94 15.55 225.2 226.6

6 21,170 4.44 47.05 226.4 227.7

7 21,505 4.19 9.25 226.8 228.3

8 21,845 4.69 28.15 227.8 229.2

Trans IChemE, Vol 80, Pa rt A, October 2002

780 PALMER and REALFF

calculated effect estimates indicate that there is an interac-

tion effect between pressure,

x

1

, and excess N

2

,

x

3

. Figure 6

shows the margin al impact for the

x

1

x

3

interaction. The

purge,

x

2

, also has a separate impact on the objective

function, similar to that shown in Figure 5. The simu lated

annealing algorithm is used to nd the recommended

settings for pressure and excess

N

2

. The preferred value

for purge is selected by inspection. The results of the

optimization study are shown in Table 11.

For each of the models, the predicted and actual objective

function values are not as close to each o ther as was

observed for the second stage, but are still in substantially

better agree ment than was observed in the initial study. The

smallest approximation error is produced by the eight output

kriging metamodel. This result is suf ciently accurate to

indicate that there is no need to adjust the size of the input

space. For each solution, the control variable settings are

either well within the working range for the third stage or at

a limit of the working range for the initial study. This

indicates that there is no need to adjust the location of the

input space. Next, we consider the quality of these nal

results.

The intent of the optimization is to min imize the operat-

ing cost of the ammonia plant. The solution obtained by the

eight output kriging model has the smallest operating cost

among the metamodels used. Since the eight output kriging

model also produces the smallest approximation error for

the objective function value among the metamodels used, it

is clear that this is the best solution we have obtained.

The right-hand column in Table 11 co ntains the best

result reported by Morari and Grossmann

5

in their design

case study. Their result was obtained by trial and error,

presumably carried out by several student design teams. The

operating cost and objective function value for this best

result are very close to those obtained by the eight output

kriging model solution. The pressure settin g given by the

eight output krig ing model solution is essentially th e same

as that given for the best result. The increased purge rate is

necessitated by the use of excess N

2

, which the Morari and

Grossmann study

5

did not consider.

Table 10. Third stage parameter estimates for kriging models.

^

yy

1

^

yy

2

^

yy

3

^

bb

yˆ

0

: Direct ob j func 4.52E‡00 1.46E¡01 2.97E¡01 231.614

y

ˆ

1

: Feed H

2

1.62E¡03 5.92E¡02 2.39E¡01 4998.71

yˆ

2

: RCM1 1.43E¡02 1.19E¡03 2.60E¡01 6219.05

yˆ

3

: RCM2 5.14E¡05 1.50E¡01 8.28E¡02 14,739.1

y

ˆ

4

: CHX1 8.59E¡01 2.37E¡01 1.89E¡01 250.249

yˆ

5

: D1 reboiler 4.75E‡ 00 1.04E¡01 4.00E¡01 276.442

y

ˆ

6

: D1 condenser 5.62E‡ 00 1.12E¡01 4.57E¡01 261.709

y

ˆ

7

: Waste H

2

3.47E¡02 1.66E¡01 5.82E¡02 25.5922

yˆ

8

: Waste CH

4

1.47E‡ 00 1.06E¡06 2.58E¡01 41.1754

Table 11. Third stage optimization results.

Kriging,

Direct

model

Kriging,

Eight output

models Polynomial

Morari and

Grossmann

5

Best point

Pressure 20,980 21,040 21,250 21,000

Purge 5.00 5.00 4.50 4.00

ExcessN

2

0.00 8.77 25.00 0.00

Predicted

Objective function 226.0 226.1 2 28.0

Actual

Objective function 231.1 227.0 2 30.3 226.7

Operating cost 229.6 225.6 228.9 225.1

Figure 5. Marginal impacts of pressure (solid), purge (d ashed), and excess

N

2

(dotted) from stage 3, direct kriging metamodel.

Figure 6. Marginal impact of pressure and excess N

2

(X1X3 interaction)

from stage 3, eight output kriging metamodel.

Trans IChemE, Vol 80, Pa rt A, October 2002

OPTIMIZATION AND VALIDATION OF METAMODELS 781

SUMMARY

The example of an ammonia synthesis plant has been

used to demonstrate that metamodel ing can be an effective

approach for solving chemical process design optimizat ion

problems. A total of six input variables, three control

variables and three noise variables, were investigated for

their impact on the operating cost of the plant. The method

produced a solution equivalent to the previously reported

best result while using a total of only 32 simulation runs to

collect data for three stages of metamodel generation. At

each stage, the best solution was produced by a model that

combined predictions from separate kriging metamodels of

basic output variables.

The metamodeling approach resembles traditional

Response Surface Methodology (RSM). The data collection

and modeling stages proceed sequentially. The size and

=

or

location of the input variable space is adjusted from stage to

stage. The results from one stage inform the selection of a

re ned space fo r the next stage. The metamodeling

approach differs from RSM in the tools that are used to

explore the deterministic simulation’s input–output relation-

ships. Derivatives of strati ed sampling, such as the Mini-

mum Bias Latin Hypercube Design (MBLHD), are used as

the data collection plans. Simple interpolating functions,

such as the kriging model, are used for the surface model-

ing. Randomized search techniques, such as the simulated

annealing algorithm, are used to optimize the surface model.

There are opportunities to investigate re nements of the

tools. Improvements in metamo del approximation error

could be obtained by use of interpo lating functions with

greater exibility. Data collection plans could be optimized

for use with speci c metamodel forms. The trade-off

between approximation error and th e number of simulation

runs in the data collection plan could be detailed. Improve-

ments in the ef ciency of the optimization algorithm could

be found by considering alternative global optimization

techniques. There is a need to demonstrate the application

of the tools to problems with a larger number of input

variables and problems involving integer input variables.

Also, automation of the process would make it less cumber-

some to implement.

However, the appeal of the metamodeling approach

should be obvious. This technique provides a means to

optimize process designs while requiring only a small

number of simulation runs. The use of computer simulation

in process design can only be expected to increase in the

future. The complexity of processes that we wish to simulate

and optimize will l ikewise increase. As engineers, we

should be receptive to the usefulness of a simplifying

approximation.

REFERENCES

1. Welch, W. J. and Sacks, J., 1991, A system for quality improvement via

computer experiments, Comm Stat Theory Methods, 20 (2): 477–495.

2. Palmer, K. and Realff, M., 2002, Metamodeling approach to optimiza-

tion of steady-state owsheet simulations: Model generation, Trans

IChemE, Part A, Chem Eng Res Des, 80(A7): 760–772.

3. Palmer, K. and Tsui, K.-L., 2001, A minimum bias latin hypercube

design, IEE Trans, 33(9): 793–808.

4. Cressie, N. A. C., 1993, Statistics for Spatial Data (John Wiley & Sons,

Inc., New York, USA).

5. Morari, M. and Grossman, I. E. (eds), 1995, Design of an Ammonia

Synthesis Plant, Volume 2 of CACHE Process Design Case Studies

(CACHE Corp., Austin, USA).

6. Box, G. E. P. and Draper, N. R., 1987, Empirical Model-Building and

Response Surfaces (John Wiley & Sons, New York, USA).

7. Myers, R. H. and Montgomery, D. C., 1995, Response Surface

Methodology: Product and Process Optimization Using Designed

Experiments (John Wiley & Sons, New York, USA).

8. Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P., 1989, Design and

analysis of computer experiments, Statist Sci, 4: 409–435.

9. Kirkpatrick, S., Gelatt, C. and Vecchi, M., 1983, Optimization by

simulated annealing, Science, 220: 671–680.

10. van Laarhoven, P. J. M. and Aarts, E. H. L., 1987, Simulated Annealing:

Theory and Applications (Reidel, The Netherlands).

11. Painton, L. A. and Diwekar, U. M., 1994, Synthesizing optimal

design con gurations for a Brayton cycle power plant, Comp Chem

Eng, 18(5): 369–381.

12. Bohachevsky, I. O., Johnson, M. E. and Stein, M. L., 1995, Simulated

Annealing and generalizations, in Kalivas, J. H. (ed). Adaption of

Simulated annealing to chemical Optimization Problems, 3–24

(Elsevier Science, Amsterdam, The Netherlands).

13. Palmer, K., 1998, Data Collection Plans and Meta Models for Chemical

Process Flowsheet Simulators, PhD Thesis (Georgia Institute of

Technology, Atlanta, USA).

ACKNOWLEDGEMENT

We wish to thank Russell Barton (Pennsylvania State University) for

providing us with a copy of his FORTRAN code for estimating kriging

model parameters.

ADDRESS

Correspondence concerning this paper should be addressed to Professo r

K. Palmer, Department of Industrial and Systems Engineering, University

of Southern California, Los Angeles, California 90089-0193, USA.

E-mail: kpalmer@usc.edu

The manuscript was received 15 October 2001 and accepted for

publication after revision 5 September 2002.

Trans IChemE, Vol 80, Pa rt A, October 2002

782 PALMER and REALFF