simplebooklet thumbnail

of 0
0263–8762/02/$23.50+0.00
# Institution of Chemical Engineers
www.catchword.com=titles=02638762.htm Trans IChemE, Vol 80, Part A, October 2002
OPTIMIZATION AND VALIDATION OF STEADY-STATE
FLOWSHEET SIMULATION METAMODELS
K. PALMER
1
and M. REALFF
2
1
Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, USA
2
School of Chemical Engineering, Georgia Institute of Technology, Atlanta, USA
T
he use of metamodels all ows the optimization of steady-state owsheet simulations to
proceed while requiring only a small number of solutions to be obtained from the
simulation. This paper presents the techniques needed to optimize the metamodel forms
described by Palmer and Realff. Through a case study, we exemplify characteristics of the
solution process and demonstrate the strength of the metamodeling approach. The operating
cost of an ammonia sy nthesis plant is optimized over six input variables using a total of only
32 solutions of the plant simulation as data for the metamodels.
Keywords: kriging; polynomial models; simulated annealing; canonical analysis; ridge
analysis.
INTRODUCTION
The ef ciency of steady-state owsheet simulations has
made it possible to perform process optimization studies
that historically had been impractical. However, the time
required to obtain a single solution of the owsheet is not
always trivial, and in these cases, the number of simulation
solutions available for an optimization study is severely
limited. Consequently, it becomes advantageous to consider
how to optimize a owsheet while using only a small
number of simulation runs.
In answer to this problem, we suggest use of the meta-
modeling approach proposed by Welch and Sacks
1
. They
describe the creation of approximating functions based upon
data gen erated from an electronic circuit simulation. The
metamodels are then used as a surrogate for the simulation
in optimization studies. Welch an d Sacks give the following
general procedure, which accounts for the possibility of
several output variables from the simulation contributing to
one or more objective functions:
(1) Postulate a model for each output
y
k
as a function of all
inputs x.
(2) Plan an initial experiment of a set of
N
x vectors, and
run the simulation at these input vectors to generate the
quality characteristic outputs.
(3) Use the data to t the metamodels (estimate model
parameters), and build approximating functions
^yy
k
x
.
These will be used to predict the outputs at new x
vectors. Before proceeding fur ther, the adequacy of the
predictors should be assessed and, if necessary,
improved.
(4) Plot the predicted surfaces. Particu larly when there are
con icting objectives, visualization of the input–output
relationships can help to establish trade-offs. Even if
model assessment indicated that the predictors suffer
large approximation errors, the plots may suggest promis-
ing subregions of the x space for the next experiment.
(5) Use the predictors to estimate the objective function(s),
and perfo rm a tentative optimization.
(6) If necessary, reduce the size of the input space. If Step 3
indicated that a predictor is not accurate enough yet,
reduce the size of the x region to a promising subregion
based on the plots in Step 4 and the tentative optimiza-
tion in Step 5, and return to Step 2 to collect new data
on the subregion. Otherwise continue to Step 7.
(7) Conduct a con rmation experiment to evaluate the
tentative optimum found in Step 5. If the con rmation
experiment demonstrates that the tentative optimum is
unsatisfactory, then we have to return to an earlier step.
Chemical engineers have used approximation modeling
techniques for other purposes in the past; see Palmer
and Realff
2
for a list of examples. However, we believe
that th is is the rst reported use of approximating models as
metamodels for a chemical engineering process owsheet
simulation.
The rst paper in this series, Palmer and Realff
2
, deals
with implementation issues surrounding the data collection
(Step 2) and metamodel building (Step 3) portions of the
procedure given above. With regard to the data collection
step, we emphasize the desirability of unique sample values
at each data collection point, broad dispersion of the data
collection points within the input variable space, and
minimal contribution to met amodel prediction error. We
introduce the Minimum Bias Latin Hypercube Design
(MBLHD) of Palmer and Tsui
3
as an example of a plan
that has these qualities.
773
For the model building step, we emphasize the need to
estimate model parameters from small data sets and the
desire to create metamodels that faithfully reproduce the
input–output characteristics of the simulation. We focus our
attention on two model forms: polynomial and kriging. Both
of these have been used successfully with small data sets in
other applications. Kriging
4
is a modeling technique from
geostatistics that predicts output values by exploiting the
similarity of the corresponding locations in the input spa ce.
Input points that are near’ to each other are assumed to have
similar output values. An additional issue that we raise is the
assertion that improvement in a pproximation error can
be obtained by modeling basic outputs from the simulation
rather than summary measures of process performance. We
investigate the assertion for a speci c example process.
We introduce the ammonia synthesis plant described by
Morari and Grossmann
5
as an example process for investi-
gation. The intent of the optimization problem is to mini-
mize the operating cost of the plant. The op erating cost is
determined by 18 simulation output variables that represent
material and energy usage, or recovery, at various points in
the process. To simplify the task of generating separate
metamodels for the basic simulation outputs, we select a
subset of eight output variables that represent the
most important operating cost components. The eight key
outputs are:
°
hydrogen feed stream H
2
owrate;
°
rst compressor (RCM1) electricity u sage;
°
second compressor (RCM2) electricity usage;
°
reactants preheater (CHX1) steam usage;
°
distillation column (D1) reboiler steam usage;
°
D1 condenser refrigeration usage;
°
waste gas stream H
2
owrate;
°
waste gas stream CH
4
owrate.
The components of the operating cost corresponding to th ese
outputs are totaled to form an objective function, which is
used in lieu of the operating cost during optimizations.
Following the initial data collection, polynomial meta-
models are generated both directly for the o bjective function
value and separately for each of the eight key outputs. The
separate output models are then combined, using the appro-
priate cost factors, to create a second objective function
model. Predictions from the two models are found to be
nearly identical. This is taken as an indication that the
assertion about basic outputs does not hold in the case of
polynomial metamodels and linear objective functions.
Kriging metamodels are also generated from the initial
ammonia plant data, both directly for the objective function
and separat ely for the key outputs. Predictions of the
objective function from the direct model a nd the eight
outputs model are observed to d iffer substantially. It is not
practical to determine which model is more a ccurate by
comparing predictions to actual simulation results over a
large set of test conditions, because o f the lengthy execution
time of the simulation. An alternative approach is to rst
obtain a solution to the operating cost problem using the
metamodel. Then, the approximation error of the metamodel
prediction at the input variable solution point is evaluated
using a single simulation run. At the end of the work by
Palmer and Realff
2
, we leave the optimization and valida-
tion of the kriging models to a future publication.
In this article we present the techniques for optimization
of both the polynomial and kriging metamodels. We also
validate our solutions by evaluating the metamodel approxi-
mation error at the input variable solution points. We follow
the ammonia plant problem through three stages of data
collection, modeling, and tentative optimization to exem-
plify implementation characteristics of Steps 4 to 7 of the
procedure given by Welch and Sacks. The completed
ammonia plant optimization also demonstrates the strength
of the metamode ling approach, as the problem is solved
using a total of only 32 simulation runs to provide data to
generate the metamodels.
The second section presents the methods used to optimize
the polynomial and kriging models. The third section
presents the co mpletion of the ammonia plant example.
Finally, the fourth section contains our concluding remarks
and comments regarding future work.
OPTIMIZATION TECHNIQUES
As discussed earlier, there are several possible choices of
approximating model forms to use for the metamodel. In
this work, we used polynomial and kriging models because
of their ability to be generated from small data sets. The
appropriate optimization technique differs for each type o f
model.
Polynomial Metamodels
The polynomial models used were reduced versions of
second-order functions. The complete second-order poly-
nomial model is written as
^
yy
x
ˆ
^
bb
0
X
m
iˆ
1
^
bb
i
x
i
X
m
iˆ
1
^
bb
ii
x
2
i
X
m¡
1
iˆ
1
X
m
jˆi
1
^
bb
ij
x
i
x
j
This model form can also be represented in matrix notation
^
yy
x
ˆ
^
bb
0
x
0
^
bb
x
0
^
BBx
1
where
^yy
x
is the output value predicted by the metamodel,
x
ˆ x
1
;
x
2
;
. . .
;
x
m
Š
0
is a column vector of the process
design variables,
^
bb
ˆ
^
bb
1
;
^
bb
2
;
. . .
;
^
bb
m
Š
0
is a column
vector of the linear model parameters, and
^
BB is an
m £ m
symmetric matrix of the second-order model parameters
^
BB
ˆ
^
bb
11
^
bb
12
=
2
¢ ¢ ¢
^
bb
1
m
=
2
^
bb
12
=
2
^
bb
22
¢ ¢ ¢
^
bb
2
m
=
2
.
.
.
.
.
.
.
.
.
^
bb
1
m
=
2
¢ ¢ ¢
^
bb
mm
2
6
6
6
4
3
7
7
7
5
This model form has a rich history in approximation
modeling
6,7
. Below, we summarize the optimization techni-
ques described in these texts.
The optimization of the model given in equation (1) is
accomplished in two steps. The rst step is to perform a
canonical analysis. The canonical analysis begins by calcu-
lating the location of the stationary point
x
s
ˆ ¡
1
2
^
BB
¡
1
^
bb
The nature of the stationary point is then determined by
calculating the eigenvalues of the
^
BB matrix. The eigenvalues
can be read ily obtained using software su ch as Minitab
Trans IChemE, Vol 80, Pa rt A, October 2002
774 PALMER and REALFF
or Matlab. The signs of the eigenvalues indicate the nature
of the stationary point. There are three possibilities:
(1) If all of the eigenvalues are negative, the metamodel
describes the output variable surface as a hill with a
single maximum. The stationary point represents th e
process design values that produce the maximum
predicted output value.
(2) If all of the eigenvalues are positive, the metamodel
describes the output variable surface as a well wit h a
single minimum. The stationary point represents the
process design values that produce the minimum
predicted output value.
(3) If the eigenvalues are of mixed sign, the metamodel
describes the output variable surface as a saddle system
(Figure 1). The stationary point represents the centre of
the saddle’s seat.
In addition to the eigenvalues, the eigenvectors of
^
BB can
also be calculated. This information allows us simply to
describe the output variable surface. The metamodel can
now be written in its canonical form
^yy ˆ ^yy
s
X
m
iˆ
1
l
i
w
2
i
where
^yy
s
is the predicted output value at the stationary point,
l
i
are the eigenvalues, and
w
i
are the canonical variables.
The canonical form is a result of a rede ned coordinate
system in which the axes have been translated, so that the
origin is placed at the stationary point, and rotated, so that
the
w
-axes are the principal axes of the output variable
contour system. The eigenvectors p
i
give the directions of
the new axes. The coordinates
w
i
of the former x point in the
translated and rotated system are
w
i
ˆ
p
0
i
x
1
¡ x
1;
s
x
2
¡ x
2;
s
.
.
.
x
m
¡ x
m
;
s
2
6
6
6
4
3
7
7
7
5
where p
i
is the normalized eigenvector associated with the
i
th eigenvalue. The magnitude of the eigenvalue indicates
the rate that th e output variable changes as one moves away
from the stationary point, along the associated axis.
The results of the canonical analysis determine the need
to perform the second step of the optimization. If the
stationary point is o f the desired nature and is located
within the region from which data was collected, then
there is no need to continue. If the stationary point is not
of the desired nature or it is located outside the explored
region, then a ridge analysis is necessary.
A ridge analysis is a constrained optimization procedure.
The set of feasible solutions is constrained to the set of points
that lie o n a hypersphere of radius
R
from the orig in of the
input variable space. At each iteration of the algorithm, the
point having the maximum (or minimum) predicted output
value, among all points on the hypersphere, is identi ed
x
ˆ ¡
1
2
^
BB
¡ m
I
¡
1
^
bb
where I is the
m £ m
identity matrix and
m
is a constant that
determines both the radius and the n ature of the solution.
If
m
is greater than the largest eigenvalue of
^
BB, then the
solution is a point of maximum predicted output. If
m
is less
than the smallest eigenvalue of
^
BB, then the solution is a point
of minimum predicted output. The radius is calculated as
R ˆ 
x
0
x
1
=
2
The radius is a monotonic function of
m
, within each of its
working regions.
As
m
approaches the largest eigenvalue of
^
BB from above,
the radius increases. An algorithm designed to  nd the point
of maximum p redicted output within the explored region
would begin by selecting a
m
value that is far greater than
the largest eigenvalue,
l
m
. This would produce a solution
near the origin. In succeeding iterations, the value of
m
would then be incrementally decreased, always maintaining
m
>
l
m
. The solutions would move progressively further
away from the origin, indicating the path of steepest ascent.
The algorithm should terminate at the boundary of the
explored region.
Similarly, as
m
approaches the smallest eigenvalue of
^
BB
from below, the radius increases. An algorithm designed to
 nd the point of minimum predicted output begins by
selecting a
m
value that is far less than the smallest eigenvalue,
l
1
. In succeeding iterations, the value of
m
is incrementally
increased, always maintaining
m
<
l
1
. The solutions move
progressivelyfurther away from the origin, indicatingthe path
of steepest descent. Once again, the algorithm should termi-
nate at the boundary of the explored region.
After th e  nal solution has been ident i ed, it must be
validated. There is no guarantee that the metamodel faith-
fully reproduces th e simulation characteristics. The
predicted output value at the in put variable solution point
should be con rmed, to provid e some assurance that the
metamodel is adequate.
Kriging Metamodels
The kriging models use d were of the form
^yy
x
 ˆ
^
bb 
r
0
x
R
¡
1
y
¡
1
^
bb 
2
where the column vector y represents the known output
values. R is an
N £ N
matrix that has elements
R
x
u
; x
v
for
all
u
;
v ˆ
1;
. . .
;
N
de ned as
R
x
u
; x
v
 ˆ
Y
m
iˆ
1
exp
…¡
^
yy
i
jx
iu
¡ x
iv
j
2
which represent correlation s between pairs of points in th e
data collection plan. The vector r
0
x
 ˆ R
x
1
; x
;
. . .
;
R
x
N
; x
†Š
gives the correlations between the data collectionFigure 1. Example of a saddle system surface.
Trans IChemE, Vol 80, Pa rt A, October 2002
OPTIMIZATION AND VALIDATION OF METAMODELS 775

points x
u
and the new input point x. The
^
yy
i
and
^
bb
are
parameters of the model.
The general character of the output variable surface can
be observed by calculating the main effects and interaction
effects for the model. Sacks
et al.
8
provide the necessary
formulae. The main effect of input variable
x
i
is
^mm
i
ˆ
^yy
x
Y
hi
d
x
h
¡ ^mm
0
3
where
^
mm
0
ˆ
^
yy
x
Q
m
hˆ
1
d
x
h
is the average predicted output
over the explored region. The interaction effect o f input
variables
x
i
and
x
j
is
^mm
ij
ˆ
^yy
x
Y
hi
;
j
d
x
h
¡ ^mm
i
¡ ^mm
j
^mm
0
4
Since the integral of the correlation function does not have
an analytical solution, these formulae must be numerically
integrated.
The optimization of the model given in equation (2) is
accomplished by u se of a simulated annealing algorithm
912
.
Below, we describe the simulated annealing algorithm and
some of its characteristics.
Assuming that the objective is to minimize a function
C
x
, where x is a vector of decision variables, the simulated
annealing algorithm proceeds as follows:
(1) Choose an initial point, x
0
, and determine the corres-
ponding value of the objective function,
C
x
0
.
(2) Choose a small random step, Dx.
(3) At any point, x
k
, calculate the provisional change in the
objective function value D
C ˆ C
x
k
Dx
¡ C
x
k
.
(4) Check the termination criterion. If satis ed, stop.
Otherwise,
(5) For D
C
< 0, set x
k
1
ˆ
x
k
Dx and go to Step 2.
(6) For D
C
> 0, generate
W
, a uniform(0, 1) random vari-
able, and evaluate the probability exp
…¡B ¤
D
C
:
(a) If
W
< exp
…¡B ¤
D
C
, set x
k
1
ˆ
x
k
Dx and go
to Step 2.
(b) If
W
> exp
…¡B ¤
D
C
, go to Step 2.
Simulated annealing was proposed as an alternative to
gradient descent algorithms. Gradient descent algorithms
were known to fail when applied to nonconvex objective
functions, because of an inability to escape local minima.
The concept of the simulated annealing was to allow the
algorithm to ‘climb out’ of local minima by accepting
detrimental steps, on a probabilistic basis.
Operation of the simulated annealing algorithm depends
upon several characteristics that are controllable by the
investigator:
°
The small random step, Dx, is described as being wit hin
a neighbourhood of the point x
k
. The size of the
neighbourhood should be small enough that only moder-
ate changes in the objective function occur as a result of
movement from one neighbourhood to any other adj a-
cent neighbourhood. The step, x
k
Dx, may be either to
a random point within the neighbourhood or to a
random point along the boundary of the neighbourhood.
For our implementation, the size of the neighbourhood
was de ned as 1=20 of the range for each input variable
and the step was selected as a random point within the
neighbourhood.
°
The value of the parameter
B
is used to control the
probability of accepting a detrimental step. Typically,
the value is adjusted during execution of the algorithm.
At the early stages of the algorithm, the value of
B
is set
small enough that the probability of accepting a detri-
mental step is relatively high. This allows for movement
through the input variable space. As the algorithm
progresses, the value of
B
is increased so that the
probability of moving away from the optimal solution is
reduced. The value of
B
can be adjusted according to the
nearness of the current objective function value to the
optimal objective function value, if known. Alternatively,
the
B
value can be adjusted according to the number of
steps taken. In our case, the initial value of
B
was selected
so that the probability of accepting a detrimental step was
approximately 0.95. This required some preliminary
probing of the surface to estimate the size of D
C
in the
neighbourhood of the initial point. The value of
B
was
increased by a factor of 1.11 at each stage of the
algorithm. The algorithm was allowed to progress through
50 stages.
°
The criterion for terminating the algorithm is usually
selected through a compromise of solution quality and
execution time. If the optimal objective function value is
known, then termination can occur when the current
objective function value comes within a desired margin
of error from the optimal value. If the optimal o bjective
function value is un known, termination can occur either
when a lack of improvement in the objective function
value is detected or when the number of steps taken
reaches a desired limit. In our case, each stage of the
algorithm co nsisted of 10,000 steps. In general, the
simulated annealing algorithm does not guarantee a
globally optimal solution. It is usually necessary to
con rm the optimal solution by executing the algorithm
several times beginning at different initial values, x
0
, or
simply to accept the best solution among several trials.
Process optimization studies follow the pattern of most
scienti c investigations. It is rare that the desired result is
obtained upon completion of an initial effo rt. Ty pically, a
series of progressive improvements in knowledge and tech-
nique are required for success. Th e example of an ammonia
synthesis plant is used to demonstrate this iterative optimi-
zation and validation procedure.
AMMONIA SYNTHESIS PLANT EXAMPLE
In Palmer and Realff
2
, we introduce an example optimi-
zation problem concerning the design of a process that
produces liquid ammonia from hydrogen gas and nitrogen
gas. The goal of the problem is to minimize the operating
cost for the process. Six variables are considered for their
impact upon the operating cost:
°
reaction pressure (
Pressure
, in kPa);
°
mole fraction of recycle stream sent to purge (
Purge
);
°
mole fraction of excess nitrogen in the reactor feed
(
ExcessN
2
);
Trans IChemE, Vol 80, Pa rt A, October 2002
776 PALMER and REALFF
°
temperature of the cooling water available to heat exchan-
gers (
Water
, in K);
°
mole ratio of methane to hydrogen in the hy drogen feed
stream (
Methane
);
°
mole ratio of argon to nitrogen in the nitrogen feed stream
(
Argon
).
The operating ranges for the input variables are shown in
Table 1.
We categorize the input variables into two groupings. The
Pressure
,
Purge
, and
ExcessN
2
variables are calle d control
variables. Control variable settings can be chosen durin g the
design procedure. They represent equipment speci cations
or operating conditions that we are free to manipulate. The
Water
,
Methane
, and
Argon
variables are called noise
variables. Noise variables may have a strong in uence on
process performance, but their values cannot be selected by
the process designers or operators. We include both types of
variables in the study in order to determine whether or not it
is possible to nd settings of the control variables that
minimize the negative in uences of the noise variables.
Initial Results
In Palmer and Realff
2
, we show the development of a
polynomial metamodel based upon data from an initial
study consisting of 1 6 simulation runs. Th e polynomial
metamodel is
objective ˆ
225:15
34:74
x
1
¡
30:72
x
2
22:24
x
4
13:07
x
6
212:44
x
2
1
¡
142:38
x
1
x
2
5
Each of the input variables in equation (5) is scaled to the
interval
‰¡
0:5;
0:5
Š
as follows:
x
1
ˆ
Pressure ¡
18;000
8000
x
2
ˆ
Purge ¡
3:5
3:0
x
3
ˆ
ExcessN
2
¡
25
50
x
4
ˆ
Water ¡
293
20
x
5
ˆ
Methane ¡
0:8
0:8
x
6
ˆ
Argon ¡
0:2
0:2
The
objective
model, equation (5), predicts the value of
the objective function. The predicted value has units of
millions of dollars, and is related to the annual operating
cost of the process. The operating cost is a linear function of
18 simulation output variables. The objective function is
formulated as a subset of the most important components of
the operating cost, as determined by the magnitude and
variation of their contribution. Th e objective function is
determined by only eight simulation output variables. In
Palmer and Realff
2
, we demonstrate that the objective
function faithfully tracks changes in the operating cost.
The form of the polynomial metamodel offers insight as
to relationships between the control and noise variable s. In
order for the control variables to be of use in mitigating the
negative impacts of the noise variables, interactions between
the control and noise variables must exist. Interactions
appear in polynomial models as cross-product terms. Equa-
tion (5) does not contain a cross-produ ct term involving a
control variable an d a noise variable. The only interaction
term that appears in the model involves the
x
1
and
x
2
variables, which are both control variables. As a result, we
conclude that it is not possible to minimize the n egative
in u ences of the noise variables through selection of control
variable settings.
The lack of terms involving
x
3
and
x
5
indicates that the
percent excess nitrogen and the methane content of the
hydrogen feed do not have an observable impact on opera-
ting cost over this input variable space. As a result, recom-
mendations for the settings of these variables should be
based upon considerations other than operating cost.
The linear terms involving
x
4
and
x
6
indicate that the
cooling water temperature and argon content of the nitrogen
feed do impact the operating cost. The positive coef cients
of these terms indicate that an increase in the measured
value will produce increased co st. However, since these are
noise variables, we can only take note of the trend. We
are not free to select desirable values of the noise variables.
The nominal values of the noise variab les are used t hrough-
out the remaining calculations.
The impact of the
x
1
and
x
2
variables upon the operating
cost involves seco nd-order terms. It is only necessary to
perform the canonical and ridge analyses with respect to
these variables. The eigenvalues of the second-order coef -
cients matrix indicate that the surface is a saddle system.
The ridge analysis g ives the point of minimum predicted
cost within the input variable space. The result is shown in
Table 4. The actual objective function and operating cost
values are found by performing a simulation run at the input
variable solution point.
Before discussing the quality of the solution obtained
from the polynomial metamodels, we consider the optimiza-
tion of the kriging metamodels. Palmer and Realff
2
describes a metamodel of the form sh own in equation (2)
that is generated directly from the objective function values
obtained in the initial study. A second objective function
model is also described. It is generated from eight separate
metamodels o f the form in equation (2), one for each of
the eight simulation output variables that contributes to the
objective function. The cost contribution corresponding to
the predicted value for each output variable is totaled, as
shown in equation (6) to obtain a prediction of the objective
function. The raw data for the eight output variables are
shown in Palmer
13
. The parameter estimates for the kriging
metamodels are shown in Table 2. We continue to consider
both objective function models here because the question of
preference for prediction accuracy remains u nsettled.
objective ˆ
3:0876
£
10
¡
2
^yy
1
7:962
£
10
¡
4
^yy
1
x
5
7:919
£
10
¡
4
^yy
2
7:919
£
10
¡
4
^yy
3
7:070
£
10
¡
2
^
yy
4
5:824
£
10
¡
2
^
yy
5
1:081
£
10
¡
1
^yy
6
¡
1:504
£
10
¡
2
^yy
7
¡
4:678
£
10
¡
2
^yy
8
6
Table 1. Input variable space for ammonia plant study.
Variable Working range
Reaction pressure 14,000 to 22,000kPa
% of recycle to purge 2 to 5%
ExcessN
2
in reactor feed 0 to 50%
Cooling water temperature 283 to 303 K
CH
4
: H
2
mole ratio in H
2
feed 0.4 : 94.0 to 1.2 : 94.0
Ar : N
2
mole ratio in N
2
feed 0.1 : 99.8 to 0.3 : 99.8
Trans IChemE, Vol 80, Pa rt A, October 2002
OPTIMIZATION AND VALIDATION OF METAMODELS 777
Since the form of the k riging metamodel offers little
insight about the relationships between the variables, it is
useful to calculate the effe cts. For the direct model, the main
effects are found using equation (3). To investigate the main
effects, an effect value is calculated at several locations
within each input variable’s range of values. We decided to
divide each range into 20 equally-sized, co ntiguous
segments and use the midpoint of each segment. Similarly,
the interaction effects are found using equation (4). We
decided to use a grid consisting of 400 locations to study
each interaction. The value of the interaction effect was
calculated at each combination of the 20 locations used for
the main effects of the two variables. For the eight outpu ts
model, the effect valu e was calculated for each of the
separate output variables. The o bjective function effect
was then calculated by totaling the cost contributions
represented by the effect values for the eight outputs.
Table 3 shows that the direct model and the eight outputs
model p roduce similar effect estimates. The relative magni-
tudes of the effect values indicate the degree of impact that
each input variable, or interaction, has upon the objective
function. The maximum and minimum value for each
effect is shown as a summary of the data. Among the
control
£
noise interactions, the
x
1
x
4
effect has the largest
impact. Among the control
£
control interactions, only the
x
1
x
2
effect appears important.
Figure 2 shows the marginal impact of the
x
1
and
x
2
input
variables upon the objective function. The marginal impact
is calculated as the rst term of equation (4), that is, the
main effects and ov erall average are not removed. Figure 3
shows the marginal impact of the
x
1
and
x
4
input variables.
Interactions are observed in these plots as twists in the
wireframe. Figure 2 shows that the shape of the curve
representing the impact of
x
1
changes as the value of
x
2
changes. This is an indication of the importance of the
x
1
x
2
interaction. Conversely, Figure 3 sh ows that the shape of
the
x
1
curve remains nearly unchanged as the
x
4
value
changes. This indicates that the
x
1
x
4
interaction is practi-
cally negligible.
The observations above regarding Figures 2 and 3 demon-
strate the value of a graphic presentation of the results
(Step 4 in the procedure of Welch and Sacks). The numeric
presentation of the results in Table 3 is suf cient to identify
x
1
x
2
and
x
1
x
4
as the most interesting interactions. However,
it is only with the aid of the gures that one observes the
import of the relative magnitudes of the effects.
As with the polynomial model, the effects analysis of the
kriging models indicates that there are no important inter-
actions between the control and noise variables. This means
that it is not possible to in uence th e negative impacts of the
noise variables through the selection of settings for the
control variables. We note that the
x
4
input variable does
have an impact on the objective function. Among the control
variables, the
x
1
x
2
interaction has the greatest impact. The
simulated annealing algorithm is used to identify the control
variable settings th at produce the best predicted objective
function value. The simulation is then run at these settings to
obtain the corresponding actual objective function and
operating cost values. The resu lts are sh own in Table 4.
Table 4 shows that the best predicted objective function
value is obtained from the polynomial metamodel. However,
the kriging metamodel based upon the eight separate output
variable models identi ed the input variable settings that
Table 2. Initial study parameter estimates for kriging models.
^
yy
1
^
yy
2
^
yy
3
^
yy
4
^
yy
5
^
yy
6
^
bb
yˆ
0
: Direct obj func 7.64E 00 5.05E¡01 2.17E¡05 2.04E¡01 2.55E¡04 1.28E¡02 243.441
y
ˆ
1
: Feed H
2
3.73E¡01 7.52E¡01 9.64E¡02 2.99E¡02 3.3 7E¡06 2.50E¡08 5000.27
yˆ
2
: RCM1 8.31E¡02 8.03E¡02 1.18E¡01 4.10E¡05 1.7 3E¡02 1.21E¡43 2070.62
yˆ
3
: RCM2 5.18E¡07 2.25E¡01 6.84E¡01 4.74E¡04 2.0 5E¡01 2.33E¡07 15,959.0
y
ˆ
4
: CHX1 2.18E¡01 1.44E¡01 3.48E¡03 1.55E¡03 2.1 6E¡02 8.47E¡04 387.634
yˆ
5
: D1 reboiler 4.53E 00 2.93E¡01 1.23E¡08 6.41E¡02 1.62E¡02 4.61E¡03 309. 460
y
ˆ
6
: D1 condenser 4.22E 00 2.51E¡01 7.64E¡07 4.58E¡02 1.43E¡02 3.50E¡03 285. 706
y
ˆ
7
: Waste H
2
6.48E¡02 5.67E¡02 8.05E¡03 7.03E¡06 1.2 6E¡04 4.16E¡10 41.0940
yˆ
8
: Waste CH
4
1.49E¡01 9.58E¡04 1.55E¡05 1.19E¡02 9.1 0E¡02 1.71E¡05 39.6433
Table 3. Initial study kriging model effects.
Direct model Eight output models
Min Max Min Max
Main effects
x
1
¡16.60 31.30 ¡15.92 40.64
x
2
¡11.34 9.93 ¡13.77 11.23
x
3
¡0.04 ¡0.04 ¡0.17 0.18
x
4
¡10.19 10.78 ¡8.68 9.13
x
5
¡0.08 ¡0.01 ¡2.05 1.59
x
6
¡0.04 ¡0.04 ¡1.07 1.08
Interaction effects
x
1
x
2
¡23.03 20.48 ¡27.97 27.97
x
1
x
3
¡0.57 1.08 ¡0.54 1.44
x
2
x
3
¡0.39 0.34 ¡0.56 0.56
x
1
x
4
¡6.65 6.49 ¡6.67 6.14
x
1
x
5
¡0.58 1.08 ¡3.32 3.88
x
1
x
6
¡0.57 1.07 ¡1.28 2.30
x
2
x
4
¡0.90 0.91 ¡0.65 0.65
x
2
x
5
¡0.39 0.34 ¡2.35 2.25
x
2
x
6
¡0.39 0.34 ¡0.60 0.55
x
3
x
4
¡0.35 0.37 ¡0.28 0.35
x
3
x
5
¡0.00 0.00 ¡2.14 2.22
x
3
x
6
¡0.00 ¡0.00 ¡0.01 0.07
Figure 2. Marginal impact of pressure and purge (X1X2 interaction) from
stage 1 kriging metamodel.
Trans IChemE, Vol 80, Pa rt A, October 2002
778 PALMER and REALFF
produce th e best actual objective funct ion value. For each of
the three metamodels, there is a substantial difference
between the predicted and actual objective function values.
This indicates that the metamodels may not be suf ciently
accurate representations of the simulation to identify the
optimal operating conditions.
One way to improve the approximation error of a meta-
model is to reduce the range of input variable values over
which the model is intended to apply. This approach relies
on the concept that a simple function may adequately
represent even the most complex function, if done over a
small enough range. When the input range is reduced, it is
assumed that the metamodel shape will more closely resem-
ble the shape of the simulation function, and the approxi-
mation error of the metamodel will improve. After reducing
the size of the input variable space, it is necessary to repeat
the data collection and model generation steps.
Follow-up Studies
Since both the polynomial and kriging metamodels indi-
cate that there are no exploitable interactions between the
control and noise variables, there is little rationale for
continued investigation and modeling of the noise variables.
By removing the noise variables from considerat ion, it is
possible to concentrate data collection plans on the gather-
ing of information regarding the control variables, and thus
reduce the requ ired number of simulation runs. Conse-
quently, the noise variables are set to their nominal values
for the follow-up study.
For the control variables, we note that the
x
1
x
2
interaction
has the greatest impact on the objective function and d ecide
to further reduce the input space along these variables. We
decide to cut the range of each variable to a third of that
used in the initial study. The results shown in Table 4
suggest that the best area for
pressure
is just below
19;000 kPa and the best area for
purge
is near 5%. With
this in mind, we set the input space for the second stage of
data collection to that shown in Table 5.
The data collection plan for the second stage is a
MBLHD for three input variables and e ight simulation
runs. The settings for the control variables, as wel l as the
observed operating cost and objective function values, are
shown in Table 6.
The polynomial metamodel for the objective function is
objective ˆ
237:46
¡
7:6254
x
1
¡
6:262
x
2
1
7
Each of the input variables used in tting equation (7) is
scaled to the interval
‰¡
0:5;
0:5
Š
as follows
x
1
ˆ
Pressure ¡
18;872:5
2665
;
x
2
ˆ
Purge ¡
4:5
1:0
;
x
3
ˆ
ExcessN
2
¡
25
50
Kriging metamodels of the form shown in equation (2) are
generated both directly for the objective function and
separately for each of the eight key output variabl es. The
parameter estimates are shown in Table 7. Predictions of the
objective function value are ge nerated from the eight sepa-
rate output predictions using equation (6).
The form of the polynomial model indicates that pressure,
x
1
, has the dominant impact on the objective function over
this input space. The canonical analysis shows that the
stationary point is a point of maximum objective function
value. The ridge analysis reveals the point of minimum
predicted objective function value within the input space.
Since neither purge nor excess
N
2
have an important impact
on the objective function here, they are set to their midpoint
values. Table 8 shows the results at the input variable
solution point.
Figure 3. Marginal impact of pressure and water temperature (X1X4
interaction) from stage 1 kriging metamodel.
Table 4. Initial study optimization and veri cation.
Kriging,
Direct
model
Kriging,
Eight output
models Polynomial
Best point
Pressure 18,745 19,080 18,595
Purge 5.00 5.00 5.00
ExcessN
2
0.00 14.65 25.00
Water 302.4 302.4 302.4
Methane 0.800 0.800 0.800
Argon 0.200 0.200 0.200
Predicted
Objective function 229.7 222.3 218.8
Actual
Objective function 240.3 236.1 237.7
Operating cost 239.0 235.0 236.6
Table 5. Input variable space for second stage.
Variable Working range
Reaction pressure 17,540 to 20,205kPa
% of recycle to purge 4 to 5%
Excess N
2
in reactor feed 0 to 50%
Cooling water temperature 302.4K
CH
4
: H
2
mole ratio in H
2
feed 0.8 : 94.0
Ar : N
2
mole ratio in N
2
feed 0.2 : 99.8
Table 6. Data collection plan and results for second stage.
No. Pressure Purge ExcessN
2
Operating
cost
Objective
function
1 17,695 4.81 34.45 239.1 239.9
2 18,035 4.06 28.15 237.8 238.9
3 18,370 4.44 2.95 236.8 238.1
4 18,705 4.69 9.25 237.5 238.8
5 19,040 4.31 40.75 235.3 236.7
6 19,375 4.56 47.05 234.8 235.9
7 19,710 4.94 21.85 233.1 234.3
8 20,050 4.19 15.55 231.4 232.9
Trans IChemE, Vol 80, Pa rt A, October 2002
OPTIMIZATION AND VALIDATION OF METAMODELS 779
The calculated effect estimates from the kriging meta-
models indicate that the
x
1
x
3
interaction is dominant over
the input space. Figure 4 shows the marginal impact of
the pressure,
x
1
, and excess N
2
,
x
3
, input variables. Since the
purge does not have an important impact here, it is set to its
midpoint value. The simulated annealing algorithm is used
to identify the point of minimum objective function value
for each o f the two kriging metamodels. The results are
shown in Table 8.
For all three metamodels, the predicted objective function
value given in Table 8 is much closer to the actual result
than was observed in the initial study, Table 4. This suggests
that size of the input space is now appropriate to produce
models of accep table approximation e rror. However, the
pressure is at the upper limit of the working range selected
for the second stage; and, the models all predict that better
results could be obtained if the pressure were increased
further. Since the working range for the initial study extends
to greater pressures, the second stage result indicates that the
location of the input variable space should be shifted along
this variable.
A third stage of d ata collection and modeling is now
performed. The input variable space is similar to that used
for the second stage. The pressure range is shifted upwards
and is now from 19,335 to 22;000 kPa. The other variables
remain as shown in Table 5. The data collection plan for the
third stage is another MBLHD for three input variables and
eight simulation runs. The settings for the control variables,
as well as the observed op erating cost and objective function
values, are shown in Table 9.
The polynomial metamodel for the objective function is
objective ˆ
228:86
¡
8:136
x
1
18:637
x
2
1
where
x
1
ˆ
Pressure ¡
20;667:5
2665
Kriging metamodels are generated both directly for the
objective function and separately for each of the eight key
output variables. The parameter estimates are shown in
Table 10. As before, equation (6) is used to obtain predic-
tions of the objective function based upon predictions of the
eight key output variables.
The form of the polynomial model once again indicates
that pressure,
x
1
, has the dominant impact on the objective
function over this input space. The canonical analysis for the
third stag e model shows that the stationary point does
produce the minimum predicted objective function valu e.
The purge and excess N
2
are set to their midpoint valu es.
Table 11 shows the result at the predicted optimal lo cation.
The calculated effect estimates from the direct kriging
metamodel indicate that there are no important interaction
effects. Figure 5 shows the marginal impact of each of the
input variables separately. The simulated annealing algo-
rithm is used to nd the recommended pressure value. The
preferred purge and excess N
2
values are selected by
inspection. For the eight output kriging metamodel, the
Table 7. Second stag e parameter estimates for kriging models.
^
yy
1
^
yy
2
^
yy
3
^
bb
yˆ
0
: Direct o bj func 7.67E 00 2.66E¡05 1.17E 00 236.805
y
ˆ
1
: Feed H
2
1.05E¡02 4.43E¡02 3.42E¡02 4991.98
yˆ
2
: RCM1 1.95E 00 1.15E¡03 1.54E 01 4932.13
yˆ
3
: RCM2 1.55E¡04 6.08E¡02 1.06E¡01 19,345.8
y
ˆ
4
: CHX1 9.63E¡01 1.35E¡01 4.95E¡01 272.298
yˆ
5
: D1 reboiler 4.78E¡01 6.02E¡03 5.98E¡02 298.791
y
ˆ
6
: D1 condenser 5.09E¡01 9.80E¡03 6.29E¡02 281.474
y
ˆ
7
: Waste H
2
5.24E¡03 3.00E¡02 1.17E¡02 10.3473
yˆ
8
: Waste CH
4
7.59E¡01 8.54E¡02 1.26E¡01 42.2015
Table 8. Second stage optimization results.
Kriging,
Direct
model
Kriging,
Eight output
models Polynomial
Best Point
Pressure 20,205 20,205 20,205
Purge 4.50 4.51 4.50
ExcessN
2
13.95 0.00 25.00
Predicted
Objective function 232.7 230.7 232.1
Actual
Objective function 233.2 229.9 232.0
Operating cost 231.8 228.5 230.6
Figure 4. Marginal impact of pressure and excess N
2
(X1X3 interaction)
from stage 2 kriging metamodel.
Table 9. Data collection plan and results for third stage.
No. Pressure Purge ExcessN
2
Operating
cost
Objective
function
1 19,490 4.31 21.85 233.9 235.2
2 19,830 4.81 40.75 233.6 234.6
3 20,165 4.56 2.95 228.9 230.3
4 20,500 4.06 34.45 230.0 231.4
5 20,835 4.94 15.55 225.2 226.6
6 21,170 4.44 47.05 226.4 227.7
7 21,505 4.19 9.25 226.8 228.3
8 21,845 4.69 28.15 227.8 229.2
Trans IChemE, Vol 80, Pa rt A, October 2002
780 PALMER and REALFF
calculated effect estimates indicate that there is an interac-
tion effect between pressure,
x
1
, and excess N
2
,
x
3
. Figure 6
shows the margin al impact for the
x
1
x
3
interaction. The
purge,
x
2
, also has a separate impact on the objective
function, similar to that shown in Figure 5. The simu lated
annealing algorithm is used to nd the recommended
settings for pressure and excess
N
2
. The preferred value
for purge is selected by inspection. The results of the
optimization study are shown in Table 11.
For each of the models, the predicted and actual objective
function values are not as close to each o ther as was
observed for the second stage, but are still in substantially
better agree ment than was observed in the initial study. The
smallest approximation error is produced by the eight output
kriging metamodel. This result is suf ciently accurate to
indicate that there is no need to adjust the size of the input
space. For each solution, the control variable settings are
either well within the working range for the third stage or at
a limit of the working range for the initial study. This
indicates that there is no need to adjust the location of the
input space. Next, we consider the quality of these nal
results.
The intent of the optimization is to min imize the operat-
ing cost of the ammonia plant. The solution obtained by the
eight output kriging model has the smallest operating cost
among the metamodels used. Since the eight output kriging
model also produces the smallest approximation error for
the objective function value among the metamodels used, it
is clear that this is the best solution we have obtained.
The right-hand column in Table 11 co ntains the best
result reported by Morari and Grossmann
5
in their design
case study. Their result was obtained by trial and error,
presumably carried out by several student design teams. The
operating cost and objective function value for this best
result are very close to those obtained by the eight output
kriging model solution. The pressure settin g given by the
eight output krig ing model solution is essentially th e same
as that given for the best result. The increased purge rate is
necessitated by the use of excess N
2
, which the Morari and
Grossmann study
5
did not consider.
Table 10. Third stage parameter estimates for kriging models.
^
yy
1
^
yy
2
^
yy
3
^
bb
yˆ
0
: Direct ob j func 4.52E00 1.46E¡01 2.97E¡01 231.614
y
ˆ
1
: Feed H
2
1.62E¡03 5.92E¡02 2.39E¡01 4998.71
yˆ
2
: RCM1 1.43E¡02 1.19E¡03 2.60E¡01 6219.05
yˆ
3
: RCM2 5.14E¡05 1.50E¡01 8.28E¡02 14,739.1
y
ˆ
4
: CHX1 8.59E¡01 2.37E¡01 1.89E¡01 250.249
yˆ
5
: D1 reboiler 4.75E 00 1.04E¡01 4.00E¡01 276.442
y
ˆ
6
: D1 condenser 5.62E 00 1.12E¡01 4.57E¡01 261.709
y
ˆ
7
: Waste H
2
3.47E¡02 1.66E¡01 5.82E¡02 25.5922
yˆ
8
: Waste CH
4
1.47E 00 1.06E¡06 2.58E¡01 41.1754
Table 11. Third stage optimization results.
Kriging,
Direct
model
Kriging,
Eight output
models Polynomial
Morari and
Grossmann
5
Best point
Pressure 20,980 21,040 21,250 21,000
Purge 5.00 5.00 4.50 4.00
ExcessN
2
0.00 8.77 25.00 0.00
Predicted
Objective function 226.0 226.1 2 28.0
Actual
Objective function 231.1 227.0 2 30.3 226.7
Operating cost 229.6 225.6 228.9 225.1
Figure 5. Marginal impacts of pressure (solid), purge (d ashed), and excess
N
2
(dotted) from stage 3, direct kriging metamodel.
Figure 6. Marginal impact of pressure and excess N
2
(X1X3 interaction)
from stage 3, eight output kriging metamodel.
Trans IChemE, Vol 80, Pa rt A, October 2002
OPTIMIZATION AND VALIDATION OF METAMODELS 781
SUMMARY
The example of an ammonia synthesis plant has been
used to demonstrate that metamodel ing can be an effective
approach for solving chemical process design optimizat ion
problems. A total of six input variables, three control
variables and three noise variables, were investigated for
their impact on the operating cost of the plant. The method
produced a solution equivalent to the previously reported
best result while using a total of only 32 simulation runs to
collect data for three stages of metamodel generation. At
each stage, the best solution was produced by a model that
combined predictions from separate kriging metamodels of
basic output variables.
The metamodeling approach resembles traditional
Response Surface Methodology (RSM). The data collection
and modeling stages proceed sequentially. The size and
=
or
location of the input variable space is adjusted from stage to
stage. The results from one stage inform the selection of a
re ned space fo r the next stage. The metamodeling
approach differs from RSM in the tools that are used to
explore the deterministic simulations input–output relation-
ships. Derivatives of strati ed sampling, such as the Mini-
mum Bias Latin Hypercube Design (MBLHD), are used as
the data collection plans. Simple interpolating functions,
such as the kriging model, are used for the surface model-
ing. Randomized search techniques, such as the simulated
annealing algorithm, are used to optimize the surface model.
There are opportunities to investigate re nements of the
tools. Improvements in metamo del approximation error
could be obtained by use of interpo lating functions with
greater exibility. Data collection plans could be optimized
for use with speci c metamodel forms. The trade-off
between approximation error and th e number of simulation
runs in the data collection plan could be detailed. Improve-
ments in the ef ciency of the optimization algorithm could
be found by considering alternative global optimization
techniques. There is a need to demonstrate the application
of the tools to problems with a larger number of input
variables and problems involving integer input variables.
Also, automation of the process would make it less cumber-
some to implement.
However, the appeal of the metamodeling approach
should be obvious. This technique provides a means to
optimize process designs while requiring only a small
number of simulation runs. The use of computer simulation
in process design can only be expected to increase in the
future. The complexity of processes that we wish to simulate
and optimize will l ikewise increase. As engineers, we
should be receptive to the usefulness of a simplifying
approximation.
REFERENCES
1. Welch, W. J. and Sacks, J., 1991, A system for quality improvement via
computer experiments, Comm Stat Theory Methods, 20 (2): 477–495.
2. Palmer, K. and Realff, M., 2002, Metamodeling approach to optimiza-
tion of steady-state owsheet simulations: Model generation, Trans
IChemE, Part A, Chem Eng Res Des, 80(A7): 760–772.
3. Palmer, K. and Tsui, K.-L., 2001, A minimum bias latin hypercube
design, IEE Trans, 33(9): 793–808.
4. Cressie, N. A. C., 1993, Statistics for Spatial Data (John Wiley & Sons,
Inc., New York, USA).
5. Morari, M. and Grossman, I. E. (eds), 1995, Design of an Ammonia
Synthesis Plant, Volume 2 of CACHE Process Design Case Studies
(CACHE Corp., Austin, USA).
6. Box, G. E. P. and Draper, N. R., 1987, Empirical Model-Building and
Response Surfaces (John Wiley & Sons, New York, USA).
7. Myers, R. H. and Montgomery, D. C., 1995, Response Surface
Methodology: Product and Process Optimization Using Designed
Experiments (John Wiley & Sons, New York, USA).
8. Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P., 1989, Design and
analysis of computer experiments, Statist Sci, 4: 409–435.
9. Kirkpatrick, S., Gelatt, C. and Vecchi, M., 1983, Optimization by
simulated annealing, Science, 220: 671–680.
10. van Laarhoven, P. J. M. and Aarts, E. H. L., 1987, Simulated Annealing:
Theory and Applications (Reidel, The Netherlands).
11. Painton, L. A. and Diwekar, U. M., 1994, Synthesizing optimal
design con gurations for a Brayton cycle power plant, Comp Chem
Eng, 18(5): 369–381.
12. Bohachevsky, I. O., Johnson, M. E. and Stein, M. L., 1995, Simulated
Annealing and generalizations, in Kalivas, J. H. (ed). Adaption of
Simulated annealing to chemical Optimization Problems, 324
(Elsevier Science, Amsterdam, The Netherlands).
13. Palmer, K., 1998, Data Collection Plans and Meta Models for Chemical
Process Flowsheet Simulators, PhD Thesis (Georgia Institute of
Technology, Atlanta, USA).
ACKNOWLEDGEMENT
We wish to thank Russell Barton (Pennsylvania State University) for
providing us with a copy of his FORTRAN code for estimating kriging
model parameters.
ADDRESS
Correspondence concerning this paper should be addressed to Professo r
K. Palmer, Department of Industrial and Systems Engineering, University
of Southern California, Los Angeles, California 90089-0193, USA.
E-mail: kpalmer@usc.edu
The manuscript was received 15 October 2001 and accepted for
publication after revision 5 September 2002.
Trans IChemE, Vol 80, Pa rt A, October 2002
782 PALMER and REALFF