Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/ert/includes/Sanitizer.php on line 1470
Creating a configuration file for ERT - Ert

# Creating a configuration file for ERT

## General overview

The enkf application is started with a single argument, which is the name of the configuration file to be used. The enkf configuration file serves several purposes, which are:

• Defining which ECLIPSE model to use, i.e. giving a data, grid and schedule file.
• Defining which observation file to use.
• Defining how to run simulations.
• Defining where to store results.
• Creating a parametrization of the ECLIPSE model.

The configuration file is a plain text file, with one statement per line. The first word on each line is a keyword, which then is followed by a set of arguments that are unique to the particular keyword. Except for the DEFINE keyword, ordering of the keywords is not significant. Similarly to ECLIPSE data files, lines starting with "--" are treated as comments.

The keywords in the enkf configuration file can roughly be divded into two groups:

• Basic required keywords not related to parametrization. I.e., keywords giving the data, grid, schedule and observation file, defining how to run simulations and how to store results. These keywords are described in Basic required keywords.
• Basic optional keywords not related to parametrization. These keywords are described in Basic optional keywords.
• Keywords related to parametrization of the ECLIPSE model. These keywords are described in Parametrization keywords.
• Advanced keywords not related to parametrization. These keywords are described in Advanced optional keywords.

## List of keywords

Keyword name Required by user? Default value Purpose
ADD_FIXED_LENGTH_SCHEDULE_KW No Supporting unknown SCHEDULE keywords.
ANALYSIS_COPY No Create new instance of analysis module.
ANALYSIS_SET_VAR No Set analysis module internal state variable.
ANALYSIS_SELECT No STD_ENKF Select analysis module to use in update.
CASE_TABLE No For running sensitivies you can give the cases descriptive names.
CONTAINER No ...
DATA_FILE Yes Provide an ECLIPSE data file for the problem.
DATA_KW No Replace strings in ECLIPSE .DATA files.
DBASE_TYPE No BLOCK_FS Which 'database' system should be used for storage.
DEFINE No Define keywords with config scope.
DELETE_RUNPATH No Explictly tell ert to delete the runpath when a job is complete.
ECLBASE Yes Define a name for the ECLIPSE simulations.
END_DATE No You can tell ERT how the long the simulation should be - for error checking.
ENKF_ALPHA No 1.50 Parameter controlling outlier behaviour in EnKF algorithm.
ENKF_BOOTSTRAP No FALSE Should we bootstrap the Kalman gain estimate.
ENKF_CROSS_VALIDATION No ...
ENKF_CV_FOLDS No 10 Number of folds used in the Cross-Validation scheme.
ENKF_FORCE_NCOMP No FALSE Should we want to use a specific subspace dimension?
ENKF_KERNEL_REGRESSION No FALSE ....
ENKF_KERNEL_FUNCTION No 1 ....
ENKF_KERNEL_PARAM No 1 ....
ENKF_LOCAL_CV No FALSE Should we estimate the subspace dimension using Cross-Validation.
ENKF_MERGE_OBSERVATIONS No FALSE Should observations from many times be merged together.
ENKF_MODE No STANDARD Which EnKF algorithm should be used.
ENKF_NCOMP No 1 Dimension of the reduced order subspace (If ENKF_FORCE_NCOMP = TRUE)
ENKF_PEN_PRESS No FALSE Should we want to use a penalised PRESS statistic in model selection?
ENKF_RERUN No FALSE Should the simulations be restarted from time zero after each update.
ENKF_SCALING No TRUE Do we want to normalize the data ensemble to have unit variance?
ENKF_SCHED_FILE No Allows fine-grained control of the time steps in the simulation.
ENKF_TRUNCATION No 0.99 Cutoff used on singular value spectrum.
ENSPATH No storage Folder used for storage of simulation results.
FORWARD_MODEL No Add the running of a job to the simulation forward model.
GEN_DATA No Specify a general type of data created/updated by the forward model.
GEN_KW No Add a scalar parameter.
GEN_KW_TAG_FORMAT No <%s> Format used to add keys in the GEN_KW template files.
GEN_KW_EXPORT_FILE No parameter.txt Name of file to export GEN_KW parameters to.
GEN_PARAM No Add a general parameter.
GRID Yes Provide an ECLIPSE grid for the reservoir model.
HISTORY_SOURCE No REFCASE_HISTORY Source used for historical values.
HOST_TYPE No ...
IGNORE_SCHEDULE No ...
IMAGE_TYPE No png The type of the images created when plotting.
IMAGE_VIEWER No /usr/bin/display External program spawned to view images.
INIT_SECTION Yes Initialization code for the reservoir model.
INSTALL_JOB No Install a job for use in a forward model.
ITER_CASE No ITERATED_ENSEMBLE_SMOOTHER%d Case name format - iterated ensemble smoother
ITER_COUNT No 4 Number of iterations - iterated ensemble smoother
ITER_RETRY_COUNT No 4 Number of retries for a iteration - iterated ensemble smoother
JOBNAME No Name used for simulation files. An alternative to ECLBASE.
JOB_SCRIPT No Python script managing the forward model.
KEEP_RUNPATH No Specify realizations that simulations should be kept for.
LOCAL_CONFIG No A file with configuration information for local analysis.
LOG_FILE No log Name of log file
LOG_LEVEL No 1 How much logging?
LSF_QUEUE No normal Name of LSF queue.
LSF_RESOURCES No ...
LSF_SERVER No Set server used when submitting LSF jobs.
MAX_ITER_COUNT No Maximal number of iterations - iterated ensemble smoother.
MAX_RESAMPLE No 1 How many times should ert resample & retry a simulation.
MAX_RUNNING_LOCAL No The maximum number of running jobs when running locally.
MAX_RUNNING_LSF No The maximum number of simultaneous jobs submitted to LSF.
MAX_RUNNING_RSH No The maximum number of running jobs when using RSH queue system.
MAX_RUNTIME No 0 Set the maximum runtime in seconds for a realization.
MAX_SUBMIT No 2 How many times should the queue system retry a simulation.
MIN_REALIZATIONS No 0 Set the number of minimum reservoir realizations to run before long running realizations are stopped. Keyword STOP_LONG_RUNNING must be set to TRUE when MIN_REALIZATIONS are set.
NUM_REALIZATIONS Yes Set the number of reservoir realizations to use.
OBS_CONFIG No File specifying observations with uncertainties.
PLOT_DRIVER No PLPLOT Which plotting system should be used.
PLOT_ERRORBAR No FALSE Should errorbars on observations be plotted?
PLOT_ERRORBAR_MAX No 25 Show error bars if less than this number of observations.
PLOT_HEIGHT No 768 Pixel height of the plots.
PLOT_PATH No plots Path to where the plots are stored.
PLOT_REFCASE No TRUE TRUE (IF you want to plot the listed reference cases) FALSE if not.
PLOT_REFCASE_LIST No Deprecated. Use REFCASE_LIST instead.
PLOT_WIDTH No 1024 Pixel width of the plots.
PRE_CLEAR_RUNPATH No FALSE Should the runpath be cleared before initializing?
QC_PATH No QC ...
QC_WORKFLOW No Name of existing workflow to do QC work.
QUEUE_SYSTEM No System used for running simulation jobs.
REFCASE No (but see HISTORY_SOURCE and SUMMARY) Reference case used for observations and plotting.
REFCASE_LIST No Full path to Eclipse .DATA files containing completed runs (which you can add to plots)
REPORT_CONTEXT No Values for variables used in report templates.
REPORT_GROUP_LIST No Specify list of groups used in report templates.
REPORT_LARGE No FALSE
REPORT_LIST No List of templates used for creating reports.
REPORT_PATH No Reports Directory where final reports are stored.
REPORT_SEARCH_PATH No Search path for report templates.
REPORT_TIMEOUT No 120 Timeout for running LaTeX.
REPORT_WELL_LIST No Specify list of wells used in report templates.
RERUN_PATH No ...
RERUN_START No 0 ...
RFT_CONFIG No Config file specifying wellnames and dates for rft-measurments. Used for plotting. The format has to be name day month year (ex. Q-2FI 02 08 1973), with a new entry on a new line.
RFTPATH No rft Path to where the rft well observations are stored
RSH_COMMAND No Command used for remote shell operations.
RSH_HOST No Remote host used to run forward model.
RUNPATH No simulations/realization%d Directory to run simulations
RUN_TEMPLATE No Install arbitrary files in the runpath directory.
SCHEDULE_FILE Yes Provide an ECLIPSE schedule file for the problem.
SCHEDULE_PREDICTION_FILE No Schedule prediction file.
SELECT_CASE No The current case / default You can tell ert to select a particular case on bootup.
SETENV No You can modify the UNIX environment with SETENV calls.
SINGLE_NODE_UPDATE No FALSE ...
STD_CUTOFF No 1e-6 ...
STOP_LONG_RUNNING No FALSE Stop long running realizations after minimum number of realizations (MIN_REALIZATIONS) have run.
STORE_SEED No File where the random seed used is stored.
SUMMARY No Add summary variables for internalization.
SURFACE No Surface parameter read from RMS IRAP file.
TORQUE_QUEUE No ...
UPDATE_LOG_PATH No update_log Summary of the EnKF update steps are stored in this directory.
UPDATE_PATH No Modify a UNIX path variable like LD_LIBRARY_PATH.
UPDATE_RESULTS No FALSE ...
WORKFLOW_JOB_DIRECTORY No Directory containing workflow jobs.

## Basic required keywords

These keywords must be set to make the enkf function properly.

##### DATA_FILE

This is the name of ECLIPSE data file used to control the simulations. The data file should be prepared according to the guidelines given in Preparing an ECLIPSE reservoir model for use with enkf.

Example:

  -- Load the data file called ECLIPSE.DATA
DATA_FILE ECLIPSE.DATA

##### ECLBASE

The ECLBASE keyword sets the basename used for the ECLIPSE simulations. It can (and should, for your convenience) contain a %d specifier, which will be replaced with the realization numbers when running ECLIPSE. Note that due to limitations in ECLIPSE, the ECLBASE string must be in strictly upper or lower case.

Example:

  -- Use MY_VERY_OWN_OIL_FIELD-0 etc. as basename.
-- When ECLIPSE is running, the %d will be,
-- replaced with realization number, giving:
--
-- MY_VERY_OWN_OIL_FIELD-0
-- MY_VERY_OWN_OIL_FIELD-1
-- MY_VERY_OWN_OIL_FIELD-2
-- ...
-- and so on.
ECLBASE MY_VERY_OWN_OIL_FIELD-%d

##### JOBNAME

As an alternative to the ECLBASE keyword you can use the JOBNAME keyword; in particular in cases where your forward model does not include ECLIPSE at all that makes more sense. If JOBANME is used instead of ECLBASE the same rules of no-mixed-case apply.

##### GRID

This is the name of an existing GRID/EGRID file for your ECLIPSE model. If you had to create a new grid file when preparing your ECLIPSE reservoir model for use with enkf, this should point to the new .EGRID file.

Example:

  -- Load the .EGRID file called MY_GRID.EGRID
GRID MY_GRID.EGRID

##### INIT_SECTION

The INIT_SECTION keyword is used to handle initialization of the ECLIPSE run. See the documentation of the Initialization for more details on why this has to be done. The keyword can be used in two different ways:

• If it is set to the name of an existing file, the contents of this file will be used for the initialization.
• If it is set to the name of a non-existing file, it will be assumed that a file with this name in the simulation folder will be generated when simulations are submitted, either by the enkf application itself, or by some job installed by the user (see INSTALL_JOB). This generated file will then be used by ECLIPSE for initialization.

Example A:

  -- Use the contents of the file parameters/EQUIL.INC for initialization
INIT_SECTION params/EQUIL.INC


Example B:

  -- Use a generated file for the initialization
INIT_SECTION MY_GENERATED_EQUIL_KEYWORD.INC

##### NUM_REALIZATIONS

This is just the size of the ensemble, i.e. the number of realizations/members in the ensemble.

Example:

  -- Use 200 realizations/members
NUM_REALIZATIONS 200

##### SCHEDULE_FILE

This keyword should be the name a text file containing the SCHEDULE section of the ECLIPSE data file. It should be prepared in accordance with the guidelines given in Preparing an ECLIPSE reservoir model for use with enkf. This SCHEDULE section will be used to control the ECLIPSE simulations. You can optionally give a second filename, which is the name of file which will be written into the directories for running ECLIPSE.

Example:

  -- Parse MY_SCHEDULE.SCH, call the generated file ECLIPSE_SCHEDULE.SCH
SCHEDULE_FILE MY_SCHEDULE.SCH ECLIPSE_SCHEDULE.SCH


Observe that the SCHEDULE_FILE keyword is only required when you need ERT to stop and restart your simulations; i.e. when you are using the EnKF algorithm. If you are only using ERT to your simulations; or using smoother update it is recommended to leave the SCHEDULE_FILE keyword out. In that case you must make sure that the ECLIPSE datafile correctly includes the SCHEDULE section.

## Basic optional keywords

These keywords are optional. However, they serve many useful purposes, and it is recommended that you read through this section to get a thorough idea of what's possible to do with the enkf application.

##### DATA_KW

The keyword DATA_KW can be used for inserting strings into placeholders in the ECLIPSE data file. For instance, it can be used to insert include paths.

Example:

  -- Define the alias MY_PATH using DATA_KW. Any instances of <MY_PATH> (yes, with brackets)
-- in the ECLIPSE data file will now be replaced with /mnt/my_own_disk/my_reservoir_model
-- when running the ECLIPSE jobs.
DATA_KW  MY_PATH  /mnt/my_own_disk/my_reservoir_model


The DATA_KW keyword is of course optional. Note also that the enkf has some built in magic strings.

##### DELETE_RUNPATH

When the enkf application is running it creates directories for the ECLIPSE simulations, one for each realization. When the simulations are done, the enkf will load the results into it's internal database. If you are using the enkf application as a convenient way to start many simulations, e.g. using the screening experiment option, the default behavior is to not delete these simulation directories. This behavior can be overridden with the DELETE_RUNPATH keyword, which causes enkf to delete the specified simulation directories. When running the EnKF algorithm, the behavior is the opposite. The keyword KEEP_RUNPATH can then be used to override the default behavoir.

Example A:

  -- Delete simulation directories 0 to 99
DELETE_RUNPATH 0-99


Example B:

  -- Delete simulation directories 0 to 10 as well as 12, 15 and 20.
DELETE_RUNPATH 0 - 10, 12, 15, 20


The DELETE_RUNPATH keyword is optional.

##### ENKF_SCHED_FILE

When the enkf application runs the EnKF algorithm, it will use ECLIPSE to simulate one report step at a time, and do an update after each step. However, in some cases it will be beneficial to turn off the EnKF update for some report steps or to skip some steps completely. The keyword ENKF_SCHED_FILE can point to a file with a more advanced schedule for when to perform the updates. The format of the file pointed to by ENKF_SCHED_FILE should be plain text, with one entry per line. Each line should have the following form:

REPORT_STEP1   REPORT_STEP2   ON|OFF    STRIDE
...


Here REPORT_STEP1 and REPORT_STEP2 are the first and last report steps respectively and ON|OFF determines whether the EnKF analysis should be ON or OFF, the STRIDE argument is optional. If the analysis is ON the stride will default to REPORT_STEP2 minus REPORT_STEP1, thus if you want to perform analysis at each report step set stride equal to 1. Observe that whatever value of stride is used, the integration will always start on REPORT_STEP1 and end on REPORT_STEP2. Example:

0     100   OFF
100   125   ON     5
125   200   ON     1


In this example, the enkf application will do the following:

1. Simulate directly from report step 0 to report step 100. No EnKF update will be performed.
2. From report step 100 to report step 125 it will simulate five report steps at a time, doing EnKF update at report steps 105, 110, 115, 120 and 125.
3. From report step 125 to report step 200 it will simulate one report step at a time, doing EnKF update for every timestep.

The ENKF_SCHED_FILE keyword is optional.

##### END_DATE

When running a set of models from beginning to end ERT does not now in advance how long the simulation is supposed to be, it is therefor impossible beforehand to determine which restart file number should be used as target file, and the procedure used for EnKF runs can not be used to verify that an ECLIPSE simulation has run to the end.

By using the END_DATE keyword you can tell ERT that the simulation should go at least up to the date given by END_DATE, otherwise they will be regarded as failed. The END_DATE does not need to correspond exactly to the end date of the simulation, it must just be set so that all simulations which go to or beyond END_DATE are regarded as successfull.

Example:

END_DATE  10/10/2010


With this END_DATE setting all simulations which have gone to at least 10.th of October 2010 are OK.

##### ENSPATH

The ENSPATH should give the name of a folder that will be used for storage by the enkf application. Note that the contents of this folder is not intended for human inspection. By default, ENSPATH is set to "storage".

Example:

  -- Use internal storage in /mnt/my_big_enkf_disk
ENSPATH /mnt/my_big_enkf_disk


The ENSPATH keyword is optional.

##### SELECT_CASE

By default ert will remember the selected case from the previous run, or select the case "default" if this is the first time you start a project. By using the SELECT_CASE keyword you can tell ert to start up with a particular case. If the requested case does not exist ert will ignore the SELECT_CASE command, the case will not be created automagically.

##### HISTORY_SOURCE

In the observation configuration file you can enter observations with the keyword HISTORY_OBSERVATION; this means that ERT will the observed 'true' values from the model history. Practically the historical values can be fetched either from the SCHEDULE file or from a reference case. What source to use for the historical values can be controlled with the HISTORY_SOURCE keyword. The different possible values for the HISTORY_SOURCE keyword are:

REFCASE_HISTORY
This is the default value for HISTORY_SOURCE, ERT will fetch the historical values from the ***H keywords in the refcase summary, e.g. observations of WGOR:OP_1 is based the WGORH:OP_1 vector from the refcase summary.
REFCASE_SIMULATED
In this case the historical values are based on the simulated values from the refcase, this is mostly relevant when a you want compare with another case which serves as 'the truth'.
SCHEDEULE
Load historical values from the WCONHIST and WCONINJE keywords in the Schedule file.

When setting HISTORY_SOURCE to either REFCASE_SIMULATED or REFCASE_HISTORY you must also set the REFCASE variable to point to the ECLIPSE data file in an existing reference case (should be created with the same schedule file as you are using now).

Example:

-- Use historic data from reference case
HISTORY_SOURCE  REFCASE_HISTORY
REFCASE         /somefolder/ECLIPSE.DATA


The HISTORY_SOURCE keyword is optional.

##### REFCASE

With the REFCASE key you can supply ert with a reference case which can be used for observations (see HISTORY_SOURCE), if you want to use wildcards with the SUMMARY keyword you also must supply a REFCASE keyword. The REFCASE keyword should just point to an existing ECLIPSE data file; ert will then look up and load the corresponding summary results.

Example:

-- The REFCASE keyword points to the datafile of an existing ECLIPSE simulation.
REFCASE   /path/to/somewhere/SIM_01_BASE.DATA

##### INSTALL_JOB

The INSTALL_JOB keyword is used to learn the enkf application how to run external applications and scripts, i.e. defining a job. After a job has been defined with INSTALL_JOB, it can be used with the FORWARD_MODEL keyword. For example, if you have a script which generates relative permeability curves from a set of parameters, it can be added as a job, allowing you to do history matching and sensitivity analysis on the parameters defining the relative permeability curves.

The INSTALL_JOB keyword takes two arguments, a job name and the name of a configuration file for that particular job.

Example:

  -- Define a Lomeland relative permeabilty job.
-- The file jobs/lomeland.txt contains a detailed
-- specification of the job.
INSTALL_JOB LOMELAND jobs/lomeland.txt


The configuration file used to specify an external job is easy to use and very flexible. It is documented in Customizing the simulation workflow in enkf.

The INSTALL_JOB keyword is optional.

##### KEEP_RUNPATH

When the enkf application is running it creates directories for the ECLIPSE simulations, one for each realization. If you are using the enkf application to run the EnKF algorithm, the default behavior is to delete these directories after the simulation results have been internalized. This behavior can be overridden with the KEEP_RUNPATH keyword, which causes enkf to keep the specified simulation directories. When running the enkf application as a convenient way to start many simulations, e.g. using the screening experiment option, the behavior is the opposite, and can be overridden with the DELETE_RUNPATH keyword.

Example:

 -- Keep simulation directories 0 to 15 and 18 and 20
KEEP_RUNPATH 0-15, 18, 20


The KEEP_RUNPATH keyword is optional.

##### OBS_CONFIG

The OBS_CONFIG key should point to a file defining observations and associated uncertainties. The file should be in plain text and formatted according to the guidelines given in Creating an observation file for use with enkf.

Example:

  -- Use the observations in my_observations.txt
OBS_CONFIG my_observations.txt


The OBS_CONFIG keyword is optional, but for your own convenience, it is strongly recommended to provide an observation file.

##### RESULT_PATH

The enkf application will print some simple tabulated results at each report step. The RESULT_PATH keyword should point to a folder where the tabulated results are to be written. It can contain a %d specifier, which will be replaced with the report step by enkf. The default value for RESULT_PATH is "results/step_%d".

Example:

  -- Changing RESULT_PATH
RESULT_PATH my_nice_results/step-%d


The RESULT_PATH keyword is optional.

##### RUNPATH

The RUNPATH keyword should give the name of the folders where the ECLIPSE simulations are executed. It should contain at least one %d specifier, which will be replaced by the realization number when the enkf creates the folders. Optionally, it can contain one more %d specifier, which will be replaced by the iteration number.

By default, RUNPATH is set to "simulations/realization-%d".

Example A:

  -- Giving a RUNPATH with just one %d specifer.
RUNPATH /mnt/my_scratch_disk/realization-%d


Example B:

  -- Giving a RUNPATH with two %d specifers.
RUNPATH /mnt/my_scratch_disk/realization-%d/iteration-%d


The RUNPATH keyword is optional.

##### RUNPATH_FILE

When running workflows based on external scripts it is necessary to 'tell' the external script in some way or another were all the realisations are located in the filesystem. Since the number of realisations can be quite high this will easily overflow the commandline buffer; the solution which is used is therefor to let ert write a reagular file which looks like this:

0   /path/to/realisation0   CASE0
1   /path/to/realisation1   CASE1
...
N   /path/to/realisationN   CASEN


the path to this file can then be passed to the scripts using the magic string <RUNPATH_FILE>. The RUNPATH_FILE will by default be stored as .ert_runpath_list in the same directory as the configuration file, but you can set it to something else with the RUNPATH_FILE key.

## Keywords controlling the simulations

#### MIN_REALIZATIONS

MIN_REALIZATIONS is the minimum number of realizations that must have succeeded for the simulation to be regarded as a success.

MIN_REALIZATIONS can also be used in combination with STOP_LONG_RUNNING, see the documentation for STOP_LONG_RUNNING for a description of this.

Example:

  MIN_REALIZATIONS  20


The MIN_REALIZATIONS key can also be set as a percentage of NUM_REALIZATIONS

  MIN_REALIZATIONS  10%


The MIN_REALIZATIONS key is optional.

#### STOP_LONG_RUNNING

The STOP_LONG_RUNNING key is used in combination with the MIN_REALIZATIONS key to control the runtime of simulations. When STOP_LONG_RUNNING is set to TRUE, MIN_REALIZATIONS is the minimum number of realizations run before the simulation is stopped. After MIN_REALIZATIONS have succeded successfully, the realizatons left are allowed to run for 25% of the average runtime for successfull realizations, and then killed.

Example:

  -- Stop long running realizations after 20 realizations have succeeded
MIN_REALIZATIONS  20
STOP_LONG_RUNNING TRUE


The STOP_LONG_RUNNING key is optional. The MIN_REALIZATIONS key must be set when STOP_LONG_RUNNING is set to TRUE.

#### MAX_RUNTIME

The MAX_RUNTIME keyword is used to control the runtime of simulations. When MAX_RUNTIME is set, a job is only allowed to run for MAX_RUNTIME, given in seconds. A value of 0 means unlimited runtime.

Example:

  -- Let each realizations run for 50 seconds
MAX_RUNTIME 50


The MAX_RUNTIME key is optional.

## Parametrization keywords

The keywords in this section are used to define a parametrization of the ECLIPSE model. I.e., defining which parameters to change in a sensitivity analysis and/or history matching project. For some parameters, it necessary to specify a prior distribution. See Prior distributions available in enkf for a complete list of available priors.

##### FIELD

The FIELD keyword is used to parametrize quantities which have extent over the full grid. Both dynamic properties like pressure, and static properties like porosity, are implemented in terms of FIELD objects. When adding fields in the config file the syntax is a bit different for dynamic fields (typically solution data from ECLIPSE) and parameter fields like permeability and porosity.

Dynamic fields

To add a dynamic field the entry in the configuration file looks like this:

  FIELD   <ID>   DYNAMIC  MIN:X  MAX:Y


In this case ID is not an arbitrary string; it must coincide with the keyword name found in the ECLIPSE restart file, e.g. PRESSURE. Optionally, you can add a minimum and/or a maximum value with MIN:X and MAX:Y.

Example A:

  -- Adding pressure field (unbounded)
FIELD PRESSURE DYNAMIC


Example B:

  -- Adding a bounded water saturation field
FIELD SWAT DYNAMIC MIN:0.2 MAX:0.95


Parameter fields

A parameter field (e.g. porosity or permeability) is defined as follows:

  FIELD  ID PARAMETER   <ECLIPSE_FILE>  INIT_FILES:/path/%d  MIN:X MAX:Y OUTPUT_TRANSFORM:FUNC INIT_TRANSFORM:FUNC


Here ID is again an arbitrary string, ECLIPSE_FILE is the name of the file the enkf will export this field to when running simulations. Note that there should be an IMPORT statement in the ECLIPSE data file corresponding to the name given with ECLIPSE_FILE. INIT_FILES is a filename (with an embedded %d) to load the initial field from. Can be RMS ROFF format, ECLIPSE restart format or ECLIPSE GRDECL format.

The options MIN, MAX, INIT_TRANSFORM and OUTPUT_TRANSFORM are all optional. MIN and MAX are as for dynamic fields. OUTPUT_TRANSFORM is the name of a mathematical function which will be applied to the field before it is exported, and INIT_TRANSFORM is the name of a function which will be applied to the fields when they are loaded. [Just use INIT_TRANSFORM:XXX to get a list of available functions.]

Regarding format of ECLIPSE_FILE: The default format for the parameter fields is binary format of the same type as used in the ECLIPSE restart files. This requires that the ECLIPSE datafile contains an IMPORT statement. The advantage with using a binary format is that the files are smaller, and reading/writing is faster than for plain text files. If you give the ECLIPSE_FILE with the extension .grdecl (arbitrary case), enkf will produce ordinary .grdecl files, which are loaded with an INCLUDE statement. This is probably what most users are used to beforehand - but we recomend the IMPORT form.

General fields

In addition to dynamic and parameter field there is also a general field, where you have fine grained control over input/output. Use of the general field type is only relevant for advanced features. The arguments for the general field type are as follows:

FIELD   ID  GENERAL    FILE_GENERATED_BY_ENKF  FILE_LOADED_BY_ENKF    <OPTIONS>


The OPTIONS argument is the same as for the parameter field.

##### GEN_DATA

The GEN_DATA keyword is used when estimating data types which enkf does not know anything about. GEN_DATA is very similar to GEN_PARAM, but GEN_DATA is used for data which are updated/created by the forward model like e.g. seismic data. In the main configuration file the input for a GEN_DATA instance is as follows:

  GEN_DATA  ID RESULT_FILE:yyy INPUT_FORMAT:xx  REPORT_STEPS:10,20  ECL_FILE:xxx  OUTPUT_FORMAT:xx  INIT_FILES:/path/files%d TEMPLATE:/template_file TEMPLATE_KEY:magic_string


The GEN_DATA keyword has many options; in many cases you can leave many of them off. We therefor list the required and the optional options separately:

###### Required GEN_DATA options
• RESULT_FILE - This if the name the file generated by the forward model and read by ERT. This filename _must_ have a %d as part of the name, that %d will be replaced by report step when loading.
• INPUT_FORMAT - The format of the file written by the forward model (i.e. RESULT_FILE) and read by ERT, valid values are ASCII, BINARY_DOUBLE and BINARY_FLOAT.
• REPORT_STEPS A list of the report step(s) where you expect the forward model to create a result file. I.e. if the forward model should create a result file for report steps 50 and 100 this setting should be: REPORT_STEPS:50,100. If you have observations of this GEN_DATA data the RESTART setting of the corresponding GENERAL_OBSERVATION must match one of the values given by REPORT_STEPS.
###### Optional GEN_DATA options
• ECL_FILE - This is the name of file written by enkf to be read by the forward model.
• OUTPUT_FORMAT - The format of the files written by enkf and read by the forward model, valid values are ASCII, BINARY_DOUBLE, BINARY_FLOAT and ASCII_TEMPLATE. If you use ASCII_TEMPLATE you must also supply values for TEMPLATE and TEMPLATE_KEY.
• INIT_FILES - Format string with '%d' of files to load the initial data from.

Example:

GEN_DATA 4DWOC  INPUT_FORMAT:ASCII   RESULT_FILE:SimulatedWOC%d.txt   REPORT_STEPS:10,100


Here we introduce a GEN_DATA instance with name 4DWOC. When the forward model has run it should create two files with name SimulatedWOC10.txt and SimulatedWOC100.txt. The result files are in ASCII format, ERT will look for these files and load the content. The files should be pure numbers - without any header.

Observe that the GEN_DATA RESULT_FILE setting must have a %d format specifier, that will be replaced with the report step..

##### GEN_KW

The GEN_KW (abbreviation of general keyword) parameter is based on a template file and substitution. In the main config file a GEN_KW instance is defined as follows:

 GEN_KW  ID  my_template.txt  my_eclipse_include.txt  my_priors.txt


Here ID is an (arbitrary) unique string, my_template.txt is the name of a template file, my_eclipse_include.txt is the name of the file which is made for each member based on my_template.txt and my_priors.txt is a file containing a list of parametrized keywords and a prior distribution for each. Note that you must manually edit the ECLIPSE data file so that my_eclipse_include.txt is included.

Let us consider an example where the GEN_KW parameter type is used to estimate pore volume multipliers. We would then declare a GEN_KW instance in the main enkf configuration file:

 GEN_KW PAR_MULTPV multpv_template.txt multpv.txt multpv_priors.txt


In the GRID or EDIT section of the ECLIPSE data file, we would insert the following include statement:

 INCLUDE
'multpv.txt' /


The template file multpv_template.txt would contain some parametrized ECLIPSE statements:

  BOX
1 10 1 30 13 13 /
MULTPV
300*<MULTPV_BOX1> /
ENDBOX

BOX
1 10 1 30 14 14 /
MULTPV
300*<MULTPV_BOX2> /
ENDBOX


Here, <MULTPV_BOX1> and <MULTPV_BOX2> will act as magic strings. Note that the '<' '>' must be present around the magic strings. In this case, the parameter configuration file multpv_priors.txt could look like this:

   MULTPV_BOX2 UNIFORM 0.98 1.03
MULTPV_BOX1 UNIFORM 0.85 1.00


In general, the first keyword on each line in the parameter configuration file defines a key, which when found in the template file enclosed in '<' and '>', is replaced with a value. The rest of the line defines a prior distribution for the key. See Prior distributions available in enkf for a list of available prior distributions.

Example: Using GEN_KW to estimate fault transmissibility multipliers

Previously enkf supported a datatype MULTFLT for estimating fault transmissibility multipliers. This has now been depreceated, as the functionality can be easily achieved with the help of GEN_KW. In th enkf config file:

GEN_KW  MY-FAULTS   MULTFLT.tmpl   MULTFLT.INC   MULTFLT.txt


Here MY-FAULTS is the (arbitrary) key assigned to the fault multiplers, MULTFLT.tmpl is the template file, which can look like this:

MULTFLT
'FAULT1'   <FAULT1>  /
'FAULT2'   <FAULT2>  /
/


and finally the initial distribution of the parameters FAULT1 and FAULT2 are defined in the file MULTFLT.txt:

FAULT1   LOGUNIF   0.00001   0.1
FAULT2   UNIFORM   0.00      1.0


The default use of the GEN_KW keyword is to let the ERT application sample random values for the elements in the GEN_KW instance, but it is also possible to tell ERT to load a precreated set of data files, this can for instance be used as a component in a experimental design based workflow. When using external files to initialize the GEN_KW instances you supply an extra keyword INIT_FILE:/path/to/priors/files%d which tells where the prior files are:

GEN_KW  MY-FAULTS   MULTFLT.tmpl   MULTFLT.INC   MULTFLT.txt    INIT_FILES:priors/multflt/faults%d


In the example above you must prepare files priors/multflt/faults0, priors/multflt/faults1, ... priors/multflt/faultsn which ert will load when you initialize the case. The format of the GEN_KW input files can be of two varieties:

1. The files can be plain ASCII text files with a list of numbers:
1.25
2.67


The numbers will be assigned to parameters in the order found in the MULTFLT.txt file.

2. Alternatively values and keywords can be interleaved as in:
FAULT1 1.25
FAULT2 2.56


in this case the ordering can differ in the init files and the parameter file.

The heritage of the ERT program is based on the EnKF algorithm, and the EnKF algorithm evolves around Gaussian variables - internally the GEN_KW variables are assumed to be samples from the N(0,1) distribution, and the distributions specified in the parameters file are based on transformations starting with a N(0,1) distributed variable. The slightly awkward consequence of this is that to let your sampled values pass through ERT unmodified you must configure the distribution NORMAL 0 1 in the parameter file; alternatively if you do not intend to update the GEN_KW variable you can use the distribution RAW.

##### GEN_PARAM

The GEN_PARAM parameter type is used to estimate parameters which do not really fit into any of the other categories. As an example, consider the following situation:

1. Some external Software (e.g. Cohiba) makes a large vector of random numbers which will serve as input to the forward model. (It is no requirement that the parameter set is large, but if it only consists of a few parameters the GEN_KW type will be easier to use.)
2. We want to update this parameter with enkf.

In the main configuration file the input for a GEN_PARAM instance is as follows:

  GEN_PARAM  ID  ECLIPSE_FILE  INPUT_FORMAT:xx  OUTPUT_FORMAT:xx  INIT_FILES:/path/to/init/files%d (TEMPLATE:/template_file KEY:magic_string)


here ID is the usual unique string identifying this instance and ECLIPSE_FILE is the name of the file which is written into the run directories. The three arguments GEN_PARAM, ID and ECLIPSE_FILE must be the three first arguments. In addition you must have three additional arguments, INPUT_FORMAT, OUTPUT_FORMAT and INIT_FILES. INPUT_FORMAT is the format of the files enkf should load to initialize, and OUTPUT_FORMAT is the format of the files enkf writes for the forward model. The valid values are:

• ASCII - This is just text file with formatted numbers.
• ASCII_TEMPLATE - An plain text file with formatted numbers, and an arbitrary header/footer.
• BINARY_FLOAT - A vector of binary float numbers.
• BINARY_DOUBLE - A vector of binary double numbers.

Regarding the different formats - observe the following:

1. Except the format ASCII_TEMPLATE the files contain no header information.
2. The format ASCII_TEMPLATE can only be used as output format.
3. If you use the output format ASCII_TEMPLATE you must also supply a TEMPLATE:X and KEY:Y option. See documentation of this below.
4. For the binary formats files generated by Fortran can not be used - can easily be supported on request.

Regarding templates: If you use OUTPUT_FORMAT:ASCII_TEMPLATE you must also supply the arguments TEMPLATE:/template/file and KEY:MaGiCKEY. The template file is an arbitrary existing text file, and KEY is a magic string found in this file. When enkf is running the magic string is replaced with parameter data when the ECLIPSE_FILE is written to the directory where the simulation is run from. Consider for example the follwing configuration:

   TEMPLATE:/some/file   KEY:Magic123


The template file can look like this (only the Magic123 is special):

Header line1
============
Magic123
============
Footer line1
Footer line2


When enkf is running the string Magic123 is replaced with parameter values, and the resulting file will look like this:

Header line1
============
1.6723
5.9731
4.8881
.....
============
Footer line1
Footer line2

##### SURFACE

The SURFACE keyword can be used to work with surface from RMS in the irap format. The surface keyword is configured like this:

SURFACE TOP   OUTPUT_FILE:surf.irap   INIT_FILES:Surfaces/surf%d.irap   BASE_SURFACE:Surfaces/surf0.irap


The first argument, TOP in the example above, is the identifier you want to use for this surface in ert. The OUTPUT_FILE key is the name of surface file which ERT will generate for you, INIT_FILES points to a list of files which are used to initialize, and BASE_SURFACE must point to one existing surface file. When loading the surfaces ERT will check that all the headers are compatible. An example of a surface IRAP file is:

-996   511     50.000000     50.000000
444229.9688   457179.9688  6809537.0000  6835037.0000
260      -30.0000   444229.9688  6809537.0000
0     0     0     0     0     0     0
2735.7461    2734.8909    2736.9705    2737.4048    2736.2539    2737.0122
2740.2644    2738.4014    2735.3770    2735.7327    2733.4944    2731.6448
2731.5454    2731.4810    2730.4644    2730.5591    2729.8997    2726.2217
2721.0996    2716.5913    2711.4338    2707.7791    2705.4504    2701.9187
....


The surface data will typically be fed into other programs like Cohiba or RMS. The data can be updated using e.g. the Smoother.

##### Initializing from the FORWARD MODEL

All the parameter types like FIELD,GEN_KW,GEN_PARAM and SURFACE can be initialized from the forward model. To achieve this you just add the setting FORWARD_INIT:True to the configuration. When using forward init the initialization will work like this:

1. The explicit initialization from the case menu, or when you start a simulation, will be ignored.
2. When the FORWARD_MODEL is complete ERT will try to initialize the node based on files created by the forward model. If the init fails the job as a whole will fail.
3. If a node has been initialized, it will not be initialized again if you run again. [Should be possible to force this ....]

When using FORWARD_INIT:True ERT will consider the INIT_FILES setting to find which file to initialize from. If the INIT_FILES setting contains a relative filename, it will be interpreted relativt to the runpath directory. In the example below we assume that RMS has created a file petro.grdecl which contains both the PERMX and the PORO fields in grdecl format; we wish to initialize PERMX and PORO nodes from these files:

FIELD   PORO  PARAMETER    poro.grdecl     INIT_FILES:petro.grdecl  FORWARD_INIT:True
FIELD   PERMX PARAMETER    permx.grdecl    INIT_FILES:petro.grdecl  FORWARD_INIT:True


Observe that forward model has created the file petro.grdecl and the nodes PORO and PERMX create the ECLIPSE input files poro.grdecl and permx.grdecl, to ensure that ECLIPSE finds the input files poro.grdecl and permx.grdecl the forward model should contain a job which will copy/convert petro.grdecl -> (poro.grdecl,permx.grdecl), this job should not overwrite existing versions of permx.grdecl and poro.grdecl. This extra hoops is not strictly needed in all cases, but strongly recommended to ensure that you have control over which data is used, and that everything is consistent in the case where the forward model is run again.

##### SUMMARY

The SUMMARY keyword is used to add variables from the ECLIPSE summary file to the parametrization. The keyword expects a string, which should have the format VAR:WGRNAME. Here, VAR should be a quantity, such as WOPR, WGOR, RPR or GWCT. Moreover, WGRNAME should refer to a well, group or region. If it is a field property, such as FOPT, WGRNAME need not be set to FIELD.

Example:

  -- Using the SUMMARY keyword to add diagnostic variables
SUMMARY WOPR:MY_WELL
SUMMARY RPR:8
SUMMARY F*          -- Use of wildcards requires that you have entered a REFCASE.


The SUMMARY keyword has limited support for '*' wildcards, if your key contains one or more '*' characters all matching variables from the refcase are selected. Observe that if your summary key contains wildcards you must supply a refcase with the REFCASE key - otherwise it will fail hard.

Note: Properties added using the SUMMARY keyword are only diagnostic. I.e., they have no effect on the sensitivity analysis or history match.

## Keywords controlling the EnKF algorithm

##### ENKF_ALPHA

Scaling factor (double) used in outlier detection. Increasing this factor means that more observations will potentially be included in the assimilation. The default value is 1.50.

Including outliers in the EnKF algorithm can dramatically increase the coupling between the ensemble members. It is therefore important to filter out these outlier data prior to data assimilation. An observation, $\textstyle d^o_i$, will be classified as an outlier if

$|d^o_i - \bar{d}_i| > \mathrm{ENKF\_ALPHA} \left(s_{d_i} + \sigma_{d^o_i}\right)$,

where $\textstyle\boldsymbol{d}^o$ is the vector of observed data, $\textstyle\boldsymbol{\bar{d}}$ is the average of the forcasted data ensemble, $\textstyle\boldsymbol{s_{d}}$ is the vector of estimated standard deviations for the forcasted data ensemble, and $\textstyle\boldsymbol{s_{d}^o}$ is the vector standard deviations for the observation error (specified a priori).

##### ENKF_BOOTSTRAP

Boolean specifying if we want to resample the Kalman gain matrix in the update step. The purpose is to avoid that the ensemble covariance collapses. When this keyword is true each ensemble member will be updated based on a Kalman gain matrix estimated from a resampling with replacement of the full ensemble.

In theory and in practice this has worked well when one uses a small number of ensemble members.

##### ENKF_CV_FOLDS

Integer specifying how many folds we should use in the Cross-Validation (CV) scheme. Possible choices are the integers between 2 and the ensemble size (2-fold CV and leave-one-out CV respectively). However, a robust choice for the number of CV-folds is 5 or 10 (depending on the ensemble size).

Example:

 -- Setting the number of CV folds equal to 5
ENKF_CV_FOLDS 5


Requires that the ENKF_LOCAL_CV keyword is set to TRUE

##### ENKF_FORCE_NCOMP

Bool specifying if we want to force the subspace dimension we want to use in the EnKF updating scheme (SVD-based) to a specific integer. This is an alternative to selecting the dimension using ENKF_TRUNCATION or ENKF_LOCAL_CV.

Example:

 -- Setting the the subspace dimension to 2
ENKF_FORCE_NCOMP     TRUE
ENKF_NCOMP              2


##### ENKF_LOCAL_CV

Boolean specifying if we want to select the subspace dimension in the SVD-based EnKF algorithm using Cross-Validation (CV) [1]. This is a more robust alternative to selecting the subspace dimension based on the estimated singular values (See ENKF_TRUNCATION), because the predictive power of the estimated Kalman gain matrix is taken into account.

Example:

 -- Select the subspace dimension using Cross-Validation
ENKF_LOCAL_CV TRUE

##### ENKF_PEN_PRESS

Boolean specifying if we want to select the subspace dimension in the SVD-based EnKF algorithm using Cross-Validation (CV), and a penalised version of the predictive error sum of squares (PRESS) statistic [2]. This is recommended when overfitting is a severe problem (and when the number of ensemble members is small)

Example:

 -- Select the subspace dimension using Cross-Validation
ENKF_LOCAL_CV TRUE

-- Using penalised PRESS statistic
ENKF_PEN_PRESS TRUE


##### ENKF_MODE

The ENKF_MODE keyword is used to select which EnKF algorithm to use. Use the value STANDARD for the original EnKF algorithm, or SQRT for the so-called square root scheme. The default value for ENKF_MODE is STANDARD.

Example A:

  -- Using the square root update
ENKF_MODE SQRT


Example B:

  -- Using the standard update
ENKF_MODE STANDARD


The ENKF_MODE keyword is optional.

##### ENKF_MERGE_OBSERVATIONS

If you use the ENKF_SCHED_FILE option to jump over several dates at a time you can choose whether you want to use all the observations in between, or just the final. If set to TRUE, all observations will be used. If set to FALSE, only the final observation is used. The default value for ENKF_MERGE_OBSERVATIONS is FALSE.

Example:

  -- Merge observations
ENKF_MERGE_OBSERVATIONS TRUE

##### ENKF_NCOMP

Integer specifying the subspace dimension. Requires that ENKF_FORCE_NCOMP is TRUE.

##### ENKF_RERUN

This is a boolean switch - TRUE or FALSE. Should the simulation start from time zero after each update.

##### ENKF_SCALING

This is a boolean switch - TRUE (Default) or FALSE. If TRUE, we scale the data ensemble matrix to unit variance. This is generally recommended because the SVD-based EnKF algorithm is not scale invariant.

##### ENKF_TRUNCATION

Truncation factor for the SVD-based EnKF algorithm (see Evensen, 2007). In this algorithm, the forecasted data will be projected into a low dimensional subspace before assimilation. This can substantially improve on the results obtained with the EnKF, especially if the data ensemble matrix is highly collinear (Sætrom and Omre, 2010). The subspace dimension, p, is selected such that

$\frac{\sum_{i=1}^{p} s_i^2}{\sum_{i=1}^r s_i^2} \geq \mathrm{ENKF\_TRUNCATION},$

where si is the ith singular value of the centered data ensemble matrix and r is the rank of this matrix. This criterion is similar to the explained variance criterion used in Principal Component Analysis (see e.g. Mardia et al. 1979).

The default value of ENKF_TRUNCATION is 0.99. If ensemble collapse is a big problem, a smaller value should be used (e.g 0.90 or smaller). However, this does not guarantee that the problem of ensemble collapse will disappear. Note that setting the truncation factor to 1.00, will recover the Standard-EnKF algorithm if and only if the covariance matrix for the observation errors is proportional to the identity matrix.

##### UPDATE_LOG_PATH

A summary of the data used for updates are stored in this directory.

##### References
• Evensen, G. (2007). "Data Assimilation, the Ensemble Kalman Filter", Springer.
• Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). "Multivariate Analysis", Academic Press.
• Sætrom, J. and Omre, H. (2010). "Ensemble Kalman filtering with shrinkage regression techniques", Computational Geosciences (online first).

## Analysis module

The final EnKF linear algebra is performed in an analysis module. The keywords to load, select and modify the analysis modules are documented here.

The ANALYSIS_LOAD key is the main key to load an analysis module:

ANALYSIS_LOAD ANAME  analysis.so


The first argument ANAME is just an arbitrary unique name which you want to use to refer to the module later. The second argument is the name of the shared library file implementing the module, this can either be an absolute path as /path/to/my/module/ana.so or a relative file name as analysis.so. The module is loaded with dlopen() and the normal shared library search semantics applies.

##### ANALYSIS_SELECT

This command is used to select which analysis module to actually use in the updates:

ANALYSIS_SELECT ANAME


##### ANALYSIS_SET_VAR

The analysis modules can have internal state, like e.g. truncation cutoff values, these values can be manipulated from the config file using the ANALYSIS_SET_VAR keyword:

ANALYSIS_SET_VAR  ANAME  ENKF_TRUNCATION  0.97


To use this you must know which variables the module supports setting this way. If you try to set an unknown variable you will get an error message on stderr.

##### ANALYSIS_COPY

With the ANALYSIS_COPY keyword you can create a new instance of a module. This can be convenient if you want to run the same algorithm with the different settings:

ANALYSIS_LOAD   A1  analysis.so
ANALYISIS_COPY  A1  A2


We load a module analysis.so and assign the name A1; then we copy A1 -> A2. The module A1 and A2 are now 100% identical. We then set the truncation to two different values:

ANALYSIS_SET_VAR A1 ENKF_TRUNCATION 0.95
ANALYSIS_SET_VAR A2 ENKF_TRUNCATION 0.98

##### Developing analysis modules

In the analysis module the update equations are formulated based on familiar matrix expressions, and no knowledge of the innards of the ERT program are required. Some more details of how modules work can be found here modules.txt. In principle a module is 'just' a shared library following some conventions, and if you are sufficiently savy with gcc you can build them manually, but along with the ert installation you should have utility script ert_module which can be used to build a module; just write ert_module without any arguments to get a brief usage description.

The keywords in this section, controls advanced features of the enkf application. Insight in the internals of the enkf application and/or ECLIPSE may be required to fully understand their effect. Moreover, many of these keywords are defined in the site configuration, and thus optional to set for the user, but required when installing the enkf application at a new site.

Real low level fix for some SCHEDULE parsing problems.

The restart files from ECLIPSE are organized by keywords, which are of three different types:

1. Keywords containing the dynamic solution, e.g. pressure and saturations.
2. Keywords containing various types of header information which is needed for a restart.
3. Keywords containing various types of diagnostic information which is not needed for a restart.

Keywords in category 2 and 3 are referred to as static keywords. To be able to restart ECLIPSE, the enkf application has to store the keywords in category 2, whereas keywords in category 3 can safely be dropped. To determine whether a particular keyword is in category 2 or 3 the enkf considers an internal list of keywords. The current list contains the keywords:

  INTEHEAD LOGIHEAD DOUBHEAD IGRP SGRP XGRP ZGRP IWEL SWEL XWEL ZWEL
ICON SCON XCON HIDDEN STARTSOL PRESSURE SWAT SGAS RS RV ENDSOL ICAQNUM ICAQ IAAQ
SCAQNUM SCAQ SAAQ ACAQNUM ACAQ XAAQ
ISEG ILBS ILBR RSEG ISTHW ISTHG


By using ADD_STATIC_KW you can dynamically add to this list. The magic string __ALL__ will add all static keywords. Use of the __ALL__ option is strongly discouraged, as it wastes a lot disk space.

##### DEFINE

With the DEFINE keyword you can define key-value pairs which will be substituted in the rest of the configuration file. The DEFINE keyword expects two arguments: A key and a value to replace for that key. Later instances of the key enclosed in '<' and '>' will be substituted with the value. The value can consist of several strings, in that case they will be joined by one single space.

Example:

-- Define ECLIPSE_PATH and ECLIPSE_BASE
DEFINE  ECLIPSE_PATH  /path/to/eclipse/run
DEFINE  ECLIPSE_BASE  STATF02
DEFINE  KEY           VALUE1       VALUE2 VALUE3            VALUE4

-- Set the GRID in terms of the ECLIPSE_PATH
-- and ECLIPSE_BASE keys.
GRID    <ECLIPSE_PATH>/<ECLIPSE_BASE>.EGRID


Observe that when you refer to the keys later in the config file they must be enclosed in '<' and '>'. Furthermore, a key-value pair must be defined in the config file before it can be used. The final key define above KEY, will be replaced with VALUE1 VALUE2 VALUE3 VALUE4 - i.e. the extra spaces will be discarded.

##### SCHEDULE_PREDICTION_FILE

This is the name of a schedule prediction file. It can contain %d to get different files for different members. Observe that the ECLIPSE datafile should include only one schedule file, even if you are doing predictions.

## Keywords related to running the forward model

##### FORWARD_MODEL

The FORWARD_MODEL keyword is used to define how the simulations are executed. E.g., which version of ECLIPSE to use, which rel.perm script to run, which rock physics model to use etc. Jobs (i.e. programs and scripts) that are to be used in the FORWARD_MODEL keyword must be defined using the INSTALL_JOB keyword. A set of default jobs are available, and by default FORWARD_MODEL takes the value ECLIPSE100.

The FORWARD_MODEL keyword expects a series of keywords, each defined with INSTALL_JOB. The enkf will execute the jobs in sequentially in the order they are entered. Note that the ENKF_SCHED_FILE keyword can be used to change the FORWARD_MODEL for sub-sequences of the run.

Example A:

  -- Suppose that "MY_RELPERM_SCRIPT" has been defined with
-- the INSTALL_JOB keyword. This FORWARD_MODEL will execute
-- "MY_RELPERM_SCRIPT" before ECLIPSE100.
FORWARD_MODEL MY_RELPERM_SCRIPT ECLIPSE100


Example B:

  -- Suppose that "MY_RELPERM_SCRIPT" and "MY_ROCK_PHYSICS_MODEL"
-- has been defined with the INSTALL_JOB keyword.
-- This FORWARD_MODEL will execute "MY_RELPERM_SCRIPT", then
-- "ECLIPSE100" and in the end "MY_ROCK_PHYSICS_MODEL".
FORWARD_MODEL MY_RELPERM_SCRIPT ECLIPSE100 MY_ROCK_PHYSICS_MODEL


For advanced jobs you can pass string arguments to the job using a KEY=VALUE based approach, this is further described in: passing arguments. In available jobs in enkf you can see a list of the jobs which are available.

##### JOB_SCRIPT

Running the forward model from enkf is a multi-level process which can be summarized as follows:

1. A Python module called jobs.py is written and stored in the directory where the forward simulation is run. The jobs.py module contains a list of job-elements, where each element is a Python representation of the code entered when installing the job.
2. The enkf application submits a Python script to the enkf queue system, this script then loads the jobs.py module to find out which programs to run, and how to run them.
3. The job_script starts and monitors the individual jobs in the jobs.py module.

The JOB_SCRIPT variable should point at the Python script which is managing the forward model. This should normally be set in the site wide configuration file.

##### QUEUE_SYSTEM

The keyword QUEUE_SYSTEM can be used to control where the simulation jobs are executed. It can take the values LSF, TORQUE, RSH and LOCAL.

The LSF option will submit jobs to the LSF cluster at your location, and is recommended whenever LSF is available.

The TORQUE option will submit jobs to the TORQUE a torque based system, using the commands qsub, qstat etc., if available.

If you do not have access to LSF or TORQUE you can submit to your local workstation using the LOCAL option and to homemade cluster of workstations using the RSH option. All of the queue systems can be further configured, see separate sections.

Example:

-- Tell ert to use the LSF cluster.
QUEUE_SYSTEM LSF


The QUEUE_SYSTEM keyword is optional, and usually defaults to LSF (this is site dependent).

## Configuring LSF access

The LSF system is the most useful of the queue alternatives, and also the alternative with most options. The most important options are related to how ert should submit jobs to the LSF system. Essentially there are two methods ert can use when submitting jobs to the LSF system:

1. For workstations which have direct access to LSF ert can submit directly with no further configuration. This is preferred solution, but unfortunately not very common.
2. Alternatively ert can issue shell commands to bsub/bjobs/bkill to submit jobs. These shell commands can be issued on the current workstation, or alternatively on a remote workstation using ssh.

The main switch between alternatives 1 and 2 above is the LSF_SERVER option.

##### LSF_SERVER

By using the LSF_SERVER option you essentially tell ert two things about how jobs should be submitted to LSF:

1. You tell ert that jobs should be submitted using shell commands.
2. You tell ert which server should be used when submitting

So when your configuration file has the setting:

LSF_SERVER   be-grid01


ert will use ssh to submit your jobs using shell commands on the server be-grid01. For this to work you must have passwordless ssh to the server be-grid01. If you give the special server name LOCAL ert will submit using shell commands on the current workstation.

##### bsub/bjobs/bkill options

By default ert will use the shell commands bsub,bjobs and bkill to interact with the queue system, i.e. whatever binaries are first in your PATH will be used. For fine grained control of the shell based submission you can tell ert which programs to use:

QUEUE_OPTION   LSF  BJOBS_CMD  /path/to/my/bjobs
QUEUE_OPTION   LSF  BSUB_CMD   /path/to/my/bsub


Example 1

LSF_SERVER    be-grid01
QUEUE_OPTION  LSF     BJOBS_CMD   /path/to/my/bjobs
QUEUE_OPTION  LSF     BSUB_CMD    /path/to/my/bsub


In this example we tell ert to submit jobs from the workstation be-grid01 using custom binaries for bsub and bjobs.

Example 2

LSF_SERVER   LOCAL


In this example we will submit on the current workstation, without using ssh first, and we will use the default bsub and bjobs executables. The remaining LSF options apply irrespective of which method has been used to submit the jobs.

##### LSF_QUEUE

The name of the LSF queue you are running ECLIPSE simulations in.

##### MAX_RUNNING_LSF

The keyword MAX_RUNNING_LSF controls the maximum number of simultaneous jobs submitted to the LSF (Load Sharing Facility) queue when using the LSF option in QUEUE_SYSTEM.

Example:

  -- Submit no more than 30 simultaneous jobs
-- to the LSF cluster.
MAX_RUNNING_LSF 30


## Configuring TORQUE access

The TORQUE system is the only available system on some clusters. The most important options are related to how ert should submit jobs to the TORQUE system.

1. Currently, the TORQUE option only works when the machine you are logged into have direct access to the queue system. ert then submit directly with no further configuration.

The most basic invocation is in other words:

QUEUE_SYSTEM TORQUE

##### qsub/qstat/qdel options

By default ert will use the shell commands qsub,qstat and qdel to interact with the queue system, i.e. whatever binaries are first in your PATH will be used. For fine grained control of the shell based submission you can tell ert which programs to use:

QUEUE_SYSTEM TORQUE
QUEUE_OPTION TORQUE QSUB_CMD /path/to/my/qsub
QUEUE_OPTION TORQUE QSTAT_CMD /path/to/my/qstat
QUEUE_OPTION TORQUE QDEL_CMD /path/to/my/qdel



In this example we tell ert to submit jobs using custom binaries for bsub and bjobs.

##### Name of queue

The name of the TORQUE queue you are running ECLIPSE simulations in.

QUEUE_OPTION TORQUE QUEUE name_of_queue

##### Name of cluster (label)

The name of the TORQUE cluster you are running ECLIPSE simulations in. This might be a label (serveral clusters), or a single one, as in this example baloo.

QUEUE_OPTION TORQUE CLUSTER_LABEL baloo

##### Max running jobs

The queue option MAX_RUNNING controls the maximum number of simultaneous jobs submitted to the queue when using (in this case) the TORQUE option in QUEUE_SYSTEM.

  QUEUE_SYSTEM TORQUE
-- Submit no more than 30 simultaneous jobs
-- to the TORQUE cluster.
QUEUE_OPTION TORQUE MAX_RUNNING 30

##### Queue options controlling number of nodes and CPUs

When using TORQUE, you must specify how many nodes a single job is should to use, and how many CPUs per node. The default setup in ert will use one node and one CPU. These options are called NUM_NODES and NUM_CPUS_PER_NODE.

If the numbers specified is higher than supported by the cluster (i.e. use 32 CPUs, but no node has more than 16), the job will not start.

If you wish to increase these number, the program running (typically ECLIPSE) will usually also have to be told to correspondingly use more processing units (keyword PARALLEL)

  QUEUE_SYSTEM TORQUE
-- Use more nodes and CPUs
-- in the TORQUE cluster per job submitted
-- This should (in theory) allow for 24 processing
-- units to be used by eg. ECLIPSE
QUEUE_OPTION TORQUE NUM_NODES 3
QUEUE_OPTION TORQUE NUM_CPUS_PER_NODE 8

##### Keep output from qsub

Sometimes the error messages from qsub can be useful, if something is seriously wrong with the environment or setup. To keep this output (stored in your home folder), use this:

QUEUE_OPTION TORQUE KEEP_QSUB_OUTPUT 1


## Configuring the LOCAL queue

##### MAX_RUNNING_LOCAL

The keyword MAX_RUNNING_LOCAL controls the maximum number of simultaneous jobs running when using the LOCAL option in QUEUE_SYSTEM. It is strongly recommended to not let MAX_RUNNING_LOCAL exceed the number of processors on the workstation used.

Example:

  -- No more than 3 simultaneous jobs
MAX_RUNNING_LOCAL 3


## Configuring the RSH queue

##### RSH_HOST

You can run the forward model in enkf on workstations using remote-shell commands. To use the RSH queue system you must first set a list of computers which enkf can use for running jobs:

RSH_HOST   computer1:2  computer2:2   large_computer:8


Here you tell enkf that you can run on three different computers: computer1, computer2 and large_computer. The two first computers can accept two jobs from enkf, and the last can take eight jobs. Observe the following when using RSH:

1. You must have passwordless login to the computers listed in RSH_HOST otherwise it will fail hard.
2. enkf will not consider total load on the various computers; if have said it can take two jobs, it will get two jobs, irrespective of the existing load.
##### RSH_COMMAND

This is the name of the executable used to invoke remote shell operations. Will typically be either rsh or ssh. The command given to RSH_COMMAND must either be in PATH or an absolute path.

##### MAX_RUNNING_RSH

The keyword MAX_RUNNING_RSH controls the maximum number of simultaneous jobs running when using the RSH option in QUEUE_SYSTEM. It MAX_RUNNING_RSH exceeds the total capacity defined in RSH_HOST, it will automatically be truncated to that capacity.

Example:

  -- No more than 10 simultaneous jobs
-- running via RSH.
MAX_RUNNING_RSH 10


## Keywords related to plotting

##### IMAGE_VIEWER

The enkf application has some limited plotting capabilities. The plotting is based on creating a graphics file (currently a png file) and then viewing that file with an external application. The current default image viewer is a program called /usr/bin/display, but you can set IMAGE_VIEWER to point to another binary if that is desired. In particular it can be interesting to set as

IMAGE_VIEWER  /d/proj/bg/enkf/bin/noplot.sh


then the plot files will be created, but they will not be flashing in your face (which can be a bit annoying).

##### IMAGE_TYPE

This switch control the type of the plot figures/images created by the PLPLOT plot driver. It is by default set to png which works fine, but you can probably(??) use other popular graphics formats like gif and jpg as well.

##### PLOT_DRIVER

This is the name of the sub system used for creating plots. The default system is called 'PLPLOT' - all the other options regarding plotting are sub options which are only relevant when you are using PLPLOT. In addition to PLPLOT you can chose the value 'TEXT'; this will actually not produce any plots, just textfiles which can be used for plotting with your favorite plotting program. This is particularly relevant if you have some special requirements to the plots.

##### PLOT_ERRORBAR

Should errorbars on the observations be plotted?

##### PLOT_ERRORBAR_MAX

When plotting summary vectors for which observations have been 'installed' with the OBS_CONFIG keyword, ert will plot the observed values. If you have less than PLOT_ERRORBAR_MAX observations ert will use errorbars to show the observed values, otherwise it will use two dashed lines indicating +/- one standard deviation. This option is only meaningful when PLOT_PLOT_ERRORBAR is activated.

To ensure that you always get errorbars you can set PLOT_ERRORBAR_MAX to a very large value, on the other hand setting PLOT_ERRORBAR_MAX to 0 will ensure that ert always plots observation uncertainty using dashed lines of +/- one standard deviation.

The setting here will also affect the output when you are using the TEXT driver to plot.

##### PLOT_HEIGHT

When the PLPLOT driver creates a plot file, it will have the height (in pixels) given by the PLOT_HEIGHT keyword. The default value for PLOT_HEIGHT is 768 pixels.

##### PLOT_REFCASE

Boolean variable which is TRUE if you want to add the refcases to the plots.

Ex.
PLOT_REFCASE TRUE
##### REFCASE_LIST

Provide one or more Eclipse .DATA files for a refcase to be added in the plots. This refcase will be plotted in different colours. The summary files related to the refcase should be in the same folder as the refcase.

Ex.
REFCASE_LIST /path/to/refcase1/file1.DATA /path/to/refcase2/file2.DATA
##### PLOT_PATH

The plotting engine creates 'files' with plots, they are stored in a directory. You can tell what that directory should be. Observe that the current 'casename' will automatically be appended to the plot path.

##### PLOT_WIDTH

When the PLPLOT driver creates a plot file, it will have the width (in pixels) given by the PLOT_WIDTH keyword. The default value for PLOT_WIDTH is 1024 pixels. To create plots of half the size you use:

PLOT_HEIGHT   384
PLOT_WIDTH    512

##### RFT_CONFIG

RFT_CONFIGS argument is a file with the name of the rfts followed by date (day month year) Ex.

RFT_CONFIG  ../models/wells/rft/WELLNAME_AND_RFT_TIME.txt

Where the contents of the file is something like

be-linapp16(inmyr) -/models/wells/rft 34> more WELLNAME_AND_RFT_TIME.txt
A-1HP  06 05 1993
A-9HW  31 07 1993
C-1HP  11 12 2007
C-5HP  21 12 1999
C-6HR  09 11 1999
D-4HP  10 07 2003
K-3HW  09 02 2003
K-6HW  08 11 2002
K-7HW  21 04 2005
D-6HP  22 04 2006

##### RFTPATH

RFTPATHs argument is the path to where the rft-files are located

RFTPATH  ../models/wells/rft/

## Workflows

The Forward Model in ERT runs in the context of a single realization, i.e. there is no communication between the different processes, and collective gather operations must be performed by the ERT core program after the forward model has completed. As an alternative to the forward model ERT has a system with workflows. Using workflows you can automate cumbersome normal ERT processes, and also invoke external programs. The workflows are run serially on the workstation actually running ERT, and should not be used for computationally heavy tasks.

Configuring workflows in ERT consists of two steps: installing the jobs which should be available for ERT to use in workflows, and then subsequently assemble one or more jobs, with arguments, in a workflow.

### Workflow jobs

The workflow jobs are quite similar to the jobs in the forward model, in particular the jobs are described by a configuration file which resembles the one used by the forward model jobs. The workflow jobs can be of two fundamentally different types:

INTERNAL
These jobs invoke a function in the address space of the ERT program itself. The functions are called with the main enkf_main instance as a self argument, and can in principle do anything that ERT can do itself. ERT functions which should be possible to invoke like this must be 'marked as exportable' in the ERT code, but that is a small job. The internal jobs have the following sections in their config file:
INTERNAL  TRUE                     -- The job will call an internal function of the current running ERT instance.
FUNCTION  enkf_main_plot_all       -- Name of the ERT function we are calling; must be marked exportable.
MODULE    /name/of/shared/library  -- Very optional - to load an extra shared library.

EXTERNAL
These jobs invoke an external program/script to do the job, this is very similar to the jobs of the forward model. Context must be passed between the main ERT process and the script through the use of string substitution, in particular the 'magic' key <RUNPATH_FILE> has been introduced for this purpose.
INTERNAL   FALSE                    -- This is the default - not necessary to include.
EXECUTABLE /path/to/a/program       -- Path to a program/script which will be invoked by the job.


In addition to the INTERNAL, FUNCTION, MODULE and EXECUTABLE keys which are used to configure what the job should do there are some keys which can be used to configure the number of arguments and their type. These arguments apply to both internal and external jobs:

MIN_ARG    2                 -- The job should have at least 2 arguments.
MAX_ARG    3                 -- The job should have maximum 3 arguments.
ARG_TYPE   0    INT          -- The first argument should be an integer
ARG_TYPE   1    FLOAT        -- The second argument should be a float value
ARG_TYPE   2    STRING       -- The third argument should be a string - the default.


The MIN_ARG,MAX_ARG and ARG_TYPE arguments are used to validate workflows.

#### Example 1 : Plot variables

-- FILE: PLOT --
INTERNAL  TRUE
FUNCTION  ert_tui_plot_JOB
MIN_ARG   1


This job will use the ERT internal function ert_tui_plot_JOB to plot an ensemble of an arbitrary ERT variable. The job needs at least one argument; there is no upper limit on the number of arguments.

#### Example 2 : Run external script

-- FILE: ECL_HIST --
EXECUTABLE  Script/ecl_hist.py
MIN_ARG     3


This job will invoke the external script Script/ecl_host.py; the script should have at least three commandline arguments. The path to the script, Script/ecl_hist.py is interpreted relative to the location of the configuration file.

Before the jobs can be used in workflows they must be 'loaded' into ERT. This is done with two different ERT keywords:

LOAD_WORKFLOW_JOB     jobConfigFile     JobName


The LOAD_WORKFLOW_JOB keyword will load one workflow. The name of the job is optional, if not provided the job will get name from the configuration file. Alternatively you can use the command WORKFLOW_JOB_DIRECTORY which will load all the jobs in a directory. The command:

WORKFLOW_JOB_DIRECTORY /path/to/jobs


will load all the workflow jobs in the /path/to/jobs directory. Observe that all the files in the /path/to/jobs directory should be job configuration files. The jobs loaded in this way will all get the name of the file as the name of the job.

### Complete Workflows

A workflow is a list of calls to jobs, with additional arguments. The job name should be the first element on each line. Based on the two jobs PLOT and ECL_HIST we can create a small workflow example:

PLOT      WWCT:OP_1   WWCT:OP_3  PRESSURE:10,10,10
PLOT      FGPT        FOPT
ECL_HIST  <RUNPATH_FILE>   <QC_PATH>/<ERTCASE>/wwct_hist   WWCT:OP_1  WWCT:OP_2


In this workflow we create plots of the nodes WWCT:OP_1;WWCT:OP_3,PRESSURE:10,10,10,FGPT and FOPT. The plot job we have created in this example is completely general, if we limited ourselves to ECLIPSE summary variables we could get wildcard support. Then we invoke the ECL_HIST example job to create a histogram. See below for documentation of <RUNPATH_FILE>,<QC_PATH> and <ERTCASE>.

LOAD_WORKFLOW  /path/to/workflow/WFLOW1


The LOAD_WORKFLOW takes the path to a workflow file as the first argument. By default the workflow will be labeled with the filename internally in ERT, but optionally you can supply a second extra argument which will be used as name for the workflow. Alternatively you can load a workflow interactively.

#### Running workflows

Go to workflow menu and type run.

#### Locating the realisations: <RUNPATH_FILE>

Many of the external workflow jobs involve looping over all the realisations in a construction like this:

for each realisation:
// Do something for realisation
summarize()


When running an external job in a workflow there is no direct transfer of information between the main ERT process and the external script. We therefor must have a convention for transfering the information of which realisations we have simulated on, and where they are located in the filesystem. This is done through a file which looks like this:

0   /path/to/real0  CASE_0000
1   /path/to/real1  CASE_0001
...
9   /path/to/real9  CASE_0009


The name and location of this file is available as the magical string <RUNPATH_FILE> and that is typically used as the first argument to external workflow jobs which should iterate over all realisations. The realisations referred to in the <RUNPATH_FILE> are meant to be last simulations you have run; the file is updated every time you run simulations. This implies that it is (currently) not so convenient to alter which directories should be used when running a workflow.

## Built in workflow jobs

ERT comes with a list of default workflow jobs which invoke internal ERT functionality. The internal workflows include:

### Jobs related to case management

#### SELECT_CASE

The job SELECT_CASE can be used to change the currently selected case. The SELECT_CASE job should be used as:

SELECT_CASE  newCase


if the case newCase does not exist it will be created.

#### CREATE_CASE

The job CREATE_CASE can be used to create a new case without selecting it. The CREATE_CASE job should be used as:

CREATE_CASE  newCase


#### INIT_CASE_FROM_EXISTING

The job INIT_CASE_FROM_EXISTING can be used to initialize a case from an existing case. The argument to the workflow should be the name of the workflow you are initializing from; so to initialize the current case from the existing case "oldCase":

INIT_CASE_FROM_EXISTING oldCase


By default the job will initialize the 'current case', but optionally you can give the name of a second case which should be initialized. In this example we will initialize "newCase" from "oldCase":

INIT_CASE_FROM_EXISTING oldCase newCase


When giving the name of a second case as target for the initialization job the 'current' case will not be affected.

### Jobs related to export

#### EXPORT_FIELD

The EXPORT_FIELD workflow job exports field data to roff or grdecl format dependent on the extension of the export file argument.The job takes the following arguments:

1. Field to be exported
2. Filename for export file, must contain %d
3. Report_step
4. State
5. Realization range

The filename must contain a %d. This will be replaced with the realization number.

The state parameter is either FORECAST or ANALYZED, BOTH is not supported.

The realization range parameter is optional. Default is all realizations.

Example use of this job in a workflow:

EXPORT_FIELD PERMZ path_to_export/filename%d.grdecl 0 FORECAST 0,2


#### EXPORT_FIELD_RMS_ROFF

The EXPORT_FIELD_RMS_ROFF workflow job exports field data to roff format. The job takes the following arguments:

1. Field to be exported
2. Filename for export file, must contain %d
3. Report_step
4. State
5. Realization range

The filename must contain a %d. This will be replaced with the realization number.

The state parameter is either FORECAST or ANALYZED, BOTH is not supported.

The realization range parameter is optional. Default is all realizations.

Example uses of this job in a workflow:

EXPORT_FIELD_RMS_ROFF PERMZ path_to_export/filename%d.roff 0 FORECAST
EXPORT_FIELD_RMS_ROFF PERMX path_to_export/filename%d 0 FORECAST 0-5


#### EXPORT_FIELD_ECL_GRDECL

The EXPORT_FIELD_ECL_GRDECL workflow job exports field data to grdecl format. The job takes the following arguments:

1. Field to be exported
2. Filename for export file, must contain %d
3. Report_step
4. State
5. Realization range

The filename must contain a %d. This will be replaced with the realization number.

The state parameter is either FORECAST or ANALYZED, BOTH is not supported.

The realization range parameter is optional. Default is all realizations.

Example uses of this job in a workflow:

EXPORT_FIELD_ECL_GRDECL PERMZ path_to_export/filename%d.grdecl 0 ANALYZED
EXPORT_FIELD_ECL_GRDECL PERMX path_to_export/filename%d 0 ANALYZED 0-5


#### EXPORT_RUNPATH

The EXPORT_RUNPATH workflow job writes the runpath file RUNPATH_FILE for the selected case.

The job can have no arguments, or one can set a range of realizations and a range of iterations as arguments.

Example uses of this job in a workflow:

EXPORT_RUNPATH


With no arguments, entries for all realizations are written to the runpath file. If the runpath supports iterations, entries for all realizations in iter0 are written to the runpath file.

EXPORT_RUNPATH 0-5 | *


A range of realizations and a range of iterations can be given. "|" is used as a delimiter to separate realizations and iterations. "*" can be used to select all realizations or iterations. In the example above, entries for realizations 0-5 for all iterations are written to the runpath file.

### Jobs related to analysis update

#### ANALYSIS_UPDATE

This job will perform a update based on the current case, it is assumed that you have already completed the necessary simulations. By default the job will use all available data in the conditioning and store the updated parameters as the new initial parameters of the current case. However you can use optional argument to control which case the parameters go to, at which report step they are stored and also which report steps are considered when assembling the data. In the simplest form the ANALYSIS_UPDATE job looks like this:

ANALYSIS_UPDATE


In this case the initial parameters in the current case will be updated; using all available data in the conditioning process. In the example below we redirect the updated parameters to the new case NewCase:

ANALYSIS_UPDATE NewCase


Optionally we can decide to update the parameters at a later stage, i.e. for instance at report step 100:

ANALYSIS_UPDATE * 100


The '*' above means that we should update parameters in the current case. Finally we can limit the report steps used for data:

ANALYSIS_UPDATE NewCaseII  0   10,20,30,40,100,120-200


In the last example 10,20,30,40,100,120-200 mean the report steps we are considering when updating. Observe that when we use the first argument to specify a new case the will be created if it does not exist, but not selected.

#### ANALYSIS_ENKF_UPDATE

The ANALYSIS_ENKF_UPDATE job will do an EnKF update at the current report step. The job requires the report step as the first argument:

ANALYSIS_ENKF_UPDATE  10


by default the ENKF_UPDATE will use the observed data at the updatestep, but you can configure it use the report steps you like for data. In the example below the parameters at step 20 will be updated based on the observations at report step 0,5,10,15,16,17,18,19,20:

ANALYSIS_ENKF_UPDATE  20  0,5,10,15-20


The ANALYSIS_ENKF_UPDATE job is a special case of the ANALYSIS_UPDATE job, in principle the same can be achieved with the ENKF_UPDATE job.

### Jobs related to running simulations - including updates

#### RUN_SMOOTHER

The RUN_SMOOTHER job will run a simulation and perform an update. The updated parameters are default stored as the new initial parameters of the current case. Optionally the job can take 1 or 2 parameters. The case to store the updated parameters in can be specified as the first argument. A second argument can be specified to run a simulation with the updated parameters.

Run a simulation and an update. The updated parameters are stored as the new initial parameters of the current case:

RUN_SMOOTHER


Run a simulation and an update. Store the updated parameters in the specified case. This case is created if it does not exist:

RUN_SMOOTHER new_case


Run a simulation and an update. Store the updated parameters in the specified case, then run a simulation on this case:

RUN_SMOOTHER new_case true


Run a simulation and an update. Store the updated parameters in the current case, then run a simulation again. Specify "*" to use the current case:

RUN_SMOOTHER * true


#### RUN_SMOOTHER_WITH_ITER

This is exactly like the RUN_SMOOTHER job, but it has an additional first argumeent iter which can be used to control the iter number in the RUNPATH. When using the RUN_SMOOTHER job the iter number will be defaultetd to zero, and one in the optional rerun.

#### ENSEMBLE_RUN

The ENSEMBLE_RUN job will run a simulation, no update. The job take as optional arguments a range and/or list of which realizations to run.

ENSEMBLE_RUN

ENSEMBLE_RUN 1-5, 8


The LOAD_RESULTS loads result from simulation(s). The job takes as optional arguments a range and/or list of which realizations to load results from. If no realizations are specified, results for all realizations are loaded.

LOAD_RESULTS

LOAD_RESULTS 1-5, 8


In the case of multi iteration jobs, like e.g. the integrated smoother update, the LOAD_RESULTS job will load the results from iter==0. To control which iteration is loaded from you can use the LOAD_RESULTS_ITER job.

The LOAD_RESULTS_ITER job is similar to the LOAD_RESULTS job, but it takes an additional first argument which is the iteration number to load from. This should be used when manually loading results from a multi iteration workflow:

LOAD_RESULTS_ITER

LOAD_RESULTS_ITER 3 1-3, 8-10


Will load the realisations 1,2,3 and 8,9,10 from the fourth iteration (counting starts at zero).

### Jobs for ranking realizations

#### OBSERVATION_RANKING

The OBSERVATION_RANKING job will rank realizations based on the delta between observed and simulated values for selected variables and time steps. The data for selected variables and time steps are summarized for both observed and simulated values, and then the simulated versus observed delta is used for ranking the realizations in increasing order. The job takes a name for the ranking as the first parameter, then the time steps, a "|" character and then variables to rank on. If no time steps and/or no variables are given, all time steps and variables are taken into account.

Rank the realizations on observation/simulation delta value for all WOPR data for time steps 0-20:

OBSERVATION_RANKING Ranking1 0-20 | WOPR:*


Rank the simulations on observation/simulation delta value for all WOPR and WWCT data for time steps 1 and 10-50

OBSERVATION_RANKING Ranking2 1, 10-50 | WOPR:* WWCT:*


Rank the realizations on observation/simulation delta value for WOPR:OP-1 data for all time steps

OBSERVATION_RANKING Ranking3 | WOPR:OP-1


#### DATA_RANKING

The DATA_RANKING job will rank realizations in increasing or decreasing order on selected data value for a selected time step. The job takes as parameters the name of the ranking, the data key to rank on, increasing order and selected time steps. If no time step is given, the default is the last timestep.

Rank the realizations on PORO:1,2,3 on time step 0 in decreasing order

DATA_RANKING Dataranking1 PORO:1,2,3 false 0


#### EXPORT_RANKING

The EXPORT_RANKING job exports ranking results to file. The job takes two parameters; the name of the ranking to export and the file to export to.

EXPORT_RANKING Dataranking1 /tmp/dataranking1.txt


#### INIT_MISFIT_TABLE

Calculating the misfit for all observations and all timesteps can potentially be a bit timeconsuming, the results are therefor cached internally. If you need to force the recalculation of this cache you can use the INIT_MISFIT_TABLE job to initialize the misfit table that is used in observation ranking.

INIT_MISFIT_TABLE


## QC

The QC system is mainly based on workflows.

### QC_WORKFLOW

Name of an existing workflow to do QC work. Will be invoked automatically when a ensemble simulation has been completed, can alternatively be invoked from the QC menu.

## Creating reports

ERT has a limited capability to create pdf reports based on a LaTeX template and the plots you have created. The process for creating reports works like this:

1. You select a report template using the REPORT_LIST keyword.
2. ERT will insantiate a LaTeX report file by using your template, and performing some substitutions. The LaTeX report will be stored in a temporary directory /tmp/latex-XXXXXX.
3. pdflatex is used to compile the latex file into a pdf file

### Format of the template file

The template file should mostly be ordinary LaTeX, but when instantiating ERT will search and replace some strings. The most important are:

$PLOT_CASE This will be replaced with the name of the current case, and that is used to locate the active figures.$WELL_LIST
This will be expanded to a list of well names, REPORT_WELL_LIST below.
$GROUP_LIST This will be expanded to a list of group names, REPORT_GROUP_LIST below.$CONFIG_FILE
The full path of the config file currently in use.
$USER The username of the current user. ### Template loops The template can have a very simple for loop construction. The syntax of the for loop is as follows: {% for x in [a,b,c,d] %} %% Do something with x {% endfor %}  The whole concept is based on regular expressions and is quite picky on the format. Observe the following: 1. CaSe MAttErs - i.e. {% For ... %} with a capital 'F' will not work. 2. The loop variable must start with$ or a letter, followed by an arbitrary number of letters and numbers. I.e. $well, x and ab21 are all valid variable names, whereas 5b, __internal and #var are examples of invalid variable names. 3. The behaviour of the matching in the body depends on whether the variable starts with '$' or not:
1. If the variable starts with '$' embedded substrings will be matched - i.e. WWCT$well will be expanded to e.g. WWCTOP-1.
2. If the variable starts with an alphabet character substrings will not be replaced - i.e. the well in wellname will not be touched.
4. All the spaces (underlined here) in {% for x in [a,b,c,d] %} and {% endfor %} must be present; you can have more spaces if you like.
5. A missing {% endfor %} will be detected with a warning; all other errors will go undected, producing something different from what you wanted, it will probably not even compile.

### Problems

When LaTeX compiling you will get a prompt like this on the screen:

Creating report Reports/<Case>/<Name.pdf>  [Work-path:/tmp/latex-XXXXXX] ......


This means that ERT has created the directory /tmp/latex-XXXXXX and populated that with the file which is compiled. If there are LaTeX problems of some kind you must go to this directory to check out what is wrong, and then fix the source template. When the compilation is finished ERT will print:

Creating report Reports/<Case>/<Name.pdf>  [Work-path:/tmp/latex-XXXXXX] ...... OK??


As indicated by the OK?? it is quite difficult for ERT to assert that the compilation has been successfull, so the pdf file must be opened with e.g. acroread to be certain.

### REPORT_CONTEXT

With the report context word you can define key,value pairs which will be used in a search-and-replace operation in the template. Eg:

REPORT_CONTEXT $FIELD Snorre REPORT_CONTEXT$MODEL "DG-X sensitivity studies"


Here every occurence of $FIELD will be replaced with 'Snorre' and every occurence of$MODEL will be replaced with 'DG-X sensitivity studies'. Observe that the config parser expects that the REPORT_CONTEXT keyword gets two space separated arguments, so quoting with "" is necessary when the value consists of several words. The use of a '$' prefix on the keys above is just a suggestion, and not a rule. ### REPORT_LIST This should be a list of LaTeX templates which you want to use for creating reports. The arguments in this list should either be the path to an existing file, or alternatively the name of a file which can be found in REPORT_SEARCH_PATH. The search order will be to first look directly in the filesystem, and then subsequently go through the paths listed in REPORT_SEARCH_PATH The filename can optionally be followed by a :Name, in that case the created report will be renamed Name.pdf irrespective of the name of the template file. Example: REPORT_SEARCH_PATH /common/report/path REPORT_LIST templates/report.tex /some/absolute/path/report.tex:Report2 well_report.tex:snorre_wells.tex  In the example we specify templates for three different reports: 1. In the relative path templates/report.tex 2. In the absolute path /some/absolute/path/report.tex 3. Assuming there is no file well_report.tex in your current directory ERT will look in the paths specified by REPORT_SEARCH_PATH. Observe the two latter reports will be renamed Report2.pdf and snorre_wells.pdf respectively. ### REPORT_PATH The REPORT_PATH keyword is used to tell ERT where you want to store the finished pdf reports. A subdirectory with the current case name will be appended to this path: REPORT_LIST templates/well_report.tex templates/field_report.tex REPORT_PATH Reports  Assuming the selected case is called prior you will get the reports Reports/prior/well_report.pdf and Reports/prior/field_report.tex. ### REPORT_SEARCH_PATH It is possible to install LaTeX templates for reports in a common location, the REPORT_LIST keyword will then search for the templates in these locations. You can use the REPORT_SEARCH_PATH keyword in your config file, but the most relevant use is in the global site configuration file. REPORT_SEARCH_PATH /common/path/well_reports /common/path/group_reports REPORT_SEARCH_PATH /common/path/field_reports  ### REPORT_WELL_LIST By using the {% for x in [] %} construction in the templates it is possible to have report templates which loop over a list of wells. Unfortunately it is not very interesting to loop over all wells, because ECLIPSE has a limited amount of meta information about the wells, which means that injectors and producers will be treated equally. By using the REPORT_WELL_LIST keyword you can specify which wells you wish to include in the report, these well names will then be assembled into a list which will go into the$WELL_LIST keyword in the report template.

REPORT_WELL_LIST   C*  E1-H       -- Must have supplied a REFCASE for the '*' to work properly
REPORT_WELL_LIST   OP*


These well names are then assembled into a list and replace the symbol $WELL_LIST when creating the report. For this to work the report template should contain a section like: {% for$well in $WELL_LIST %} % Do something with$well
{% endfor %}


### REPORT_GROUP_LIST

This is just like the REPORT_WELL_LIST keyword, but for groups.

## Manipulating the Unix environment

The two keywords SETENV and UPDATE_PATH can be used to manipulate the Unix environment of the ERT process, tha manipulations only apply to the running ERT instance, and are not applied to the shell.

##### SETENV

You can use the SETENV keyword to alter the unix environment enkf is running in. This is probably most relevant for setting up the environment for the external jobs invoked by enkf.

Example:

-- Setting up LSF
SETENV  LSF_BINDIR      /prog/LSF/7.0/linux2.6-glibc2.3-x86_64/bin
SETENV  LSF_LIBDIR      /prog/LSF/7.0/linux2.6-glibc2.3-x86_64/lib
SETENV  LSF_UIDDIR      /prog/LSF/7.0/linux2.6-glibc2.3-x86_64/lib/uid
SETENV  LSF_SERVERDIR   /prog/LSF/7.0/linux2.6-glibc2.3-x86_64/etc
SETENV  LSF_ENVDIR      /prog/LSF/conf


Observe that the SETENV command is not as powerful as the corresponding shell utility. In particular you can not use $VAR to refer to the existing value of an environment variable. To add elements to the PATH variable it is easier to use the UPDATE_PATH keyword. ##### UPDATE_PATH The UPDATE_PATH keyword will prepend a new element to an existing PATH variable. I.e. the config UPDATE_PATH PATH /some/funky/path/bin  will be equivalent to the shell command: setenv PATH /some/funky/path/bin:$PATH

The whole thing is just a workaround because we can not use \$PATH.