The ERT code

From Ert

Jump to: navigation, search

The ERT system is based on several different libraries, some of these libraries can potentially be of interest for other applications in reservoir modelling and management. The various libraries are written in C. Some general principles have been applied quite rigourously throughout the code, and keeping these in mind should help understand the code. Observe that the examples below are meant as documentation of principles and ideas, and are not actually verbatim pieces from the code. If there is a discrepancy between the dcoumentation here and the code, the code is always right!

Contents

General principles

Opaque pointers

To facilitate encapsulation of information a system of opaque pointers has been implemented throughout. For example in ECLIPSE library libecl there is an implementation of a grid structure ecl_grid_type. This is briefly implemented as follows:

  1. The header ecl_grid.h file contains a typedef struct ecl_grid_struct ecl_grid_type;
  2. The implementation of the ecl_grid_struct is in the file ecl_grid.c.
  3. All files using the ecl_grid functionality must then have #include <ecl_grid.h>, and call accessor functions in ecl_grid.c to manipulate/query the grid object.
  4. All functions operating on the ecl_grid datatype should have a pointer to an ecl_grid_type instance as their first argument.
//Header file: ecl_grid.h
...
typedef struct ecl_grid_struct ecl_grid_type;

ecl_grid_type * ecl_grid_alloc( const char * filename );
void            ecl_grid_free( ecl_grid_type * ecl_grid );
int             ecl_grid_get_nactive( const ecl_grid_type * ecl_grid );
// Implementation: ecl_grid.c


/* 
  Defining the data structure ecl_grid_struct, this definition is only visible inside this file. 
  Functions defined externally can not get at the internal fields of the ecl_grid struct. 
*/

struct ecl_grid_struct {
   int   size;
   int   nx,ny,nz;
   int   nactive;
   .....
};


/**
   Allocator function. Take a grid filename as input, 
   and allocate a grid instance based on this file.
*/
ecl_grid_type * ecl_grid_alloc( const char * filename ) {
   ecl_grid_type * ecl_grid = util_malloc( sizeof * ecl_grid , __func__);
   // Initialize the grid instance ...
   return ecl_grid;
}


/**
   Destructor - free all the resources used by the ecl_grid instance. 
*/
void ecl_grid_free( ecl_grid_type * ecl_grid ) {
  // free all the internal data of the ecl_grid instance. 
  free( ecl_grid );
}


/** 
   Accessor function - get the number of active cells from the grid.
*/
int ecl_grid_get_nactive( const ecl_grid_type * ecl_grid ) {
   return ecl_grid->nactive;
}


// External 'program' using ecl_grid_type.

/**
  This little program reads the name of EGRID/GRID file from the command line,
  loads the grid and then queries the grid how many active cells it has, before 
  freeing the grid again.
*/

#include <ecl_grid.h>

int main( int argc, char ** argv) {
   const char * grid_file   = argv[1];
   ecl_grid_type * ecl_grid = ecl_grid_alloc( grid_file );
   printf("Griddet: %s har %d aktive celler \n",grid_file , ecl_grid_get_nactive( ecl_grid ));
   ecl_grid_free( ecl_grid );
   return 0;
}

File scope

Regarding files there are roughly two rules:

  1. The struct ecl_grid_struct is implemented in the file ecl_grid.h/ecl_grid.c, i.e. the name of the file corresponds to the basename of the type.
  2. Each file should only implement one datatype which is exported. Observe that many of the larger structs, like e.g. ecl_grid_struct are based on helper datatypes which are implemented in the same file, however these are not visible outside the scope of ecl_grid.c. There are some very few examples of files implementing more than one public datatype.


Naming conventions

All function names should be prefixed with the basename of the struct 'class' they are implementing, i.e. all functions implementing ecl_grid functionality should have a name starting with ecl_grid.

In addition to the prefix convention, there are some other naming principles which have been adhered to:

  1. The function names should describe quite well what the function does, and they are verbose.
  2. For functions where use of a stdlib function is an important part of the function, that stdlib name should be part of the name. I.e a for a function where writing to file with the fwrite() call is an important part, fwrite should be part of the name.
  3. Function which implement set and get functionality should be named accordingly. Functions which implement set and get functionality of an indexed quantity, should have _iget()/_iset() names.
  4. All types should have suffix _type.
  5. The elements in enumerations and macros defined with #define should have LARGE CAPSE, everything else should be in small caps.

Memory treatment

C of course has a rather rudimentary memory management, so it is quite possible to mess up on this. The ERT code employs several general principles which should simplify, but for more advanced usage you really need to understand memory management. Sorry.

Constructor / allocation function

All datatypes have at least one designated function for allocation with _alloc in the name. When the allocation routine returns the instance should be in a fully initialized state, which is internally consistent. Many of the datatypes have several different functions for allocation & initialization, xxx_fread_alloc() which will bootstrap from a file is a quite common example, internally many of the datatypes have static functions with names like xxx_alloc_empty(), these typically initialize parts of the data structure, and return a partly initialized object.

Destructor / free function

All datatypes have a desctructor/free function with _free in the name. When an object has been returned from one of the constructor functions, and manipulated with the member functions, it should always be safe to call the destructor. Most datatypes are fully self-contained, taking all storage down the drain when exiting, but there are also som e datatypes which contain pointers to data they do not own. Such shared reference should be clearly documented in the struct definition.

Other memory functions

Many of the datatypes are internally based on e.g. a char * pointer to store whatever. If you need to go low-level (at your own risk of course...) many of the datatypes have functions which will return a pointer to the internal storage, i.e.

double_vector_type * vector = vector_alloc(0,0);
....
{
   const double * data_ref  = double_vector_get_ptr( vector );
   double       * data_copy = double_vector_data_copy( vector );
   ....
   free( data_copy );  
}
double_vector_free( vector );

Here data_ref is just a "sneak pointer" into the internals of the double_vector, and should not be freed when going out of scope. The data_copy pointer on the other hand represents pristine storage, and should be explicitly freed from the current scope. Unfortunately there are no consistent naming conventions for functions of this type.

Container types can clean up

The container types, hash_type and vector_type can take ownership of the objects installed, and then free the objects when the hash/vector itself is freed. To get this behaviour you should use the xxx_owned_ref() functions when installing in the hash/vector, and supply a destructor. The destructor prototype expects a (void *) input, so a simple wrapper of the destructor taking this form of input is needed. This is demonstrated in the following example:

1.  ecl_grid_type * ecl_grid  = ecl_grid_alloc( grid_file_name );
2.  hash_type     * hash      = hash_alloc();
3.  double        * data      = util_malloc( 1000 * sizeof * data , __func__);
4.  hash_insert_hash_owned_ref( hash , "GRID" , ecl_grid , ecl_grid_free__):
5.  hash_insert_hash_owned_ref( hash , "DATA" , data     , free );
6.  ....
7.  ....
8.  hash_free( hash );

What happens in this example is:

  1. In lines 1-3 an ecl_grid instance and double * pointer are allocated, along with a hash table to store these items in.
  2. In lines 4-5 the ecl_grid and double * pointers are inserted in the hash table. We use the hash_insert_hash_owned_ref() function when inserting, essentially telling the hash structure that it is responsible for freeing these objects when it is freed itself. When freeing the objects the function given as fourth argument should be used, i.e. the grid can be freed with ecl_grid_free__ and the double * pointer can be freed with the ordinary free function.
  3. In line 8 the hash_free() function is called, and everything is freed.

The ecl_grid_free__ function will typically look like this:

void ecl_grid_free__( void * arg ) {
   ecl_grid_type * ecl_grid = (ecl_grid_type *) arg;
   ecl_grid_free( arg );
}

I.e. we cast the (void *) input argument to the correct type, and then continue to call the standard destructor.

Dead Programs Tell No Lies

When something goes wrong, e.g. we try to open a file for reading which does not exist, the program will crash hard with the function util_abort() which should print which function failed, and a backtrace on stderr. This is considered a feature. If you would rather like to handle your error situations yourself, it should always be possible to check before going into a operation which will abort on failure.

The different libraries

All in all the ERT code consists of different libraries. The dependencies between the different libraries is illustrated in the graph below.

Graph.png

The solid boxes illustrate the internally developed ERT libraries, the solid lines show dependencies. All explicit dependencies are shown as connecting arrows. The dashed boxes illustrate external libraries; with dashed lines connecting the internal library which has explicit dependance on said external library.

libutil

The most important parts of the libutil library are:

As indicated the libutil library is a mix of 'this and that', in particular there is no common namespace for the library. The individual files and datatypes have a consistent naming convention though, i.e. matrix_type is implemented in matrix.c/matrix.h, and all the matrix functions have prefix matrix_. More detailed documentation of libutil.

libconfig

More detailed documentation of libconfig.

libplot

More detailed documentation of libplot.

libecl

More detailed documentation of libecl.

librms

More detailed documentation of librms.

libsched

More detailed documentation of libsched.

libjob_queue

More detailed documentation of libjob_queue.

libenkf

More detailed documentation of libenkf.

Using the code

All the code is installed in the svn repository EnKF in SDP. Here it is documented how you can check out the code from svn to your own computer, and how to compile it. It is possible to make your own programs using the ERT libraries by using the existing installation, and not downloading and compiling your personal copy.

Building from svn

Creating your own programs

The ERT libraries are written in C, and hence the easiest is if you write your own programs in C as well. Some comments on the use of other languages will be at the bottom.

Where are the ERT libraries located

Include and link flags

To compile a C program using the ERT libraries you need essentially two steps:

  1. Include the relevant header files in your source code with:
    #include <ecl_file.h>
    
    and tell the compiler where in the filesystem the header files are located, with the -I/path/to/include/dir command line directive.
  2. To link with the relevant libraries. This is done by adding -lbasename on the link command-line, where basename is the basename of the library in question. I.e. to link with libutil and libecl you add the flags:
         -lecl  -lutil
    
    , and tell the compiler where libraries are installed with the -L/path/to/library/dir command line directive.


A small example

The following little program should illustrate the process.

#include <stdlib.h>
#include <ecl_grid.h>
#include <util.h>

/*
  This little program will load a GRID/EGRID file, and print the number of active cells. 
  The grid file to load is given as command line argument when running the program.
*/

int main(int argc , char ** argv) {
   if (argc != 2)
      util_exit("The program:%s needs exactly one commandline argument. %d Given \n",argv[0] , argc - 1);
   {
      ecl_grid_type * ecl_grid = ecl_grid_fread_alloc( argv[1] );
      printf("The grid:%s has %d active cells \n",argv[1] , ecl_grid_get_nactive( ecl_grid ));
      ecl_grid_free( ecl_grid );
   }
}

We see that this program uses the libraries libutil and libecl. The commands to compile and build this program can be summarized as:

  1. The first step is to compile the source code grid_test.c to an object file grid_test.o. That is done with the following magic string:
    gcc -Wall -std=gnu99 -I/path/to/libutil -I/path/to/libecl -c grid_test.c
    
  2. The second step is to link the object file grid_test.o with the relevant libraries to produce a executable program. That is achieved with the following mumbo-jumbo:
    gcc -L/path/to/lib -o grid_test grid_test.o -lecl -lutil -llapack -lz -lm
    
Personal tools
Namespaces
Variants
Actions
Navigation
Download code
Support
Toolbox