DEFINITION MODULE DataFrames;
(*******************************************************************
Module DataFrames (DF_Version_2.2)
Copyright (c) 1997-2006 by Andreas Fischlin, Dimitrios Gyalistras
and ETH Zurich.
Purpose Core routines to work with Data Frames
(see also module DatFraAux for more advanced
routines). Data Frames are read from file(s) and
stored in memory for analysis and subsequent
retrieval.
Remarks Current implementation assumes any number of
data frames (spread over any number of files via
so-called file references) can be read into memory
for subsequent retrieval. Actual data are always
specified in a tabular form. The syntax of an
input file is as follows:
DataFrameFile = (FileReference | DataFrame)
{FileReference | DataFrame}.
FileReference = "FILE" "=" fileName ";" [FilterSpecif].
fileName = STRING.
FilterSpecif = "USE" "IF" "FILTER" "=" filterVal ";".
filterVal = LONGINT.
DataFrame = "DATAFRAME" dataFrameIdent ";"
[DataFrameParamList]
"DATA" ":" Table
"END" dataFrameIdent ";".
dataFrameIdent = IDENTIFIER.
DataFrameParamList = { [FilterSpecif] | [ParentOrModelSpecif] |
[RemarkSpecif] | [KeyColumnSpecif] }.
ParentOrModelSpecif = ("PARENT"|"MODEL") "=" parentOrModelID ";".
parentOrModelID = IDENTIFIER | "ANY" | "ALL".
RemarkSpecif = "REMARK" "=" STRING ";".
KeyColumnSpecif = "KEYCOLUMN" "=" keyColumnID ";".
keyColumnID = IDENTIFIER.
Table = TableHeader TableLine {";" TableLine}.
TableHeader = IDENTIFIER {IDENTIFIER} ";".
TableLine = TableEle {TableEle}.
TableEle = (INTEGER | LONGINT | REAL | LONGREAL |
IDENTIFIER | STRING | BOOLEAN).
Syntax of the elementary data types is (regular expression notation):
INTEGER = [+-]?[0-9]+
LONGINT = [+-]?[0-9]+ "D"
REAL = [+-]?[0-9]+ "." [0-9]+ (("E"|"e")[+-]?[0-9]+)?
LONGREAL = [+-]?[0-9]+ "." [0-9]+ (("D"|"d")[+-]?[0-9]+)?
IDENTIFIER = [a-zA-Z] [_a-zA-Z0-9]*
STRING = ('.*')|(".*")
BOOLEAN = "TRUE" | "FALSE"
Comments start with "(*" and close with "*)" and may be nested.
Examples:
DATAFRAME SiteBatch; MODEL = ForClim;
DATA: SiteName; "Bern"; "Davos"; "Bever_S"
END SiteBatch;
or
DATAFRAME ModelParameters;
REMARK = 'For any logistic growth model';
MODEL = ANY;
KEYCOLUMN = Ident;
DATA:
(*===========================================================*)
Ident Descr val min max unit ;
(*-----------------------------------------------------------*)
r 'Relative growth rate' 0.7 0.0 10.0 'd^-1' ;
K 'Carrying capacity' 700.0 0.0 1.0E+38 'g/m^2' ;
(*-----------------------------------------------------------*)
END ModelParameters;
Filter specifications serve as a reading filter to
control conditional recognition of file references or
data frames. Only those file references or data frames,
respectively, are scanned for content and loaded into
memory, whose filter value (see symbol filterVal in EBNF
above) actually fits within the filter range
[fromFilter..toFilter] passed as actual arguments to the
reading routines ReadDataFramesIntoMemory or
LoadDataFrames (see below). Otherwise they are entirely
skipped and the scanning process continues after the
file reference or data frame, respectively. Any file
reference or data frame without a filter specification
(optional symbol FilterSpecif, see EBNF above) will be
treated to have a filter specification like "USE IF
FILTER = 0 (* = defaultFilter *)". To recognize, i.e.
load, all file references and data frames regardless of
their filter specifications, pass the range
[minFilter..maxFilter] to ReadDataFramesIntoMemory or
LoadDataFrames, respectively.
Note, this module does not export an abstract data
type DataFrame. See module DatFraAux instead, which
exports this type and allows to access and operate on
individual data frames via its routines.
NOTE This module uses package DataTables (all data tables
are used in locked mode only).
Programming
o Design
Andreas Fischlin 04/03/1997
Dimitrios Gyalistras 23/07/1997
o Implementation
Andreas Fischlin 04/03/1997
ETH Zurich
Systems Ecology
CHN E 35.1
Universitaetstrasse 16
8092 Zurich
SWITZERLAND
URLs:
<mailto:RAMSES@env.ethz.ch>
<http://www.sysecol.ethz.ch>
<http://www.sysecol.ethz.ch/SimSoftware/RAMSES>
Last revision of definition: 05/06/1998 AF
*******************************************************************)
(********************************)
(*##### Error Handling #####*)
(********************************)
FROM DMLanguage IMPORT userBase;
CONST
dtfOffset = userBase + 200; (* constants used for resCode *)
badLoadFromAnchor = dtfOffset + 0; (* internally used by LoadDataFrames *)
loadUserInterrupted = dtfOffset + 1; (* internally used by LoadDataFrames *)
integerOverflow = dtfOffset + 2; (* internally used by VI,VIVec *)
realOverflow = dtfOffset + 3; (* internally used by VR,VRVec *)
datFraHasNoKeyCol = dtfOffset + 4; (* internally used by GetValDefsKeyColumn *)
subKeyIsRealOrBoolCol = dtfOffset + 5; (* internally used by GetValDefsSubKeyColumn *)
(* Note: Error #'s 500..550 (i.e. userBase + 200..userBase +
250) are reserved for the entire package DataFrames. The
underlying DataTables uses range 600..650. The DataFrames
package uses only module Errors and its modes for error
generation and display. WARNING: In case the mode is
suppressing and deferring error messages, you won't learn
about any errors which have possibly occurred. See also
module Errors. *)
(*************************************************)
(*##### Loading Data Frames Into Memory #####*)
(*************************************************)
TYPE
ReadingFilter = LONGINT;
VAR
defaultFilter: ReadingFilter; (* READ ONLY *)
minFilter: ReadingFilter; (* READ ONLY *)
maxFilter: ReadingFilter; (* READ ONLY *)
PROCEDURE ReadDataFramesIntoMemory (startOnFileWithName: ARRAY OF CHAR;
fromFilter,
tillFilter: ReadingFilter;
VAR resCode: INTEGER);
PROCEDURE LoadDataFrames (VAR startOnFileWithName: ARRAY OF CHAR;
fromFilter, tillFilter: ReadingFilter;
VAR resCode: INTEGER);
(*
Above procedures use the file with name startOnFileWithName
(may contain a path) as an anchor to scan and interprete
data frames. Any value definitions found during the
reading process are transferred to memory and can
subsequently be retrieved by any of the value definition
retrieval routines described below (e.g. VR or GetRVec).
Any file reference or data frame whose filter specification is
out of range [fromFilter..tillFilter] will be ignored and
skipped by the scanning process. Note, file references or data
frames which do not specify its filter explicitely, i.e. those
without the phrase "USE IF FILTER = x" are treated as if they
would have the specification "USE IF FILTER = defaultFilter (* =
0 *)". To load without any filtering pass the range
[minFilter..maxFilter] to above routines.
Any data frames already in memory with the same identifier as
a data frame encountered during the reading process will be
discarded and replaced (overwritten) with the content of the
newly read data frame. This allows for a partial or total
updating of value definitions or loading of additional data
frames etc. See also procedure DropAllDataFrames for freeing
memory and module DatFraAux for more advanced operations on
data frames.
LoadDataFrames is very similar to ReadDataFramesIntoMemory; in
fact, it calls the same core function as ReadDataFramesIntoMemory,
but offers additional features like offering a file opening dialog
in case that startOnFileWithName is the empty string, displays the
progress by showing in a small window which file is corrently
processed, and features an automatic error display in case that the
anchor file can't be accessed properly or ReadDataFramesIntoMemory
returns with resCode<>Errors.allOk. startOnFileWithName is also an
in/out parameter which allows to learn about which file has
actually been opened as anchor. If the file opening fails for
whichever reason, the startOnFileWithName returns untouched.
All currently installed handlers for a loading event (see module
DatFraAux, routine AddAnyDFChangedHandler) will always be called
at the end of ReadDataFramesIntoMemory.
*)
(**********************************************)
(*##### Retrieving Value Definitions #####*)
(**********************************************)
TYPE
ALPHA = ARRAY [0..63] OF CHAR; (* also maximum size of identifiers *)
ValDefType = (undefined, integer, longint, singlereal, doublereal, boolean,
identifier, string);
(*
Types of value definitions supported by data frames.
undefined denotes a value definition which has never been
made, i.e. the associated identifier has never been
found in any data frames by previous calls to
ReadDataFramesIntoMemory (see also procedure GetValDefType).
Note, in contrast to a general string an identifier
follows a particular syntax and starts with a letter and
contains no special characters, i.e. its EBNF is:
identifier ::= letter {letter|digit|"_"}.
*)
VAR
undefInteger: INTEGER; (* READ ONLY; = MIN(INTEGER)+1 *)
undefLongInt: LONGINT; (* READ ONLY; = MIN(LONGINT)+1 *)
undefReal: REAL; (* READ ONLY; = DMConversions.UndefREAL() *)
undefLongReal: LONGREAL; (* READ ONLY; = DMConversions.UndefLONGREAL() *)
undefBoolean: BOOLEAN; (* READ ONLY; = BOOLEAN(ORD(MAX(BOOLEAN))+1) *)
(* WARNING: undefBoolean may be TRUE or FALSE in a
compiler dependent manner. *)
undefString: ARRAY [0..0] OF CHAR; (* READ ONLY; undefString[0] = 0C *)
(*
Most above constants are the defaults also used by DataTables. They
are returned by the value definition retrieval procedures in case the
value definition could not be found or an error occurred, like a number
overflow.
*)
(*--------------------------------*)
(*===== scalar retrieval =====*)
(*--------------------------------*)
PROCEDURE VI (ident: ARRAY OF CHAR): INTEGER;
PROCEDURE VLI (ident: ARRAY OF CHAR): LONGINT;
PROCEDURE VR (ident: ARRAY OF CHAR): REAL;
PROCEDURE VLR (ident: ARRAY OF CHAR): LONGREAL;
PROCEDURE VB (ident: ARRAY OF CHAR): BOOLEAN; (* returns undefBoolean if undefined *)
PROCEDURE Flag (ident: ARRAY OF CHAR): BOOLEAN; (* returns FALSE if undefined *)
PROCEDURE GetVS (ident: ARRAY OF CHAR; VAR s: ARRAY OF CHAR);
PROCEDURE GetVId(ident: ARRAY OF CHAR; VAR id: ARRAY OF CHAR);
(*--------------------------------*)
(*===== vector retrieval =====*)
(*--------------------------------*)
PROCEDURE GetIVec (ident: ARRAY OF CHAR;
VAR ivec: ARRAY OF INTEGER; VAR n: INTEGER);
PROCEDURE GetLIVec(ident: ARRAY OF CHAR;
VAR livec: ARRAY OF LONGINT; VAR n: INTEGER);
PROCEDURE GetRVec (ident: ARRAY OF CHAR;
VAR rvec: ARRAY OF REAL; VAR n: INTEGER);
PROCEDURE GetLRVec(ident: ARRAY OF CHAR;
VAR lrvec: ARRAY OF LONGREAL; VAR n: INTEGER);
PROCEDURE GetBVec(ident: ARRAY OF CHAR;
VAR bvec: ARRAY OF BOOLEAN; VAR n: INTEGER);
PROCEDURE GetSVec(ident: ARRAY OF CHAR;
VAR svec: ARRAY OF ALPHA; VAR n: INTEGER);
PROCEDURE GetIdVec(ident: ARRAY OF CHAR;
VAR idvec: ARRAY OF ALPHA; VAR n: INTEGER);
PROCEDURE GetSs (ident: ARRAY OF CHAR; sepChar: CHAR;
VAR ssvec: ARRAY OF CHAR; VAR n: INTEGER);
PROCEDURE GetIds (ident: ARRAY OF CHAR; sepChar: CHAR;
VAR idvec: ARRAY OF CHAR; VAR n: INTEGER);
(*
Above procedures allow to fetch value definitions as
contained in all data frames currently stored in memory.
Basic methods of retrieving value definitions
---------------------------------------------
Value definitions are given by an identifier consisting of
2 (vector retrieval) or 3 parts (scalar retrieval), where
parts are delimited by ".". Ex.:
"MyModel.Bern.Bucketsize" - scalar retrieval returns single value
"MyModel.Bucketsize" - vector retrieval returns vector of n values
Formation of ident
------------------
The identifier ident denotes a particular value definition,
the procedures return the corresponding value (in case of
vector retrieval additional information on the dimension of
the vector is also returned). In case of scalar retrieval,
the middle part needs not be an identifier, but can be
any string and may even contain a period. Ex.:
good idents for value definitions:
"ForClim.Bern.Beech_species"
"FC_C.St. Gallen.Bucketsize"
bad idents for value definitions:
"ForClim.Bern.Beech specs."
"FC-C.St. Gallen.Bucketsize"
Note: There are auxiliary routines (ConcatIdent and
SplitIdent) available from this module for construction
and analysis of ident.
The first part of ident is always formed exactly the same
way: It is either the identifier of the data frame (see also
routine DatFraAux.GetDataFrameIdent) or it is the parent or
model identifier, respectively. Any data frame may use a so
called parent or model specification (see EBNF-production
ParentOrModelSpecif), hereby using a common identifier for
value definitions. This allows to construct large sets of
value definitions denoted by a common first part from
several data frames. Ex.:
DATAFRAME SiteEdaphics; MODEL = ForClim; Data:
(*-------------------------------------*)
Site Bucketsize SiteID;
(*-------------------------------------*)
"Bern" 30.0 333333 ;
"Bever S" 20.0 222222 ;
"Davos" 25.0 666666 ;
END SiteEdaphics;
DATAFRAME TreeParams; MODEL = ForClim; KEYCOLUMN = SpecIdent; DATA:
(*-------------------------------------------------------------*)
SpecIdent Scientific_name Common_name;
(*-------------------------------------------------------------*)
Fsil "Fagus silvatica L." "European beech" ;
Aalb "Abies alba Mill." "European silver fir" ;
END TreeParams;
Above data frames allow for accessing value definitions,
which all start with an identical first part, i.e. idents
like "ForClim.Site", "ForClim.BucketSize", or
"ForClim.Fsil.Scientific_name" (see also following
explanations).
The remainder parts of ident are constructed in two different
ways, depending on the retrieval method:
Retrieval method Examples of ident
---------------- -------------------------
scalar retrieval "MyModel.ThirdRowEntry.MyColumnHead"
"SiteEdaphics.Bern.BucketSize"
vector retrieval "MyModel.MyColumnHead"
"SiteEdaphics.BucketSize"
IMPORTANT NOTE: Scalar retrieval is only possible if the data
frame uses a keyColumn. Vector retrieval is always possible.
In case of scalar retrieval, the first part of ident is
followed by a '.' and the identifier of a particular column.
Such an ident always denotes an entire vector of values. The
tabulated number of rows defines its dimension n. E.g. the
following data frame
DATAFRAME SiteEdaphics1; DATA:
(*-------------------------------------*)
Site Bucketsize SiteID;
(*-------------------------------------*)
"Bern" 30.0 333333 ;
"Bever S" 20.0 222222 ;
"Davos" 25.0 666666 ;
END SiteEdaphics1;
allows to retrieve all bucket sizes by code similar to this
VAR bs: ARRAY [0..maxNoSites] OF LONGREAL;
...
GetRVec("SiteEdaphics1.Bucketsize",bs,n);
IF n<=maxNoSites THEN
(* use bs *)
ELSE
Errors.Info(...
END(*IF*);
Similarily the corresponding site identifiers could be retrieved by
VAR ss: ARRAY [0..maxNoSites] OF ALPHA;
...
GetSVec("SiteEdaphics1.Site",ss,n);
In case of scalar retrieval (a keyColumn is in use), the
remainder parts of the identifier is built as follows: Again
first part is followed by a '.', but then comes the string
found in the particular row of the column headed by
keyColumn, another '.', plus the identifier of the column.
E.g. the following data frame lists in the key column Site
strings; each of these can be used to denote a particular row.
DATAFRAME SiteEdaphics2; KEYCOLUMN = Site;
DATA:
(*-------------------------------------*)
Site Bucketsize SiteID;
(*-------------------------------------*)
"Basel" 30.0 111111 ;
"Bern" 30.0 333333 ;
"Bever S" 20.0 222222 ;
"St. Gotthard" 20.0 444444 ;
"Bever_N" 15.0 555555 ;
"Davos" 25.0 666666 ;
"Bern" 25.0 333333 ;
END SiteEdaphics2;
Above data frame allows to retrieve individual bucket sizes
(Bucketsize) or site identification numbers (SiteID) as
follows
myBucketSize := VR("SiteEdaphics2.Basel.Bucketsize");
theSiteID := VI("SiteEdaphics2.Davos.SiteID");
or
GetSVec("SiteEdaphics2.Site",ss,n);
IF n<=maxNoSites THEN
FOR i := 0 TO n-1 DO
ConcatIdent("SiteEdaphics2",ss[i],"Bucketsize",id);
myBucketSize[i] := VR(id);
END(*FOR*);
ELSE
Errors.Info(...
END(*IF*);
To allow for model specific retrieval above header of the
data frame would need the additional MODEL clause similar
to this:
DATAFRAME SiteEdaphics2; KEYCOLUMN = Site;
MODEL = ForClimS; DATA:
(*-------------------------------------*)
Site Bucketsize SiteID;
(*-------------------------------------*)
"Basel" 30.0 111111 ;
...
...
allowing for retrievals similar to this
myBucketSize := VR("ForClimS.Basel.Bucketsize");
above retrieval would also work if the headers of data
frame SiteEdaphics2 would contain the reserved word ANY or
ALL, i.e. "MODEL = ANY" or "MODEL = ALL".
Uniqueness of Value Definitions
-------------------------------
ASSERTION: Since the reading process of data frames unifies
any value definitions referring to the same value, the
access of a particular value definition via its identifier
ident and its expected type is unique at all times.
Note, in the above example for data frame SiteEdaphics2,
which contains 2 entries for "Bern", above assertion becomes
relevant. Module DataFrames always unifies all data definitions
encountered by treating any conflicts like redifinitions and
overwriting a particular value definition with the last found
during reading, i.e. VR("SiteEdaphics2.Bern.Bucketsize")
returns in the above example 25.0 and not 30.0. To learn
about the fact of multiple definitions use procedure
DatFraAux.InspectValDef (returned value of nrOfDefs).
Note also, in case the same identifier denotes value
definitions of different types, they can still be retrieved
in a unique way. E.g.
DATAFRAME SiteEdaphics3; MODEL = ForClim;
KEYCOLUMN = Site; DATA:
(*-------------------------------------*)
Site Bucketsize ;
(*-------------------------------------*)
"Bern" 30.0 ;
"Bever S" 20.0 ;
"Davos" 25.0 ;
END SiteEdaphics3;
DATAFRAME SiteEdaphics4; MODEL = ForClim;
KEYCOLUMN = Site; DATA:
(*-------------------------------------*)
Site Bucketsize ;
(*-------------------------------------*)
"Bern" high ;
"Bever S" low ;
"Davos" medium ;
END SiteEdaphics4;
In above example you can either call
VR("ForClim.Bern.Bucketsize") or you can call
GetVS("ForClim.Bern.Bucketsize", bucketSizeCode)
successfully. The result is different and lets you
distinguish between the two types of value definitions.
However, within a particular type, the identifier denotes
uniquely a single value definition without ambiguity.
Important note: Be aware that for the sake of simplicity,
the following types (see ValDefType) are merged and columns
of one of these types is treated as the same:
integer with longint,
singlereal with doublereal,
This means that a real value definition for a given
identifier is uniqe within all present real types, i.e.
regardless wether a particular column is of type singlereal
or doublereal etc. Note also this means that you can
retrieve the very same real value either with VR or with
VLR. If the actual value is too large to fit into a single
real (type REAL), the difference is that you get an overflow
by using VR. This means, that VR vs. VLR, or VI vs. VLI
etc. differ only in their memory requirements for storing
the returned values, but not which original value definition
they are referencing in the data base.
IMPLEMENTATION RESTRICTION: A column which contains only NA
is of no defined type. Thus any retrieval of values from
such a column would always only return undefined values.
Therefore, for the sake of simplicity and efficiency such
columns will be completely ignored and attempts to retrieve
data from them will be treated by this model as if the
column would be completely absent.
Finally note that certain priority rules are observed
during data retrieval to allow for easier assignment of
data to models in the context of modeling and simulation
(clauses "MODEL = ANY" and "MODEL = ALL"). See below type
definition DataFrameSet and routine LastValDefFoundIn.
Vector dimension (in case of vector retrieval)
----------------
In case of vector retrieval n returns the dimension of the
actual vector as stored in the data frame where ident was
found. Usually n is the number of elements actually assigned
to the returned vector variable like rvect, ivect etc. Note
however, if the callee passes a vector of insufficient
length, only HIGH(vect) values will be assigned and no error
message is produced (incomplete retrieval). Only n will
return the actual, larger number of elements contained in the
data frame. To learn about the actual dimension of the
vector data use routine GetVectDim in order to proved a
sufficiently large object for data retrieval. If the value
definition couldn't be found, n is 0.
Type of Value Definitions
-------------------------
Note, a data frame defines the type of values hold in a
particular column by implicit definition. The first row
tabulated defines the type of the data expected in the entire
table (for details see modules DatFraAux and DataTables,
especially the latter which is used by DataFrames to load
data frames). Any subsequent row, which contains an entry in
a particular column which does not match the expected type,
will be treated as an erronous condition. Such errors will
be normally reported during any calls to LoadDataFrames or
ReadDataFramesIntoMemory and no such faulty data frame should
ever be available for retrieval of value definitions.
If the type of the value definition does not match the
expected one as given by the routine like VR or GetRVect,
one of the read only variables undefInteger, undefReal, or
undefBoolean are returned instead; for the string or
identifier types, the emtpy string is returned.
The same is the case for missing values. The latter are
entered in a data frame by using the symbol NA instead
of the value.
IMPLEMENTATION RESTRICTIONS: To allow for an unambiguous
detection of a missing value definition (symbol NA) in case
VI, VB, GetVS, or GetVId (GetIVec, GetBVec, GetSVec,
GetIdVec, GetSS, or GetIds respectively) return undefInteger
or undefBoolean, inspect n or use also GetValDefType to
learn whether the value definition is really undefined or
the data frame happens to contain exactly that value used by
undefInteger or undefReal.
A column which contains only NA is of no defined type. Thus
any retrieval of values from such a column would always only
return undefined values. Therefore, for the sake of
simplicity and efficiency such columns will be completely
ignored and attempts to retrieve data from them will be
treated by this model as if the column would be completely
absent.
GetVS vs. GetVId
----------------
GetVS can retrieve only a string, whereas GetVId can
retrieve only an identifier.
GetSs/GetIds vs. GetSVec/GetIdVec
---------------------------------
GetSs (or GetIds) returns a vector of strings (or
identifiers), but in contrast to GetSVec (or GetIdVec) not
as an open array, but as a single large string, where the
elements of the vector are delimited by the separator
sepChar, typically "|". Use DMStrings.ExtractSubString to
access invidual elements with the following algorithm
GetSs(ident,"|",ssvec,n);
ix := 0; count := 0; ExtractSubString(ix,ssvec,element,"|");
WHILE (element[0]<>0C) DO
INC(count); (* process element count *)
ExtractSubString(ix,ssvec,element,"|");
END(*WHILE*);
IF count<>n THEN
Errors.Info(...
END(*IF*);
*)
(*************************************************************************)
(*##### Construction & Analysis of Value Definition Identifiers #####*)
(*************************************************************************)
PROCEDURE ConcatIdent( dtfParModId, (* data frame (no key column) or
parent or model identifier (with key column) *)
rowId, (* = empty string if vector retrieval *)
colId: ARRAY OF CHAR;
VAR ident: ARRAY OF CHAR);
PROCEDURE SplitIdent ( ident: ARRAY OF CHAR;
VAR dtfParModId, (* data frame (no key column) or
parent or model identifier (with key column) *)
rowId, (* = empty string if vector retrieval *)
colId: ARRAY OF CHAR);
(*
Concatenates, respectively splits, ident from/to its parts
for subsequent uses while calling routines retrieving value
definitions.
*)
PROCEDURE FixIdent(VAR ident: ARRAY OF CHAR);
(*
Utility to fix the identifier 'ident', which has recently
been used to retrieve a value definition according to the
result by LastValDefFoundIn() (from this module). Upon
returning from FixIdent the first part of the ident may
contain ALL or ANY, respectively, instead of the original
model name, depending in which set of data frames the
value definition was actually retrieved. If the value
definition could not be retrieved, the first part contains
"-" instead of the model name.
*)
(********************************************)
(*##### Type of A Value Definition #####*)
(********************************************)
TYPE
ValDefTypes = SET OF ValDefType;
(*
Since value definitions referred to by the same
identifier can have different types, a particular
identifier can be used to retrieve not just a single, but
a set of value definition types (see also above comments
on topic "Uniqueness of Value Definitions" and routine
GetValDefType).
*)
PROCEDURE GetValDefType(ident: ARRAY OF CHAR; VAR fdt: ValDefTypes);
(*
Allows to learn about the type(s) of the value
definitions, without actually having to retrieve the
data. Returns the empty set if the value definition is
currently not stored in memory.
*)
(***********************************************************)
(*##### Origin of Last Retrieved Value Definition #####*)
(***********************************************************)
TYPE
DataFrameSet = (nowhere, inALL, inSpecific, inANY);
(*
Each particular value definitions can be retrieved from 3 basic sets
of data frames.
The first set (inALL) is the set of all data frames with the phrase
"MODEL = ALL" respectively "PARENT = ALL".
The second set (inSpecific) is the set of all data frames with the
phrase "MODEL = " respectively "PARENT =
" or without that phrase but with the data frame
identifier .
E.g. the value definition given by "MyModel.Bern.Bucketsize"
defines as the set inSpecific all data frames with the
phrase "MODEL = MyModel", respectively "PARENT =
MyModel" plus the single data frame with the name
"MyModel" but without the phrase "MODEL = ...".
The third set (inANY) is the set of all data frames with the phrase
"MODEL = ANY" respectively "PARENT = ANY".
While retrieving data the following, fundamental priority rules apply:
1) A value definition present in the set inALL, is retrieved with the
highest priority. The first part of the identifier is ignored but
the remainder parts have to match exactly the value definition.
E.g. the value definition given by "MyModel.Bern.Bucketsize" is
retrieved from the set inALL as soon it contains a data frame
where the key column contains in a row "Bern" and another column
is headed by "Bucketsize". The value definition actually used is
"ALL.Bern.Bucketsize". This is the case regardless whether
the value definition is also present in a present data frame
using "MODEL = MyModel".
2) A value definition present in the set inSpecific, is retrieved when
it can't be retrieved from the set inALL. Then the entire
identifier denoting the value definition must match; in particular
its first part must match exactly used in
phrase "MODEL = ", "PARENT =
", or in "DATAFRAME ".
E.g. the value definition given by "MyModel.Bern.Bucketsize" is
retrieved from the set inSpecific as soon as this set contains a
data frame where the key column contains in a row "Bern" and
another column is headed by "Bucketsize". The value definition
actually used is exactly "MyModel.Bern.Bucketsize". This is the
case regardless whether the value definition is also contained in
a data frame present using "MODEL = ANY".
3) A value definition present in the set inANY, is retrieved when
it can't be retrieved from the set inALL, nor the set inSpecific.
Then again as with the set inALL, the first part of the identifier
is ignored and only the remainder parts have to match the value
definition.
E.g. the value definition given by "MyModel.Bern.Bucketsize" is
retrieved from the set inANY as soon as this set contains a data
frame where the key column contains in a row "Bern" and another
column is headed by "Bucketsize". The value definition actually
used is "ANY.Bern.Bucketsize".
A value definition not present in any of the three sets,
is not defined and will return values undefined.
To learn about the origin of a particular, successfully retrieved
value definition, use procedure LastValDefFoundIn.
In the context of modeling and simulation above sets serve particular
purposes: The set inALL is particularily useful to define common
parameters which are used by several submodels in a structured,
complex model system. The set inSpecific of course, does exactly
the opposite, i.e. it defines parameters which are specific to
individual submodels. The set inANY is useful to define particular
model parameters without having already a specific model in mind.
Ex.:
(*===========================================================*)
(* Model Parameters *)
(*===========================================================*)
DATAFRAME ModelParameters;
REMARK = 'For any logistic growth model';
MODEL = ANY; KEYCOLUMN = Ident;
DATA:
(*===========================================================*)
Ident Descr val min max unit ;
(*-----------------------------------------------------------*)
r 'Relative growth rate' 0.7 0.0 10.0 'd^-1' ;
K 'Carrying capacity' 700.0 0.0 1.0E+38 'g/m^2' ;
(*-----------------------------------------------------------*)
END ModelParameters;
(*===========================================================*)
Using module ModDatAccess it is easy possible to have the tabulated
value definitions be assigned to any logistic growth model,
regardless of its identifier. I.e. retrieving the values
VR("MyModel.r.val"), GetVS("MyModel.r.Descr",descr), or
VR("MyModel.r.min") etc. will always be successful thanks to the
fact the data frame ModelParameters is of model ANY. However, if the
data frame would be DATAFRAME ModelParameters OF MODEL Logistic
retrieving data by VR("MyModel.r.val") would not succeed (For
explanations on the procedures VR, GetVS etc. see their comments).
*)
PROCEDURE LastValDefFoundIn(): DataFrameSet;
(*
Immediately after retrieving a value definition use this
routine to learn about the set of data frames from which
the value definition has been retrieved. See also
explanations on TYPE DataFrameSet (No implementation
restrictions).
*)
PROCEDURE GetLastValDefsDFIdent (VAR lastDataFrameIdent: ARRAY OF CHAR);
(*
Immediately after retrieving a value definition use this
routine to learn about the identifier of the data frame
from which the value definition has been retrieved.
IMPLEMENTATION RESTRICTION: Works only if the data frame
has not been dropped in the meantime.
*)
(******************************************)
(*##### Special Vector Retrieval #####*)
(******************************************)
PROCEDURE GetVectDim (ident: ARRAY OF CHAR; ofType: ValDefType; VAR n: INTEGER);
(*
Allows to learn about the dimension n of the vector
denoted by identifier ident and of type ofType in the
context of vector retrieval without actually having to
retrieve the data. Returns n = 0 if value definition is
currently not present.
*)
PROCEDURE GetValDefsKeyColumn (ident: ARRAY OF CHAR; ofType: ValDefType;
sepChar: CHAR; VAR keys: ARRAY OF CHAR;
VAR n: INTEGER);
(*
Allows to retrieve the key column of the data frame in
which the value definition denoted by identifier ident
and of type ofType is contained. n returns the dimension
of the key column vector (regardless wether all its
content fits within keys or not). Actual keys are
delimited by character sepChar, typically "|". See also
below routine GetValDefsSubKeyColumn.
*)
PROCEDURE GetValDefsSubKeyColumn (ident: ARRAY OF CHAR; ofType: ValDefType;
subKeyIdent: ARRAY OF CHAR;
sepChar: CHAR; VAR subKeys: ARRAY OF CHAR;
VAR n: INTEGER);
(*
Allows to retrieve the column with identifier subKeyIdent
of the data frame in which the value definition denoted by
identifier ident and of type ofType is contained. The
column must be of type integer, long integer, string, or
identifier, or the procedure will return nothing (n=0,
subKeys empty). Use for subKeys not an open array, but a
single large string, where the elements of the vector are
delimited by the separator sepChar, typically "|".
Consequently, in case of an integer column, you have to
convert the strings first to integer values. The ideas of
subkeys is simply that in a data frame some columns beside
the key column may serve to establish a particular order
among the rows within a data frame. Typically you can use
such a column to store indices which help to further
specify the main key to denote particular rows when
multiple value definitions are present in data frames. By
means of vector retrieval it is then possible to retrieve
more than just the last value definition by searching
within the vector denoted by ident in function of the
independent main key and sub keys of any number. E.g.:
DATAFRAME MonitoringPart1; MODEL = ANY; KEYCOLUMN = Ident; DATA:
(*------------------------------------------------------------*)
Ident MonitLev Filing Tabulation ;
(*------------------------------------------------------------*)
grass 3 FALSE FALSE ;
grass 4 FALSE FALSE ;
grassDot 3 TRUE FALSE ;
grassDot 4 TRUE FALSE ;
(*------------------------------------------------------------*)
END MonitoringPart1;
DATAFRAME MonitoringPart2; MODEL = ANY; KEYCOLUMN = Ident; DATA:
(*------------------------------------------------------------*)
Ident MonitLev ScaleMin ScaleMax Graphing ;
(*------------------------------------------------------------*)
grass 3 0.0 1000.0 Y ;
grass 4 0.0 700.0 X ;
grassDot 3 NA NA noG ;
grassDot 4 0.0 500.0 Y ;
(*------------------------------------------------------------*)
END MonitoringPart2;
The two data frames define for the same keys as given by
the identifier listed in the key column 'Ident' for the two
monitoring levels 3 and 4 in the first data frame the
filing and tabulation attributes, in the second data frame
the graphing attributes. Column 'MonitLev' serves here as
a subkey column, since each variable identifier, e.g.
grass, has two value definitions tabulated, one for
monitoring level 3 and the other for level 4. Of course
only vector retrieval can retrieve these multiple
definitions, yet the tabulated data can still be
interpreted fully, since the order as given by the main key
and the subkeys (redundantly present in both data frames)
can still be retrieved and used by calling routines
GetValDefsKeyColumn and GetValDefsSubKeyColumn like this:
GetValDefsKeyColumn("MyModel.Tabulation","|",idents,n);
GetValDefsSubKeyColumn("MyModel.Tabulation",boolean,
"MonitLev", monLevsArr, n);
will return: idents = "grass|grass|grassDot|grassDot"
monLevsArr = "3|4|3|4"
n = 4
Note in above example the calls
GetValDefsKeyColumn("MyModel.Graphing","|",idents,n);
GetValDefsSubKeyColumn("MyModel.Graphing",identifier,
"MonitLev", monLevsArrStr, n);
retrieve the same information as
GetSs("MyModel.Ident","|",idents,n);
GetIVec("MyModel.MonitLev",monLevsArrInt,n);
which is not necessary the same as what
GetValDefsKeyColumn("MyModel.Tabulation","... might
retrieve, in case the key columns or sub key columns
should differ in content from on to the other data frame.
The actual content of the key column and possible subkey
columns and wether they really correctly define the same
ordering among two data frames is fully the users
responsibility. Same order means for instance that
individual components from retrieved vectors can be paired,
e.g. the vectors 'MyModel.Tabulation' with
'MyModel.Graphing'. An analysis of the order defining
vectors can reveal the validity of the assumption before
actually using any retrieved data.
*)
(*******************************************************)
(*##### Dropping Data Frames / Freeing Memory #####*)
(*******************************************************)
PROCEDURE DropAllDataFrames;
(*
Removes all data frames from memory and releases the memory
for other uses. Note, if you call several times
LoadDataFrames respectively ReadDataFramesIntoMemory without
calling DropAllDataFrames inbetween, that data frames tend to
accumulate in memory, using up more and more memory. This is
of course not true, if you reread the same data frame files
starting with the same anchor (and the same reading filers),
because data frames with an identical identifier will be
overwritten with the latest data found in the scanned data
frame files, hereby reusing the memory space efficiently.
Note also, this means that you ought to call procedure
DropAllDataFrames in case you want to make sure that older
value definitions contained in any previously read data frames
are forgotten. In case you wish to forget only about
individual data frames, see the routines provided by module
DatFraAux, which support versatile plus efficient rereading
and management of data frames. All currently installed
handlers for a dropping event (see module DatFraAux, routines
AddAnyDFChangedHandler and AddDFDropHandler) will also be
called.
*)
END DataFrames.