DEFINITION MODULE HashTables;
(*******************************************************************
Module HashTables (Version 1.0)
Copyright (c) 1997-2006 by Andreas Fischlin and ETH Zurich.
Purpose Provides hash tables.
Remarks Any number of hash tables, each up to the
maximum size of 2999 (a prime number)
keys can be stored. Initially only small
hash tables are allocated. Only if
demand requires it, a bigger table is
allocated in a so-called rehashing
algorithm. The data which can be stored
in the internal hash table are only
addresses (information hiding).
Keys can also be grouped into so-called
series, ordered according the sequence of
storing (StoreKey). A series is accessed via
its first key.
Reference: Wirth, N. (1986). Algorithms and
Data Structures.
Programming
o Design
Andreas Fischlin 30/01/1997
o Implementation
Andreas Fischlin 30/01/1997
ETH Zurich
Systems Ecology
CHN E 35.1
Universitaetstrasse 16
8092 Zurich
SWITZERLAND
URLs:
<mailto:RAMSES@env.ethz.ch>
<http://www.sysecol.ethz.ch>
<http://www.sysecol.ethz.ch/SimSoftware/RAMSES>
Last revision of definition: 07/03/1997 AF
*******************************************************************)
FROM SYSTEM IMPORT ADDRESS;
FROM Errors IMPORT userBase;
CONST
hashTableOverFlow = userBase + 250;
unknownHashTable = userBase + 251;
unStoredKey = userBase + 252;
TYPE
HashTable;
Key = INTEGER;
UserEntry = ADDRESS;
DoWithUserEntryProc = PROCEDURE (Key, UserEntry);
VAR
nonExistentHashTable: HashTable; (* read only *)
unknownEntry: UserEntry; (* read only *)
resCode: INTEGER; (* Holds the last result code of any operation.
This variable is offered as read only for efficiency reasons
separate from other routines (except for StoreEntry) in order
to better support efficiency, the main motivation to use hash
tables. If no errors are encountered, resCode = allOk (see
module Errors).*)
PROCEDURE CreateHashTable(VAR ht: HashTable);
(* Creates a new hash table ht *)
PROCEDURE HashTableExists(ht: HashTable): BOOLEAN;
PROCEDURE StoreKey(ht: HashTable; newSeries: BOOLEAN; key: Key);
(*
Stores key k in the hash table. If newSeries is TRUE, a
new series is started. Subsequent calls to StoreKey with
newSeries = FALSE will internally connect all stored
entries into a series until StoreKey is called again with
newSeries = TRUE. The series is denoted by the first key
which was passed while newSeries has been TRUE. To
ignore the series mechanism, simply set newSeries = TRUE
always. If the capacity of the hash table is exceeded,
resCode contains hashTableOverFlow. IMPORTANT NOTE: It
is recommended to inspect resCode after each call to
StoreKey if you store many entries close to the maximum
of 2999 (since Hash may fail unnoticed otherwise).
*)
PROCEDURE DeleteKey(ht: HashTable; key: Key);
(*
Removes an individual key for subsequent reuse, i.e. reverts
the effect of StoreKey. You should discard of associated
user data (see below StoreEntry) before calling DeleteKey.
*)
PROCEDURE Hash(ht: HashTable; k: Key; VAR found: BOOLEAN): INTEGER;
(*
Returns for the hash table ht the hash function value
mapping the key k. The hash function adjusts itself
internally and allocates more memory until the upper
limit of the hash table ([0..2999 = prime) is reached.
If found is TRUE the key k has previously been
successfully stored. IMPLEMENTATION RESTRICTION: Note,
you should NOT use this function before having stored
ALL(!) keys, since you do not learn about the rehashing.
However, attached entries are correctly rehashed (see
below procedure StoreEntry), supporting full dynamic use
of hashing user data kept in heap only. In the latter
case the behavior is similar to an ARRAY OF UserEntry,
but still offers the advantage of mapping a large range
of index values (keys) to a much smaller, still storable
range (the classical principle of hashing). NOTE ALSO:
For efficiency reasons Hash assumes ht references an
existing and fully valid HashTable if ht's value differs
from nonExistentHashTable. Returns unknownEntry if ht =
nonExistentHashTable. NOTE, however, Hash does NOT
generate an error message and will return a WRONG value
in case of hashTableOverFlow!! Thus, hashTableOverFlow
should be checked while storing entries (see also
StoreKey).
*)
(* The following procedures support hashing in conjunction
with some user data kept in heap (or alternatively referenced
by SYSTEM.ADR(userEntry). It is possible to associate
userEntries in a 1:1 relation, referencable via an integer
key. *)
PROCEDURE StoreEntry(ht: HashTable; k: Key; e: UserEntry);
(* Attaches and stores entry e in the internal hash table
denoted by ht for key k, given k has been previously
successfully stored with StoreKey. Typically this mechanism
is used to reference some client data structures via e. This
emulates a data structure of the type ARRAY [Hash(ht,key)] OF
UserEntry. Delete an entry by simply calling
StoreEntry(ht,k,unknownEntry). *)
PROCEDURE Entry(ht: HashTable; key: Key): UserEntry;
(* returns the entry for key. Returns unknownEntry if never
stored. IMPLEMENTATION RESTRICTION: For efficiency reasons
Entry assumes that ht references an existing and fully valid
HashTable, in case ht has a value different from
nonExistentHashTable. *)
(* The following procedures support hashing operating on a
group of keys, a so-called series *)
PROCEDURE DoInSeries(ht: HashTable; fstkey: Key; dwuse: DoWithUserEntryProc);
(* Executes dwuse for every entry stored in the series as
denoted by fstkey *)
PROCEDURE DeleteSeries(ht: HashTable; fstkey: Key; dwuse: DoWithUserEntryProc);
(* Deletes all entries belonging to the series denoted by
fstkey and calls dwuse for every entry in the series.
IMPLEMENTATION RESTRICTION: For efficiency reasons, the entire
series is cleared before the Hash function is fully restored.
Hence, do not use any procedures relying on the Hash function
from within dwuse, i.e. Hash, StoreKey, DeleteKey, StoreEntry,
DeleteEntry, and Entry. Of course, you should also refrain
from calling DiscardHashTable for the table in which you delete
the series. You should only operate on your user entry. If
you wish to call any of above routines from within dwuse, use
DoInseries and call DeleteKey from within dwuse. That's a safe
operation preserving the functionality of the Hash function at
all times. *)
PROCEDURE DiscardHashTable(VAR ht: HashTable);
END HashTables.