Generic Set Data Structure. More...
#include <set.h>
Generic Set Data Structure.
Structure for ordered sets Sets are containers for elements that cannot occur into them more then once. Sets can be travered through iterators, and chronological ordering which elements are inserted into them is preserved when travered. A set can be frozen at time t, meaning that that the set cannot be later changed until the set is destroyed. Freezing a set make that set unchangeble in time, and allows for more efficient operations like set copy. Copying a frozen set has the only effect of incrementing a reference counting for that set, without any need for actually copying the set content. Freezing a set whose content does not need to be changed later on is therefore always a good idea to make set operations and memory usage more efficient.
It is also important to mark that when storing a set in memory (e.g. in memoized operations, like in dependencies hashes) it is needed to freeze the set, otherwise external code might change the set content with side-effect with weird results, as explained below..
Operations like AddMember, RemoveMember, Union, Difference, etc. do not create a new set, instead they modify the set they are applied to. For example given two sets S1 and S2, S1 U S2 (set union) can be obtained by calling
Set_Union(S1, S2)
If S1 is not a frozen set, the result goes to S1 (with side-effect), and no copy is performed. If S1 is frozen, S1 is copied to a new set S1' and then side effect is performed on S1' to add members in S2. All this operation is carried out automatically in a transparent manner, but it is required that operations that modify sets all returns a set that can be different from the set they are applied to. The returned value has to be assigned to a variable. The right set protocol then requires an explicit assignment:
S1 = Set_Union(S1, S2)
To save memory the empty set is represented with a NULL pointer, that is another reason why an explicit assigment is required, and that justify the fact that in general S1 and S1' may be different.
When a set is no longer used, it has to be freed with method ReleaseSet. This either frees the set and the memory it uses, or decreases the set's reference counting.
Reference counting is applied only for frozen sets. When a frozen set is copied its reference counting is incremented. When a frozen set is released, the reference counting is decremented and the set is freed only if its reference counting reaches the value of 0, meaning that there are no longer users of that set.
Notice that in previous operation:
S1 = Set_Union(S1, S2)
If S1 is a frozen set, this is the sequence of actions that are involved:
1. S1 is copied into a temporary set S1' 2. S1' is unioned with S2 (with side-effect on S1') 3. S1 is released (and possibly freed if needed) 4. S1' is returned as a new non-frozen set and assigned to S1.
Pass 3 is remarkable here. Suppose that a set is stored into a permanent memory area (like a cache, a hash, etc.). When storing the set, it has to be frozen and a carefully reference counting has to be takein into account. When looking up previously stored set and returning that set (e.g. in memoizing) is is important to return a copy of the (frozen) set, and explicitly ask the user to release the returned set when no longer used. This prevents previous step 3 to release sets that are still in usage for example inside the cache. For example:
Set_t s1 = some_memoizing_function(); Set_t s2 = Set_AddMember(s1, element); ... Set_ReleaseSet(s2);
Here s2 is different from s1 (as s1 is frozen and AddMember would change it otherwise). Even if function some_memoizing_function requires the user to release returned set, there is no need to release s1 (and in fact you do not have to, or you have a bug). This allows to write second line Set_t s2 = ... as:
s1 = Set_AddMember(s1, element);
At the end you will have only to release s1 (as prescribed by function some_memoizing_function)