Dealing recently with some task (the same that inspired us to implement WeakTable), we were in a position to use a dictionary as a key in another dictionary.

What are the rules for the class to be used as key:

• key should be immutable;
• key should implement a GetHashCode() method;
• key should implement a Equals() method.

The first requirement is usually implemented as a documentation contract like this:

As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.

The third requirement about equals is trivially implemented as a method:

public bool Equals(IDictionary<K, V> x, IDictionary<K, V> y)
{
if (x == y)
{
return true;
}

if ((x == null) || (y == null) || (x.Count != y.Count))
{
return false;
}

foreach(var entry in x)
{
V value;

if (!y.TryGetValue(entry.Key, out value) ||
!valueComparer.Equals(entry.Value, value))
{
return false;
}
}

return true;
}

But how would you implement hash code?

We argued like this.

1. Let's consider the dictionary as a sparse array of values with only populated items that correspond to key hash codes.

2. Hash code is constructed using some fair algorithm. E.g like that used in java to calculate string's hash code:

n-1
h(s) = SUM (s[i]*p^(n-1-i)) mod m,  where m = 2^31
i=0

In our case:

• n can be arbitrary large int value, so in fact it's 2^32;
• items are enumerated in unknown order;
• there is only limited set of items, so most s[i] are zeros.

As result we cannot use recurrent function to calculate a power p^k mod m. Fortunately one can build fast exponentiation arguing like this:

32/s - 1
p^k = p^  SUM  2^((s*i)*k[i]) mod m,  where s some int: 1, 2, 4, 8, 16, or 32.
i=0

Thus

32/s - 1
p^k = PRODUCT (p^(2^(s*i)))^k[i] mod m
i=0

If s = 1 then k[i] is either 1 or 0 (a bit), and there is 32 different p^(2^i) mod m values, which can be precalculated.

On the other hand, if we select s = 8 we can write the formula as:

p^k = p^k * (p^(2^8))^k * (p^(2^16))^k * (p^(2^24))^k mod m

where k[i] is a 8 bit value (byte).

Precalculating all values p^n, (p^(2^8))^n, (p^(2^16))^n, (p^(2^24))^n for n in 0 to 255 we reach the formula with 4 multiplications and with 1024 precalculated values.

Here is the whole utility to calculate hash factors:

/// <summary>
/// Hash utilities.
/// </summary>
public class Hash
{
/// <summary>
/// Returns a P^value mod 2^31, where P is hash base.
/// </summary>
/// <param name="value">A value to get hash factor for.</param>
/// <returns>A hash factor value.</returns>
public static int GetHashFactor(int value)
{
return factors[(uint)value & 0xff] *
factors[(((uint)value >> 8) & 0xff) | 0x100] *
factors[(((uint)value >> 16) & 0xff) | 0x200] *
factors[(((uint)value >> 24) & 0xff) | 0x300];
}

/// <summary>
/// Initializes hash factors.
/// </summary>
static Hash()
{
var values = new int[4 * 256];
var value = P;
var current = 1;
var i = 0;

do
{
values[i++] = current;
current *= value;
}
while(i < 256);

value = current;
current = 1;

do
{
values[i++] = current;
current *= value;
}
while(i < 512);

value = current;
current = 1;

do
{
values[i++] = current;
current *= value;
}
while(i < 768);

value = current;
current = 1;

do
{
values[i++] = current;
current *= value;
}
while(i < 1024);

factors = values;
}

/// <summary>
/// A base to calculate hash factors.
/// </summary>
public const int P = 1103515245;

/// <summary>
/// Hash factors.
/// </summary>
}

With this API hash code for a dictionary is a trivial operation:

public int GetHashCode(IDictionary<K, V> dictionary)
{
if (dictionary == null)
{
return 0;
}

var result = 0;

foreach(var entry in dictionary)
{
if ((entry.Key == null) || (entry.Value == null))
{
continue;
}

result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
valueComparer.GetHashCode(entry.Value);
}

return result;
}

And finally, here is a reference to a class DictionaryEqualityComparer<K, V>: IEqualityComparer<IDictionary<K, V>> that allows a dictionary to be a key in another dictionary.

Update

We have commited some tests, and have found that with suffiently "good" implementation of GetHashCode() of key or value we achieve results almost of the same quality, as the results of the algorithm we have outlined above with much simpler and straightforward algorithm like this:

public int GetHashCode(IDictionary<K, V> dictionary)
{
if (dictionary == null)
{
return 0;
}

var result = 0;

foreach(var entry in dictionary)
{
if ((entry.Key == null) || (entry.Value == null))
{
continue;
}

var k = entry.Key.GetHashCode();
var v = entry.Value.GetHashCode();

k = (k << 5) + k;
v = (v << (k >> 3)) + v;

result += k ^ v;

//result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
// valueComparer.GetHashCode(entry.Value);
}

return result;
}

It was worth to blog about this just to find out that we have outwitted ourselves, and finally to reach to a trivial hash code implementation for the dictionary.

All comments require the approval of the site owner before being displayed.
 Name E-mail Home page Remember Me Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use or
. Enter the code shown (prevents robots): Live Comment Preview
Archive
 < January 2022 >
SunMonTueWedThuFriSat
2627282930311
2345678
9101112131415
16171819202122
23242526272829
303112345
Blogroll
Statistics
Total Posts: 380
This Year: 1
This Month: 1
This Week: 0