hash tables and equality definitions
Nov. 18th, 2009 05:20 pmI've just realized that using hash tables properly requires a notion of when two objects have equal value.
How come I never had to worry about this in the past?
---
I want my hash table to encode an equivalence relation (permutations of node labels), automatically dealing with my label-switching problem.
---
Do any of the normal languages/libraries let you pass an equality test, and adapt the hash function accordingly? I imagine you'd need to pass a function that brings the objects into a normal form.
---
I've installed R's hash library, but it seems to be pretty basic.
How come I never had to worry about this in the past?
---
I want my hash table to encode an equivalence relation (permutations of node labels), automatically dealing with my label-switching problem.
---
Do any of the normal languages/libraries let you pass an equality test, and adapt the hash function accordingly? I imagine you'd need to pass a function that brings the objects into a normal form.
---
I've installed R's hash library, but it seems to be pretty basic.
(no subject)
Date: 2009-11-18 10:53 pm (UTC)(no subject)
Date: 2009-11-18 11:07 pm (UTC)(no subject)
Date: 2009-11-19 01:23 am (UTC)(no subject)
Date: 2009-11-19 01:59 am (UTC)make-hash-tableoptionally takes a:testkeyword argument.See also KMP's article.
(no subject)
Date: 2009-11-19 02:02 am (UTC)Thanks!
(no subject)
Date: 2009-11-19 02:04 am (UTC)(no subject)
Date: 2009-11-19 02:36 am (UTC)(no subject)
Date: 2009-11-19 06:13 am (UTC)The problem is I'm not assuming that the hash function implements EQUAL.
And it sounds like we're using "key" in different ways: to me, "key" is the thing on the left in a key-value pair, i.e. you apply the hash function to the key.
Suppose I define EQUALity between trees so that:
[[a b] c] EQUAL [c [b a]]
I don't know how to make them hash to the same thing (other than by converting them to a normal form before hashing)... and what's the use of having a :test argument if EQUAL things hash to different things?
(no subject)
Date: 2009-11-19 06:45 am (UTC)The contract for hash functions typically requires a = b => hash(a) = hash(b). If you're using some complicated equivalence relation on keys then you still need to follow that contract. If that's difficult or slow, you need to restructure your code or figure out if you really wanted a hash table. There's no magic bullet.
(no subject)
Date: 2009-11-19 07:37 am (UTC)(no subject)
Date: 2009-11-19 08:21 am (UTC)(no subject)
Date: 2009-11-19 04:48 pm (UTC)when you look up a key k, you iterate through the bucket table[hash(k)] looking for some key-value pair whose key is equal to k. if you find one, you return v. this is where the equality test comes in, since hash values may overlap -- table[hash(k)] may contain many (k', v') pairs for which k' is not equal to k.
in any case, the complexity guarantees hash tables give you depend upon the hash function being good enough that the buckets don't become too full. for instance, if everything just hashed to 1, everything would go in the same bucket, and hash table lookup would degenerate into linear search.
(no subject)
Date: 2009-11-19 04:50 pm (UTC)(no subject)
Date: 2009-11-20 01:12 pm (UTC)(no subject)
Date: 2009-11-20 08:26 pm (UTC)