universal hash families
DESCRIPTION
Universal Hash Families. Universal hash families. Family of hash functions Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1} * and range B ⊆{0,1} * , for some constant b - PowerPoint PPT PresentationTRANSCRIPT
Universal Hash Families
2
Universal hash families Family of hash functions
Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A {0,1}⊆ * and range B {0,1}⊆ *, for some constant b
[Definition] -almost universal2 (-AU2) & -almost XOR universa
l2 (-AXU2)
A family of hash functions H ={h: A {0,1}b} is -almost universal2 , wri
tten -AU2, if, for all distinct x, x’∈A, Prh ∈ H [h(x)=h(x’)] .
The family of hash functions H is -almost XOR universal2 , written -A
XU2, if, for all distinct x, x’∈A, and for all c {0,1}∈ b, Prh ∈ H [h(x)h(x’)=
c] . =maxxx’{Prh [h(x)=h(x’)]} : collision probability
Principle measure of an AU2 : How small its collision probability is and
how fast one can compute its functions
3
Composition of universal hash families Make the domain of a hash family bigger
H ={h: {0,1}a{0,1}b} Hm ={h: {0,1}am{0,1}bm}
its elements are the same as in H where h(x1 x2… xm), for | xi |=a, is defined by h(x1)|| h(x2)||…|| h(xm)
[Proposition] If H is -AU2, then Hm is -AU2
Make the collision probability smaller H1 ={h: A {0,1}b1}, H2 ={h: A {0,1}b2}
H1 & H2 ={h: A {0,1}b1+ b2}
its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h1(x)||h2(x)
[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H1 & H2 is 12-AU2
4
Composition of universal hash families(cont.) Make the image of a hash function shorter
H1 ={h: {0,1}a{0,1}b}, H2 ={h: {0,1}b{0,1}c}
H2 ○ H1 ={h: {0,1}a{0,1}c}
its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h2(h1(x))
[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H2 ○ H1 is (1+2)-AU2
Turn an AU2 family H1 and AXU2 family H2 into an AXU2 family H2 ○ H1
[Proposition] Suppose H1 ={h: AB} is 1-AU2 and H2 ={h: BC} is 2-AXU2. Then H2 ○ H1 ={h: AC} is (1+2)-AXU2
5
Related researches Carter-Wegman(1979, 1981)
Efficient authentication code under strongly universal hash functions Key observations
Long messages can be authenticated efficiently using short keys if the number of bits in the authentication tag is increased slightly compared to ‘perfect’ schemes
If a message is hashed to a short authentication tag, weaker properties are sufficient for the first stage of the compression
Under certain conditions, the hash function can remain the same for many plaintexts, provided that hash results is encrypted using a one-time pad
Stinson(1994) Improves the works by Wegman-Carter and establishes an explicit link
between authentication codes and strongly universal hash functions
Johansson-Kabatianskii-Smeets(1996) Establish a relation between authentication codes and codes correcting
independent errors
6
Related researches(cont.) Krawczyk(1994, 1995)
Propose universal hash functions that are linear with respect to bitwise XOR
Makes it easier to reuse the authentication code Encrypt the m-bit hash result for each new message using a one-time pad
Simple and efficient constructions based on polynomials and LFSR
Shoup(1996) Propose and analyze the constructions based on polynomials over finite
fields
Rogaway : Bucket hashing (1995) Halevi-Krawczyk : MMH (1997)
Make optimal used of the multiply and accumulate instruction of the Pentium MMX processor
Black-Halevi-Krawczyk-Krovertz-Rogaway : UMAC (1999) Further improved the performance on high end processors
7
Constructions Bucket hashing
is an -AU introduced by Rogaway Defining the Bucket Hash Family B
word size w(1), parameters n(1), N(3) domain D={0,1}wn, range R={0,1}wN
Let h B and let X=X1 Xn be the string we want to hash, where each |Xi|=w. Then h(X) is defined by the following algorithm. First, for each j{1,, N}, initialize Yj to 0w. Then, for each i{1,, n} and k hi, replace Yk by Yk Xi. When done, set h(X) = Y1||Y2||||YN.
Pseudocode
for j 1 to N do Yj 0w
for i 1 to n do
Yhi1 Yhi1 Xi
Yhi2 Yhi2 Xi
Yhi3 Yhi3 Xi
return Y1||Y2||||YN
8
Constructions(cont.) Bucket Hashing with Small Key Size
N=2s/L Each hash function hB’[w,M,N] is specified by a list of length M
each entry contains L integers in the interval [0, N-1]
L arrays are introduced, each containing N buckets Next, each array is compressed to s/L words, using a fixed primitive
element GF(2s/L) The hash result is equal to the concatenation of the L compressed arrays,
each containing s/L words
9
Constructions(cont.) Hash Family Based on Fast Polynomial Evaluation
is based on polynomial evaluation over a finite field q = 2r, Q = 2m = 2r+s, n = 1+2s, : a linear mapping from GF(Q) onto
GF(q) Q = q0
m, q = q0r , q0 : a prime power
fa(x) =a0 + a1x + + an-1xn-1
x, y, a0, a1, , an-1 GF(Q), z GF(q)
H = {hx,y,z : hx,y,z(a) = hx,y,z(a0, a1, , an-1) = (y fa(x)) + z}
10
Constructions(cont.) Hash Family Using Toeplitz Matrices
Toeplitz matrices are matrices with constant values on the left-to-right diagonals
A Toeplitz matrix of dimension n m can be used to hash messages of length m to hash results of length n by vector-matrix multiplication
The Toeplitz construction uses matrices generated by sequences of length n + m - 1 drawn from -biased distributions
-biased distributions are a tool for replacing truly random sequences by more compact and easier to generate sequences
The lower , the more random the sequence is
Krawczyk proves that the family of hash functions associated with a family of Toeplitz-matrices corresponding to sequences selected from a -biased distribution is -AXU with = 2-n +
11
Constructions(cont.) Evaluation Hash Function
is one of the variants analyzed by Shoup The input (of length tn) : viewed as a polynomial M(x) of degree < t
over GF(2n) The key : a random element GF(2n) the hash result : equal to M() GF(2n) This family of hash functions is -AXU with = t/2n
12
Constructions(cont.) Division Hash Function
represents the input as a polynomial M(x) of degree less than tn over GF(2)
The hash key : a random irreducible polynomial p(x) of degree n over GF(2)
The hash result : m(x) xn mod p(x) This family of hash functions is -AXU with = tn/2n
The total number of irreducible polynomials of degree n is roughly equal to 2n/n
13
Constructions(cont.) MMH(Multilinear Modular Hashing) hashing
consists of a (modified) inner product between message and key modulo a prime p (close to 2w, with w the word length; below w = 32)
is an -AXU2, but with xor replaced by subtraction modulo p The core hash function maps 32 32-bit message words and 32 32-bit key w
ords to a 32-bit result The key size is 1024 bits and = 1.5/ 230
For larger messages, a tree construction can be used the value of and the key length have to be multiplied by the height of the
tree
This algorithm is very fast on the Pentium Pro, which has a multiply and accumulate instruction
On a 32-bit machine, MMH requires only 2 instructions per byte for a 32-bit result
14
Comparing the Hash Functions
15
Comparing the Hash Functions(cont.) Scheme A
the input : divided into 32 blocks of 8 Kbyte each block is hashed using the same bucket hash function with N = 160 results in an intermediate string of 20480 bytes
Scheme B the input : divided into 64 blocks of 4 Kbyte
each block is hashed using the same bucket hash function with short key(s=42, L=6, N=128)
results in an intermediate string of 10752 bytes
Scheme C the input is divided into 64 blocks of 4 Kbyte
each block is hashed using a 331024 Toeplitz matrix, based on a -biased sequence of length 1056 generated using an 88-bit LFSR
The length of the intermediate string is 8448 bytes
16
Comparing the Hash Functions(cont.) Scheme D
the input : hashed twice using the polynomial evaluation hash function with = 2-15 resulting in a combined value of 2-30
W = 5 The performance is slightly key dependent. Therefore an average over a
number of keys has been computed.
Scheme E this is simply the evaluation hash function with t = 32768
the resulting value of is too small However, choosing a smaller value of n that is not a multiple of 32 induces
a performance penalty
Scheme F the input : divided into 2048 blocks of 128 bytes
each block is hashed twice using MMH the length of the intermediate string is 16384 bytes
– It is not possible to obtain a value of closer to 2-32 in an efficient way
Universal Hashing MAC
18
Message authentication based on Universal hashing Message authentication based on Universal hashing
Wegman-Carter approach The parties share a secret key k=(h,P)
– P : infinite random string
– h : function drawn randomly from a strongly universal2 family of hash functions H» H is strongly universal2 if, for all xx’, the random variable h1(x)
||h2(x), for h ∈ H , is uniformly distributed
To authenticate a message x, the sender transmits h(x) xored with the next piece of the pad P
Standard cryptographic technique use of a pseudorandom function family, F
[Theorem] Assume H is -AXU2, and that F is replaced by the truly random function family R of functions. In this case, if an adversary makes q1 queries to the authentication algorithm S and q2 queries to the verification algorithm V, the probability of forging a MAC is at most q2
))()(,()(),( iaiah xhiFixMAC
19
Universal hashing MAC Why Universal hashing MAC?
The speed of a universal hashing MAC depends on the speed of the hashing step and encrypting step
The encryption does not take long hash function compresses messages => the encrypting message is short
The combinatorial properties of the universal hash function family is mathematically proven
needs no “over-design” or “safe margin” the way a cryptographic primitive would
Universal hashing MAC makes for desirable security properties can select a cryptographically conservative design for the encrypting
step can pay with only a minor impact on speed the cryptographic primitive is applied only to the much shorter hashed
image of the message security and efficiency are not conflicting requirements
20
UMAC The UMAC algorithm
species how the message, key, and nonce determine an authentication tag
The sender will need to provide the receiver with the message, nonce, and tag
The receiver can then compute what “should be” the tag for this particular message and
nonce, and see if it matches the received tag
employs a subkey generation process in which the shared (convenient-length) key is mapped into UMAC's internal keys subkey generation is done just once, at the beginning of a communicati
on session during which the key does not change, and so subkey-generation is usually not performance-critical
UMAC depends on a few different parameters
21
UMAC(cont.) An illustrative special case of UMAC
Subkey generation: Using a PRG, map Key to K = K1K2 K1024 and to A
– each Ki : a 32-bit word, |A| = 512
Hashing the message Msg to HM = NHXKey(Msg): Let Len be |Msg| mod 4096, encoded as a 2-byte string Append to Msg the minimum number of 0 bits to make |Msg| divisible by
8 Let Msg = Msg1 || Msg2 || || Msgt where each Msgi is 1024 words excep
t for Msgt, which has between 2 and 1024 words
Let HM = NHK(Msg1) || NHK(Msg2) || || NHK(Msgt) || Len
Computing the authentication tag: The tag is Tag = HMAC-SHA1A(HM || Nonce)
22
UMAC(cont.) Definition of NH
blocksize n 2, wordsize w 1 domain : A = {0, 1}2w {0, 1}2w {0, 1}nw
range : B = {0, 1}2w
a random function in NH[n,w] is given by a random nw-bit string K Uw : {0, , 2w-1}, U2w : {0, , 22w -1}
for integers x, y let (x +w y) denote (x + y) mod 2w
M A and M = M1 Ml , |M1| = = |Ml| = w
K = {0, 1}nw and K = K1 Kn , |K1| = = |Kn| = w
23
UMAC(cont.)
where mi Uw is the number that Mi represents (as an unsigned integer)
where ki Uw is the number that Ki represents (as an unsigned integer)
the right-hand side of the above equation is understood to name the (unique) 2w-bit string which represents (as an unsigned integer) the U2w-valued integer result
24
Compression Function of MD4
25
Compression Function of MD5
26
Compression Function of RIPEMD-160
27
Compression Function of SHA-1
Conclusions