universal hash families

28
Universal Hash Families

Upload: armina

Post on 02-Feb-2016

85 views

Category:

Documents


1 download

DESCRIPTION

Universal Hash Families. Universal hash families. Family of hash functions Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1} * and range B ⊆{0,1} * , for some constant b - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Universal Hash Families

Universal Hash Families

Page 2: Universal Hash Families

2

Universal hash families Family of hash functions

Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A {0,1}⊆ * and range B {0,1}⊆ *, for some constant b

[Definition] -almost universal2 (-AU2) & -almost XOR universa

l2 (-AXU2)

A family of hash functions H ={h: A {0,1}b} is -almost universal2 , wri

tten -AU2, if, for all distinct x, x’∈A, Prh ∈ H [h(x)=h(x’)] .

The family of hash functions H is -almost XOR universal2 , written -A

XU2, if, for all distinct x, x’∈A, and for all c {0,1}∈ b, Prh ∈ H [h(x)h(x’)=

c] . =maxxx’{Prh [h(x)=h(x’)]} : collision probability

Principle measure of an AU2 : How small its collision probability is and

how fast one can compute its functions

Page 3: Universal Hash Families

3

Composition of universal hash families Make the domain of a hash family bigger

H ={h: {0,1}a{0,1}b} Hm ={h: {0,1}am{0,1}bm}

its elements are the same as in H where h(x1 x2… xm), for | xi |=a, is defined by h(x1)|| h(x2)||…|| h(xm)

[Proposition] If H is -AU2, then Hm is -AU2

Make the collision probability smaller H1 ={h: A {0,1}b1}, H2 ={h: A {0,1}b2}

H1 & H2 ={h: A {0,1}b1+ b2}

its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h1(x)||h2(x)

[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H1 & H2 is 12-AU2

Page 4: Universal Hash Families

4

Composition of universal hash families(cont.) Make the image of a hash function shorter

H1 ={h: {0,1}a{0,1}b}, H2 ={h: {0,1}b{0,1}c}

H2 ○ H1 ={h: {0,1}a{0,1}c}

its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h2(h1(x))

[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H2 ○ H1 is (1+2)-AU2

Turn an AU2 family H1 and AXU2 family H2 into an AXU2 family H2 ○ H1

[Proposition] Suppose H1 ={h: AB} is 1-AU2 and H2 ={h: BC} is 2-AXU2. Then H2 ○ H1 ={h: AC} is (1+2)-AXU2

Page 5: Universal Hash Families

5

Related researches Carter-Wegman(1979, 1981)

Efficient authentication code under strongly universal hash functions Key observations

Long messages can be authenticated efficiently using short keys if the number of bits in the authentication tag is increased slightly compared to ‘perfect’ schemes

If a message is hashed to a short authentication tag, weaker properties are sufficient for the first stage of the compression

Under certain conditions, the hash function can remain the same for many plaintexts, provided that hash results is encrypted using a one-time pad

Stinson(1994) Improves the works by Wegman-Carter and establishes an explicit link

between authentication codes and strongly universal hash functions

Johansson-Kabatianskii-Smeets(1996) Establish a relation between authentication codes and codes correcting

independent errors

Page 6: Universal Hash Families

6

Related researches(cont.) Krawczyk(1994, 1995)

Propose universal hash functions that are linear with respect to bitwise XOR

Makes it easier to reuse the authentication code Encrypt the m-bit hash result for each new message using a one-time pad

Simple and efficient constructions based on polynomials and LFSR

Shoup(1996) Propose and analyze the constructions based on polynomials over finite

fields

Rogaway : Bucket hashing (1995) Halevi-Krawczyk : MMH (1997)

Make optimal used of the multiply and accumulate instruction of the Pentium MMX processor

Black-Halevi-Krawczyk-Krovertz-Rogaway : UMAC (1999) Further improved the performance on high end processors

Page 7: Universal Hash Families

7

Constructions Bucket hashing

is an -AU introduced by Rogaway Defining the Bucket Hash Family B

word size w(1), parameters n(1), N(3) domain D={0,1}wn, range R={0,1}wN

Let h B and let X=X1 Xn be the string we want to hash, where each |Xi|=w. Then h(X) is defined by the following algorithm. First, for each j{1,, N}, initialize Yj to 0w. Then, for each i{1,, n} and k hi, replace Yk by Yk Xi. When done, set h(X) = Y1||Y2||||YN.

Pseudocode

for j 1 to N do Yj 0w

for i 1 to n do

Yhi1 Yhi1 Xi

Yhi2 Yhi2 Xi

Yhi3 Yhi3 Xi

return Y1||Y2||||YN

Page 8: Universal Hash Families

8

Constructions(cont.) Bucket Hashing with Small Key Size

N=2s/L Each hash function hB’[w,M,N] is specified by a list of length M

each entry contains L integers in the interval [0, N-1]

L arrays are introduced, each containing N buckets Next, each array is compressed to s/L words, using a fixed primitive

element GF(2s/L) The hash result is equal to the concatenation of the L compressed arrays,

each containing s/L words

Page 9: Universal Hash Families

9

Constructions(cont.) Hash Family Based on Fast Polynomial Evaluation

is based on polynomial evaluation over a finite field q = 2r, Q = 2m = 2r+s, n = 1+2s, : a linear mapping from GF(Q) onto

GF(q) Q = q0

m, q = q0r , q0 : a prime power

fa(x) =a0 + a1x + + an-1xn-1

x, y, a0, a1, , an-1 GF(Q), z GF(q)

H = {hx,y,z : hx,y,z(a) = hx,y,z(a0, a1, , an-1) = (y fa(x)) + z}

Page 10: Universal Hash Families

10

Constructions(cont.) Hash Family Using Toeplitz Matrices

Toeplitz matrices are matrices with constant values on the left-to-right diagonals

A Toeplitz matrix of dimension n m can be used to hash messages of length m to hash results of length n by vector-matrix multiplication

The Toeplitz construction uses matrices generated by sequences of length n + m - 1 drawn from -biased distributions

-biased distributions are a tool for replacing truly random sequences by more compact and easier to generate sequences

The lower , the more random the sequence is

Krawczyk proves that the family of hash functions associated with a family of Toeplitz-matrices corresponding to sequences selected from a -biased distribution is -AXU with = 2-n +

Page 11: Universal Hash Families

11

Constructions(cont.) Evaluation Hash Function

is one of the variants analyzed by Shoup The input (of length tn) : viewed as a polynomial M(x) of degree < t

over GF(2n) The key : a random element GF(2n) the hash result : equal to M() GF(2n) This family of hash functions is -AXU with = t/2n

Page 12: Universal Hash Families

12

Constructions(cont.) Division Hash Function

represents the input as a polynomial M(x) of degree less than tn over GF(2)

The hash key : a random irreducible polynomial p(x) of degree n over GF(2)

The hash result : m(x) xn mod p(x) This family of hash functions is -AXU with = tn/2n

The total number of irreducible polynomials of degree n is roughly equal to 2n/n

Page 13: Universal Hash Families

13

Constructions(cont.) MMH(Multilinear Modular Hashing) hashing

consists of a (modified) inner product between message and key modulo a prime p (close to 2w, with w the word length; below w = 32)

is an -AXU2, but with xor replaced by subtraction modulo p The core hash function maps 32 32-bit message words and 32 32-bit key w

ords to a 32-bit result The key size is 1024 bits and = 1.5/ 230

For larger messages, a tree construction can be used the value of and the key length have to be multiplied by the height of the

tree

This algorithm is very fast on the Pentium Pro, which has a multiply and accumulate instruction

On a 32-bit machine, MMH requires only 2 instructions per byte for a 32-bit result

Page 14: Universal Hash Families

14

Comparing the Hash Functions

Page 15: Universal Hash Families

15

Comparing the Hash Functions(cont.) Scheme A

the input : divided into 32 blocks of 8 Kbyte each block is hashed using the same bucket hash function with N = 160 results in an intermediate string of 20480 bytes

Scheme B the input : divided into 64 blocks of 4 Kbyte

each block is hashed using the same bucket hash function with short key(s=42, L=6, N=128)

results in an intermediate string of 10752 bytes

Scheme C the input is divided into 64 blocks of 4 Kbyte

each block is hashed using a 331024 Toeplitz matrix, based on a -biased sequence of length 1056 generated using an 88-bit LFSR

The length of the intermediate string is 8448 bytes

Page 16: Universal Hash Families

16

Comparing the Hash Functions(cont.) Scheme D

the input : hashed twice using the polynomial evaluation hash function with = 2-15 resulting in a combined value of 2-30

W = 5 The performance is slightly key dependent. Therefore an average over a

number of keys has been computed.

Scheme E this is simply the evaluation hash function with t = 32768

the resulting value of is too small However, choosing a smaller value of n that is not a multiple of 32 induces

a performance penalty

Scheme F the input : divided into 2048 blocks of 128 bytes

each block is hashed twice using MMH the length of the intermediate string is 16384 bytes

– It is not possible to obtain a value of closer to 2-32 in an efficient way

Page 17: Universal Hash Families

Universal Hashing MAC

Page 18: Universal Hash Families

18

Message authentication based on Universal hashing Message authentication based on Universal hashing

Wegman-Carter approach The parties share a secret key k=(h,P)

– P : infinite random string

– h : function drawn randomly from a strongly universal2 family of hash functions H» H is strongly universal2 if, for all xx’, the random variable h1(x)

||h2(x), for h ∈ H , is uniformly distributed

To authenticate a message x, the sender transmits h(x) xored with the next piece of the pad P

Standard cryptographic technique use of a pseudorandom function family, F

[Theorem] Assume H is -AXU2, and that F is replaced by the truly random function family R of functions. In this case, if an adversary makes q1 queries to the authentication algorithm S and q2 queries to the verification algorithm V, the probability of forging a MAC is at most q2

))()(,()(),( iaiah xhiFixMAC

Page 19: Universal Hash Families

19

Universal hashing MAC Why Universal hashing MAC?

The speed of a universal hashing MAC depends on the speed of the hashing step and encrypting step

The encryption does not take long hash function compresses messages => the encrypting message is short

The combinatorial properties of the universal hash function family is mathematically proven

needs no “over-design” or “safe margin” the way a cryptographic primitive would

Universal hashing MAC makes for desirable security properties can select a cryptographically conservative design for the encrypting

step can pay with only a minor impact on speed the cryptographic primitive is applied only to the much shorter hashed

image of the message security and efficiency are not conflicting requirements

Page 20: Universal Hash Families

20

UMAC The UMAC algorithm

species how the message, key, and nonce determine an authentication tag

The sender will need to provide the receiver with the message, nonce, and tag

The receiver can then compute what “should be” the tag for this particular message and

nonce, and see if it matches the received tag

employs a subkey generation process in which the shared (convenient-length) key is mapped into UMAC's internal keys subkey generation is done just once, at the beginning of a communicati

on session during which the key does not change, and so subkey-generation is usually not performance-critical

UMAC depends on a few different parameters

Page 21: Universal Hash Families

21

UMAC(cont.) An illustrative special case of UMAC

Subkey generation: Using a PRG, map Key to K = K1K2 K1024 and to A

– each Ki : a 32-bit word, |A| = 512

Hashing the message Msg to HM = NHXKey(Msg): Let Len be |Msg| mod 4096, encoded as a 2-byte string Append to Msg the minimum number of 0 bits to make |Msg| divisible by

8 Let Msg = Msg1 || Msg2 || || Msgt where each Msgi is 1024 words excep

t for Msgt, which has between 2 and 1024 words

Let HM = NHK(Msg1) || NHK(Msg2) || || NHK(Msgt) || Len

Computing the authentication tag: The tag is Tag = HMAC-SHA1A(HM || Nonce)

Page 22: Universal Hash Families

22

UMAC(cont.) Definition of NH

blocksize n 2, wordsize w 1 domain : A = {0, 1}2w {0, 1}2w {0, 1}nw

range : B = {0, 1}2w

a random function in NH[n,w] is given by a random nw-bit string K Uw : {0, , 2w-1}, U2w : {0, , 22w -1}

for integers x, y let (x +w y) denote (x + y) mod 2w

M A and M = M1 Ml , |M1| = = |Ml| = w

K = {0, 1}nw and K = K1 Kn , |K1| = = |Kn| = w

Page 23: Universal Hash Families

23

UMAC(cont.)

where mi Uw is the number that Mi represents (as an unsigned integer)

where ki Uw is the number that Ki represents (as an unsigned integer)

the right-hand side of the above equation is understood to name the (unique) 2w-bit string which represents (as an unsigned integer) the U2w-valued integer result

Page 24: Universal Hash Families

24

Compression Function of MD4

Page 25: Universal Hash Families

25

Compression Function of MD5

Page 26: Universal Hash Families

26

Compression Function of RIPEMD-160

Page 27: Universal Hash Families

27

Compression Function of SHA-1

Page 28: Universal Hash Families

Conclusions