information introduction to information...

18
1 Information Introduction to Information Theory Basic concepts of information collecting and processing systems Review the theory Examine examples 工学系研究科・機械工学専攻 Jean-Jacques Delaunay [email protected] Outline 1. What is information? 2. Where does information come from? 3. What is a message? 4. Where does the noise come from? 5. How does noise affect information transmission? What you need to remember Definition of information [bit] is universal Measurements are always indirect and therefore require calibration Noise causes random errors

Upload: others

Post on 07-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

1

InformationIntroduction to

Information Theory

Basic concepts of information collecting and

processing systems

– Review the theory

– Examine examples

工学系研究科・機械工学専攻

Jean-Jacques Delaunay

[email protected]

Outline

1. What is information?

2. Where does information come from?

3. What is a message?

4. Where does the noise come from?

5. How does noise affect information transmission?

What you need to remember

• Definition of information [bit] is universal

• Measurements are always indirect and

therefore require calibration

• Noise causes random errors

Page 2: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

2

International system of units

(Abbreviated SI)

SI is not adopted by Burma,

Liberia and the US!

The seven based units of SI:

meter [m], kilogram [kg], second [s], ampere [A],

kelvin [K], candela [cd], mole [mol]

Other units are derived from the based units

1. What is information?

Information

• Information is a measure of order

• Measure of order:

– Universal measure applicable to any structure,

any system (physical laws)

– Order quantifies the instructions needed to

produce a certain organization (binary

choices)

Minimum number of binary choices

Compute the information in any given arrangement

from the number of choices we must make to arrive at

that particular arrangement.

000 001 010 011 100 101 110 111

1st choice

2nd choice

3rd choice

Three correct consecutive choices

=> message identifying ‘101’ contains 3 bits of information

8 symbols of equally probable states

(bit : binary digit)

Page 3: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

3

I am thinking of one person. How many

questions do you need to ask to find out?

Information definition

NI

1log2

• Information is a dimensionless entity

• I 0, information has to be acquired in order to make the

correct choice

For any number of possibilities N,

the information I for specifying a member

Information entropy (Shannon)Uncertainty is related to the logarithm of possible symbols

(equally likely symbols)

General formula for uncertainty

(non equally likely symbols)

M

2

1

For an infinite string which is formed from an alphabet of symbols,

the averaged uncertainty is

log ( ) [bits per symbol occurrence]

Symbol

i i

i

M

H p p

i

1

1 whose outcome has probability ; 1M

i i

i

,...,M p p

1/4

)(log2 NH

Where N is the number of equally likely symbols

ExampleWhen throwing a fair die, the probability of 5 is 1/6.

2 2log log 1/ 6 2.58 bitsH P

When 2 dices are thrown independently, the amount of

information associated with

thrown dice 1 = 5 AND thrown dice 2 = 4 is

1 2

2

1/ 6 1/ 6 1/ 36

log 5.17 bits

P P P

H P

We can also write

1 2 1 25.17 where 2.58 bitsH H H H H

Page 4: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

4

Information entropy:

Shannon uncertainty equation

2

the number of available symbols; same frequency of occurrence

(equally likely symbols)

1The probability of a particular symbol appearing is

1The amount of information per symbol is log

M

PM

IM

2

2

log

The total amount of information in a string of length

formed from an alphabet of symbols is

log ( )

P

N

M

I N P

2/4 Information entropy:

Shannon uncertainty equation

equationy uncertaintShannon symbol]per [bits )(log

symbols allover averaged ,occurrence symbolper n informatio ofamount The

occurence symbolper n informatio providesit appears, Every time

)(log is stringin containsn informatio totalThe

)(log)(log is symbol by the providedn Informatio

is symbol ofy Probabilit

times appears symbol symbols; thefrom formed string Infinite

,...,, symbols ofSet

1

2

1

2

22

21

M

i

ii

i

ii

M

i

ii

iiiiii

iii

ii

M

ppN

I

N

Ix

pNpIN

pNppNIx

N

Npx

NxMN

xxxM

3/4

Information entropy:

Rate of information transmission

Amount of information that knowledge about the outcome

of an event adds to someone’s knowledge

errorontransmissi

after

Nsymbolsofnumber

before HHR

)(log

2

Self information is a decrease in uncertainty H

before and after communication

If there is no noise, the information communicated is Hbefore

4/4

Information is defined as a decrease in uncertainty

Information in the physical world

Living matter Nonliving matter

slyspontaneou form likely to Less

iespossibilit 4

(length) bases 10 of totalawith

sequences, specificin arranged

(letters) bases 4 of made isDNA

complex - ulesMacromolec

910

9

solutionin atoms

its from itself assembles crystalSalt

atoms sodium and chloride ofArray

NaCl) (e.g. worldMineral

Page 5: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

5

DNA replication NaCl crystal New York Times

impurities no K,0at

0)1(log 2

T

I

)4(log910

2Ibits 8 character 1

characters 1000 page 1

2log10008

2

I

Entropy in statistical mechanics

• Measure of the dispersal of energy (“disorder”)

• Measure of “disorder”– For a system in thermodynamic equilibrium, the number N

of equivalent ways (microstates) a system can be constructed:

• For a perfect crystal at T = 0 K, N = 1, S = 0

• For a liquid phase, the number of equivalent ways for assembly gets very large

J/K 10 3806.1 constant,Boltzmann theis where

)ln(Entropy

23-

BB

B

kk

NkS

Dissipation of order and information

Smoke introduced into a closed chamber

Molecules are initially concentrated near

the source

=> significant order and information

Structure being lost. System evolves toward more

probable states

Macroscopic structure is lost. Thermodynamic

equilibrium.

=> the most probable state, information content zero,

entropy maximum

Tim

e

Info

rmation c

onte

nt:

dis

pla

cem

ent

of pro

bability w

ith

resp

ect

to t

he

the

rmo

d.

eq

uilib

.

Information and entropy of

communication theory

• Entropy

• Information

• Information is equivalent to entropy

)(log2 NH

)1

(log2N

I

0 entropy n informatio :lawon Conservati

An increase in entropy implies a decrease in information, and vice versa.

Page 6: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

6

Summary

Communication theory: measure of order

n binary questions/choices N possibilities/symbols

N

Nn

2

2

log [bit]Entropy

)1

(log - [bit] nInformatio

Statistical mechanics: measure of disorder

N number of equivalent ways a system can be constructed

)ln( [J/K]Entropy NkB

)2( Nn 2. Where does information

come from?

Where does information comes from?

Information comes from sensors or transducers:

Sensors/transducers generate responses which

can be measured. This measurement creates

information.

=>Instrumentation and measurement form the

basis of all practical information systems

Transducers

A device that converts one form of energy into another.

Example: an electric motor is a transducer that converts mechanical energy into a voltage (or vice versa).

Transducers are important components in many types of sensor => convert the physical quantity to be measured into a voltage.

Transducers include sensors and actuators

-Sensors: Devices which transform a stimulus into an electrical signal (thermometer)

-Actuators: Devices which transform an input signal into motion (electrical motor)

Page 7: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

7

SensorA device that receives and responds to a signal/stimulus.

Biological sensor

Electrochemical

character

Ion transport in nerve

fibers

Man-made sensor

Physical, chemical,

biochemical characters

Information is processed

and transported in

electrical form

stimulus

signalWorld

Environment

Body

sensorelectrical

signal

Instrument

An instrument senses some form of input and responds by producing an appropriate output.(Examples of instrument: balance, voltmeter, oscilloscope, PC, TV set…)

Important properties:

-Calibration

-Response time

-Range

-Noise

=> Measurement errors

=> Amount of information

is limited

Measurement

We do not measure weight directly. (indirect indication)

=> we have to calibrate the balance using known weights.

=> we compare the weight of the iron with a set of standard

weights.

Compression force from the

squeezed spring balances the

force from the increased weight.

We observe:

-pointer rotates (120 degrees)

-pan moves down (2 cm)

All measurements are comparisons with

some defined standard.

Calibration process

Where did the known weights come from?

National standard

Laboratory

Primary Reference

System

. . .

End User

System was

calibrated in the

step before

Makers

Calibrate a system

against the Primary

Reference System

Define what we

mean by “1 kg”,

“1 sec”.

Traceability: The ability to trace the calibration of a device through a chain of

calibrations to a primary standard. Traceability does not guarantee accuracy…

Page 8: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

8

The Metrology Institute of Japan Calibration of measurement

devices always necessary?

Calibration standards are always necessary,

however special techniques may be applied to use

un-calibrated devices. For example:

- double weighting measurements (un-calibrated

balance, Borda XVIII century)

- reciprocity calibration for microphones

(piezoelectric)

Double weighting measurements

21

21

2

2112

22

221

112

MMM

rrgMMrrgM

grMMgr

grMMgr

1r2r

M M1

1r2r

M2 M

M (unknown mass) is measured two times (left and right side).

We write the two equilibrium conditions (torque balance):

onaccelerati nalgravitatio g

In this measurement, the

instrument is not calibrated.

M1 and M2 are used as the

weight standards.

Reciprocity calibration (1/2)The technique exploits the reciprocal nature of transduction

mechanisms such as the piezoelectric effect (microphones).

Uncalibrated microphones i, j, and k are used. The

microphones are placed facing each other with a well known

acoustic transfer impedance between their diaphragms.

ac impedance transfer acoustic Z

iI jUacZ

imic jmic

iI kUacZ

imic kmic

jIkUacZ

jmickmic

ji

i

j

ji MZMI

UZ ac,

impedance elecrical

1)

2)

3)

wavepressure toresponse

t measuremen voltage

sound of source

t measuremencurrent

U

I

Page 9: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

9

Reciprocity calibration (2/2)

,

,

,

j

i j i ac j

i

ki k i ac k

i

kj k j ac k

j

UZ M Z M i j

I

UZ M Z M i k

I

UZ M Z M j k

I

The electrical transfer impedance is determined by measuring

currents and voltages only.

kacjkj

kjac

kiji

i

kacjkj

kjacikiji

MZMZ

MMZ

ZZM

MZMZ

MMZMZZ

,

,,

,

22

,,

kj

kiji

acackj

kiji

iZ

ZZ

ZZZ

ZZM

,

,,

,

,, 1

Accuracy and precision

We can never make perfect measurements

with absolute accuracy or precision:

1) Noise effect (thermal noise, shot noise…)

2) Calibration process

=> The amount of information we can collect

is always finite.

Accuracy and precision

Accurate

and precise

Not accurate

but precise

Not accurate

not preciseAccurate,

not precise

Summary

• Information is collected by instruments which perform some kind of measurement.

• All measurements are comparisons with a set of standards.

• The amount of information we can collect is always finite, limited by the effects of noise, saturation and response time.

Page 10: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

10

3. What is a message?

Signal and message

Information channelInformation

Source Receiver

Measurement

+ coding

Transport signal Decoding

microphone wires earpiece

Telephone example:

Information is in the form of a

signal which carries a message.

Communication requires a code

Before a signal can be used to communicate some specific information in the form of a message, the sender and the receiver must have agreed on the details of how the actual signals are to be used (code).=> Distinguish one code symbol from another

Communicate information requires:

Addresser

Addressee

Channel

Code

Context

Message

Code includes Error Detection

-Parity bit-

We use redundancy to check for transmission errors

10100100 010100100 010100100

01010101 101010101 101010101

11101111 011101111 011101111

11101111 011101111 011111111

8 bit word 9 bit word

Parity bit

(Redundant information)

Parity = 0: odd numbers of 1

Parity = 1: even numbers of 1

Note that there exists two types of parity bits: even parity bit and odd parity bit.

Transmission

error

9 bit word

Odd parity bit: the parity bit of

each word can be set to 1 when it

contains an even number of 1.

Original word Parity bit added After transmission

0

1

0

0

0

1

0

1

Before Transmission After Transmission

Page 11: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

11

Conversion into binary numbers

Define a signal range

(wide enough to ensure

that the signal is inside

the chosen range)

Sample

1

0

{1}

1

0

{10}

Conversion to

a 1-digit binary

Conversion to

a 2-digit binary

Sampling

Signal level moves out of the initial range: signal is said to have been clipped

Value indicated by the nearest available binary number.

Examples of analog-to-digital conversions

-Signal amplitude is coded using 12

bits for each pixel

-Acquisition rate is 10 frames per

second

Image sensors Oscilloscope (DAQ card)

-Signal amplitude is coded

using 8 bits

-Sample rate is 2 Giga points

per second

-bandwidth 100 MHz

How much information in a message?

Bit:Yes/no question; minimum possible amount of information.

Ask n questions => n bits of information (coded by 2n distinct symbols)

Note: ask an extra question double the number of symbols, but it doesn’t provide twice as much information.

2 bits (22 = 4 symbols) -> 3 bits (23 = 8 symbols)

=> 3/2 as much information

Total amount of information contained in a message:Information = Number of samples number of bits

Number of samples = measurement duration / sampling time

Number of bits = log2(number of symbols)

Page 12: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

12

Capacity of an information channel

The amount of information that can be

communicated through a channel is limited

by the properties of that channel:

• Range

• Noise

• Response time (bandwidth)

Effect of noise

The amount of information an analog channel can convey is limited by the noise level.

• Channel maximum range 1 V; noise level 1 mV

Measurement with an Analog to Digital Convertor (ADC):

Set voltage range to 1 V

8-bit ADC => 1 V / 28 = 1/256 = 0.0039 V

10-bit ADC => 1 V / 210 = 1/1024 = 0.00097 V noise

> 10-bit ADC => obtain NO extra information

1/4

Effect of noise: Dynamic range

power. noisemean theis

signal theofpower maximum theis where

[decibels] log10Range Dynamic

noise

max

noise

max

P

P

P

P

2/4

Definition of the Dynamic range: the ratio of maximum

signal amplitude to the minimum amplitude detectable

Signal has a finite dynamic range

=> Amount of information limited by the noise level(the effect seen in the previous slide is a consequence of the signal

having a finite dynamic range)

Effect of noise:

Signal to noise ratio

3/4

noise

signal actual log10SNR

decibelsin (SNR) Ratio Noise toSignal

P

P

Note: SNR is different from the dynamic range (Pactual signal Pmax)

noise

max

2noise

2max

noise

signal actual

2noise

2signal actual

log20log10Range Dynamic

log20log10 SNR

V

V

V

V

V

V

V

V

rms noise

rms signal maximum

rms signal actual

where

noise

max

signal actual

V

V

V

Page 13: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

13

Effect of noise:

Decrease in transmission rate

4/4

bit/sec 1 bit/sec 919.0081.01

081.0)01.0(log01.0)99.0(log99.0

1

0.99 is dtransmittecorrectly is 0 ay that probabilit

0.99 is dtransmittecorrectly is 1 ay that probabilit

noise with Case 2)

bit/sec 10,1

error)ion transmiss(no noise without Case 1)

:secondevery ed transmittsymbolslikely equally Two

22after

before

afterbefore

R

H

H

RHH

Before transmission After transmission

1 or 0

H = log2(2) = 1 bit

Without noise

-We receive a 1

P(1) = 1 => H=log2(1)=0

P(0) = 0

-We receive a 0

P(1) = 0

P(0) = 1 => H=log2(1)=0

With noise

-We receive a 1

P(1) = 0.99

P(0) = 0.01

-We receive a 0

P(1) = 0.01

P(0) = 0.99

Effect of response time

Limitation on how quickly the voltage can be changed (finite response time of any system).

Example: wires take 1 s to respond to a change.

Sampling at a frequency > 1 MHz do NOT provide extra information.

Maximum signal frequency the wires can carry?

One cycle, one up and down: 1 s + 1 s = 2 s

=>maximum frequency = 0.5 MHz

Sampling rate = 2 maximum frequency

1/2

Time

Frequency:

Signal frequency 1/T

Sampling frequency 2/T

Time:

Signal period: T

Sampling: T/2

Sampling frequency = 2 maximum signal frequency

T

T/2

signal

Page 14: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

14

Bandwidth

Bandwidth: the range of frequencies a

channel can carry.

Usually bandwidth = maximum frequency

(range of frequencies extends to dc)

Sampling rate = 2 Bandwidth

2/2

Summary

• Information system consists of an information source connected to a receiver through an information channel.

• A set of information is a message which is sent as a signal using a code made up of symbols.

• The amount of information a channel can carry is limited by its range, the level of noise and its response time.

4. Where does noise come from?

Noise sources

Noise type Origin System Noise

spectrum

Thermal

(Johnson)

Thermal

agitation

Electronic

systems

Flat

(normally

distributed)

Shot Quantization

of electrical

charge

Electronic

systems

Flat

(Poisson-

distributed)

1/f System

dynamics

Chaotic

systems

1/f

Page 15: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

15

Thermal noise

Mobile electrons move in a random fashion:

=> at any particular instant there may be

more e- near one end of the conductor than

the other.

Resistor

mobile electrons

1/3

Thermal noise occurs in all systems which are

not at absolute zero: cannot be avoided.

At time t more

e- on this side

Thermal noise 2/3

][ resistance sresistor' theis

[Hz]bandwidth observed theis

[K] re temperatusresistor' theis

J/K) 10(1.38constant Boltzman theis where

[V] 4

isresistor a accross voltagenoise theof rms the

[W] 4 ispower noise thermalThe

23-

2noise

R

B

T

k

RBTkv

TBkP

B

B

B

HzV/ -104

1000 300 23-101.38 4/2 :nsfluctuatio produces

K 300at resistor kΩ 1 a accross noise thermal:Example

9

Bnoise

v

Noise power

observation

3/3

noisev

R

inR

amplifier

BTkvR

iv

RR

vRR

Riv

iRvRR

vi

noiseinin

in

noise

in

ininin

inininin

noisein

2

max

2

2

4

1

:for obtained ispower maximum

power noisemean

;

ini

inv

Band-

bass filter

rms

volts

Shot noise

Shot noise arises because of the quantization

of the electrical charge (e = 1.610-19 C).

Ii

nmnt

enI

t

eni

i

k

m

i

kkk

kk

mk

1

,1

/1 where

,

t measuremen same Repeat thewire

n charges

cross the

surface in a

time t

Shot noise cannot be avoided.

Each carrier has its own velocity and separation from its neighbors

Page 16: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

16

1/f noise

Thermal noise and shot noise are ‘white’ noises (have flat power spectra)

1/f noise :

noise power spectrum 1/f

where 0.5 2

1/f power spectrum observed in:

- Some chaotic systems, such as in the Bernoulli map (deterministic chaos).

- Brownian motion (red noise).

Example of 1/f noise

Summary

• Thermal noise and shot noise arises from

the quantization of electrical charge.

These types of noise are unavoidable.

• Thermal noise and Shot noise have a

uniform noise power spectral density

(white noise)

5. How does noise affect

information transmission?

Digital transmission over a noisy channel

Noise produces random errors

=> Information received is not always correct T

ransm

itte

d

Vmax

Vmin

Receiv

ed

0 a received levelDecision Voltage

1 a received levelDecision Voltage

:rules codeon Transmissi

2 levelDecision

digitsbinary of stream a assent Message

minmax

VVVDecision

Before transmission After transmission

Page 17: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

17

Transmission errors

Dis

t. o

f "1

" voltages

Vmax

Dis

t. o

f "0

" voltages

Vmin

Decision level

Voltages that

give the wrong

result.

Decision level

Vmin

Vmax

{1}{0}

"1"!or :"0"iespossibilit only two

%50)(

0for that Note

where

) variablesof (change 2

1

2

1)(

:receivedcorrectly be will"1" ay that Probabilit

minmax

max2

1

2/

2

1

2

2

max

Decision

pp

pp

u

V

Vv

V

Decision

VvP

V

VVV

Vvudue

dveVvP

pp

Decision

0 2 4 6 80.5

0.6

0.7

0.8

0.9

1

Vp-p /

Pro

babili

ty t

o c

orr

ectly r

eceiv

e a

"1"

(SNR)

Noise can be helpful!

DitheringDigital transmission system

-Quantization interval: range/2^bit

-Signal details smaller than the quantization interval are lost

Dithering

-Superimpose random noise on the signal (noise fluctuation should be a little larger than the quantization interval)

-Quantize/digitize

-Transmit

-Filter (low-pass filter), time information

1/3

=> Signal details smaller than the quantization

interval can be transmitted (overcome the

distortion caused by quantization)Image dithering refers to a different technique: diffuse color information to nearest pixels.

Page 18: Information Introduction to Information Theoryscale.t.u-tokyo.ac.jp/lecture/meas/1_meas.pdfcommunication theory • Entropy • Information • Information is equivalent to entropy

18

2/3

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Sig

nal

5

6

7

8

9

10

11

Quantized s

ignal

quantization interval

3/3

Sig

nal +

Nois

eQ

uantized

Filt

ere

d

Signal averaging

• Voltmeter reading in the

presence/absence of noise:

– 3-digit voltmeter

– Reading in the absence of noise : 5.50 V

– If noise > quantization interval 0.01 V

=> readings : 5.51, 5.50, 5.50, 5.51, 5.49, etc.

average of the readings 5.503 V => better

precision!!!