information introduction to information...
TRANSCRIPT
1
InformationIntroduction to
Information Theory
Basic concepts of information collecting and
processing systems
– Review the theory
– Examine examples
工学系研究科・機械工学専攻
Jean-Jacques Delaunay
Outline
1. What is information?
2. Where does information come from?
3. What is a message?
4. Where does the noise come from?
5. How does noise affect information transmission?
What you need to remember
• Definition of information [bit] is universal
• Measurements are always indirect and
therefore require calibration
• Noise causes random errors
2
International system of units
(Abbreviated SI)
SI is not adopted by Burma,
Liberia and the US!
The seven based units of SI:
meter [m], kilogram [kg], second [s], ampere [A],
kelvin [K], candela [cd], mole [mol]
Other units are derived from the based units
1. What is information?
Information
• Information is a measure of order
• Measure of order:
– Universal measure applicable to any structure,
any system (physical laws)
– Order quantifies the instructions needed to
produce a certain organization (binary
choices)
Minimum number of binary choices
Compute the information in any given arrangement
from the number of choices we must make to arrive at
that particular arrangement.
000 001 010 011 100 101 110 111
1st choice
2nd choice
3rd choice
Three correct consecutive choices
=> message identifying ‘101’ contains 3 bits of information
8 symbols of equally probable states
(bit : binary digit)
3
I am thinking of one person. How many
questions do you need to ask to find out?
Information definition
NI
1log2
• Information is a dimensionless entity
• I 0, information has to be acquired in order to make the
correct choice
For any number of possibilities N,
the information I for specifying a member
Information entropy (Shannon)Uncertainty is related to the logarithm of possible symbols
(equally likely symbols)
General formula for uncertainty
(non equally likely symbols)
M
2
1
For an infinite string which is formed from an alphabet of symbols,
the averaged uncertainty is
log ( ) [bits per symbol occurrence]
Symbol
i i
i
M
H p p
i
1
1 whose outcome has probability ; 1M
i i
i
,...,M p p
1/4
)(log2 NH
Where N is the number of equally likely symbols
ExampleWhen throwing a fair die, the probability of 5 is 1/6.
2 2log log 1/ 6 2.58 bitsH P
When 2 dices are thrown independently, the amount of
information associated with
thrown dice 1 = 5 AND thrown dice 2 = 4 is
1 2
2
1/ 6 1/ 6 1/ 36
log 5.17 bits
P P P
H P
We can also write
1 2 1 25.17 where 2.58 bitsH H H H H
4
Information entropy:
Shannon uncertainty equation
2
the number of available symbols; same frequency of occurrence
(equally likely symbols)
1The probability of a particular symbol appearing is
1The amount of information per symbol is log
M
PM
IM
2
2
log
The total amount of information in a string of length
formed from an alphabet of symbols is
log ( )
P
N
M
I N P
2/4 Information entropy:
Shannon uncertainty equation
equationy uncertaintShannon symbol]per [bits )(log
symbols allover averaged ,occurrence symbolper n informatio ofamount The
occurence symbolper n informatio providesit appears, Every time
)(log is stringin containsn informatio totalThe
)(log)(log is symbol by the providedn Informatio
is symbol ofy Probabilit
times appears symbol symbols; thefrom formed string Infinite
,...,, symbols ofSet
1
2
1
2
22
21
M
i
ii
i
ii
M
i
ii
iiiiii
iii
ii
M
ppN
I
N
Ix
pNpIN
pNppNIx
N
Npx
NxMN
xxxM
3/4
Information entropy:
Rate of information transmission
Amount of information that knowledge about the outcome
of an event adds to someone’s knowledge
errorontransmissi
after
Nsymbolsofnumber
before HHR
)(log
2
Self information is a decrease in uncertainty H
before and after communication
If there is no noise, the information communicated is Hbefore
4/4
Information is defined as a decrease in uncertainty
Information in the physical world
Living matter Nonliving matter
slyspontaneou form likely to Less
iespossibilit 4
(length) bases 10 of totalawith
sequences, specificin arranged
(letters) bases 4 of made isDNA
complex - ulesMacromolec
910
9
solutionin atoms
its from itself assembles crystalSalt
atoms sodium and chloride ofArray
NaCl) (e.g. worldMineral
5
DNA replication NaCl crystal New York Times
impurities no K,0at
0)1(log 2
T
I
)4(log910
2Ibits 8 character 1
characters 1000 page 1
2log10008
2
I
Entropy in statistical mechanics
• Measure of the dispersal of energy (“disorder”)
• Measure of “disorder”– For a system in thermodynamic equilibrium, the number N
of equivalent ways (microstates) a system can be constructed:
• For a perfect crystal at T = 0 K, N = 1, S = 0
• For a liquid phase, the number of equivalent ways for assembly gets very large
J/K 10 3806.1 constant,Boltzmann theis where
)ln(Entropy
23-
BB
B
kk
NkS
Dissipation of order and information
Smoke introduced into a closed chamber
Molecules are initially concentrated near
the source
=> significant order and information
Structure being lost. System evolves toward more
probable states
Macroscopic structure is lost. Thermodynamic
equilibrium.
=> the most probable state, information content zero,
entropy maximum
Tim
e
Info
rmation c
onte
nt:
dis
pla
cem
ent
of pro
bability w
ith
resp
ect
to t
he
the
rmo
d.
eq
uilib
.
Information and entropy of
communication theory
• Entropy
• Information
• Information is equivalent to entropy
)(log2 NH
)1
(log2N
I
0 entropy n informatio :lawon Conservati
An increase in entropy implies a decrease in information, and vice versa.
6
Summary
Communication theory: measure of order
n binary questions/choices N possibilities/symbols
N
Nn
2
2
log [bit]Entropy
)1
(log - [bit] nInformatio
Statistical mechanics: measure of disorder
N number of equivalent ways a system can be constructed
)ln( [J/K]Entropy NkB
)2( Nn 2. Where does information
come from?
Where does information comes from?
Information comes from sensors or transducers:
Sensors/transducers generate responses which
can be measured. This measurement creates
information.
=>Instrumentation and measurement form the
basis of all practical information systems
Transducers
A device that converts one form of energy into another.
Example: an electric motor is a transducer that converts mechanical energy into a voltage (or vice versa).
Transducers are important components in many types of sensor => convert the physical quantity to be measured into a voltage.
Transducers include sensors and actuators
-Sensors: Devices which transform a stimulus into an electrical signal (thermometer)
-Actuators: Devices which transform an input signal into motion (electrical motor)
7
SensorA device that receives and responds to a signal/stimulus.
Biological sensor
Electrochemical
character
Ion transport in nerve
fibers
Man-made sensor
Physical, chemical,
biochemical characters
Information is processed
and transported in
electrical form
stimulus
signalWorld
Environment
Body
sensorelectrical
signal
Instrument
An instrument senses some form of input and responds by producing an appropriate output.(Examples of instrument: balance, voltmeter, oscilloscope, PC, TV set…)
Important properties:
-Calibration
-Response time
-Range
-Noise
=> Measurement errors
=> Amount of information
is limited
Measurement
We do not measure weight directly. (indirect indication)
=> we have to calibrate the balance using known weights.
=> we compare the weight of the iron with a set of standard
weights.
Compression force from the
squeezed spring balances the
force from the increased weight.
We observe:
-pointer rotates (120 degrees)
-pan moves down (2 cm)
All measurements are comparisons with
some defined standard.
Calibration process
Where did the known weights come from?
National standard
Laboratory
Primary Reference
System
. . .
End User
System was
calibrated in the
step before
Makers
Calibrate a system
against the Primary
Reference System
Define what we
mean by “1 kg”,
“1 sec”.
Traceability: The ability to trace the calibration of a device through a chain of
calibrations to a primary standard. Traceability does not guarantee accuracy…
8
The Metrology Institute of Japan Calibration of measurement
devices always necessary?
Calibration standards are always necessary,
however special techniques may be applied to use
un-calibrated devices. For example:
- double weighting measurements (un-calibrated
balance, Borda XVIII century)
- reciprocity calibration for microphones
(piezoelectric)
Double weighting measurements
21
21
2
2112
22
221
112
MMM
rrgMMrrgM
grMMgr
grMMgr
1r2r
M M1
1r2r
M2 M
M (unknown mass) is measured two times (left and right side).
We write the two equilibrium conditions (torque balance):
onaccelerati nalgravitatio g
In this measurement, the
instrument is not calibrated.
M1 and M2 are used as the
weight standards.
Reciprocity calibration (1/2)The technique exploits the reciprocal nature of transduction
mechanisms such as the piezoelectric effect (microphones).
Uncalibrated microphones i, j, and k are used. The
microphones are placed facing each other with a well known
acoustic transfer impedance between their diaphragms.
ac impedance transfer acoustic Z
iI jUacZ
imic jmic
iI kUacZ
imic kmic
jIkUacZ
jmickmic
ji
i
j
ji MZMI
UZ ac,
impedance elecrical
1)
2)
3)
wavepressure toresponse
t measuremen voltage
sound of source
t measuremencurrent
U
I
9
Reciprocity calibration (2/2)
,
,
,
j
i j i ac j
i
ki k i ac k
i
kj k j ac k
j
UZ M Z M i j
I
UZ M Z M i k
I
UZ M Z M j k
I
The electrical transfer impedance is determined by measuring
currents and voltages only.
kacjkj
kjac
kiji
i
kacjkj
kjacikiji
MZMZ
MMZ
ZZM
MZMZ
MMZMZZ
,
,,
,
22
,,
kj
kiji
acackj
kiji
iZ
ZZ
ZZZ
ZZM
,
,,
,
,, 1
Accuracy and precision
We can never make perfect measurements
with absolute accuracy or precision:
1) Noise effect (thermal noise, shot noise…)
2) Calibration process
=> The amount of information we can collect
is always finite.
Accuracy and precision
Accurate
and precise
Not accurate
but precise
Not accurate
not preciseAccurate,
not precise
Summary
• Information is collected by instruments which perform some kind of measurement.
• All measurements are comparisons with a set of standards.
• The amount of information we can collect is always finite, limited by the effects of noise, saturation and response time.
10
3. What is a message?
Signal and message
Information channelInformation
Source Receiver
Measurement
+ coding
Transport signal Decoding
microphone wires earpiece
Telephone example:
Information is in the form of a
signal which carries a message.
Communication requires a code
Before a signal can be used to communicate some specific information in the form of a message, the sender and the receiver must have agreed on the details of how the actual signals are to be used (code).=> Distinguish one code symbol from another
Communicate information requires:
Addresser
Addressee
Channel
Code
Context
Message
Code includes Error Detection
-Parity bit-
We use redundancy to check for transmission errors
10100100 010100100 010100100
01010101 101010101 101010101
11101111 011101111 011101111
11101111 011101111 011111111
8 bit word 9 bit word
Parity bit
(Redundant information)
Parity = 0: odd numbers of 1
Parity = 1: even numbers of 1
Note that there exists two types of parity bits: even parity bit and odd parity bit.
Transmission
error
9 bit word
Odd parity bit: the parity bit of
each word can be set to 1 when it
contains an even number of 1.
Original word Parity bit added After transmission
0
1
0
0
0
1
0
1
Before Transmission After Transmission
11
Conversion into binary numbers
Define a signal range
(wide enough to ensure
that the signal is inside
the chosen range)
Sample
1
0
{1}
1
0
{10}
Conversion to
a 1-digit binary
Conversion to
a 2-digit binary
Sampling
Signal level moves out of the initial range: signal is said to have been clipped
Value indicated by the nearest available binary number.
Examples of analog-to-digital conversions
-Signal amplitude is coded using 12
bits for each pixel
-Acquisition rate is 10 frames per
second
Image sensors Oscilloscope (DAQ card)
-Signal amplitude is coded
using 8 bits
-Sample rate is 2 Giga points
per second
-bandwidth 100 MHz
How much information in a message?
Bit:Yes/no question; minimum possible amount of information.
Ask n questions => n bits of information (coded by 2n distinct symbols)
Note: ask an extra question double the number of symbols, but it doesn’t provide twice as much information.
2 bits (22 = 4 symbols) -> 3 bits (23 = 8 symbols)
=> 3/2 as much information
Total amount of information contained in a message:Information = Number of samples number of bits
Number of samples = measurement duration / sampling time
Number of bits = log2(number of symbols)
12
Capacity of an information channel
The amount of information that can be
communicated through a channel is limited
by the properties of that channel:
• Range
• Noise
• Response time (bandwidth)
Effect of noise
The amount of information an analog channel can convey is limited by the noise level.
• Channel maximum range 1 V; noise level 1 mV
Measurement with an Analog to Digital Convertor (ADC):
Set voltage range to 1 V
8-bit ADC => 1 V / 28 = 1/256 = 0.0039 V
10-bit ADC => 1 V / 210 = 1/1024 = 0.00097 V noise
> 10-bit ADC => obtain NO extra information
1/4
Effect of noise: Dynamic range
power. noisemean theis
signal theofpower maximum theis where
[decibels] log10Range Dynamic
noise
max
noise
max
P
P
P
P
2/4
Definition of the Dynamic range: the ratio of maximum
signal amplitude to the minimum amplitude detectable
Signal has a finite dynamic range
=> Amount of information limited by the noise level(the effect seen in the previous slide is a consequence of the signal
having a finite dynamic range)
Effect of noise:
Signal to noise ratio
3/4
noise
signal actual log10SNR
decibelsin (SNR) Ratio Noise toSignal
P
P
Note: SNR is different from the dynamic range (Pactual signal Pmax)
noise
max
2noise
2max
noise
signal actual
2noise
2signal actual
log20log10Range Dynamic
log20log10 SNR
V
V
V
V
V
V
V
V
rms noise
rms signal maximum
rms signal actual
where
noise
max
signal actual
V
V
V
13
Effect of noise:
Decrease in transmission rate
4/4
bit/sec 1 bit/sec 919.0081.01
081.0)01.0(log01.0)99.0(log99.0
1
0.99 is dtransmittecorrectly is 0 ay that probabilit
0.99 is dtransmittecorrectly is 1 ay that probabilit
noise with Case 2)
bit/sec 10,1
error)ion transmiss(no noise without Case 1)
:secondevery ed transmittsymbolslikely equally Two
22after
before
afterbefore
R
H
H
RHH
Before transmission After transmission
1 or 0
H = log2(2) = 1 bit
Without noise
-We receive a 1
P(1) = 1 => H=log2(1)=0
P(0) = 0
-We receive a 0
P(1) = 0
P(0) = 1 => H=log2(1)=0
With noise
-We receive a 1
P(1) = 0.99
P(0) = 0.01
-We receive a 0
P(1) = 0.01
P(0) = 0.99
Effect of response time
Limitation on how quickly the voltage can be changed (finite response time of any system).
Example: wires take 1 s to respond to a change.
Sampling at a frequency > 1 MHz do NOT provide extra information.
Maximum signal frequency the wires can carry?
One cycle, one up and down: 1 s + 1 s = 2 s
=>maximum frequency = 0.5 MHz
Sampling rate = 2 maximum frequency
1/2
Time
Frequency:
Signal frequency 1/T
Sampling frequency 2/T
Time:
Signal period: T
Sampling: T/2
Sampling frequency = 2 maximum signal frequency
T
T/2
signal
14
Bandwidth
Bandwidth: the range of frequencies a
channel can carry.
Usually bandwidth = maximum frequency
(range of frequencies extends to dc)
Sampling rate = 2 Bandwidth
2/2
Summary
• Information system consists of an information source connected to a receiver through an information channel.
• A set of information is a message which is sent as a signal using a code made up of symbols.
• The amount of information a channel can carry is limited by its range, the level of noise and its response time.
4. Where does noise come from?
Noise sources
Noise type Origin System Noise
spectrum
Thermal
(Johnson)
Thermal
agitation
Electronic
systems
Flat
(normally
distributed)
Shot Quantization
of electrical
charge
Electronic
systems
Flat
(Poisson-
distributed)
1/f System
dynamics
Chaotic
systems
1/f
15
Thermal noise
Mobile electrons move in a random fashion:
=> at any particular instant there may be
more e- near one end of the conductor than
the other.
Resistor
mobile electrons
1/3
Thermal noise occurs in all systems which are
not at absolute zero: cannot be avoided.
At time t more
e- on this side
Thermal noise 2/3
][ resistance sresistor' theis
[Hz]bandwidth observed theis
[K] re temperatusresistor' theis
J/K) 10(1.38constant Boltzman theis where
[V] 4
isresistor a accross voltagenoise theof rms the
[W] 4 ispower noise thermalThe
23-
2noise
R
B
T
k
RBTkv
TBkP
B
B
B
HzV/ -104
1000 300 23-101.38 4/2 :nsfluctuatio produces
K 300at resistor kΩ 1 a accross noise thermal:Example
9
Bnoise
v
Noise power
observation
3/3
noisev
R
inR
amplifier
BTkvR
iv
RR
vRR
Riv
iRvRR
vi
noiseinin
in
noise
in
ininin
inininin
noisein
2
max
2
2
4
1
:for obtained ispower maximum
power noisemean
;
ini
inv
Band-
bass filter
rms
volts
Shot noise
Shot noise arises because of the quantization
of the electrical charge (e = 1.610-19 C).
Ii
nmnt
enI
t
eni
i
k
m
i
kkk
kk
mk
1
,1
/1 where
,
t measuremen same Repeat thewire
n charges
cross the
surface in a
time t
Shot noise cannot be avoided.
Each carrier has its own velocity and separation from its neighbors
16
1/f noise
Thermal noise and shot noise are ‘white’ noises (have flat power spectra)
1/f noise :
noise power spectrum 1/f
where 0.5 2
1/f power spectrum observed in:
- Some chaotic systems, such as in the Bernoulli map (deterministic chaos).
- Brownian motion (red noise).
Example of 1/f noise
Summary
• Thermal noise and shot noise arises from
the quantization of electrical charge.
These types of noise are unavoidable.
• Thermal noise and Shot noise have a
uniform noise power spectral density
(white noise)
5. How does noise affect
information transmission?
Digital transmission over a noisy channel
Noise produces random errors
=> Information received is not always correct T
ransm
itte
d
Vmax
Vmin
Receiv
ed
0 a received levelDecision Voltage
1 a received levelDecision Voltage
:rules codeon Transmissi
2 levelDecision
digitsbinary of stream a assent Message
minmax
VVVDecision
Before transmission After transmission
17
Transmission errors
Dis
t. o
f "1
" voltages
Vmax
Dis
t. o
f "0
" voltages
Vmin
Decision level
Voltages that
give the wrong
result.
Decision level
Vmin
Vmax
{1}{0}
"1"!or :"0"iespossibilit only two
%50)(
0for that Note
where
) variablesof (change 2
1
2
1)(
:receivedcorrectly be will"1" ay that Probabilit
minmax
max2
1
2/
2
1
2
2
max
Decision
pp
pp
u
V
Vv
V
Decision
VvP
V
VVV
Vvudue
dveVvP
pp
Decision
0 2 4 6 80.5
0.6
0.7
0.8
0.9
1
Vp-p /
Pro
babili
ty t
o c
orr
ectly r
eceiv
e a
"1"
(SNR)
Noise can be helpful!
DitheringDigital transmission system
-Quantization interval: range/2^bit
-Signal details smaller than the quantization interval are lost
Dithering
-Superimpose random noise on the signal (noise fluctuation should be a little larger than the quantization interval)
-Quantize/digitize
-Transmit
-Filter (low-pass filter), time information
1/3
=> Signal details smaller than the quantization
interval can be transmitted (overcome the
distortion caused by quantization)Image dithering refers to a different technique: diffuse color information to nearest pixels.
18
2/3
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Sig
nal
5
6
7
8
9
10
11
Quantized s
ignal
quantization interval
3/3
Sig
nal +
Nois
eQ
uantized
Filt
ere
d
Signal averaging
• Voltmeter reading in the
presence/absence of noise:
– 3-digit voltmeter
– Reading in the absence of noise : 5.50 V
– If noise > quantization interval 0.01 V
=> readings : 5.51, 5.50, 5.50, 5.51, 5.49, etc.
average of the readings 5.503 V => better
precision!!!