inertial and video data integration - kth · contents introdution 1 thales group 1 goals &...

Inertial and Video Data Integration

A N T O I N E D E B L O N D E

Master of Science Thesis Stockholm, Sweden 2011

Inertial and Video Data Integration

A N T O I N E D E B L O N D E

Master’s Thesis in Computer Science (30 ECTS credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2011 Supervisor at CSC was Lars Kjelldahl Examiner was Lars Kjelldahl TRITA-CSC-E 2011:044 ISRN-KTH/CSC/E--11/044--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.kth.se/csc

Abstract

As a part of its research projects, the Advanced Studies Department of ThalesSimulation is studying mixed reality solutions as an alternative to training sim-ulation. Here, the study focus on the conception and realization of a new kindof 3D positioning, aiming at a supervision system in mixed reality.The core idea is the fusion between heterogeneous data from 6-axis inertial sen-sors and a camera, to locate and accurately orientate of individuals evolving ina urban area.Using inertial data to perform the estimation of an object’s attitude when visualdata are not available is a goal that would have a strong impact in geolocal-ization or human-computer interfaces. But the most obvious thing to do, anintegration of acceleration over time to get the position, is in fact theoreticallya dead-end.Nevertheless, the efficiency and quality of inertial measurement devices is con-stantly improving since years, and such devices become more and more afford-able.The aim of this work was to estimate if, by using professional quality inertialmeasurement units and combining several methods of signal processing and dataanalysis, it is possible to obtain a relevant estimation of an object’s attitude, interms of position and of orientation, when nothing more than inertial data areavailable, but after an arbitrarily long phase of calibration, during which videodata where available, occurred.

Contents

Introdution 1

Thales Group 1

Goals & Methods 1

I Experimental protocol 3

1 Inertial Measurement Unit technical characteristics 3

2 Dead reckoning 4

3 Attitude estimation, theory 53.1 First scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Second scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 Application 7

II Dealing with dead reckoning : State of the Art 9

5 Problem with temporal integration of acceleration 9

6 Angular integration 9

7 Numerical filters 107.1 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107.2 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . 127.3 Classical Linear Numerical Filters - Reminders . . . . . . . . . . 12

7.3.1 Bessel Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 137.3.2 Butterworth Filter . . . . . . . . . . . . . . . . . . . . . . 137.3.3 Chebychev Filter . . . . . . . . . . . . . . . . . . . . . . . 13

8 Numerical integration 148.1 Temporal integration in frequency domain - Reminders . . . . . . 15

III Implementation 17

9 Features 179.1 OpenSceneGraph . . . . . . . . . . . . . . . . . . . . . . . . . . . 179.2 IMU’s data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189.3 Video tracking with osgART . . . . . . . . . . . . . . . . . . . . 189.4 Recording data with OpenCV . . . . . . . . . . . . . . . . . . . . 189.5 Signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

10 Computing the transformation between IMU and camera coor-dinate system 19

11 Overview of the program 20

12 Reliability of the camera tracker 22

IV Drift Correction 24

13 Modeling a moving offset using a linear regression 2413.1 Computing camera-based acceleration . . . . . . . . . . . . . . . 25

14 Gain Correction 27

15 Detection of shocks 27

16 Orientation with the magnetic field 29

17 Removing the gravitational component of the acceleration usingmoving average 29

V Results 30

18 Integration in spectral domain 33

19 Testing the stability of the magnetic field 33

20 Effects of filters 3520.1 Effect of Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . 3520.2 Effect of other linear filters . . . . . . . . . . . . . . . . . . . . . 36

21 Computing the position 40

22 Using the IMU for orientation only 40

Conclusions and Discussions 44

Bibliography 47

IntroductionThe development of advanced real-time mixed reality solutions for various pur-poses, such as full-scale military training, requires the parallel development ofmethods for linking virtual scenes and real ones. Besides cameras, one of themost widely used devices for such purposes, combined with the constant im-provement of accelerometers, associated with lowering prices, allows us to useinertial measurement units as auxiliary devices to provide connection betweenvirtuality and reality.An inertial measurement unit (IMU) can provide real-time data about its an-gular and spatial accelerations in three dimensions, using accelerometers andgyroscopes, with a frequency that can easily go to 100 Hz or more. These ac-celerations can be processed to obtain the position and orientation of the IMU,since its acceleration is nothing more than the double time derivative of theposition. Nevertheless, the double integration of a noisy data can lead to asigni[U+FB01]cant drift in the computed position of the IMU : in practice, nomethod exists that ensure the correctness of the time integration of accelerationin all situations without a drift.In this internship, we are instead studying a particular case of data processing: the IMU is coupled with a tracking camera, using visual markers to do thepositioning in space. The 3D position through acceleration integration, is ex-pected to substitute itself to the camera when the later is failing at tracking themarkers. Thus, we consider a small time frame, and movements that can beassociated to typical scenarii : a monitoring robot or camera, a human walkingin a corridor, a car in a street... The main goal is to achieve positional trackingusing only the inertial data, or coupled with GPS data provided by a third-partyapplication/device, with an accuracy that can prove useful for CCTV, SLAM(Simultaneaous Localization and Mapping) or other industry-oriented applica-tions, during a time frame that gives the camera enough time to catch backon markers in realistic situations. The use case will be an internal project ofvirtualized environment by the company Thales.

Thales Group

Thales, previously known as Thomson-CSF, is a french electronics company,specialized in aerospace, defense, security and information technologies. Thecompany is present in 50 countries, has 68 000 employees, and is one of theworld leader on the market of information systems for security, communicationand avionics, both in civil and military sectors.With an annual revenue of AC12.9 billions in 2009, the group is the 9th largestdefense contractor in the world. Currently in reorganization, the group hasa matricial structure, with entities for both geographical areas and businessspecialties.

Thales Avionic, BL Training & Simulation

Thales Training & Simulation is an entity from the Thales Avionics divisionsince the last reorganization of the company. This entity joins all the units of

1

the group related with the simulation field. The BL develops simulators forairplanes, military vehicles, trucks, power plants, urban areas.

Goals & Methods

The aim of the internship is the acquisition and process of the frames emittedby a professional inertial measurement unit, in order to extract its position andattitude in a robust way. The coupling with an image from a camera will allowto refine the extrinsic positioning, through the matching to a photo-realisticsynthesized model.First, we have to calibrate a inertial measurement unit and develop robustalgorithms for pose estimation, using inertial data extracted from the IMU, wethen have to integrate these data in a virtual environment, built with the C++library OpenSceneGraph, in order to obtain a real-time positional mapping.Work progressed on the following features :

� Integration of the Xsens’ IMU into an OpenSceneGraph pro-gramSince the team’s project was built on the OpenSceneGraph library, it wasnecessary to learn to use this library, and to implement an OpenScene-Graph program able to read and use the data flow from the IMU. Thiswork was greatly facilitated by the availability of the MTi-G driver’s sourcecode.

� Integration of a tracker feature using camera data into this pro-gramThe IMU was aimed to work as an auxiliary of a camera tracking system(and possibly of a GPS, but this was not part of the subject). Imageprocessing is more deeply studied and documented than inertia process-ing, and a lot of material is available on the subject. We chose to use acouple of C++ libraries targeted at augmented reality to implement thecamera tracking : ARToolkit (augmented reality toolkit), and osgART(ARToolkit for openscenegraph).

� Experimentation on filtersSeveral data processing methods and filters were tried to increase theusability of the data we got from the IMU and the image tracking, mostnotably Kalman Filters and Infinite Impulsion Filters.

� Experimentation on integration and derivationSeveral derivation (to get the theoretical acceleration from the positionprovided by the camera tracker) and integration methods were tested inorder to minimize the drift from the integration step

2

Part I

Experimental protocol

1 Inertial Measurement Unit technical charac-teristics

The inertial measurement unit used for the experimental protocol is the MTi-Gfrom Xsens company.It provides accelerometers measuring acceleration in the 3 dimensions, 3 gy-roscopes measuring angular velocity over the 3 axis, a magnetometer and abarometer, and can make use of an embedded GPS. We won’t make use of theGPS features.

Figure 1: The IMU model MTi-G from Xsens

Technical data about the inertial measurement unit (MTi-G from Xsens) aregiven :

Angular resolution 0.05 degreesstatic accuracy (roll/pitch) 0.5 degreesstatic accuracy (yaw) 1.0 degreesaccelerometer refreshing rate & gyro max 120 Hz (cal.) / 500 Hz (raw)numerical accuracy (accelero & gyro) 16 bitsbandwidth Gyro 40 Hz, Accelero 30 HzGPS refreshing rate GPS 4 HzGPS accuracy 2.5 m CEP

Xsens provides, beside its material, the source code of the IMU driver, whichhelps us to integrate directly this code into a pose estimation program, andto eventually to modify it. This code allows notably to set numerous IMU’sparameters.The following data outputs are available, through the preprocessor variables ofthe kind CMT OUTPUTMODE :

� Temperature

� Calibrated Data

3

� Orientation

� Position (through GPS)

� Velocity (through GPS)

� GPS PVT (Position, Velocity, Time & Pressure)

� Raw Data (can be used only combined with the GPS PVT output)

The following settings are also available, through the preprocessor variables ofthe CMT OUTPUTSETTINGS kind :

� time stamp (counter and/or UTC)

� orientation (quaternion, Euler angles, matrix)

� activation/deactivation of the acceleration calibration/rotation frequency/magnetometer

� output format (floating point 4 bytes by default, otherwise fixed point (12bits for integers 20 bits fraction in low accuracy/ 16 bits integers 32 bitsfractions in high accuracy)

� coordinates system choice (X North Z up / North East Down)

� axis origin calibration settings LLA WGS84 (position) and XYZ (velocity)

Besides raw data from all of its sensors, the MTi-G makes a first processing ofthe raw data and provides calibrated data, using notably an internally-developedextended Kalman Filter (XKF).Among these data, we can find the angular orientation of the IMU, directlyintegrated over time.We will explain in the next part why it is possible to produce a reliable inte-gration of the angular velocity, and not to do the same for the acceleration ofposition.

2 Dead reckoning

A typical application of the algorithms we are working on would be the atti-tude estimation of a mobile on which an Inertial Measurement Unit has beenattached.The mobile is also equipped with a camera, which is able to compute the mobileattitude regarding the position and orientation of a motionless visual marker,most of the time.In this case, we would try to compute an attitude estimation of the mobile whenthe camera, for any reason, loses track of the marker. For this, we will use deadreckoning method on the mobile trajectory, meaning that at each time step, weare going to estimate the current position and orientation of the mobile basedon the previously known values of its position, orientation, acceleration, speed,and angular speed.

Such an experiment would fit a situation where, for example, the attitude of a

4

robot doing Simultaneous Localization and Mapping is needed.

Another application of the algorithms would be a situation where the cam-era is motionless, and the marker is rigidly linked with the mobile. Here, thecamera is supposed to give the attitude of the mobile in the camera’s coordinatesystem, and the IMU is supposed to provide this attitude when the camera losesthe marker.

3 Attitude estimation, theory

The Inertial Measurement Unit provides us a measure of the acceleration vector

A = d2

dt2

(−−→OG)

where G is the center of mass of the IMU, and O is the origin

of the Galilean reference frame, the derivation being also made in the Galileanreference frame.Two things to notice :

� If the acceleration is expressed in the Galilean reference frame, it is mea-sured in the local coordinate system of the IMU, i.e. according to its localX,Y,Z axis at any time t.

� The acceleration given by an accelerometer includes the acceleration fromgravity. We must dissociate it from the movement’s acceleration.

3.1 First scenario

The marker is motionless, the camera moves with the IMU, and we want toobtain the acceleration of the IMU expressed in the marker’s coordinate system.Let us consider that we can obtain the IMU acceleration A, expressed in thelocal IMU coordinate system RIMU . What we want now is to express thisacceleration in a world coordinate system Rworld. Noting PRworld→RIMU thetransformation (rotation) matrix from the IMU coordinate system to the worldcoordinate system, we have then :

ARworld = P−1Rworld→RIMU ·AIMU

Now, let us consider that we can obtain the marker orientation Mmarker inthe camera coordinate system Rcamera .Then, Mmarker is simply the transformation matrix between Rcamera and thecoordinate system defined by the position and orientation of the marker Rmarker:

Mmarker = PRcamera→Rmarker

Now, noting PRIMU→Rcamera the transformation matrix between the local IMUreferential and the local camera referential :

ARmarker = PRcamera→Rmarker ·ARcamera= PRcamera→Rmarker · PRIMU→Rcamera ·ARIMU= PRcamera→Rmarker · PRIMU→Rcamera · PRworld→RIMU︸︷︷︸

=Pglobal

·ARworld

5

Then, let us consider that Rworld is in fact the coordinate system definedby the initial position and orientation of the IMU. Since the marker is mo-tionless, its position and orientation in regard to Rworld is constant, thus,Pglobal = PRmarker→RIMU is constant.

Finally, at any moment t, we have :

ARmarker (t) = PRmarker→RIMU · P−1Rworld→RIMU (t) ·ARIMU (t)

ARmarker (t) = Pglobal · P−1Rworld→RIMU (t) ·ARIMU (t)

Now, noting X the IMU position and V its velocity in the Rmarker coordinatesystem we can write

VRmarker (t) = VRmarker (t0) +

∫ t

t0

ARmarker (u)du

XRmarker (t) = XRmarker (t0) + VRmarker (t0) +

∫ t

t0

(∫ v

t0

ARmarker (u)du

)dv

XRmarker (t) = XRmarker (t0)+VRmarker (t0)+

∫ t

t0

(∫ v

t0

Pglobal · P−1Rworld→RIMU (u) ·ARIMU (u)du

)dv

Figure 2: Experimental protocol - First scenario

Accelerometers data include gravity, so we must correct it before integration.We define a gravity vector −→g , whose value in the two motionless coordinatesystems Rworld and Rmarker is constant. For example, if we define the Rworld =RIMU initial by initially positioning the IMU with Z-axis up, we have

−→g world = (0 0 9.81)T

and by noting A the acceleration of the IMU corrected from the gravity, we have

ARIMU (t) = ARIMU (t)− PRworld→RIMU · −→g

6

and

XRmarker (t) = XRmarker (t0) + VRmarker (t0)

+∫ tt0

(∫ vt0Pglobal ·

(P−1Rworld→RIMU (u) ·ARIMU (u)−−→g world

)du)dv

3.2 Second scenario

Here, the camera is motionless, the marker moves with the IMU, and we want toobtain the acceleration of the IMU expressed in the camera coordinate system.Using the notations of the previous scenario, we have

ARcamera = PRmarker→Rcamera ·ARmarker= PRmarker→Rcamera · PRworld→Rmarker ·ARworld= PRmarker→Rcamera · PRworld→Rmarker︸︷︷︸

Pglobal bis

·PRIMU→Rworld ·ARIMU

Since the camera is motionless and the coordinate system Rworld is defined bythe initial (and motionless) attitude of the IMU, the transformation Pglobal bisis constant. Finally, at any moment t, we have :

ARcamera(t) = Pglobal bis · P−1Rworld→RIMU (t) ·ARIMU (t)

and, similarly to the first scenario, we have

XRcamera(t) = XRcamera(t0) + VRcamera(t0)

+∫ tt0

(∫ vt0Pglobal bis ·

(P−1Rworld→RIMU (u) ·ARIMU (u)−−→g world

)du)dv

4 Application

In the previous formula, the following elements can be obtained :

� PRIMU→Rworld is simply the rotation matrix defined by the angular orien-tation provided by the IMU

� PRmarker→Rcamera can be provided by an algorithm computing the attitudeof the marker in the camera coordinate system, done by common imageprocessing, at a moment when the user is sure that the camera catch themarker

� Pglobal : since it is constant, all of its components can be computed at amoment where PRmarker→Rcamera is available

In the first scenario, PRworld→Rcamera must be estimated, it can be done easilyif the camera and the IMU are rigidly linked (i.e. the matrix is constant). Asimple method will be explained furtherIn the second scenario, PRworld→Rmarker must be estimated, it can be slightlymore difficult to proceed than the previous transformation.

That means that the attitude of the mobile (camera + IMU) can be com-puted using the formula, according to the fact that a minimal calibration time(during which camera data for the marker were available) was possible.

7

Figure 3: Experimental protocol - Second scenario

If we want to make comparisons between the acceleration provided by the IMUand acceleration provided by the camera, we have to compute a reference framechange, between the reference frame of the camera and the IMU’s one. Noting−−→OC the position vector of the optical center of the camera (center of the cam-era’s reference frame) in the marker’s reference frame (O being the center of themarker), we have, using the twist transformation formula :

d2

dt2

(−−→OG)/Rworld

=d2

dt2

(−−→OC)/Rworld

+ω/Rworld∧PRworld→RIMU−−→GC+ω∧ d

dt

(PRworld→RIMU

−−→GC)/Rworld

where ω is the rotation velocity vector of the IMU regarding its initial referenceframe.Since the camera is rigidly linked to the IMU,

−−→GC is constant, thus, the last

term of the sum is null.Since the length ‖ −−→GC ‖ is about 5 cm, in most testing situations where theself-rotations of the IMU are slow, i.e. ‖ ω ‖� 1, the second term of the sumcan be neglected.

8

Part II

Dealing with dead reckoning :State of the Art

5 Problem with temporal integration of acceler-ation

The previously exposed method is theoretically perfectly correct, but its applica-tion in reality with accelerations provided by a discrete flow of data is extremelydifficult .Let us consider an acceleration signal a[t] and a velocity signal v[t], available atevery ∆t time step, with a measurement error ε[t] considered as a white noiseon the acceleration measure. Considering these two signals as discretization ofsmooth functions, we have

v[t+ ∆t] = v[t] + a[t+ ∆t] ·∆t+ ε[t+ ∆t] ·∆t+ o(∆t)

and, after N iterations,

v[t+N∆t] = v[t] +

N∑n=1

a[t+ n∆t] ·∆t+

N∑n=1

ε[t+ n∆t] ·∆t+

N∑n=1

hn(∆t)

where

∀i > 0, hi = o(∆t)

But, if we know that the sum∑Nn=1

ε[t+∆t]N converges toward 0 since ε is a white

noise, we have absolutely no idea if the two last sums of the previous equationconverge or even have an upper limit. And another time integration to get theposition from the velocity would make the problem even worse.This fact makes any attempt to compute the position of an IMU according to itsacceleration only unsuccessful on the long run, at least using an Euler method(directly inspired from the previous equations) or some derived.

Following tests which show us that other integration methods, notably inte-gration through frequency domain, will prove inaccurate on the long term too.So, for the rest of the work, we assume that we will not try to find a methodto keep track of the accelerometer position on the long run but insteadtry to find workarounds which allows to lengthen the duration of ac-curacy of the integration, or methods to re-calibrate the position ofthe IMU using accelerometers (and gyros) data only.

6 Angular integration

The gyroscopes included in an IMU provides angular velocities along the 3 axisof the IMU, which means that in order to obtain the rotation between the cur-rent position and the reference orientation, we must also integrate the angular

9

velocities over time.Let−→θ (t) = (θx(t)θy(t)θz(t)) the angular vector between the current attitude of

the IMU and the reference attitude, and −→ω (t) = (ωx(t)ωy(t)ωz(t)) the instan-taneous angular speed vector of the IMU, we have

−→θ (t+ ∆t) =

−→θ (t) +−→ω (t) ·∆t+−→ε (t+ ∆t) + o(∆t)

were −→ε is the measure error.That means that we are going to have exactly the same problem as with the inte-gration of acceleration. Nonetheless, the problem can be made slightly smaller,for two reasons :

1. we must integrate only once, and thus slowing down the drift growth.

2. one of the main problem of the acceleration integration is the necessarysubtraction of gravity before any further operation.Nevertheless, this subtraction needs to know the orientation of the IMUin a world coordinate system with an especially high accuracy, since afew degrees of error on the orientation can induce up to 1m.s−2 of erroron the acceleration, which is critically bad for movements with a smallacceleration (most human movements in fact).The angular integration does not suffer from this problem, so the angulardrift can be as low as 0.5 degrees over a minute (which is the angulardrift from the Xsens MTi-G). Over this time length, the acceleration driftwould have been far more significant . . .

7 Numerical filters

The main purpose of filtering is to remove as much measurement noise as pos-sible, and, if possible, to smooth the shape of the measurement curves withouterasing relevant information. Generally, it means removing high-frequency com-ponents of a signal while preserving the low-frequency components, so we mustconsider low-pass filters :

7.1 Kalman Filter

The Kalman filter is one of the most wisely used for data fusion. The core ideaof this filter is to consider separately, in a system, an estimation of the true valueof a noised measure and the measure itself. To get the estimation of the truevalue at step t+ 1, the algorithm computes a weighted average of the predictedvalue at step t and of the measured value, using respectively uncertainty on thepredicted value and measure noise amplitude as weight.Theoretically speaking, the Kalman filter is a Bayesian model similar to a hid-den Markov where the state space of the latent variables is continuous and allvariables have a Gaussian distribution.It is important to notice that the Kalman filter is a recursive estimator, whichmeans that in order to estimate the current state, only the previous state andcurrent measures are required, whereas in other common filters, like Finite orInfinite Impulse Response filters, an historic of data is necessary.We consider xt1|t2 the estimation vector at time t1 knowing estimation at time

10

t2, and Pt1|t2 the error covariance matrix of the state vector at time t1 knowingestimation at time t2.Noting

� Fk the conversion matrix between state k − 1 and state k

� uk a command vector

� Bk the conversion matrix between the command uk and the state Xk

� Pk|k−1 the a priori error covariance matrix of the state vector xk

� Pk|k−1 the a posteriori error covariance matrix of the state vector xk

� Qk the process noise covariance matrix of the state k

� yk the process measure at state k

� Hk the conversion matrix between state xk and measure yk

� I an identity matrix with appropriate dimensions

We have :

Prediction

a priori state estimation (as the state update in automatism)

xk|k−1 = Fkxk−1|k +Bkuk−1

error covariance a priori estimation

Pk|k−1 = FkPk−1|k−1FTk +Qk

which can be seen, noting yk−1 the error at state k − 1, as

Pk|k−1 = Fkyk−1yTk−1F

Tk +Qk = yk−1y

Tk−1 +Qk

Update

Innovation : we compute the difference between the real measure and whatthe measure should be according to the estimated state

yk = yk −Hkxk|k−1

Innovation covarianceSk = HkPk|k−1H

Tk +Rk

which can be seen as

Sk = (Hkxk−1)(Hkxk−1)T︸︷︷︸error from estimated state

+ HkQkHTk︸︷︷︸

error from process noise

+ Rk︸︷︷︸error from measurement noise

Optimal Kalman GainKk + Pk|k−1H

Tk S−1k

a posteriori state estimate

xk|k = xk|k−1 +Kkyk

a posteriori error covariance estimation

Pk|k = (I −KkHk)Pk|k−1

11

7.2 Extended Kalman Filter

The Extended Kalman Filter is a variant of the original Kalman filter, wherethe evolution of the states and measures are not only linear function modeledby matrices, but can be any differentiable functions :

xk = f(xk−1, uk, wk)

yk = h(xk, vk)

What we can infer from the theoretical description of the Kalman filter isthat such a filter can be useful (and in fact is extremely efficient) for the stateestimation in a system where :

1. even highly noisy, a measure of the output is available

2. we know the conversion matrix between the state and the measurement

3. we know the process noise covariance matrix and the measurement noisecovariance matrix

Thus, the Kalman filter can be used, for example, to interpolate data and fusiondata, like in dead reckoning.

7.3 Classical Linear Numerical Filters - Reminders

A numerical filter is an algorithm that proceeds a discrete signal in order tofilter it, by various means. The most common numerical filter are linear ones: the output is a linear combination of the previous inputs and outputs. Thegeneral form of a linear filter is the following one, noting y[n] the output andx[n] the input :

y[n] =

N∑k=0

bk · x[n− k]−M∑k=1

ak · y[nk]

Two families of linear filters exist :

� Finite impulse response filtersIn these filters, all the coefficients a are equal to zero, which means thatthe last output depends only on the inputs. The name of thisfamily comes from the fact that the response of one of those filters toan impulsion (a Kronecker function δ[k]) will always stabilize itself tozero. A important property of the FIR filters is the fact that the outputis simply the convolution of the input with the coefficient b.A very simple example of a FIR is the moving average : ∀i, 0 ≤ i ≤N, bi = 1

N ,∀i > N, bi = 0

� Infinite impulse response filtersThese filters have an infinite impulse response, which means the output toan impulsion will never stabilize itself, even at infinity. They are recursive: the current output depends on both the previous inputs and outputs.IIR filters allows to use more data for filtering, thus allowing more subtleprocessing, but they have the drawback that they are not necessary stable :it can be difficult to fit the coefficients to the input used to avoid divergenceof the outputs.

12

7.3.1 Bessel Filter

The Bessel low-pass IIR filter of order n is defined by :

� (bi) = b0 = (2n!)2nn!

� (ai) = (2n−k)!2n−kk!(n−k)!

· 1ω0

where ω0 represents the cut-off frequency.The main interest of the Bessel filter is the preservation of the phase delay acrossthe pass-band, preserving the wave shape of the filtered signal in the pass-band.The main drawback is its low selectivity, which grows slowly with the filter’sorder.

Figure 4: Illustration of the gain (red) and phase (dark gray) curves of a Besselfilter of 10-th order, with a cut-off frequency equal to 15 Hz. The abscissa isthe fraction of the acquisition frequency

7.3.2 Butterworth Filter

The Butterworth low-pass IIR filter of order n is defined by :

� (bi) = b0 = G0 (Gain)

�∑Mk=1 ak · y[nk] =

∏Mk=1(p − pk)/ωc where pk = ωce

j(2k+n−1)π2n and wc is

the cut-off frequency

The main interest of the Butterworth filter is its maximally flat gain responsein the pass-band. Nevertheless, the roll-off (steep of the gain beyond the cut-offfrequency) is weaker than in other filters, needing a higher order to implementa good low-pass.

7.3.3 Chebychev Filter

The Chebychev filter allows to obtain a pass-band with a steeper roll-off thanother filters, but at the cost of a ripple (i.e. oscillations) in the pass-band.

It can be categorized in type I (ripple in the pass-band) or type II (ripple inthe stop-band).

13

Figure 5: Illustration of the gain (red) and phase (blue) curves of a Butterworthfilter of 10-th order, with a cut-off frequency equal to 15 Hz. The abscissa isthe fraction of the acquisition frequency

If we notate Tn(θ) a n − th order Chebychev polynomial of the first kind(defined by the relation Tn(θ) = cos(nθ)∀θ), then the poles (si) of the filter’sgain are the solutions of the equation

1 + ε2T 2n(js) = 0

where ε is the ripple factor, defining the amplitude of the ripple, the transferfunction Hn(jω) is defined by

H(jω) =1

2n−1ε

n∏m=1

1

(jω − s−pm)

where (s−pm) are the poles of the transfer function where the real part is negative,and the filter’s gain Gn(ω) is defined by

Gn(ω) = |Hn(ω)| = 1√1 + ε2T 2

n( ωω0)

ω0 being the cut-off frequency.

8 Numerical integration

The integration of a discrete signal over time is done by the approximation ofthe integral operator by a series. We have∫ t

t0

f(t)dt ∼∆t→0

N∑k=0

f(t0 + k∆t)∆t

where N = b (t−t0)∆t c So the numerical integration of acceleration will be written

as :

V (t0 +N∆t)− V (t0) =

t0+N∆t∑t0

A(t)dt 'N∑k=0

A[t0 + k∆t] ·∆t

We must keep in mind that a numerical integration works as a low pass filter.

14

Figure 6: Illustration of the gain (red) and phase (blue) curves of a Bessel filterof 10-th order, with a cut-off frequency equal to 15 Hz and a ripple equal to -1dB. The abscissa is the fraction of the acquisition frequency

8.1 Temporal integration in frequency domain - Reminders

If we consider a piecewise smooth function f equal to its Fourier series over aninterval [0, T ]

f(t) =

∞∑n=−∞

cneiωnt

with c0 = 0 (i.e. with an average equal to zero), then for 0 < t < t0∫ t

t0

f(v)dv = A0(t) +

∞∑n=−∞ n 6=0

1

iωncne

iωnt

where

A0(t) = − 1

T

∫ t−t02

− t−t02

uf(u)du

with T = t− t0 the acquisition length.

Considering f = a the acceleration vector and the case where c0 is notnecessarily equal to 0, we can write more simply

v(t) = c0t+ a+

∞∑n=−∞ n 6=0

1

iωncne

iωnt

where

a = v(t0)− c0t0 −∞∑

n=−∞ n 6=0

1

iωncne

iωnt0

and

x(t) = c0t2

2+ at+ b+

∞∑n=−∞ n 6=0

−1

ωn2cne

iωnt

15

where

b = c0t202− at0 −

∞∑n=−∞ n 6=0

−1

ωn2cne

iωnt0

It is important to keep in mind that a temporal integration in frequencydomain does not avoid the problem evoked with the integration of a discretesignal of acceleration. Indeed, what the integration of a discrete signal f [t],t = i · ∆t, produces is a discrete signal F [t], t = i · ∆t, which match f [t] =F ′[t]∀t = i · ∆t, but we can absolutely not tell if the continuous signals f(t)and F (t) match F ′(t) = f(t).

16

Part III

Implementation

9 Features

We want to develop an application that display the position of the whole rigidlylinked IMU+camera regarding the motionless marker (first scenario) in realtime, while the operator moves the IMU in front of the marker, sometimes los-ing the sight of the marker.To implement the experimental protocol, we used C++ as main language, underthe Win32 API. The graphical display has been done using the library Open-SceneGraph, and the capture of video frame using the library OpenCV. We alsoneeded tools for video tracking and signal processing.

9.1 OpenSceneGraph

The code for the implementation of the experimental protocol has been writtenin C++, using mostly the library OpenSceneGraph. OpenSceneGraph offers anextra programming layer between a theoretical scene and low-level 3-D graphicalfunctions, working on the principle of a graph tree : the scene is represented bya graph of nodes, where each node inherits properties from its father(s). Thewhole scene is rendered by infinite frame loop. Automatic interactions with thescene can be implemented using callbacks, added to the scene tree.The program is aimed at rendering the position of the video marker and theIMU, according to the scenario chosen (IMU linked with the marker or thecamera), from an external point of view.

Figure 7: A scene obtained through OpenSceneGraph, visualized using an open-source GUI for osg called OSGEdit

17

9.2 IMU’s data

The communication between the OSG rendering loop and the IMU is possiblethrough the IMU’s driver, whose sources have been released by Xsense.Since the IMU did not send data at the same frequency as the OSG’s renderingfrequency, a separated thread has been implemented to run the IMU’s driver,filling a data array at the IMU’s frequency.

9.3 Video tracking with osgART

In order to merge inertial data from the IMU with positioning data from a cam-era, we need to implement the acquisition processing of camera data.Since image processing is a widely developed research field, and that the maintask of the internship was the processing of inertial data and its fusion withcamera data, not the processing of camera data themselves, we decided to usepre-existent tools to compute feature localization on a camera image.The chosen tool to achieve feature localization is the C++ library ARToolkit.The main feature of ARToolkit is the localization of square marker on a camerapicture frame, using simple mathematics to compute the position and orienta-tion of the marker relatively to the camera coordinate system. This computationis fast enough to be processed in �real time�, which means faster than 30 Hz,the maximal frame rate of a camera compatible with the library.It is important to notice that the frame rate of a common webcam with a 640 ×480 pixels resolution is 15 Hz, and can reach 30 Hz with a 320 × 200 resolution.One of the main reasons for this relatively low frame rate is the limitations ofthe USB 2.0 standard.

Figure 8: Tracking a squared shape : a typical augmented reality applicationusing ARToolkit/osgART

9.4 Recording data with OpenCV

In order to make comparisons between several algorithms consistent, it wasnecessary to create a recording process of the data, so we can try several methodson the same set of data. Numerical data can easily be stored in cvs files, butsince the video tracking needs pictures, we used the widely used C++ library

18

OpenCV to record and read video frames, and a home-made function to convertOpenCV images into OSG images.So, inside the OSG main rendering loop, a function picks a video frame, from acamera or a recorded video file, and give it to an osgART callback.

9.5 Signal processing

Since the IMU has a higher frequency than the camera, the processing of thedata needs to be made by a function/class working at the same rate as theIMU’s thread.Efficient signal processing in C/C++, using the algorithms previously describedin the state of the art, needs matrices implementation, and if possible FFT (FastFourier Transform) and other signal processing functions. Plenty numericallibraries in C++ exist, but all of them are not complete, neither efficient. Wekept in mind two of them :

� IT++, developed by Chalmers University of Technology, under GPL li-cense, has a syntax close to Matlab/Octave language. It makes extensiveuse of open source mathematical libraries (BLAS,LAPACK,ATLAS,FFTW. . . ). It provides matrices, FFT and linear systems solving, but lacksWavelet Transform.

� GSL (GNU Scientific Library), developed by the GNU project, under GPLlicense, is probably the most developed free toolbox available in C. Al-though it lacks good wrappers for C++, and its syntax is quite unpracti-cal, it provides matrices, FFT, linear systems solving, Wavelet Transformand further features.

We chose to use IT++, since its syntax made code writing far much easier. Aminimal implementation of GSL can be used to try algorithms using WaveletTransforms

10 Computing the transformation between IMUand camera coordinate system

In the first scenario, in order to apply the algorithms presented until now, weneed a way to get the transformation matrix between the local IMU’s coordinatesystem and the camera coordinate system. For this purpose, we will use thefact that, at rest, the IMU records only the gravity vector, which is necessarilyoriented downward. During a calibration time, we get several pictures of themotionless marker, and store the acceleration data obtained during the shots.Assuming that the normalized gravity vector is oriented backward in the markercoordinate system, −→g

‖ −→g ‖= −−→ez(Rmarker)

since −→g =−→A (IMU motionless), we have

−→A IMU

‖ −→A IMU ‖= PRIMU→camera · PRcamera→marker · (0 0 − 1)T

19

The value of PRcamera→marker is obtained using the video tracker, so, if we shotn pictures of the marker, denoting respectively Pi = PRcamera→marker and ai thenormalized acceleration at the measurement corresponding to the i− th picture,what we got is a system

(a1 · · · ai · · · an) = X

P1

001

· · ·Pi 0

01

· · ·Pn 0

01

where X is a 3 × 3 matrix. To solve this system, we need a set of (at least)three pairs of vectors (ai, Pi(0 0 1)T ), linearly independent. In order to improvethe accuracy of the method, we can use more than three pairs of vectors, andsolve the system using the least-squares method.

Figure 9: An overview of the running program : a window shows the captured(and recorded) camera picture, another window shows the augmentation doneby the camera (in red) and the IMU (in green), and the bigger window showsa virtualization of the scene, displaying the movements of the camera and theIMU regarding the marker

11 Overview of the program

program in pseudocode :

data : array used to store the last data obtained from the IMU

IMU_node : graphical osg node representing the position and

orientation of the IMU in the marker coordinate system

20

camera_node : graphical osg node representing the position and

orientation of the camera in the marker coordinate system

Thread {

while (true){

data <- raw and calibrated data from the IMU

record(data)

if (lost_marker){ //if the camera lost the marker

//the transformation from the initial IMU position to its

//current position is obtained using the IMU’s gyros

P_imu_to_world_coord <- Update_using(IMU_data)

//the transformation from the IMU current position to the marker’s position

//is computed, knowing that the transformation from the IMU’s initial position

//to the marker’s position is constant and already computed

P_imu_to_marker = P_imu_to_world_coord * P_world_coord_to_marker

//we get the IMU’s acceleration into the marker coordinate system

IMU_acceleration_marker = P_imu_to_marker * IMU_acceleration_local

//we apply the several filtering and integration algorithms on this acceleration

IMU_attitude = Filter_and_integrate(IMU_acceleration_marker)

}

}

}

Main function {

//we compute the constant transformation between the IMU’s initial

//position and the marker’s position

calibrate(P_world_coord_to_marker)

launch(Thread)

while (true){

image = GetCapture(camera) //we store the picture from the camera

marker_in_view = IsMarkerInView(camera) //the video tracker check

the presence of the marker in the picture

if (marker_in_view){ //the camera caught the marker

//osgART get the orientation and position of the marker

//from the picture

marker_attitude = ComputeAttitudeFromImageUsingOsgART(image)

21

//the attitude of the camera is updated using the tracker data

camera_node = Update_using(marker_attitude)

//same thing for the IMU, since the tracker is more reliable

//than the accelerometers

IMU_node = Update_using(marker_attitude)

} else {

//we tell the thread to compute the IMU attitude

Thread.lost_marker = true

//we get the IMU’s attitude from the thread

IMU_node = Update_using(Thread.IMU_attitude)

}

}

12 Reliability of the camera tracker

The tracker from osgART provides the position and orientation of a real markerin the camera coordinate system.In most situations, i.e. as soon as the marker can be seen clearly by thecamera, the position of the marker provided by the tracker can be consideredas reliable with an accuracy that remains stable over time. On the otherhand, as soon as the camera cannot see clearly the marker, i.e. whenits movement is too fast, or simply when the marker is obscured, the positionprovided by the tracker is completely unreliable.Though, the orientation of the marker provided by the camera tracker isslightly more inaccurate than its position, since it depends on the detectionof the marker’s shape, not its size or position on the camera picture. Thissituation can be visualized immediately on a simple tracking test with osgART: if the localization of the marker by the tracker is consistent most of the time,its orientation suffers some jitters.Now, if we considers the data from the IMU, we have the following facts :

� the accuracy of the localization is rapidly decreasing over time, for thetheoretical reasons already explained

� the accuracy of the orientation is quite good, stable over time, and infact more accurate than the orientation provided by the cameratracker

What we can conclude of these facts is :The orientation provided by the IMU is, in common use (which means for anykind of movement for both the camera and the IMU, including movementstoo fast for the camera frequency, jitters, and other tricky movements), themost reliable fact. This orientation can be used as the reference orientation,

22

after a necessary calibration of the initial orientation by the camera tracker,or combined with the tracker orientation trough a Kalman filter or any otherfusion algorithm.The tracker position is consistent most of the time, and a method to detectthe inaccuracy would be to match the orientation of the tracker andthe one of the IMU. Nevertheless, we can use the tracker position in idealconditions, with no loss of track (slow speed, low jitters), in order to computea calibration of the IMU positioning algorithm, over a calibration time whichcan be arbitrarily long.

23

Part IV

Drift Correction

13 Modeling a moving offset using a linear re-gression

Several tests with the Xsens IMU have shown that the offset of the accelerometerare not constant over time. This non-constant drift is quite disturbing, sinceseparating it from a movement-significant acceleration with acceleration dataonly appears impossible.A first way to make profit of camera data would be to use the set of last validpositions provided by the camera tracker in order to compute the drift betweencamera-provided and accelerometers-provided position, and use it to infer thefuture drift as soon as the accelerometers are the only devices available to inferposition.The method tested is the following one : in real-time, an interpolation of thelast N recorded accelerations of the IMU is processed, and in parallel anotherinterpolation of the last N accelerations estimated from the tracker position isdone. The time-dependent difference between the two interpolations is used asan estimation of the drift, as soon as the tracker loses the marker.This method relies on several conditions and hypothesis :

� The number of last recorded accelerations was big enough to infer a con-sistent interpolation

� Since the drift comes from very versatile and unpredictable parameterssuch as temperature and the jitters of the movement, such an approxima-tion will remain valid only on a short period of time

� A consistent estimation of the acceleration from the tracker position hasbeen processed. Since the tracker and the IMU do not have thesame acquisition frequency, the consistency between the �real �accelerationand the one computed from the tracker will depend heavily on the natureof the derivation algorithm applied on the tracker position

� the chosen interpolation algorithm can compute in real time. If a simplelinear regression can perform this for sure, it will not be the case for moresubtle approximation such as Bezier curves or polynomial fittings.

We chose to use simple linear regression over the acceleration data, to ensurea relative stability of the estimated drift, and to spare computing resources forreal-time.As a reminder, here are the formulas to compute a linear regression of the kind

24

y = ax+ b, with Y = (y1 . . . yn),X = (x1 . . . xn) :

a =

∑n

i=1xiyi− 1

n

∑i=1nxi

∑j=1nyj∑n

i=1(xi)2− 1

n (∑n

i=1xi)2

= XY−XYX2−x2

= Cov[X,Y ]V ar[X]

b = Y − aX

13.1 Computing camera-based acceleration

In order to make use of the data provided by the camera during a calibratingphase, we need to proceed some operations on them, since what we can havethrough the camera video tracker is only the orientation and position of thecamera in regard to the marker coordinate system, and not its acceleration.So, in order to model a linear relation between the camera acceleration and theIMU acceleration, we need to compute the camera acceleration in the markercoordinate system, which means we have to differentiate it.A simple Euler derivation, of the kind

f ′(t+ ∆t) ' f(t+ ∆t)− f(t)

∆t

should be sufficient for this, since more exotic derivation methods, like Runge-Kutta’s method, will need to dispose of the continuous definition of the accel-eration, not only on discrete points.The main problem lies in the difference of frequency between the camera’s ac-quisition and the IMU’s acquisition : since we need the camera’s accelerationat the same frequency as the one of the IMU, we need to interpolate the valuesof the camera’s position at the acquisition points of the IMU where no cameradata are available. The form of this interpolation will greatly impact the finalform of the camera’s computed acceleration.For example, if we simply fill the missing data between two camera acquisitionwith points with the data of the previous points, the curve of the camera’sposition at the IMU’s frequency will have a crenelated shape, which means adifferentiate function with a lot of peaks and zeros :

To avoid that problem, we choose to smooth the curve before every deriva-tion by using a moving average. The range of this moving average has to bedetermined experimentally, in order to fit the shape of the IMU’s acceleration.Since the previous operation implies a lot of error margin and possibly inaccu-rate approximation, we can try a derivation in spectral domain, which is thependant of the integration in spectral domain described further above : if wehave

f(t) =

∞∑n=−∞

cneiωnt

then

f ′(t) =

∞∑n=−∞,n6=0

iωncneiωnt

25

Figure 10: In blue, the position of the camera over the X axis, in green, thespeed obtained through Euler derivationabscissa : time in secondsordinates : m for the position, m.s−1 for the speed

Figure 11: Same data, over a larger time range

The main advantage of the derivation in spectral domain is the absence of thetime delay implied by the moving average associated with the Euler derivation.The drawback is that it works also like a low-pass filter.

26

Figure 12: Same data, same time range, but the position curve has beensmoothed using a moving average of 10 data points range

14 Gain Correction

Often, the acceleration given by the IMU and the acceleration computed fromthe camera’s position through the video tracker do not have the sample ampli-tude and jitter. A way to correct this and try to make the IMU’s accelerationcloser to the camera’s acceleration (supposed to be closer to the accelerationproduced by the real movement) would be to multiply the IMU’s accelerationwith a gain factor, that would be computed using the camera’s accelerationduring calibration (or video tracking) time.We could define this gain G[n] as :

G[n] =

∑Nk=0 | Acamera[n− k] |∑Nk=0 | AIMU [n− k] |

andAIMU corrected0[n] = G[n0] ·AIMU raw

where n0 is the last data point where camera data were available, and N > nthe range of the gain estimation, supposed to be high enough to ensure thatG[n] is more or less constant.

15 Detection of shocks

In mechanics, Newton’s law told us that the derivative of a solid’s momentumover time is equal to the sum of the forces applied to this solid :

d(m · −→a )

dt=∑−→

F

27

Figure 13: In blue, the speed of the camera obtained using Euler’s derivationover the X axis. The range of the moving average, before position’s integration,is 10 data points. In green, the speed obtained over the X axis using integrationin frequency domainabscissa : time in secondsordinates : m.s−1

We can see that using the frequency domain, we avoid time delay, but we lostsome high frequencies

This equation means that, when a solid with a non-zero momentum collides withan obstacle and become motionless, the momentum has been dissipated throughthe obstacle, and that a violent and brief deceleration of the solid during thecollision occurs. Depending on the nature of the obstacle (more or less elastic),this deceleration is more or less quick, and can possibly be measured by anaccelerometer. It appears that a 100 Hz acquisition frequency is high enoughfor an accelerometer to detect shocks of objects from the everyday life.Thus, we can perfectly imagine to recognize a shock by the violence of thedeceleration observed by the IMU, which can be better expressed by a peak ofthe acceleration’s derivative over time, the jerk (also known as jolt).When the IMU records a peak of jerk, that means that it has encountered anobstacle, or that its momentum has been violently dissipated by sudden brakeor friction. Let us notice that the momentum’s dissipation could be localizedin the operator’s arm if the IMU is in his hand. If the shock is elastic, themomentum will be returned through an acceleration in the opposite direction.We can imagine a scenario where no shocks are elastic. Given theappearance of the IMU, it seems coherent in most environments. In this case,every shock will imply a total dissipation of momentum, and then a velocityreset to zero. It is consistent with the common movement of a human body,which can be decomposed as a succession of very brief stop-and-go movements.Such a feature will especially allow to avoid a constant drift of the velocity

28

by regularly resetting it to zero. The main difficulty of its implementationremains the choice of a reasonable threshold for separating real shocks frommore common small peaks of jerk.

16 Orientation with the magnetic field

Since the MTi-G from Xsens provides also an embedded magnetometer, the useof the magnetic field could be considered to get the orientation of the IMU. Con-sidering that the orientation of the magnetic field B and the gravitational fieldG are constant and non collinear (we are not on the magnetic north pole), notingrespectively b and G their normalized vectors, we can consider the coordinate

system (g, b−g(b·g)‖b−g(b·g)‖ ,

g∧b‖g∧b ) as constant. It allows us to know the orientation of

the IMU as soon as it is motionless, and give a constraint on the value of theacceleration measured by the IMU, since B ∧G is constant.

17 Removing the gravitational component of theacceleration using moving average

As we explained previously, the main difficulty in the integration of accelera-tion is the necessary remove of the gravitational component in the measuredacceleration before integration. The accuracy and reliability of the orientationdata obtained from both the camera and the IMU does not allow to proceedthis subtraction without drawbacks : an error of 3 degrees in the orientationestimation can lead to an error of 0.05 g, i.e. 0.5 m.s−2, in a wrong direction,which appears low, but leads to a drift of several meters in a few tens of seconds.

A solution in a case where the studied movements are small in terms of am-plitude or velocity (more precisely with an average acceleration � g), would beto subtract to the acceleration its moving average before integration. Theoreti-cally, it means that the acceleration of gravity represents the main componentof the acceleration, and that the acceleration from movements only representsoscillations around this main component.In practice, the implementation of this method needs to define precisely themoving average, in terms of :

� range : too wide, and the average will be too close to zero, or even presentsa delay that will strengthen the amplitude of the resulting acceleration,to narrow, and the resulting acceleration will be too weak

� kind : do we chose an arithmetic average ? a geometric one ? a weightedone ?

There is no existing reliable heuristic to help deciding these matters, and lotsof tests will be needed.

29

Part V

ResultsIn the following results, when a large set of data points was required to performoperations, like for FFT for example, we used a data set composed of the last500 acquisition points. This number has been chosen as large as possible, inorder to store as much information as possible, but low enough to avoid compu-tational slowdowns and keep real-time display, especially when FFT are involvedat every frame.The graphics from this part have been computed using a data acquisition ofapproximately 30 seconds, with the first scenario (IMU rigidly linked with thecamera), while the IMU was initially laying flat. After the beginning of theacquisition, the ensemble IMU+camera is moved approximately 20 centimetersabove a marker laying horizontally on a desk, and describes slow and smallmovements.The camera always keeps in sight the marker, in order to keep reliable data onthe real position of the ensemble.Nevertheless, the loss of the marker is simulated in the code itself every 10seconds, for a duration of 5 seconds each time, giving a succession of periods of5 seconds with or without the marker in sight, in order to observe the behaviorof the integration algorithm when the IMU is autonomous. Nevertheless, themarker is in fact always in sight in order to provide data for comparison af-terwardThe process begin with 5 seconds with a (supposedly) apparent marker. Since itis not really the case before the operator put the camera above the marker, thefirst 10 seconds of the process can be considered as producing irrelevant data.The data are displayed according to the frequency of the IMU, which meansthat for the data recorded at the camera rate, the missing points are filled withthe last available data.If not precised differently, the axis evoked in the graphics are those of the worldcoordinate system.

The transformation matrices PRcamera→RIMU and Pglobal = PRmarker→Rworldobtained using a set of 10 shots of the marker with the corresponding inertialdata, are the following ones :

PRcamera→RIMU =

−0.002181 −0.77499 −0.101103−1.251344 0.038298 −0.0059480.014133 0.0612 −1.022968

Pglobal =

−0.634531 −1.031282 0.2886310.739555 −0.252833 0.330185−0.089318 0.250547 0.942597

We checked that this two matrices have a determinant equal to 1, and thatthe global transformation approximately preserve the Z - axis (since the initialposition of the IMU is Z - upwards, as for the marker position).

Now, we want to approximate the acceleration provided by the IMU and theacceleration computed from the camera with two lines, using linear regressions.

30

Figure 14: Raw acceleration and moving average over 500 and 15 data points,over the 3 axis

Figure 15: Position of the camera in the marker coordinate system, over the 3axisWe can notice that the curves have a crenelated shape, since the acquisitionfrequency of the camera is lower than the IMU’s one

In the following graphic, the term ’linear approximation’ means the value onthe current point of a linear approximation over the last 500 data points when

31

Figure 16: Smoothed Speed of the camera using a moving average of range30, after smoothing of the position using a moving average of range 10, in themarker coordinate system, over the 3 axis

Figure 17: In blue, smoothed acceleration of the camera using a moving averageof range 10, in the marker coordinate system, compared with the accelerationfrom the IMU in green, over the 3 axis

32

the camera tracks the marker, and an estimation using the slope of the linearapproximation computed on the last point where the tracker caught the marker.Considering this, those linear approximation will be actually linear every 5 sec-onds.

18 Integration in spectral domain

It was very difficult to achieve satisfactory result with the integration in thespectral domain, since the computational cost of a FFT, even using well opti-mized routine like those from LAPACK libraries, having a O(n2) complexity, istoo high to achieve real-time integration with sufficient relevance of the results.Nevertheless, we can try to take a look on the result of an integration in spectraldomain when the program is using prerecorded data. As we can see in the nextgraphics, the results appears interesting for the first time integration, to obtainvelocity, but this velocity is in fact too much distorted and far from the real oneto expect to obtain significant results in position.

19 Testing the stability of the magnetic field

In order to test the relative stability of the direction of the magnetic field, wecaptured the values from the magnetometer and the accelerometers over a setof several different motionless stances of the IMU. The IMU being motionlessduring each stance, the accelerometers are supposed to be measuring only thegravity field. So, if the direction of the magnetic field is constant, the norm ofthe cross-product of the normalized magnetic field with the normalized gravityfield should be constant :

d

dt

(‖−→B

‖ −→B ‖∧−→G

‖ −→G ‖‖

)= 0

33

Figure 18: Comparison of the speed computed from an integration in spectraldomain, and speed computed from video tracker - the two curves are superposedevery 5 seconds, due to the testing protocol previously explained (marker visibleevery 5 seconds)

Figure 19: norm of the cross-product of the normalized magnetic field with thenormalized gravity field over a set of several different motionless stancesabscissa : time in secondsordinates : not relevant

What we can observe on some tests is that the reliability of the magneticfield’s direction is quite low, with an amplitude far too high to be more inter-esting than the gyros for orientation estimation.

34

20 Effects of filters

20.1 Effect of Kalman filter

We tested a Kalman filter with the following attributes :

� The process measure y contains the acceleration in the 3 axis : z =(AX AY AZ)T

� The state vector is represented by x = (X Y Z VX VY VZ AX AY AZ)

� The conversion matrix between two state is constant for every state andequal to :

F =

1 0 0 ∆t 0 0 (∆t)2

2 0 0

0 1 0 0 ∆t 0 0 (∆t)2

2 0

0 0. . . 0 0

. . . 0 0. . .

· · ·

� The measure matrix is represented by

H =

0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

� The process noise covariance matrix is a positive-definite matrix, which

describes the reliability of the state model for the represent the modeledsystem. We considered a diagonal matrix, because the errors are indepen-dent along the axis. We chose the value Q = 10−4 · I9 for the covariancematrix, according to the literature, since the model is supposed accurate.

� The measure noise covariance matrix is also a positive-definite matrix,since the measure errors are independent. It represents the unreliabilityof the data from the sensors. According to the MTi-G technical documen-tation, we chose the value R = 10−4 · I3

� The error covariance matrix represents the error in the estimation of thestates. Big values signify that that the filter does not estimate correctlythe state. Given our filter and according to the literature, we chose thevalue P = I9

In being the identity matrix of size n

35

Figure 20: Comparison between the raw acceleration (black) and the accelera-tion filtered by the Kalman Filter (green) (over X-axis of the IMU)

Figure 21: Same, zoomed

It appears that the use of a Kalman Filter does not significantly modify theresults. The reasons for this can be :

� an already low level of measurement error

� the effect of the Xsens’ firmware which already implement an embeddedExtended Kalman Filter

In all cases, it does not solve the drift problem.

20.2 Effect of other linear filters

Here, we analyzed the impact of a low-pass filtering of the acceleration signalon the time integration process.

36

We tested the Bessel filter, the Butterworth filter and the Chebychev low-passfilter at 10th order. Considering the spectrum of the acceleration during a typ-ical acquisition of several tens of seconds, we can consider that most of theinformation is contained below 15 Hz.

The coefficients calculated for the different filters are the following ones :

� Bessel filter :

a =1

2.015579946.1004

1104512021025221012045101

b =

−0.00176963190.0283358587−0.02833585870.9364034626−2.83526165436.0842140836−9.423337162210.4762753571−8.09440659273.9876543673

� Butterworth filter :

a =1

3.381982205.1004

1104512021025221012045101

b =

−0.00000179420.0000010362−0.0005277807−0.0008496638−0.0210067691−0.0315935489−0.2446013233−0.2604334430−0.9468182860−0.5219785634

� Chebychev filter :

a =1

5.486139221.1004

1104512021025221012045101

b =

−0.42218838653.6134220321−14.721211212137.4740537669−65.896996993583.5976167262−77.522695310251.9716220111−24.1769594326

7.0814702764

applied to the formula

y[n] = (x[n− 10], x[n− 9], . . . , x[n]) · a− (y[n− 10], y[n− 9], . . . , y[n− 1]) · b

37

Figure 22: Spectrum of the acceleration along the X-axis of the IMU over a2000 data points record at a frequency of 100 Hz. It means that the Nyquistfrequency of this acquisition (50 Hz) correspond to the number 1000 of the FFT,and the chosen cut-off frequency is at number 300 over the X axis

Figure 23: Comparison of the effects of the different filters on the acceleration(in blue) with the raw acceleration (in green) during a simple set of movements,over the X axis

As we can see, we can estimate than beyond 25-30 Hz, the harmonics containonly noise

As previously explained, Infinite Impulse Response filters are not necessarilystable. It is the case here, where it appears that the Bessel filter produces rapidlya strong divergence. For the two other filters, the response is stable, and wecan see that the Butterworth filter has no real delay over the initial input, butcan produce some errors (a peak around second 10 for example), whereas theChebychev filter has a small delay induced, but better smoothing properties.

38

Figure 24: Comparison of the different spectrum of acceleration for each filterand without filter

As we can see, beyond the cut-off frequency, the harmonics of the spectrumhave been significantly reduced, compared with the non-filtered spectrum. Wenotice that the Chebychev filter has a better impact on the cut-band.

Figure 25: Comparison of the velocity obtained from the filtered acceleration(in blue) and the velocity estimated from the camera position (in green) duringa simple set of movements, over the X axis

It appears that the velocity computed from the filtered acceleration does notdiffer a lot from the velocity estimated from the camera position, except for anoffset that can have external origins, such as an offset in the raw acceleration.What we are interested in is the shape of the curve and the amplitude of theoscillations. In all cases (except for the Bessel filter of course), it is hard to seea significant difference.

39

What we can conclude is :

� the Xsens firmware already does a great job of noise removing

� the remaining noise is not the core reason of a drift in position

Nevertheless, we can consider to use these filters to separate efficiently theaverage acceleration and the oscillations component of the acceleration.

21 Computing the position

We finally tested the evolution of the integration algorithms by combining thefollowing features :

� Drift reduction by using a linear regression with the last available camerainformation

� Noise filtering with a Chebychev low-pass filter

� Shocks detection and related velocity reduction

� Removing the moving average from the acceleration

The values used are :

� Chebychev filter : cut-off frequency at 15 Hz

� Velocity canceled if jerk ¿ 25 m.s−3 on any axis (in world coordinates)

� The moving average is a simple arithmetic average of range 80

In these results, as before, the program catch the marker every 10 secondsfor a duration of 5 seconds. The first period of IMU autonomy is non significant,since as the camera does not start in front of the marker in reality, video dataare unreliable during the first 5 seconds. Nevertheless, beyond 10 seconds ofuse, the results are relevant.

22 Using the IMU for orientation only

If the estimation of position with IMU appears to be very difficult to reach,the estimation of orientation can prove to be reliable enough to have industrialinterests.If we consider the problem of the attitude estimation of a marker on a video,we know that the estimation of position is mainly done using the marker’s sizeand position on the picture, while it is its shape that will help determine itsorientation. Shape detection being the most subtle part of the process, orienta-tion estimation can happen to be far more unreliable in a lot of situations thanposition estimation.If we want to estimate a camera’s extrinsic parameters, we must, in most al-gorithms, make use of markers or points of interest in the scene. Adding extrainformation about the orientation of the camera, independently of the scene,can greatly improve the accuracy of those algorithms’ results. We tested if,after a preliminary estimation of the global transformation between the IMU’s

40

Figure 26: Comparison of the raw acceleration from the IMU, the corrected(after linear regression with video data as explained in the previous part) andfiltered acceleration, and the acceleration estimated from the video tracker

Figure 27: Comparison of the velocity obtained from the corrected and filteredacceleration, and the velocity estimated from the the video tracker, over the Xaxis

original coordinate system and the marker coordinate system, estimation whichneeded the orientation data from the video tracker in optimal conditions, theorientation provided by the IMU alone could be reliable in comparison with the

41

Figure 28: Comparison of the position obtained from the corrected and filteredacceleration, and the position estimated from the video tracker, over the X axis

Figure 29: Same, over the Y axis

orientation of the tracker in optimal conditions on the long run.The following graphics compare the evolution of orientation parameters (angu-lar value and coordinates of the rotation vector in world coordinates) of the IMUand of the camera during a period of approximately 30 seconds, with severalmovements from the whole camera + IMU.

42

Figure 30: Same, over the Z axis

Figure 31: Angular value of the orientation, from the camera (blue) and theIMU (green)

It appears clearly that the orientation data from the IMU are reliable inmost conditions. The noticeable offset between angular values of the orientationcomes from an error in the estimation of the global transformation between theIMU’s original coordinate system and the marker coordinate system, and canbe solved by a simple initial calibration of the angle.

43

Figure 32: X component of the rotation vector, from the camera (blue) and theIMU (green)

Figure 33: Y component of the rotation vector, from the camera (blue) and theIMU (green)

Conclusions and DiscussionsThe aim of this master thesis was to establish a state of the art on the subject ofautonomous real-time position calculation based on inertial measurement, withthe potential help of information from a camera.The double integration over time appears as the main obstacle to the develop-ment of robust positioning algorithms, but not necessarily for the theoreticalreason of cumulative errors : indeed, the estimation of the angular orientationby integrating angular velocity over time appears to be quite reliable.If the accumulation of two time integration increases significantly the drift in

44

Figure 34: Z component of the rotation vector, from the camera (blue) and theIMU (green)

position estimation, the main problem seems to be the necessary remove of thegravitational component from the measured acceleration, which needs an ex-tremely accurate estimation of the IMU’s orientation in order to estimate theacceleration derived from the real motion.Considering this, the attempts to reduce the drift in position by using exotickind of integration such as integration in spectral domain appears not only un-successful (and difficult to adapt to a real-time context), but also irrelevant.Nevertheless, some heuristic methods seem to be much interesting to perform (orat least make the current results better) position estimation : on the one hand,shocks detection and uniform motion detection (associated in almost allreal-life motions with null velocity), and on the other hand, the equalizationof the gravitational component of in acceleration with the movingaverage of acceleration for small movements, allowing a better estimation ofthe real acceleration, and leading to better position calculation.Though, shocks detection needs a very high quality of IMU, which needs to beable to measure violent deceleration during a solid shock, notably. We knowthat such features are only available in IMUs of professional quality (for exam-ple, the Nintendo’s Wiimote is not able to detect such shocks).We must of course keep in mind that the Xsens’ IMU being of such professionalquality, its measurement noise is very low, as the several filters’ tests concluded,but that noise filtering (with the filters studied, notably) will prove very usefulwhen using IMUs of lower quality.The last point to consider is that, since the calculation of angular position isreliable, far more than the calculation of position, we can make use of it toimprove the attitude estimation of cameras in real-time, that means,IMUs can provide information about the orientation of a camera that increasethe constraints on the camera’s estimated extrinsic parameters, which leads toa faster and more accurate estimation of the camera’s extrinsic and intrinsicparameters during a camera’s calibration procedure.

45

Finally, the points to remember at the end of this internship are the followingones :

� Theoretically, we cannot calculate the position or the orientation of anIMU over time without drift, whatever the quality of the measure

� Nevertheless, angular estimation is reliable during a period of several tensof seconds or even minutes (0.5 degrees of drift every minute in goodconditions)

� The main obstacle to obtain a similar result in position estimation isthe separation of gravitational and motion components of the measuredacceleration

� This separation needs extremely accurate estimation of the IMU’s orien-tation to be useful

� In most cases, the magnetic field cannot be considered as accurate enoughfor orientation estimation

� Other methods can be used to improve the integration over time : detec-tion of shocks and uniform motions

� The equalization of the moving average with the gravitational componentof the measured acceleration is promising

� The protocol experimental that have been implemented and tested allowsto compute easily the several transformations between IMU, camera andmarker coordinate systems

� This protocol shows also that the orientation of the camera in a worldcoordinates system can be obtained easily and is relatively reliable andcoherent with the orientation of the camera that could be provided by avideo tracker

Considering these results, the points that deserve further development are :

� A calibration process that would determine how to fit the parameters ofthe most promising methods described before to real use.Basically, we need to determine

– for how much time the acceleration needs to be below which thresholdto be considered as a uniform motion

– which minimal value of jerk characterize a mechanical shock, and dowe have to develop a specific gain function between the measuredjerk and the corresponding velocity

– what kind of moving average should be used as an estimation ofthe gravitational component of the measured acceleration, and forwhat kind of movements (for movements with a high amplitude orvelocity, small oscillations would be irrelevant in the estimation ofthe position)

� A method of to calculate a camera’s parameters by making use of theIMU’s orientation data that are already available

46

BibliographySuivi Hybride en presence d’Occultations pour la Realite Augmentee (HybridTracking in the presence of occlusions for Augmented Reality), Madjid Maidi,University of Evry (France), 2007

Trajectory reconstruction by integration of GPS and a swarm of MEMS ac-celerometers : model and analysis of observability, Christian Spagnol et al.,IEEE 2004

Integration of Accelerometer Data : Filter Analysis and Design Using RiccatiSolutions, Daniel M. Murray, IEEE Transactions on automatic control ac-32 n°21987

Implementing Positioning Algorithms Using accelerometers, Kurt Seifer and Os-car Camacho, ©Freescale Semiconductors, 2007

The Complementary Characteristics of GPS and Accelerometers in Monitor-ing Structural Deformation, Xiaojing Li et al., University of New South Walesand Tokyo Polytechnic University, 2004

Basic Inertial Navigation, Sheryll H. Stovall, Naval Air Warfare Center WeaponsDivision, 1997

Inertial Navigation - Forty Years of Evolution, A.D. King, GEC Review, 1998.

SLAM in O(log n) with the Combined Kalman-Information Filter, C. Cadena,J. Neira, 2010

Improving monocular plane-based SLAM with inertial measures, Fabien Ser-vant, Pascal Houlier, Eric Marchand, Intelligence Robots and Systems(IROS),2010 IEEE/RSJ Conference

Derivation and Implementation of a full 6D EKF-based Solution to Bearing-Range SLAM, Jose-Luis Blanco, Perception and Mobile robot Research Group,University of Malaga, 2008

Pedestrian and Vehicular Navigation Under Signal Masking Using IntegratedHSGPS And Self Contained Sensor Technologies, G. Lachapelle, O. Mezent-sev, J. Collin, G. MacGougan, 11th world congress, International Associationof Institutes of Navigation, Berlin, 2003

47

TRITA-CSC-E 2011:044 ISRN-KTH/CSC/E--11/044-SE

ISSN-1653-5715

www.kth.se

inertial and video data integration - kth · contents introdution 1 thales group 1 goals &...

Documents