3次元都市モデリングのためのモバイルマッピング...

大学院輪講資料平成 22年 12月 10日　

3次元都市モデリングのためのモバイルマッピングシステムに関する研究動向An Introduction on Mobile Mapping Systems for 3-D City Modeling

情報理工学系研究科電子情報学専攻池内研究室修士課程 1年 48106446 薛亮　

Abstract

Mobile Mapping Systems(MMS) is developing rapidly

these years, benefit from the more and more advanced

unit of MMS(eg:camera,laser sensor,gps), it can be used

in a wider field especially in the modeling of outside

landscape such as 3D city modeling. 3D models of cities

are usually made from data acquired by aerial-based or

land-based MMS. The data used for modeling is mainly

from two sources, one is range data caputred by laser

sensor of MMS, and the other is from passive images

captured by cameras of MMS. In this survey paper, first

we introduced what is MMS, the comprising of MMS

and the MMS unit in 3D city modeling. Then we in-

troduced two main methods in 3D city modeling using

MMS by giving some project examples.

Keywords

MMS, 3D Modeling, Range sensor, Geometric model,

Geo-referencing, Integrated modeling

1 Introduction

In the past years, Mobile Mapping Systems(MMS)

is mainly used in airborne surveysystems. Nowadays,

Land-based MMS is more and more widely used in sur-

vey works, especially in 3D landscape Modeling. Typi-

cally the platform of the Land-based MMS is a vehicle,

but there are also some other platforms being used and

implemented such as robots, trains, and even people.

The typical unit of MMS comprising multi-camera,

laser scanner, global positioning system(GPS), inertial

measurement unit(IMU). Fig. 1 Shows an example of a

typical MMS.

Multi-camera is used to capture images for recon-

struction and modeling 3D scenes, recently the new de-

veloped systems always combining a 360 degree stereo

図 1: Typical comprising of MMS

camera instead of a multi-camera. Laser scanner is used

to get point cloud image which including more details

than camera images. The GPS provided the position of

the vehicle and the images from the CCD cameras were

used to determine the positions of points relative of the

vehicle. And the IMU is used to increase the accuracy

of MMS by racking the vehicle pose.

Most of MMS integrate navigation sensors and algo-

rithms together with sensors that can be used to deter-

mine the position of points remotely. All of the sensors

are rigidly mounted together on a platform, such as ve-

hicle. The navigation sensors determine the position

and orientation of the platform, and the remote sensors

determine the position of points external to the plat-

form. The sensors thatare used for the remote position

determination are predominantly photographic sensors

and thus they are typically referred to as imaging sen-

sors. However, additional sensors such as laser scanners

are also used in MMS and therefore the moregeneral

terms of mapping sensors may also be used when refer-

ring to the remote sensors[1].

Nowadays, many government agencies use urban

models for development planning as well as climate,

air quality, fire propagation, and public safety stud-

ies. Commercial users including phone, gas, and elec-

1

tric companies also have a increasing demand of in-

jecting urban models to their own products. Most of

these users are primarily interested in models of build-

ings, terrain, vegetation, and traffic networks[2].More

recently, a number of commercially modeling systems

have also appeared, such as Google, Tele Atlas and

NAVTEQ have adopted the technology on a large scale,

introducing substantial fleets of mobile mapping vehi-

cles for their imaging and mapping operations. Most

of these systems have been utilized for the collection of

data on 3d street or 3d city modeling[3],such as Google

Street View and Microsoft Virtual Earth.

To successfully accomplish the above systems, there

are mainly two methods by using the two unit of

MMS which is respectively by using the cameras and

the lasersensor. The first method is called image-

based modeling or modeling from images,and the sec-

ond method is called range-based modeling or modeling

from range images.

Modeling from images is a classic problem in com-

puter vision and remote sensing.Photogrammetry is a

cost-effective means of obtaining large-scale urban mod-

els. Photogrammetric techniques use 2D images with-

out any a priori 3D data[2]. Differentimage sensors lend

themselves to modeling systems developed for terres-

trial, panoramic, and aerial images[4].

Range-based modeling is also indispensable in some

specifically measurement conditions. Because laser sen-

sors directly measure the depth of objects, which pro-

vide an ideal data set for urban modeling, and they can

track the vehicle’s motionand capture 3D urban facade

data[4].

This paper is structured as follows. Section 2 is the

main part of this paper which describes the MMS in 3D

city modeling in detail by listing methods used in some

papers, which describe image-based method , range-

based method, integrated-method respectively. Section

3 is the experiment part, and section 4 gives a summary

of this paper.

2 Mobile Mapping Systems in 3-D City

Modeling

2.1 Mathematical Modeling

2.1.1 Geo-Referencing Formula

The Strength of Mobile Mapping Systems lays in their

ability to directly georeference their mapping sensors. A

mapping sensor is georeferenced when its position and

orientation relative to a mapping coordinate frame is

know. Once georeferenced, the mapping sensor can be

used to determine the positions of points external to the

platform in the same mapping coordinate frame.

The basis for all direct georeferencing formulas is a

seven-parameter conformal transformation where the

coordinates of a point in the MMS imaging sensor’s co-

ordinate frame rsp are related to their coordinates in a

mapping coordinate frame rmp [13].

rmp = rms + µms Rm

s rsp (1)

In the above equation,rms is the position of the map-

ping sensor in the mapping coordinate frame, and µms

and Rms are respectively the scale factor and rotation

matrix between the mapping sensor coordinate frame

and the mapping coordinate frame.(1)is normally ex-

tended to include terms that account for the indirect

measurements. The position and orientation of the sys-

tem with respect to the mapping coordinate system are

changing with time, therefore(1)must be modified to re-

flect this[13]. The georeferencing formula for a system

integrating a mapping sensor with incorporating GPS

and an IMU is

rmp = r(t)mGPS+R(t)mIMU (rIMUIMU/s−rIMU

IMU/GPS+µms RIMU

s rsp)

(2)

Fig. 2 shows the development of this eqution

It should be noted that the position and orientation

are typically determined using a previously integrated

GPS and IMU.In this case(2)reduces to

rmp = rmIMU +RmIMU (r

IMUIMU/s + µm

s Rms rsp) (3)

2.1.2 Theoretical Background of Cam-

era Model

The pinhole camera model[11] is used to model the

camera. With thehypothesis of corrected input images

2

図 2: Development of Georeferencing formula

with regard to radial distortion, the homogeneous 2D

projection p of an homogeneous 3D point P is given by

the following equation[15],[19] :

p = K.M co .P (4)

with

K =

fpx

0 u0

0 fpy

v0

0 0 1

(5)

where px and pyare the width and height of the pixels,

[u0v0]T are the image coordinates of the principal point,

and f is the focal length. Thus fxand fy are the focal

length measured in width and height of the pixels. The

camera pose M co is defined by the camera 3× 3 orienta-

tion matrix R and the 3× 1 position vector[15],[19].

2.1.3 Geometric Model of Laser Scan-

ning

Shown in Fig. 3, point O”is the center of Laser Sys-

tem(LS);point O is the projection center of CCD. In this

coordinate system point 0 istaken as the origin of the co-

ordinate system, x-axis positive direction coincides with

progressive direction, and 00 as Y-axis, and the zenith

direction as Z-axis. In a certain clock, coordinates of

the CCD image can be described as (x, y, -0, in which

f represents focal length of CCD, x is a constant and

varies with time. The projection equation between dis-

cretional object point P and the corresponding image

point according to [12] is as follows:

x = −fa1(Xp −Xo) + b1(Yp − Yo) + c1(Zp − Zo)

a3(Xp −Xo) + b3(Yp − Yo) + c3(Zp − Zo)(6)

図 3: Coordinate Sketch Map

図 4: Coordinate transformation

y = −fa2(Xp −Xo) + b2(Yp − Yo) + c2(Zp − Zo)

a3(Xp −Xo) + b3(Yp − Yo) + c3(Zp − Zo)(7)

Where, a1,a2,a3,b1,b2,b3,c1,c2,c3 are determined by

CCD stature parameter and (Xo,Yo,Zo)by DGPS.

As mention above, LS describes objects using distance

and angle information. SeeFig. 4, assumed that the dis-

tance value OO”is r, the followingequation can trans-

form directly LS coordinates (ρ1, θ1) into the coordinate

system show in Fig. 3:

ρ2 =√ρ21 + r2 − 2ρ1rsinθ1 (8)

Yp = ρ2sinθ2 = ρ1sinθ1 − r (9)

Zp = −ρ1cosθ1 = −ρ2cosθ2 (10)

According to (8),(9),(10), the 3D coordinates of P can

be determined, where X-coordinate varies with time.

Then the texture informationextracted from the CCD

images can be pasted onto the DEM generated from

range images. Subsequently, 3D visual models, and

some of fundamental measurements such as mapping

profiles, bulk of stack, etc, can be built.

3

図 5: Image-Based Modeling Flow

2.2 Image-Based Methods

There are various kinds of methods were posed on

image-based modeling. The research activities in image-

based modelling can be classified as follows:

• Approaches that try to obtain a 3D model of the

scene from uncalibrated images automatically (also

called‘‘ shape from video’’or‘‘ VHS to VRML’’

or‘‘ Video-To-3D ’’)[14].

• Approaches that perform a semi-automated 3D re-

construction of the scene from[15]. oriented images.

• Approaches that perform a fully automated 3D re-

construction of the scene fromoriented images[18].

The General procedure of range-based modeling is

shown in Fig. 5

2.3 Range-Based Methods

The purpose of a 3D laser scanner is usually to create

a point cloud of geometric samples on the surface of the

subject. These points can then be used to extrapolate

the shape of the subject (a process called reconstruc-

tion). If color information is collected at each point,

then the colors on the surface of the subjectcan also be

determined.

3D scanners collect distance information about sur-

faces within its field of view. The“ picture”produced

by a 3D scanner describes the distance to a surface at

each point in the picture. If a spherical coordinate sys-

tem is defined in which the scanner is the origin and the

vector out from the front of the scanner is φ=0 and θ

=0, then each point in the picture is associated with a

φ and θ. Together with distance, which corresponds

to the r component, these spherical coordinates fully

describe the three dimensional position of each point in

the picture, in a local coordinate system relative to the

scanner.

For most situations, a single scan will not produce

a complete model of the subject. Multiple scans, even

hundreds, from many different directions are usually re-

quired to obtain information about all sides of the sub-

ject. These scans have to be brought in a common refer-

ence system, a process that is usually called alignment

or registration, and then merged to create a complete

model. This whole process, going from the single range

map to the whole model, is usually known as the 3D

scanning pipeline[19].

There are two main types of range sensors: triangu-

lar based and those based on the time-of-flight（TOF）

principle.

• Triangulation-based sensors: This technique is

called triangulation because the laser dot, the cam-

era and the laser emitter form a triangle. It project

light in a known direction from a known position,

and measure the direction of the returning light

through its detected position. Measurement accu-

racy depends on the triangle base relative to its

height. Because the triangle base is rather short

(for practical reasons), such systems have a limited

range of less than 10 meters (in fact, most are less

than 3 meters)[20].

The length of one side of the triangle, the distance

between the camera and the laser emitter is known.

The angle of the laser emitter corner is also known.

The angle of the camera corner can be determined

by looking at the location of the laser dot in the

camera ’s field of view. These three pieces of in-

formation fully determine the shape and size of the

triangle and gives the location of the laser dot cor-

ner of the triangle. Fig. 6. shows the triangulation

sensoring diagram.

• Sensors based on the time-of-flight principle: It

measures the delay between emission and detection

of the light reflected by the surface, and thus the

accuracy does not rapidly deteriorate as the range

4

図 6: Triangulation Sensoring Diagram

図 7: Time-of-Flight Sensoring Diagram

increases. Time-of-flight sensors can provide mea-

surements in the kilometer range.

A pulsed time-of-flight laser rangefinding device

typically consists of a laser pulse transmitter, the

necessary optics, two receiver channels and a time-

to-digital converter, as shown in Fig. 2. The laser

pulse transmitter emits a short optical pulse (typ-

ically 2 to 20 ns) to an optically visible target and

the transmission event is defined either optically,

by detecting a fraction of the pulse, or electrically,

from the drive signal of the laser diode. The start

pulse is then processed in a receiver channel, which

generates a logic-level start pulse for a TDC. In

the same way the optical pulse reflected from the

target and collected by the photodetector of the

stop receiver channel is processed and a logic-level

stop pulse is generated for the TDC. The TDC uses

its time base to convert the time interval to a dig-

ital word which represents the distance from the

target. Fig. 7 shows the time-of-flight sensoring

diagram[20].

The General procedure of range-based modeling is

shown in Fig. 8.

図 8: Range-Based Modeling Flow

2.4 Integrated Methods

Laser scanning can produce the dense 3D point-cloud

data that is required to create high-resolution geomet-

ric models, while digital photogrammetry is more suited

to produce high-resolution textured 3D models repre-

senting just the main object structure. So it is usually

necessary to perform multiple scans from differentloca-

tions to cover every part of the object: the alignment

and integration of the different scans can affect the final

accuracyof the 3D model[21].

Generally, the main problems solved in the integrated

methods are as follows[5]:

• Multi-source data acquisition

• Multi-source data registration and fusion

• Modeling and reconstruction

• Visualization and interactive operation

Next we will give some examples proposed in some

papers following these steps.

One example is proposed in [6] by Gabriele

Guidi,Fabio Remondino,Michele Russo,and Fabio

Menna.In this paper they use multi-resolution ap-

proach developed for the reality-based 3D modeling of

the entire Roman Forum in Pompeii, Italy.

A top-bottom methodology was employed in this pa-

per, which starts from traditional aerial images and

reaches higher resolution geometric details through

range data and terrestrial images.

In the data acquisition step, fist, they generate the

Digital Surface Model(DSM)by using ETH multi-photo

matcher[7]. After cleaning, simplification and overlap

5

図 9: Example of processing steps and data flow for the

integration of photogrammetry and 3D scanning sys-

tems

reduction, 36 million points were used for the build-

ings.For reducing the number of polygons in the final

mesh, the IMCompress software wasused. The process

stops when the maximum 3D distance between the cur-

rent triangulation and the original model exceeds a tol-

erance level.Most of the processing, such as pieces of

columns, trabeations was achieved with standard close-

range photogrammetry software (PhotoModeler), while

for detailed surfaces (ornaments, reliefs, etc) the multi-

photo geometrically constrained ETH matcher [7] was

used.

In the data registtration and integration step, first, a

set of starting topographic points given by the Pompeii

Superintendence was used and enriched with a dedi-

cated topographic campaign. Then, the two starting

scans were acquired from two documented topographic

points and all the other clouds of point where aligned

on these. The final range model was afterwards roto-

translated through the other documented points with

a spatial similarity transformation. The resulting point

cloud was afterwards employed to align each single pho-

togrammetric mode.

Fig. 9 shows the processing steps and data flow for the

integration of photogrammetry and 3D scanning sys-

tems.

Another case is mentioned in [8]. They approach their

integrates techniques as follows:

• Construct the basic shape and large regularly

shaped details, such as columns, blocks from high-

resolution digital images.

図 10: Main steps for constructing architectural ele-

ments semiautomatically (column and window exam-

ples):

• Use laser scans to obtain fine geometric details,

such as sculpted and irregularly shaped surfaces.

Then integrate this technique with the basic mod-

elcreated in the previous step.

• Obtain visual details in the geometric model from

image textures and reflectance models.

• Use panoramas from aerial images to complete the

surroundings and distant landscapes.

• Use the semiautomatic imagebased approach to

model the entire structure without the fine details

and sculpted surfaces.

Main steps for constructing architectural elements

semiautomatically (column and window examples): (a)

extract in multiple image steps, match, and compute

seed points ’3D coordinates; (b) in 3D space, recon-

struct the object from the seed points; and (c) create a

full 3D model and project the new 3D points onto the

image for texture mapping. Shown in Fig. 10.

Then the next step is to automatically sample points

from the range-based model along its perimeter and

insert those into the image-based model. Finally,

We created seven individual models from the digital

image, and several detailed ones from the scanned

smaller regions, and then integrated them as a whole

model.Shown in Fig. 11

The third example using integrated-method is also

done by the people of Institute of Geodesy and Pho-

togrammetry from ETH[9].

6

図 11: Integrating laser scanned models and image-

based models

They taken about 300 images in one day keeping the

camera at the minimum focal length, and acquired point

clouds by 50 scans in different days, resulting in a 30

million points dataset.

In the data processing step,they did as follows:

• Subdivided the processing in 4 different steps.

• Manually select matching points among adjacent

images.

• Compute corresponding relative orientation.

• Add geometric features to improve the level of de-

tail of the resulting model(points,lines,corner..).

• Merge 4 subdivided projects.

After that, they aligned the range data pair by pair

by using ICP-based global alignment[10].

Then in the data fusion step, the procedure is as fol-

lows:

• Relate the two models into a common reference

frame

• The globally aligned and reduced point cloud was

　 georeferenced

• The whole photogrammetric model was then im-

ported in Rhino as“.3dm”file, what allowed to keep

the texture 　 information provided by the high-

resolution digital images,

• The separated laser scanning models were imported

as“ .dwg”files.

• Render the 3D model.

We can see that they got a perfect integration result

of the building Fig. 12.

図 12: Final result of the fusion

3 Comparision

Based on the previous described two methods, we can

see each method advantages and disadvantages

Image-based Methods:

• Advantages: easy to use, very portable surveying

system, analog or digital imagery, wide availability

of commercial processing/modeling software

• Disadvantages: camera calibration, time con-

suming (semi-automated) measurements, image

resolution[4]

Range-based Methods:

• Advantages: fast acquisition of a huge amount of

3D data, recording of intensity (gray values) and

color data (digital images), high LOD of the data

combined with quite good metric accuracy (de-

pending on the used instrument)

• Disadvantages: data handling, registration, model-

ing, edges, noise[4]

4 Summary

In this paper,first we give a sketch of Mobile Mapping

Systems, second we describe the main unit of MMS by

giving the mathematic model, then we list some model-

ing method metioned in some papers using cameras or

laser-sensor of MMS. Through what we discussed in the

paper, we can find:

• Many of the problems of converting a measured

point cloud into a realistic 3D polygonal model

that can satisfy high modelling and visualisation

demands have notbeen completely solved.

• Meanwhile, modeling from images also have the

problem that lack of accuracy, andcan not fully

7

modeled easily without taking huge amount of im-

ages.

So the best method to model a object especially in

Modeling 3D city is to use the integrated method which

including use the image data taken by cameras and the

range data taken by laser sensor of MMS.

参考文献[1] Shi, Z.C., “Advanced Mobile Mapping System Devel-

opment with Integration of Laser Data, Stereo Imagesand other Sensor Data,” 武蔵工業大学環境情報学部紀要、第十号. 平成 21年 2月、pp.24-31

[2] J. Hu, S. You, and U. Neumann, “Approaches to large-scale urban modeling,” IEEE Computer Graphics andApplications, 23(6):62-67, 2003.

[3] Gordon Petrie, “An Introduction to the TechnologyMobile Mapping Systems,” GeoInformatics, Vol. 13,No.1, p. 32-43.2010.

[4] F.Remondino, A.Guarnieri, A.Vettore, “3DMODELING OF CLOSE-RANGE OB-JECTS:PHOTOGRAMMETRY OR LASER SCAN-NING”, SPIE-IS T Electronic Imaging, Vol.5665, pp.216-225, San Jose (California), USA, January 2005.

[5] Qingquan Lia, Bijun Lia, Yuguang Lia, Jing Chena,“ 3D Modeling and Visualization Based on Laserscan-ning ”, Geographic Information Sciences Vol. 6, No. 2,December 2000.

[6] Gabriele Guidi, Fabio Remondino, Michele Russo,Fabio Menna, Alessandro Rizzi, “3D Modeling of Largeand Complex Site Using Multi-sensor Integration andMu lti-resolution Data,” The 9th International Sym-posium on Virtual Reality, Archaeology and CulturalHeritage VAST (2008).

[7] Remondino, F., El-Hakim, S., Gruen, A., Zhang,L, “Development and performance analysis of imagematching for detailed surface reconstruction of heritageobjects”, IEEE Signal Processing Magazine, 25(4), pp56-65, 2008.

[8] Sabry F. El-Hakim, J.-Angelo Beraldin, Michel Pi-card, Guy Godin, “Detailed 3D Reconstruction ofLarge-Scale Heritage Sites with Integrated Tech-niques,” IEEE Computer Graphics and Applica-tions, vol. 24, no. 3, pp. 21-29, May/June 2004,doi:10.1109/MCG.2004.1318815.

[9] A. Guarnieri, F. Remondino, A. Vettore, ”Digitalphotogrammetry and TLS data fusion applied tocultural heritage 3D modelling,” INTERNATIONALARCHIVES OF PHOTOGRAMMETRY,REMOTESENSING AND SPATIAL INFORMATION SCI-ENCES, VOLUME XXXVI, PART 5, 2006.

[10] Paul J. Besl, Neil D. McKay, “A Method for Regis-tration of 3-D Shapes,” IEEE TRANSACTIONS ON

PATTERN ANALYSIS AND MACHINE INTELLI-GENCE, VOL.14, NO.2, FEBRUARY 1992.

[11] Ga el Sourimant, Luce Morin, Kadi Bouatouch, “Gps,Gis and Video fusion for urban modeling,” ComputerGraphics International, Petropolis, Brazil, May-June2007.

[12] Wang, Z. z., “Principles ofphotogrammetry,” Press ofWTUSM., 1990

[13] Albert S. Huang, Matthew Antone, Edwin Olson, LukeFletcher, David Moore, Seth Teller and John Leonard“A High-rate, Heterogeneous Data Set From TheDARPA Urban Challenge,” The International Journalof Robotics Research 2010 29: 1595.

[14] A. Akbarzadeh, J.-M. Frahm, P. Mordohai, B. Clipp,C. Engels, D. Gallup, P. Merrell, M. Phelps, S. Sinha,“Towards Urban 3D Reconstruction from Video,”Third International Symposium on 3D Data Process-ing, Visualization and Transmission (3DPVT), 2006(Invited Paper).

[15] S.F.El-Hakim “3D Modeling of Complex Environ-ments,” Electronic Imaging: Videometrics and Opti-cal Methods for 3D Shape Measurement VII, Vol. 4309(January 2001).

[16] Ioannis Stamos, Peter K. Allen, “Integration of Rangeand Image Sensing for Photorealistic 3D Modeling,”ICRA 2000: 1435-1440.

[17] Fabio Remondino, “TERRESTRIAL OPTICAL AC-TIVE SENSORS:THEORY and APPLICATIONS”,3D Optical Metrology Unit Bruno Kessler Foundation(FBK) Trento, Italy

[18] J-A. Beraldin, “INTEGRATION OFLASER SCANNING AND CLOSE-RANGEPHOTOGRAMMETRY-THE LAST DECADE ANDBEYOND,” IAPRS, 35(5), pp. 972-983, Istanbul,Turkey, July 2004.

[19] J.J. Jeon, J.W. Lee, J.Y. Ha, and W.H. Kwon,“IMAGE-BASED 3D MODELLING: A REVIEW,”The Photogrammetric Record 21(115): 269-291(September 2006)

[20] Ch. Ioannidis(a), N. Demir(b), S. Soile(a), M. Tsakiri,“COMBINATION OF LASER SCANNER DATAAND SIMPLE PHOTOGRAMMETRIC PROCE-DURES FOR SURFACE RECONSTRUCTION OFMONUMENTS,” CIPA 2005 XX International Sym-posium, 26 September - 01 October,2005,Torino,Italy.

[21] Gabriele Guidi1,*, Fabio Remondino2,3, MicheleRusso1, Fabio Menna4, Alessandro Rizzi3, Sebas-tiano Ercoli1, “A MULTI-RESOLUTION METHOD-OLOGY FOR THE 3DMODELING OF LARGE ANDCOMPLEX ARCHEOLOGICAL AREAS,” Interna-tional Journal of Architectural Computing, Vol. 7(1),pp. 40-55.

8

3次元都市モデリングのためのモバイルマッピング...

Documents