2010. 5. 25 박 한 샘

14
Soft Computing Lab. Dept. Computer Science Yonsei Univ. Korea Distilling Free-Form Natural Laws from Experimental Data Michael Schmidt and Hod Lipson, Science, vol. 324, no. 81, pp. 81-85, April, 2009 2010. 5. 25

Upload: maxim

Post on 22-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Distilling Free-Form Natural Laws from Experimental Data Michael Schmidt and Hod Lipson, Science , vol. 324, no. 81, pp. 81-85, April, 2009. 2010. 5. 25 박 한 샘. Outline. Overview of this paper Background & Motivation Algorithm Experiments Conclusion. Overview of This Paper. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 2010. 5. 25 박    한    샘

Soft Computing Lab.Dept. Computer Science

Yonsei Univ. Korea

Distilling Free-Form Natural Laws from Ex-perimental Data

Michael Schmidt and Hod Lipson,Science, vol. 324, no. 81, pp. 81-85, April, 2009

2010. 5. 25박 한 샘

Page 2: 2010. 5. 25 박    한    샘

Outline• Overview of this paper

• Background & Motivation

• Algorithm

• Experiments

• Conclusion

2

Page 3: 2010. 5. 25 박    한    샘

Overview of This Paper• Mining physical systems

– Capture the angles and angular velocities over time using motion tracking– Search for equations that describe a single natural law relating these variables

without any prior knowledge about physics or geometry– Turns out to be the double pendulum’s Hamiltonian

• The proposed approach is demonstrated – Using a simple harmonic oscillator and a chaotic double-pendulum

3

Actual pendulum, data and results

Page 4: 2010. 5. 25 박    한    샘

Symbolic Regression• Symbolic regression

– Searches both the parameters and the form of equations unlike traditional linear and nonlinear regression methods

• Process (evolutionary computation)– Initial expressions are formed by randomly combining mathematical building

blocks such as algebraic operators {+, -, x, /}, analytical functions (for example, sine and cosine), constants, and state variables

– New equations are formed by recombining previous equations and probabilisti-cally varying their sub-expressions

– Algorithm retains equations that model the experimental data better than oth-ers and abandons unpromising solutions

– After equations reach a desired level of accuracy, the algorithm terminates returning a set of equations that are most likely to correspond to the intrinsic mecha-nisms underlying the observed system

4

Background

Page 5: 2010. 5. 25 박    한    샘

Challenge• It is a major challenge even for a human scientists to identify

nontrivial relations

• ? Nontrivial conservation equation should be able to predict connections among derivatives of groups of variables over time, relations that we can also calculate from new experimental data

• ? One instance of such a metric is the partial derivatives between pairs of variables

5

Motivation

Page 6: 2010. 5. 25 박    한    샘

Algorithm to Detect Conservation Laws

• One can control the type of law, to an extent, by choosing what vari-ables to provide to an algorithm

• If we provide velocities, the algorithm is biased to find energy laws

• If we additionally supply accelerations, the algorithm is biased to find force identities and equations of motion

• Given other types of variables, other or previously unknown analytical laws may exist

6

Method

Page 7: 2010. 5. 25 박    한    샘

Data Collection• This paper collected data from typical systems:

an air-track oscillator and a double pendulum

• Motion tracking cameras and software were used– Infrared markers are placed on the experimental device– Its dynamics are captured– Motion tracking software produces time-series data of 3-dimensional Eu-

clidean position coordinates for each infrared marker

7

Experiments

Page 8: 2010. 5. 25 박    한    샘

Setting• Two configurations of the air track

– Two-spring single-mass• Minimal noise

– Three-spring double-mass• Considerable noise

• Two configurations of a pendulum– A pendulum– A double pendulum

• Higher measurement noise

8

Experiments

Page 9: 2010. 5. 25 박    한    샘

Summary of Laws Inferred

9

Experiments

Page 10: 2010. 5. 25 박    한    샘

Summary of Laws Inferred• Given position and velocity data over time

– The algorithm converged on the energy laws of each system (Hamiltonian and La-grangian equations)

• Given acceleration data also– It produced the differential equation of motion corresponding to Newton’s second

law for the harmonic oscillator and pendulum systems• Given only position data for the pendulum

– The algorithm converged on the equation of a circle, indicating that the pendulum is confined to a circle

• In the absence of appropriate building blocks, the algorithm developed approximations

– For example, eliminating cosine but not sine drove the algorithm to converge on the equality cos(Ө)=sin(Ө+π/2) or more complex equivalences

One can control the type of law

10

Experiments

Page 11: 2010. 5. 25 박    한    샘

Accuracy/Complexity Tradeoff• Consider the relationship between equation complexity and accuracy

– Extremely complex equations with near perfect accuracy• Taylor series, neural networks, and Fourier series

– Simple, single-parameter models with baseline accuracy

• The Pareto front for the double pendulum– Equation at the cliff corresponds to the exact energy conservation law– Dramatical jump means capturing some significant relationships of the system

11

Experiments

Page 12: 2010. 5. 25 박    한    샘

Time to Detect Solutions

• The computation time increases with the dimensionality (# of variables), law equation complexity, and noise

– In the worst case, the time to converge on the law equations • Depends exponentially on the complexity of the law expression itself, and • Depends roughly quadratically on the system dimensionality• The bootstrapped double pendulum is an exception

– In a 32-core implementation, the time required ranged from a few minutes (the har-monic oscillator) to 30 hours (the double pendulum)

– Noise reduces the ability to find accurate law equations substantially• It simply requires more time to compute, or • It obscure the law equation entirely depending on the noise strength

12

Experiments

Page 13: 2010. 5. 25 박    한    샘

Bootstrapping • Bootstrapping search reduced the search time from 30~40 hours of

computation to 7~8 hours

• It uses the terms from simpler systems as a seed

• We can guess that bootstrapping may be critical for detecting laws in higher-order systems that are veiled in complexity

13

Experiments

Page 14: 2010. 5. 25 박    한    샘

Conclusion • Summary

– This paper demonstrated the discovery of physical laws directly from experimentally captured data with the use of a computational search

– It is used to detect nonlinear energy conservation laws, Newtonian force laws, geometric invariants and system manifolds

• Discussion– The concise analytical expressions that we found

• Are amendable to human interpretation and • Help to reveal the physics underlying the observed phenomenon

– This process will not diminish the role of future scientists, but help to focus on interesting phenomena more rapidly

14