matchmaker, matchmaker, make me a match: migration of populations via marriages in the past

58
Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past The First Wednesday Multidisciplinary Forum @ (November 5, 2014) SHL (이상훈), R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014). Sang Hoon Lee (이상훈) Department of Energy Science (에너지과학과) Sungkyunkwan University (성균관대학교) [email protected] http://sites.google.com/site/lshlj82

Upload: sang-hoon-lee

Post on 08-Jul-2015

311 views

Category:

Science


3 download

DESCRIPTION

Matchmaker, Matchmaker, Make Me a Match: 
Migration of Populations via Marriages in the Past @ The First Wednesday Multidisciplinary Forum, Korea Advanced Institute of Science and Technology (KAIST), November 5, 2014

TRANSCRIPT

Page 1: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

The First Wednesday Multidisciplinary Forum @ (November 5, 2014)

SHL (이상훈), R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Sang Hoon Lee (이상훈) Department of Energy Science (에너지과학과)

Sungkyunkwan University (성균관대학교) [email protected]

http://sites.google.com/site/lshlj82

Page 2: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

from https://twitter.com/AcademicsSay/status/524571824492150784

Page 3: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

from https://twitter.com/AcademicsSay/status/524571824492150784

Page 4: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Kim from Gyeongju (경주 김) ≠ Kim from Gimhae (김해 김)

Korean family names are subdivided into 본관 named after their geographical origins

Page 5: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Kim from Gyeongju (경주 김) ≠ Kim from Gimhae (김해 김)

Korean family names are subdivided into 본관 named after their geographical origins

Page 6: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 7: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 8: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

Page 9: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Tales of two Korean researchers’ origins

Sang Hoon Lee (이상훈)Beom Jun Kim (김범준)Kim from Gimhae (김해 김)

Lee from Hakseong (학성 이)

1

수로왕

가락국 수로왕 駕 洛 國 首 露 王

가락국의 초대 국왕

본명 김수로(金 首 露 )

재위 42년 ~ 199년

왕후 허황옥(許 黃 玉 )

모후 정견모주

전임자 (초대 군주)

다음 왕 거등왕

김수로왕의 무덤

수로왕(首 露 王 , 42년?[1] ~ 199년, 재위: 42년 ~ 199년) 또는 김수로(金 首 露 )는 가락국 시조이자 김해 김씨의 시조이다. 일명 수릉(首 陵 ), 뇌실청예(惱 室 靑 裔 ) 등으로 불리기도 한다.

신라 유리왕 19년(42년) 가락국 북쪽 귀지봉(또는 구지봉, 龜 旨 峰 )에 하늘로부터 떨어진 6개의 금란(金 卵 )이 모두 변하여 6가야국의 왕이 되었다고 하는데 김수로도 그 가운데 하나로, 김해 김씨의 시조이다. 수장(首 長 ) 구도간(九 刀 干 )들이 왕으로 추대하였으며, 나라를 세워 가락국이라 하였다.

삼국사기와 삼국유사 가락국기에 의하면 하늘에서 내려온 6개의 황금 알 중 가장 먼저 깨어난 9척(약 2m[2])의 소년이 수로왕이 되었다고 하나, 신라의 최치원은 그가 정견모주라는 여성의 아들이었다고 한다. 전설에 의하면 150세 이상을 생존했다고 하나 이는 신빙성이 낮다.

생애 《 삼국사기》 와 《 삼국유사》 가락국기에 의하면 변한의 구야국에는 주민들이 각 촌락별로 나뉘어 생활하고 있었다. 그런데 42년 3월 부족장을 기다리는 구야국의 지도자들에게 "너희의 왕을 내려 보낸다"는 계시와 함께, "거북아 거북아 머리를 내놓아라. 그러지 않으면 구워서 먹으리"라는 노래를 부르라는 소리가 하늘에서 들려왔다. 하늘의 계시를 들은 부족장들은 가락국의 9간(干 ) 이하 수백 명이 김해의 구지봉(龜 旨 峰 )에 올라가 하늘에 제사를 지내고 춤을 추면서 하늘에서 들려온 말대로 "거북아 거북아 머리를 내놓아라. 그러지 않으면 구워서 먹으리"라고 구지가(龜 旨 歌 )를 불렀다.부족 주민들의 수가 늘어나 노래소리가 커지자 하늘에서 빛이 나더니 곧 붉은 보자기에 싸인 금빛 상자가 내려오고, 그 안에 둥근 황금색 알 여섯 개가 들어있었다. 12일 후 이들 알에서 사내아이들이 태어났는데, 그 가운데 키가 9척이며 제일 알에서 깨어난 아이가 수로였다. 부족장들은 그를 6가야 중 수도이자 영토가 넓은 가락국의 왕으로 추대하여 주민들은 그를 가락국의 왕으로 받들었고, 또한 나머지 아이들도 각각 5가야의 왕이 되었다고 한다.

신라의 학자 최치원의 석이정전(釋 利 貞 傳 )에 따르면 가야산의 여신 정견모주(正 見 母 主 )가 하늘의 신(神 )인 이비가(夷 毗 訶 )에 감응하여 두 아들을 낳았는데 한 명은 뇌질주일(惱 室 朱 日 )이었고, 다른 한 명은 뇌질청예(惱 窒 靑 裔 )였다. 금관가야의 시조가 된 뇌질청예(김수로왕)는 여신 정견모주를 닮아 얼굴이 희고 갸름했으며 대가야의 시조가 된 뇌질주일(이진아시왕)은 이비가를 닮아 얼굴이 해와 같이 둥글고 붉었다고 한다. 이는 금관가야의 시조인 수로왕이 맏형이었다고 한 ≪가락국기≫에 전하는 금관가야 중심의 형제설화와는 대비되는 것이다.

(1373) 1

이예 (1373년)이예(李 藝 , 1373년 ~ 1445년)는 조선 초의 무신, 외교관이다. 본관은 학성 이씨(鶴 城 ). 아호는 학파(鶴 坡 ). 시호는 충숙(忠 肅 )이며, 학성 이씨의 시조이다. 중인 계급에 속하는 아전으로 관리 생활을 시작했다.이후 정2품에 해당되는 관직인 동지중추원사의 자리까지 오르기도 했으며, 유고로 《 학파실기(鶴 坡 實 紀 )》가 있다.

이예는 울산의 향리(蔚 山 群 吏 )이며, 직책은 기관(記 官 )[1]이었다.

—조선왕조실록

생애 1397년 1월 31일 3천명의 왜구들이 울주포[2]에 침입하여 군수 이은(李 殷 ) 등을 사로잡아 돌아갔다.[3] 다른 관리들이 모두 도망가 숨은데 비해 이예는 자진해 군수를 따라가 끝까지 보필해 해적들을 감복시켰다. 후일 조선에서 파견한 통신사의 중재로 1397년 2월 이예는 군수와 함께 무사히 조선으로 돌아왔다.조정에서는 이예의 충성을 가상히 여겨 아전의 역(役 )을 면제하고 벼슬을 주었다. 이 사건을 계기로 이예는 중인 계층의 아전 신분에서 벗어나 사대부 양반으로서의 전문 외교관의 길을 걷게 된다. 8세 때 해적에게 잡혀간 어머니를 찾기 위해 조정에 청해 1400년 회례사(回 禮 使 ) 윤명(尹 銘 )의 수행원으로 대마도에 갔으나 찾지 못했다.위키백과:출처 밝히기

처음으로 사절의 책임을 맡은 것은 1401년(태종1년)으로, 보빙사로 일기도에 파견되었다. 1406년 일본 회례관(日 本 回 禮 官 )으로 파견되어, 납치되었던 남녀 70여 명을 데리고 돌아왔다.[4] 1416년 1월 27일 유구국[5]으로 가서, 왜에 의해 포로가 되었다가 유구국으로 팔려간 백성을 데려오기 위해 유구국으로 파견되었다.[6] 그는 유구국에서 44인을 데리고, 같은 해 7월 23일 귀국하였다. [7]

1418년 4월 24일 태종 18년 대마도 수호 종정무가 사망하자, 조의 사절로 대마도에 파견되어 쌀, 콩, 종이를 부의하여 그의 충성을 후사하였다. 종정무는 치세 기산 도적을 금제하여, 변경을 침범하지 못하게 했다는 이유로 특별히 이예를 파견한 것이다. [8]

“ 모르는 사람은 보낼 수 없어서, 이에 그대를 명하여 보내는 것이니, 귀찮다 생각하지 말라. ” —《 세종실록》 , 세종8년에 통신사로 일본으로 떠나는 이예에게 임금이 갓과 신을 하사하며 당부하는 말

1443년 세종 25년 왜적이 변방에 도적질하여 사람과 물건을 약탈해 갔으므로 나라에서 사람을 보내서 찾아오려 하니, 이예가 자청하여 대마도 체찰사(對 馬 島 體 察 使 )로 파견되었다. 이것이 마지막 사행(使 行 )이었다.28세 나던 1400년에서 71세 나던 1443년까지 44년간 40여회 일본[9]에 임금의 사절로 파견되었다. 그 중 조선왕조실록에 기록되어 사행(使 行 )의 내용이 구체적으로 알려진 것만 해도 13회에 달한다[10].조선왕조실록에는 44년간의 사행에서 이예가 쇄환해온 조선인 포로의 수가 667명이라고 기록되어 있다.

Page 10: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

family/origin population density as a percentage of total population density

Really?

from the census data (in 1985 and 2000—the only two years when the individual 본관’s regional data is available) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

Page 11: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

family/origin population density as a percentage of total population density

Really?

from the census data (in 1985 and 2000—the only two years when the individual 본관’s regional data is available) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

“ergodic” vs “non-ergodic” 본관 ...

Page 12: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

Page 13: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

S. K. Baek (백승기), P. Minnhagen, and B. J. Kim (김범준), New J. Phys. 13, 073036 (2011) and references therein.

Page 14: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

족보 in Korean culturefamily tree of (mostly) male descendants

spouses’ 본관 and birth year: “by-product” data sets

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

S. K. Baek (백승기), P. Minnhagen, and B. J. Kim (김범준), New J. Phys. 13, 073036 (2011) and references therein.

geographical information from 본관: honestly, too valuable pieces of information to overlook (and just be satisfied with the distributions as if we don’t know anything about them), don’t you think? :)

Page 15: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Marriage record in 족보 as a proxy of human migration due to marriage

Page 16: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Marriage record in 족보 as a proxy of human migration due to marriage

Page 17: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Studies on human migration patterns: two types of generative models

where mi is the population at i and rij is the distance between i and j

J. Q. Stewart and W. Warntz, J. Regional Sci. 1, 99 (1958);W.-S. Jung (정우성), F. Wang, and H. E. Stanley, EPL 81, 48005 (2008).

(i ! j) where Ri is the total population outgoing from i,sij is the population in the circle of radius rij centered at i(excluding the source i and destination j’s population),and N is the total population

F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, Nature 484, 96 (2012); A. P. Masucci, J. Serras, A. Johansson, and M. Batty, Phys. Rev. E 88, 022812 (2013).

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ijcontrolling the effect of geography (or lack thereof)gravity model flux

radiation model flux

Page 18: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

In practice . . .

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

the census data (in 1985 and 2000) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 19: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

In practice . . .

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

the census data (in 1985 and 2000) showing the list of populations of 본관 for each administrative regions of South Korea, provided by Korean Statistical Information Service (KOSIS): http://kosis.kr/

population from 2000 census data

marriage “flux” in 족보

distance between two population centroids in 2000

brides’ 본관 index i

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 20: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 21: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

Page 22: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

outlier: women from the 본관 of 족보 are severely underrepresented. In fact, marrying a spouse from the same 본관 had been considered as a taboo in Korean culture and even illegal in South Korea until 2005!

Page 23: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

No dependency on the distance and ? “population-product model” (the 본관 of the 족보 is ergodic, so the husbands could be almost everywhere in the country, which made the geographic factors irrelevant)

/ m↵i

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

Gravity & radiation models

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

grooms’ 본관 index j: identical (constant) for a single 족보 by definition

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

outlier: women from the 본관 of 족보 are severely underrepresented. In fact, marrying a spouse from the same 본관 had been considered as a taboo in Korean culture and even illegal in South Korea until 2005!

a simple measure of ergodicity: number of different regions occupied by 본관

“ergodic” vs “non-ergodic” 본관

Page 24: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

“ergodic” vs “non-ergodic” 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

Distribution of administrative regions occupied by 본관

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

ergodic

non-ergodic

Page 25: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

“ergodic” vs “non-ergodic” 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

Distribution of administrative regions occupied by 본관

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

ergodic

non-ergodic

19

Surnames, Geoforum 42, 506 (2011).[25] J. Novotný and J. A. Cheshire, The Surname Space of the Czech

Republic: Examining Population Structure by Network Anal-ysis of Spatial Co-Occurrence of Surnames, PLOS ONE 7,e48568 (2012).

[26] An original jokbo usually includes more details about clans, butwe only use information that we mention explicitly.

[27] Google Maps, https://developers.google.com/maps/.

[28] In fact, it is known that clans that originated in the southernpart of Korea have been more abundant in recent Korean history(including the period that is spanned by our data set) [20].

[29] Korean Statistical Information Service, http://kosis.kr/ (Korean version) and http://kosis.kr/eng/ (En-glish version).

[30] Statistical Geographic Information Service (:üx>⇢t�o�&Ò⌦ò–"fq�€º in Korean), http://sgis.kostat.go.kr/ (Ko-rean version; no English version available).

[31] J. P. Sethna, Statistical Mechanics: Entropy, Order Parame-

0 2000

500

number of regions occupied

dis

tance

move

d (

km) (a)

0 2500

500

radius of gyration (km)

dis

tance

move

d (

km) (b)

0 2000

250

number of regions occupied

rad. of gyr

atio

n (

km) (c)

FIG. I2. Scatter plots of (a) the distance between the clan-originlocation and the population centroid versus the number of admin-istrative regions, (b) the distance between the clan-origin locationand the population centroid versus the radius of gyration, and (c)the radius of gyration versus the number of administrative regions.The corresponding Pearson correlation values are (a) r ⇡ 0.20 (from3 481 clans that include all of the required information; the p-valueis p ⇡ 5.7 ⇥ 10�34), (b) r ⇡ 0.066 (from 3 481 clans that includeall of the required information; the p-value p ⇡ 9.4 ⇥ 10�5), and (c)r ⇡ �0.26 (from 3 481 clans; the p-value is p ⇡ 1.8⇥10�53). Note thatcorrelations over limited ranges may be di↵erent and significant: forexample, in panel (c), the two metrics for ergodicity are significantlypositively correlated when rg < 50 km (r ⇡ 0.39, p ⇡ 2.3 ⇥ 10�4).

ters and Complexity (Oxford University Press, Oxford, UnitedKingdom, 2006)

[32] J. van Lith, Ergodic Theory, Interpretations of Probability andthe Foundations of Statistical Mechanics, Stud. Hist. Philos.Sci. 32, 581 (2001).

[33] Because we only have ten family books, our data allows onlyten distinct possibilities for “clan j” (the family into which awoman marries).

[34] S. Cho, Chapter 8. Traditional Korean Culture in Cultural andEthnic Diversity: A Guide for Genetics Professionals eds FisherNL (The Johns Hopkins University Press, Maryland, UnitedStates, 1996).

[35] F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, AUniversal Model for Mobility and Migration Patterns, Nature(London) 484, 96 (2012).

[36] A. P. Masucci, J. Serras, A. Johansson, and M. Batty, Grav-ity versus Radiation Models: On the Importance of Scale andHeterogeneity in Commuting Flows, Phys. Rev. E 88, 022812

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. I3. (a) Probability distribution of the number of di↵erent admin-istrative regions occupied a Czech family name in 2009. Note thatthe leftmost two bars have heights of 0.17 and 0.03 but have beentruncated for visual presentation. (This data was initially analysedin Ref. [25].) (b) Probability distribution of the number of di↵erentadministrative regions occupied by the clan of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions (c) Probabil-ity distribution of radii of gyration (in km) of Czech family namesin 2009. Note that the leftmost bar has a height of 0.11 but has beentruncated for visual presentation. (d) Probability distribution of radiiof gyration (in km) of Czech family names of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Observethat the distributions in panels (a) and (b) are starkly di↵erent fromthe distributions in panels (a) and (b) from Fig. 3. Solid curves arekernel density estimates (from Matlab R2011a’s ksdensity func-tion with a Gaussian smoothing kernel of width 5).

surnames in Czech Republic

non-ergodic

data from J. Novotný and J. A. Cheshire, PLOS ONE 7, e48568 (2012).

then interpreted as a result of the subsequent resettlement andindustrialization led immigration into these areas, but also of somestate policies that have contributed to the spatial concentrations(and often also segregations) of Roma minority groups [23].

An intriguing exception to these explanations is a typical Czechsurname Vlcek that can also be found in Figure 6 because of itssignificant revealed relatedness with Wolf. From all of thesurnames considered, the name Wolf has been found as thenearest neighbour of Vlcek, with the 56.7% probability that one ofthese surnames concentrates in the region where another one isconcentrated. The high co-occurrence of these two surnames inthe identical regions seems to be attributable to their commonmeaning – Vlcek literally means ‘‘small Wolf’’ in the Czechlanguage. The bi-lingual naming practices or secular nametransformations taking place in these historically multi-ethnicregions (German and Czech) offers the most likely explanation forsuch commonalities.

Analysis of co-occurrence in municipalitiesIn the second stage of our analysis we examined the co-

occurrence of Czech surnames at the finest spatial level of 6,244municipalities. We began with the calculation of the pairwiseindices of revealed relatedness (Di,j,mun) among 5,660 surnamesselected on the basis of the highest revealed relatedness at moreaggregate spatial level. This sample of surnames covers almost ahalf of the Czech male population. Given the significantly highernumber of spatial units considered for this second stage of ouranalysis, the values of Di,j,mun are generally lower than Di,j,reg in thefirst stage which focused on co-occurrence in 206 micro-regionsonly. At the same time, the size distribution of these second stageresults is even more skewed to the right; the maximum Di,j,mun

(from the total of more than 32 million of observations)corresponds to 0.687, while only 0.011% of all observationsexceed 50% of the maximum value. These differences between thefirst and second stage results are understandable and go hand inhand with the expectation that the surname network based on themunicipality level calculations will be more fragmented.

This has been confirmed by the fact that a majority of the mostsignificant Di,j,mun proximity observations occur among relatively

rare surnames that are typically concentrated in a few nearbymunicipalities. This is especially the case of Silesian surnames thataccount for almost all Di,j,mun observations at the very top of thedistribution of results. As such, in order to get a reasonablenetwork representation, we again had to impose some restrictionsin relation to the minimal size of surnames shown as nodes and thestrength of links between them. After applying the criteria from theprevious section, we found the frequency of at least 150 bearersand the links determined by Di,j,mun .0.23 to be optimal. Thesurname network based on these parameters and generated by aweighted force-directed algorithm is depicted in Figure 7 (Fig-ure S5 depicts a high resolution version with labels of individualsurnames and the size of nodes scaled by their population size).

In general, the second stage or municipality level surnamenetwork has reproduced the macro-division of the Czech surnamespace identified in the first stage and described above. Theproportions between the sizes of the main clusters are howeverdifferent with the previously mentioned dominance of the densegroup of Silesian surnames (B2). Regarding Moravian surnames,again the commonality of verb-derived surnames emerges, as theyform the majority of names in the B1 area of the network. TheBohemian part of the surname space (A) is structured into threemain groups of surnames. The A1 cluster comprises some of themost frequent surnames and those prevalent across most ofBohemian regions, whilst the separation from the secondarycluster (A2) is hardly discernible. By contrast, two other core areasare well recognizable and represent northern and easternBohemian names (A3) more specifically and surnames concen-trated mainly in municipalities in the north-west and west ofBohemia (A4).

The general congruence in macro-structure of the surnamenetworks constructed here and in the first stage of our analysis isan important finding (generally similar macro-structure was alsofound when the Ji,j,reg and Ji,j,mun were considered instead of theDi,j,reg and Di,j,mun, respectively). However, the main value of thissecond stage municipality level exercise should be seen inindividual details uncovered with respect to local parts of thesurname network. A number of interesting examples of pairs ofsurnames that have been found as potentially closely related,

Figure 4. Spatial concentration of individual communities of high degree surnames. Individual maps show regional variation in thepercentage of high degree surnames from particular core communities (A1, A2, A3, A4, B1, B2) as listed in Table 5 concentrated in a given region. Forexample, if the percentage of high degree surnames for A1 (the upper left map) corresponds to 100, then all the surnames listed in the A1 group inTable 5 are concentrated in a given region (that is, all of them satisfy LQi,r .1 for the region in question).doi:10.1371/journal.pone.0048568.g004

The Surname Space of the Czech Republic

PLOS ONE | www.plosone.org 8 October 2012 | Volume 7 | Issue 10 | e48568

Page 26: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

“clan density anomaly”:

�i(rk, t) = ci(rk, t)�mi(t)

N(t)⇢(rk, t) ,

where rk is the two-dimensional coordinate vector of region k,ci(rk, t) is clan i’s population density in region k at time t,mi(t) is the total population of clan i in the country at time t,N(t) is the total population (all clans) in the country at time t,and ⇢(rk, t) is the total population density (all clans) in region k at time t.

본관

본관

본관

본관

본관

Page 27: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

“clan density anomaly”:

�i(rk, t) = ci(rk, t)�mi(t)

N(t)⇢(rk, t) ,

where rk is the two-dimensional coordinate vector of region k,ci(rk, t) is clan i’s population density in region k at time t,mi(t) is the total population of clan i in the country at time t,N(t) is the total population (all clans) in the country at time t,and ⇢(rk, t) is the total population density (all clans) in region k at time t.

본관

본관

clan i’s radius of gyration of anomaly:

rgi =

s1

�i,tot(t)

X

k

[krk � ri,C(t)k2�i(rk, t)Ak] ,

where �i,tot(t) =P

k �i(rk, t)Ak,ri,C(t) = [1/�i,tot(t)]

Pk rk�i(rk, t)Ak,

and Ak is the area of region k.

본관

본관

본관

본관

Page 28: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

assuming a simple diffusive process over time: measuring the diffusion constants by numerically applying the partial differential equation of ϕi above at the two endpoints (“boundary conditions”) in time

assuming that the spatial variation of (mi/N)⇢ is much smaller than that of ci

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

본관

Page 29: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

assuming a simple diffusive process over time: measuring the diffusion constants by numerically applying the partial differential equation of ϕi above at the two endpoints (“boundary conditions”) in time

assuming that the spatial variation of (mi/N)⇢ is much smaller than that of ci

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

본관 density anomaly of Cho from Haman (함안 조)

3

FIG

.3.Example

ofclananomaly

picture

forclan1,1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredat

CM

positionwithradiussetto

radiusofgyration.Heredi↵usivee↵

ects

are

visible.

FIG

.4.Example

ofclananomaly

picture

forKim

from

Gim

hae,

1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredatCM

positionwithradiussetto

radiusofgyration.(N

ote

positionandsize

ofcircle

are

reasonable.)

large �

small �

본관

Page 30: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

More sophisticated measure of ergodicity & simple diffusion dynamics

assuming a simple diffusive process over time: measuring the diffusion constants by numerically applying the partial differential equation of ϕi above at the two endpoints (“boundary conditions”) in time

assuming that the spatial variation of (mi/N)⇢ is much smaller than that of ci

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

본관 density anomaly of Cho from Haman (함안 조)

3

FIG

.3.Example

ofclananomaly

picture

forclan1,1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredat

CM

positionwithradiussetto

radiusofgyration.Heredi↵usivee↵

ects

are

visible.

FIG

.4.Example

ofclananomaly

picture

forKim

from

Gim

hae,

1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredatCM

positionwithradiussetto

radiusofgyration.(N

ote

positionandsize

ofcircle

are

reasonable.)

large �

small �6

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

clans) (a)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

clans) (c)

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

indiv

iduals

)

(b)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

indiv

iduals

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

babili

ty d

istr

ibutio

n

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

distribution of estimated diffusion constants

본관

Page 31: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

본관본관

Page 32: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

diffusion part?

본관본관

Page 33: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

diffusion part?

본관본관

Page 34: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

convection part?diffusion part?

본관본관

Page 35: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

convection part?diffusion part?

cannot be expected in a purely diffusive system!

본관본관

Page 36: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“diffusion” vs “convection” of human migration?

convection part?diffusion part?

cannot be expected in a purely diffusive system!

본관본관

22

33

34

35

36

37

38

39

125 126 127 128 129 130 131

latit

ude

longitude

(a) 1970

SeoulBusanInchonDaeguDaejon

GeonggiGangwon

UlsanChungbuk

ChungnamJeonbuk

JeonnamGwangju

GyeongbukGyeongnam

Jeju 33

34

35

36

37

38

39

125 126 127 128 129 130 131

latit

ude

longitude

(b) 2010

SeoulBusanInchonDaeguDaejon

GeonggiGangwon

UlsanChungbuk

ChungnamJeonbuk

JeonnamGwangju

GyeongbukGyeongnam

Jeju

FIG. I5. Density-equalizing population cartograms [55] for South Korea using population data from (a) 1970 and (b) 2010 censuses [29]. Thecoordinates are longitude on the horizontal axis and latitude on the vertical axis. The growth of the Seoul metropolitan area over the past 40years is clearly visible (compare this to a regular map of South Korea, e.g., Fig. 1).

distance from clan originlocation to Seoul (km)

radiu

s of gyr

atio

n (

km)

(a)

0 300110

140

distance from clan origin locationto current clan centroid (km)

radiu

s of gyr

atio

n (

km)

(b)

0 300110

140

FIG. I6. (a) The radius of gyration rg versus the distance to Seoul.The correlation between the variables is not statistically significant(r ⇡ 0.18 with the p-value p ⇡ 0.6. (b) The radius of gyration rgversus the distance between location of clan origin and the present-day centroid. The correlation between the variables is significantlypositive up to 170 km (r ⇡ 0.86, p ⇡ 0.01) and is significantly neg-ative (r ⇡ �0.96, p ⇡ 0.005) for larger distances. For all the panels,we estimate rg separately in each of 11 equally-sized bins over thedisplayed range of distances. The gray regions give 95% confidenceintervals.

본관본관

Page 37: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Summary and Outlook

• Human migration (long time scale) throughout history by combining two complementary data sets:

Korean 족보 (good temporal resolution) +modern census data (good spatial resolution)

• marriage pattern (short time scale: “snapshots”) based on 족보: describable by simple human migration models?

• the “ergodicity” (a result of migration) of 본관

• diffusive vs convective migration

• More data (more 족보, census, other countries’ data) → better model/prediction

SHL (이상훈), R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 38: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Hawoong [ Jeong (from Dongrae)]: [(동래)정]하웅 (족보 data), Josef Novotný (Czech family name data)

Acknowledgments

Mason Porter(University of Oxford)

Beom Jun [Kim (from Gimhae)]: [(김해)김]범준 (Sungkyunkwan University)

Danny Abrams(Northwestern University)

Robyn Ffrancon(University of Gothenburg)

SHL (이상훈), R. Ffrancon, D. M. Abrams, B. J. Kim (김범준), and M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 39: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

introduced in . . .

Page 40: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

introduced in . . .

Page 41: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Home

How wedding registries reveal migration paths

Oxford Mathematics Professor Mason Porter and formerpostdoctoral student Sang Hoon Lee, nowof Sungkyunkwan University in Korea, have found a newway of analysing population mix. In the past patterns inhuman movement have been studied using traffic data,mobile phone records, and even dollar bill circulation.Their new investigation uses centuries-old Korean familybooks, called jokbos ( in Korean), as a record ofemigration by female brides. The analysis of thismovement, as well as more recent census data, showsthat both diffusion- and convection-like phenomena mayplay a role in mixing different populations.

Please contact us for feedback and comments about this page. Last update on 17 October 2014 - 12:58.

About Us

Study Here

Research

People

Events

Members

Admin

introduced in . . .

Page 42: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Home

How wedding registries reveal migration paths

Oxford Mathematics Professor Mason Porter and formerpostdoctoral student Sang Hoon Lee, nowof Sungkyunkwan University in Korea, have found a newway of analysing population mix. In the past patterns inhuman movement have been studied using traffic data,mobile phone records, and even dollar bill circulation.Their new investigation uses centuries-old Korean familybooks, called jokbos ( in Korean), as a record ofemigration by female brides. The analysis of thismovement, as well as more recent census data, showsthat both diffusion- and convection-like phenomena mayplay a role in mixing different populations.

Please contact us for feedback and comments about this page. Last update on 17 October 2014 - 12:58.

About Us

Study Here

Research

People

Events

Members

Admin

introduced in . . .

Page 43: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Sang  Hoon  Lee/Sungkyunkwan  University

Matchmaker,  Matchmaker,  Make  Me  a  Match:  Migration  of  Populations  via  Marriages  in  the  Past

Sang  Hoon  Lee  ( ),  Robyn  Ffrancon,  Daniel  M.  Abrams,  Beom  Jun  Kim  ( ),  and  Mason  A.  PorterPhys.  Rev.  X  4,  041009  (2014)Published  October  16,  2014

Synopsis:  Wedding  Registries  Reveal  MigrationPaths

Patterns  in  human  movement  have  been  studied  using  traffic  data,  mobile  phone  records,  and  even  dollar  billcirculation.  A  new  investigation  uses  centuries-­old  Korean  family  books,  called  jokbos  (  in  Korean),  as  arecord  of  emigration  by  female  brides.  The  analysis  of  this  movement,  as  well  as  more  recent  census  data,shows  that  both  diffusion-­  and  convection-­like  phenomena  may  play  a  role  in  mixing  different  populations.

Understanding  human  mobility  is  important  for  improving  city  planning  and  for  responding  to  outbreaks  ofdisease.  Previous  modeling  work  has  suggested  that  the  statistical  patterns  in  human  movement  resemblecertain  physical  phenomena,  such  as  the  diffusion  of  molecules  in  liquids.  However,  these  patterns  are  drawnprimarily  from  short-­term  (a  day  to  at  most  a  year)  tracking  data,  so  it’s  unclear  whether  the  same  models  applyto  longer-­term  migrations  that  affect  cultural  and  genetic  dissemination.

In  Korea,  marriages  have  traditionally  involved  the  relocation  of  the  bride  to  her  groom’s  home.  Sang  Hoon  Leeof  Sungkyunkwan  University,  Korea,  and  his  colleagues  analyzed    of  these  marriages  catalogued  inten  jokbos  that  date  as  far  back  as  the   th  century.  The  books  don’t  provide  specific  locations,  so  theresearchers  inferred  migration  paths  from  the  geographic  regions  associated  with  each  bride  and  groom’s  clannames.  Because  the  bride  migration  rate  between  regions  was  primarily  dependent  on  clan  populationdensities,  the  team  tried  to  explain  migration  with  a  model  in  which  Koreans  moved  randomly  away  (diffused)from  high  concentrations  of  their  own  clan.  However,  the  estimated  diffusion  rate  was  too  slow  to  explain  howwell  certain  clans  have  spread  throughout  the  country.  The  team  concluded  that  a  directed  convective-­like  flowtowards  the  capital  city  Seoul  enhanced  the  mixing  rate.  According  to  the  authors,  the  combination  of  diffusionand  convection  may  explain  other  human  flow  patterns.

This  research  is  published  in  Physical  Review  X.

–Michael  Schirber

ISSN  1943-­2879.  Use  of  the  American  Physical  Society  websites  and  journals  implies  that  the  user  has  readand  agrees  to  our  Terms  and  Conditions  and  any  applicable  Subscription  Agreement.

200 00013

Home

How wedding registries reveal migration paths

Oxford Mathematics Professor Mason Porter and formerpostdoctoral student Sang Hoon Lee, nowof Sungkyunkwan University in Korea, have found a newway of analysing population mix. In the past patterns inhuman movement have been studied using traffic data,mobile phone records, and even dollar bill circulation.Their new investigation uses centuries-old Korean familybooks, called jokbos ( in Korean), as a record ofemigration by female brides. The analysis of thismovement, as well as more recent census data, showsthat both diffusion- and convection-like phenomena mayplay a role in mixing different populations.

Please contact us for feedback and comments about this page. Last update on 17 October 2014 - 12:58.

About Us

Study Here

Research

People

Events

Members

Admin

introduced in . . .

Page 44: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Seriously, “this is physics?!?”

Page 45: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Seriously, “this is physics?!?”

Page 46: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

statistical physics: microscopic properties → interactions → macroscopic properties

regular or completely random “networks”

magnet gasmicroscopic property (N/S already at the atomic scale)

macroscopic property (magnetism)

microscopic property (kinetic energy of gas molecules)

macroscopic property (pressure)

Page 47: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

partially random or “complex” networks

regular or completely random “networks”

Page 48: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

partially random or “complex” networks

regular or completely random “networks”

Page 49: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

partially random or “complex” networks

regular or completely random “networks”

Page 50: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

partially random or “complex” networks

regular or completely random “networks”

Page 51: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

“물리학자들은 다른 사람들의 학문을 침범하기에 더 없이 적합한 사람들이다. 물론 대단히 똑똑한 탓도 있지만 일반적으로 연

구대상에 대해 그다지 까다롭게 굴지 않기 때문이다. 물리학자들은 스스로를 아카데미라는 정글의 제왕쯤으로 생각하는 경향

이 있고, 자신들의 방법이 일반의 수준보다 높다고 여기면서 자신들의 영토를 물샐틈없이 수호한다. 하지만 그들의 또 다른 자

아는 하이에나에 비견될 만한 것이어서, 쓸모가 있을 것 같으면 생각이나 기법을 기꺼이 빌려오고 남들이 풀지 못했던 문제의

뿌리를 뽑으며 즐거워한다. 이런 태도는 약삭빨라 보일 수도 있지만 이전까지 물리학이 제외되어 있던 영역에 그들이 등장하면

서 위대한 발견이나 자극으로 이어지는 경우가 많다. 수학자들이 가끔 비슷한 행동을 하기는 해도 새로운 문제의 냄새를 맡고

흥분한 굶주린 물리학자들처럼 맹렬하게 덤벼들지는 않는다…”

“Physicists, it turns out, are almost perfectly suited to invading other people’s disciplines, being not only extremely clever but also generally much less fussy than most about the problems they choose to study. Physicists tend to see themselves as the lords of the academic jungle, loftily regarding their own methods as above the ken of anybody else and jealously guarding their own terrain. But their alter egos are closer to scavengers, happy to borrow ideas and techniques from anywhere if they seem like they might be useful, and delighted to stomp all over someone else’s problem. As irritating as this attitude can be to everybody else, the arrival of the physicists into a previously non-physics area of research often presages a period of great discovery and excitement. Mathematicians do the same thing occasionally, but no one descends with such fury and in so great a number as a pack of hungry physicists, adrenalized by the scent of a new problem . . .” quoted from Duncan J. Watts “Six Degrees”

Page 52: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

Book recommendation

Page 53: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past
Page 54: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past
Page 55: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 56: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 57: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 58: Matchmaker, Matchmaker, Make Me a Match: Migration of Populations via Marriages in the Past

https://sites.google.com/site/lshlj82/ for other works of mine

Thank you for your attention!!!Any question? Ask now, or contact me anytime later.

[email protected]