족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

45
족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구 2014 한국복잡계학회 가을 컨퍼런스, 20141129, 이상훈, R. Ffrancon, D. M. Abrams, 김범준, M. A. Porter, Phys. Rev. X 4, 041009 (2014). 성균관대학교 에너지과학과 이상훈 [email protected] http://sites.google.com/site/lshlj82

Upload: sang-hoon-lee

Post on 14-Jul-2015

363 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

2014 한국복잡계학회 가을 컨퍼런스, 2014년 11월 29일,

이상훈, R. Ffrancon, D. M. Abrams, 김범준, M. A. Porter, Phys. Rev. X 4, 041009 (2014).

성균관대학교 에너지과학과 이상훈 [email protected]

http://sites.google.com/site/lshlj82

Page 2: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

https://twitter.com/AcademicsSay/status/524571824492150784

Page 3: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

https://twitter.com/AcademicsSay/status/524571824492150784

Page 4: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

경주 김 ≠ 김해 김

한국 성씨(姓氏)는 지리적 유래를 나타내는 본관(本貫)으로 구분

Page 5: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

경주 김 ≠ 김해 김

한국 성씨(姓氏)는 지리적 유래를 나타내는 본관(本貫)으로 구분

Page 6: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

학성 이 (鶴城 李) ≠ 전주 이 (全州 李) ≠ 풍덕 이 (豊德 李)

이상훈 이은

페터 홀메 교수님의 공동연구자

이성민이민진

Page 7: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

연구에 참여한 두 한국인 연구자의 경우

이상훈김범준 교수님김해 김씨 학성 이씨

Page 8: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

연구에 참여한 두 한국인 연구자의 경우

이상훈김범준 교수님김해 김씨 학성 이씨

Page 9: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

연구에 참여한 두 한국인 연구자의 경우

이상훈김범준 교수님김해 김씨 학성 이씨

Page 10: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

연구에 참여한 두 한국인 연구자의 경우

이상훈김범준 교수님김해 김씨 학성 이씨

1

수로왕

가락국 수로왕 駕 洛 國 首 露 王

가락국의 초대 국왕

본명 김수로(金 首 露 )

재위 42년 ~ 199년

왕후 허황옥(許 黃 玉 )

모후 정견모주

전임자 (초대 군주)

다음 왕 거등왕

김수로왕의 무덤

수로왕(首 露 王 , 42년?[1] ~ 199년, 재위: 42년 ~ 199년) 또는 김수로(金 首 露 )는 가락국 시조이자 김해 김씨의 시조이다. 일명 수릉(首 陵 ), 뇌실청예(惱 室 靑 裔 ) 등으로 불리기도 한다.

신라 유리왕 19년(42년) 가락국 북쪽 귀지봉(또는 구지봉, 龜 旨 峰 )에 하늘로부터 떨어진 6개의 금란(金 卵 )이 모두 변하여 6가야국의 왕이 되었다고 하는데 김수로도 그 가운데 하나로, 김해 김씨의 시조이다. 수장(首 長 ) 구도간(九 刀 干 )들이 왕으로 추대하였으며, 나라를 세워 가락국이라 하였다.

삼국사기와 삼국유사 가락국기에 의하면 하늘에서 내려온 6개의 황금 알 중 가장 먼저 깨어난 9척(약 2m[2])의 소년이 수로왕이 되었다고 하나, 신라의 최치원은 그가 정견모주라는 여성의 아들이었다고 한다. 전설에 의하면 150세 이상을 생존했다고 하나 이는 신빙성이 낮다.

생애 《 삼국사기》 와 《 삼국유사》 가락국기에 의하면 변한의 구야국에는 주민들이 각 촌락별로 나뉘어 생활하고 있었다. 그런데 42년 3월 부족장을 기다리는 구야국의 지도자들에게 "너희의 왕을 내려 보낸다"는 계시와 함께, "거북아 거북아 머리를 내놓아라. 그러지 않으면 구워서 먹으리"라는 노래를 부르라는 소리가 하늘에서 들려왔다. 하늘의 계시를 들은 부족장들은 가락국의 9간(干 ) 이하 수백 명이 김해의 구지봉(龜 旨 峰 )에 올라가 하늘에 제사를 지내고 춤을 추면서 하늘에서 들려온 말대로 "거북아 거북아 머리를 내놓아라. 그러지 않으면 구워서 먹으리"라고 구지가(龜 旨 歌 )를 불렀다.부족 주민들의 수가 늘어나 노래소리가 커지자 하늘에서 빛이 나더니 곧 붉은 보자기에 싸인 금빛 상자가 내려오고, 그 안에 둥근 황금색 알 여섯 개가 들어있었다. 12일 후 이들 알에서 사내아이들이 태어났는데, 그 가운데 키가 9척이며 제일 알에서 깨어난 아이가 수로였다. 부족장들은 그를 6가야 중 수도이자 영토가 넓은 가락국의 왕으로 추대하여 주민들은 그를 가락국의 왕으로 받들었고, 또한 나머지 아이들도 각각 5가야의 왕이 되었다고 한다.

신라의 학자 최치원의 석이정전(釋 利 貞 傳 )에 따르면 가야산의 여신 정견모주(正 見 母 主 )가 하늘의 신(神 )인 이비가(夷 毗 訶 )에 감응하여 두 아들을 낳았는데 한 명은 뇌질주일(惱 室 朱 日 )이었고, 다른 한 명은 뇌질청예(惱 窒 靑 裔 )였다. 금관가야의 시조가 된 뇌질청예(김수로왕)는 여신 정견모주를 닮아 얼굴이 희고 갸름했으며 대가야의 시조가 된 뇌질주일(이진아시왕)은 이비가를 닮아 얼굴이 해와 같이 둥글고 붉었다고 한다. 이는 금관가야의 시조인 수로왕이 맏형이었다고 한 ≪가락국기≫에 전하는 금관가야 중심의 형제설화와는 대비되는 것이다.

(1373) 1

이예 (1373년)이예(李 藝 , 1373년 ~ 1445년)는 조선 초의 무신, 외교관이다. 본관은 학성 이씨(鶴 城 ). 아호는 학파(鶴 坡 ). 시호는 충숙(忠 肅 )이며, 학성 이씨의 시조이다. 중인 계급에 속하는 아전으로 관리 생활을 시작했다.이후 정2품에 해당되는 관직인 동지중추원사의 자리까지 오르기도 했으며, 유고로 《 학파실기(鶴 坡 實 紀 )》가 있다.

이예는 울산의 향리(蔚 山 群 吏 )이며, 직책은 기관(記 官 )[1]이었다.

—조선왕조실록

생애 1397년 1월 31일 3천명의 왜구들이 울주포[2]에 침입하여 군수 이은(李 殷 ) 등을 사로잡아 돌아갔다.[3] 다른 관리들이 모두 도망가 숨은데 비해 이예는 자진해 군수를 따라가 끝까지 보필해 해적들을 감복시켰다. 후일 조선에서 파견한 통신사의 중재로 1397년 2월 이예는 군수와 함께 무사히 조선으로 돌아왔다.조정에서는 이예의 충성을 가상히 여겨 아전의 역(役 )을 면제하고 벼슬을 주었다. 이 사건을 계기로 이예는 중인 계층의 아전 신분에서 벗어나 사대부 양반으로서의 전문 외교관의 길을 걷게 된다. 8세 때 해적에게 잡혀간 어머니를 찾기 위해 조정에 청해 1400년 회례사(回 禮 使 ) 윤명(尹 銘 )의 수행원으로 대마도에 갔으나 찾지 못했다.위키백과:출처 밝히기

처음으로 사절의 책임을 맡은 것은 1401년(태종1년)으로, 보빙사로 일기도에 파견되었다. 1406년 일본 회례관(日 本 回 禮 官 )으로 파견되어, 납치되었던 남녀 70여 명을 데리고 돌아왔다.[4] 1416년 1월 27일 유구국[5]으로 가서, 왜에 의해 포로가 되었다가 유구국으로 팔려간 백성을 데려오기 위해 유구국으로 파견되었다.[6] 그는 유구국에서 44인을 데리고, 같은 해 7월 23일 귀국하였다. [7]

1418년 4월 24일 태종 18년 대마도 수호 종정무가 사망하자, 조의 사절로 대마도에 파견되어 쌀, 콩, 종이를 부의하여 그의 충성을 후사하였다. 종정무는 치세 기산 도적을 금제하여, 변경을 침범하지 못하게 했다는 이유로 특별히 이예를 파견한 것이다. [8]

“ 모르는 사람은 보낼 수 없어서, 이에 그대를 명하여 보내는 것이니, 귀찮다 생각하지 말라. ” —《 세종실록》 , 세종8년에 통신사로 일본으로 떠나는 이예에게 임금이 갓과 신을 하사하며 당부하는 말

1443년 세종 25년 왜적이 변방에 도적질하여 사람과 물건을 약탈해 갔으므로 나라에서 사람을 보내서 찾아오려 하니, 이예가 자청하여 대마도 체찰사(對 馬 島 體 察 使 )로 파견되었다. 이것이 마지막 사행(使 行 )이었다.28세 나던 1400년에서 71세 나던 1443년까지 44년간 40여회 일본[9]에 임금의 사절로 파견되었다. 그 중 조선왕조실록에 기록되어 사행(使 行 )의 내용이 구체적으로 알려진 것만 해도 13회에 달한다[10].조선왕조실록에는 44년간의 사행에서 이예가 쇄환해온 조선인 포로의 수가 667명이라고 기록되어 있다.

Page 11: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

전체 인구 중 특정 본관의 비율 (왼쪽: 김해 김씨, 오른쪽: 학성 이씨)

정말 그런가요?

인구주택총조사(1985, 2000년: 본관의 행정구역별 분포가 기록된 유이한 두 해)에 나타난 각 행정구역별 본관의 분포. 통계청 제공: http://kosis.kr/

경도 경도

위도

각 행정구역별 전체 인구 중 본관을 가진 인구의 비율

각 행정구역별 전체 인구 중 본관을 가진 인구의 비율

위도

김해울산

Page 12: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

전체 인구 중 특정 본관의 비율 (왼쪽: 김해 김씨, 오른쪽: 학성 이씨)

정말 그런가요?

인구주택총조사(1985, 2000년: 본관의 행정구역별 분포가 기록된 유이한 두 해)에 나타난 각 행정구역별 본관의 분포. 통계청 제공: http://kosis.kr/

경도 경도

위도

각 행정구역별 전체 인구 중 본관을 가진 인구의 비율

각 행정구역별 전체 인구 중 본관을 가진 인구의 비율

위도

김해울산

“에르고딕(ergodic)” vs “비에르고딕(non-ergodic)” 본관 ...

전국구 지역구

Page 13: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보(族譜)가문의 계통과 혈연관계를 부계를 중심으로 알기 쉽게 체계적으로 나타낸 책 (위키백과)

며느리의 본관과 출생연도: “부속” 자료?!

Page 14: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보(族譜)가문의 계통과 혈연관계를 부계를 중심으로 알기 쉽게 체계적으로 나타낸 책 (위키백과)

며느리의 본관과 출생연도: “부속” 자료?!

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

백승기, P. Minnhagen, 김범준, New J. Phys. 13, 073036 (2011) and references therein.

Page 15: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보(族譜)가문의 계통과 혈연관계를 부계를 중심으로 알기 쉽게 체계적으로 나타낸 책 (위키백과)

며느리의 본관과 출생연도: “부속” 자료?!

10

10-9

10-6

10-3

100

100 102 104 106 108

P(k

)

k

1900-19901600-1630

Census

Figure 6. Comparison between the actual family books (points) and RGFpredictions (lines).

1

1.1

1.2

1.3

102 104 106 108

γ

M

Figure 7. Power-law exponent as a function of M . The solid line was obtainedby analyzing the family books from 1900–1990, and the dotted line is connectedwith the census data in 2000.

Mtot = 165 020 women each having one family name out of Ntot = 194 and where kmax = 32 316are named Kim. These three numbers Mtot, Ntot and kmax uniquely determine PM(k) within theRGF model, as explained in section 3 [6]. The middle full curve in figure 6 gives the predictedsize distribution and the pluses denote the actual (binned) data points. The agreement betweenthe RGF prediction and the data is very good, in particular in view of the fact that the predictionis based solely on the three numbers Mtot, Ntot and kmax. The prediction for the exponent � in(1) is � = 1.12. As explained in section 3, the RGF model allows you to predict how the PM(k)for either a smaller m < M or larger m > M . Figure 7 displays the predicted change of � whenstarting from the data given for the period 1900–1990. The solid curve gives the change whenm is decreasing and the dotted line when it is increasing. The fact that � changes with the sizeof the data set is a fundamental feature of the RGF model and distinguishes it from the usualgrowth models, which in general give scale-invariant and hence size-independent � [6]. The leftfull curve in figure 6 is the prediction for the 1600–1630 data only using the data for 1900–1990and the number m = 384, which is the number of women getting married into the ten familiesduring the period 1600–1630. The actual name-frequency distribution for these women is givenby the crosses and the agreement is again quite good. In particular, note that the data are indeedconsistent with the slightly steeper slope for smaller k caused by a slightly larger � = 1.22(compare figure 7). The rightmost curve in figure 6, in the same way, gives the prediction basedon only using the data for the married women in 1900–1990 and the total population size in theyear 2000 given by m = 4.6 ⇥ 107. The census data from the year 2000 are also plotted and theagreement between the prediction and the data is again very good. This time the � = 1.07 is

New Journal of Physics 13 (2011) 073036 (http://www.njp.org/)

백승기, P. Minnhagen, 김범준, New J. Phys. 13, 073036 (2011) and references therein.

진정한 연구 동기: 며느리의 본관과 출생연도에 기록된 지리적/시간적 정보가 있는데 그냥 (얼마나 많은지) 분포만 보고 지나가기엔 너무나 아까운 자료 아닌지요? :)

Page 16: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보에 나타난 결혼 기록을 토대로 과거의 결혼에 의한 인구 이동을 추측

Page 17: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보에 나타난 결혼 기록을 토대로 과거의 결혼에 의한 인구 이동을 추측

Page 18: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

인간의 이주 행태를 기술하는 두 가지 모형

where mi is the population at i and rij is the distance between i and j

J. Q. Stewart, W. Warntz, J. Regional Sci. 1, 99 (1958);정우성, F. Wang, H. E. Stanley, EPL 81, 48005 (2008).

(i ! j) where Ri is the total population outgoing from i,sij is the population in the circle of radius rij centered at i(excluding the source i and destination j’s population),and N is the total population

F. Simini, M. C. González, A. Maritan, A.-L. Barabási, Nature 484, 96 (2012); A. P. Masucci, J. Serras, A. Johansson, M. Batty, Phys. Rev. E 88, 022812 (2013).

gravity model: flux Gij = Am↵

i m�j

r�ij지리적 효과의 영향력을 나타내는 지수중력모형 흐름

방사모형 흐름

mi: i의 인구 수, rij: i와 j사이의 거리

(i → j) Ri: i에서 빠져나가는 총 인구 수, sij: i를 중심으로 하고 j에 접하는 원 내부의 인구 수 (i와 j의 인구는 제외), N: 총 인구 수

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

Page 19: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보 자료와의 비교

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

인구주택총조사(1985, 2000년)에 나타난 각 행정구역별 본관의 분포. 통계청 제공: http://kosis.kr/

Page 20: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

족보 자료와의 비교

이(李) 진주 12636이(李) 진해 490이(李) 차성 2173이(李) 창녕 4132이(李) 창원 719이(李) 천안 2297이(李) 철성 4871이(李) 청도 983이(李) 청송 814이(李) 청안 13549이(李) 청양 743이(李) 청주 34756이(李) 청평 656이(李) 청해 12002이(李) 춘천 372이(李) 충남 995이(李) 충주 4022이(李) 칠성 1209이(李) 태안 4084이(李) 태원 670이(李) 토산 491이(李) 통진 227이(李) 평산 3394이(李) 평양 579이(李) 평창 65945이(李) 평택 1099이(李) 평해 205이(李) 풍천 731이(李) 하동 1052이(李) 하빈 15058이(李) 하산 828이(李) 하음 431이(李) 학성 20964이(李) 한산 136615이(李) 한성 1342이(李) 한양 1096이(李) 함경 792이(李) 함안 37597이(李) 함양 1851이(李) 함평 125419이(李) 함풍 6413이(李) 함흥 786이(李) 합천 115462이(李) 해남 878이(李) 해령 650이(李) 해주 2064이(李) 행주 870이(李) 헌양 786이(李) 홍성 1163이(李) 홍주 14897이(李) 홍천 566이(李) 화산 1775이(李) 화평 1498이(李) 황주 291

2000년 인구조사 자료

족보에 나타는 결혼 횟수

2000년 인구조사에 나타난 두 본관의 무게중심점(centroid) 사이의 거리

신부 측 본관 i

신랑 측 본관 j: 하나의 족보에서는 모두 동일

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

인구주택총조사(1985, 2000년)에 나타난 각 행정구역별 본관의 분포. 통계청 제공: http://kosis.kr/

Page 21: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

중력 & 방사 모형 적용 결과

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

신랑 측 본관 j: 하나의 족보에서는 모두 동일 (상수)

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

모형이 예측하는 결혼 흐름 (중력모형)

모형이 예측하는 결혼 흐름 (방사모형)

실제 족보에서 나타나는 결혼의 수

선형 피팅: 결혼 수

(i = j 인 경우를 포함하기 위해)이고

이고

Page 22: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

중력 & 방사 모형 적용 결과

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

신랑 측 본관 j: 하나의 족보에서는 모두 동일 (상수)

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

모형이 예측하는 결혼 흐름 (중력모형)

모형이 예측하는 결혼 흐름 (방사모형)

실제 족보에서 나타나는 결혼의 수

선형 피팅: 결혼 수

(i = j 인 경우를 포함하기 위해)이고

이고

거리에 관계가 없고 ? [족보에 해당하는 본관이 “에르고딕”(전국구)이면 신랑이 전국 어디에든 있을 수 있으므로 거리에 관계없이 결혼이 이루어지는 것처럼 보이는 것!]

/ m↵i

Page 23: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

중력 & 방사 모형 적용 결과

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

신랑 측 본관 j: 하나의 족보에서는 모두 동일 (상수)

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

모형이 예측하는 결혼 흐름 (중력모형)

모형이 예측하는 결혼 흐름 (방사모형)

실제 족보에서 나타나는 결혼의 수

선형 피팅: 결혼 수

(i = j 인 경우를 포함하기 위해)

신랑과 본관이 같은 신부의 수. 뿌리깊은 동성동본혼 금지 (2005년에 법 조항에서 삭제) 풍습이 나타남.

이고

이고

거리에 관계가 없고 ? [족보에 해당하는 본관이 “에르고딕”(전국구)이면 신랑이 전국 어디에든 있을 수 있으므로 거리에 관계없이 결혼이 이루어지는 것처럼 보이는 것!]

/ m↵i

Page 24: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

중력 & 방사 모형 적용 결과

linear fit: marriage flux Ni / Gij

linear fit: marriage flux Ni / Rij

↵ ⇡ 1.0749 and � ⇡ �0.0349! ↵ = 1 and � = 0 (to include the case i = j)

신랑 측 본관 j: 하나의 족보에서는 모두 동일 (상수)

radiation model: flux Rij =Ri

1�mi/N

mimj

(mi + sij)(mi +mj + sij)

gravity model: flux Gij = Am↵

i m�j

r�ij

에르고딕성(지역성)을 정량화하는 간단한 양: 본관이 한 명이라도 존재하는 행정구역의 수

“에르고딕”(전국구) vs “비에르고딕”(지역구) 본관

모형이 예측하는 결혼 흐름 (중력모형)

모형이 예측하는 결혼 흐름 (방사모형)

실제 족보에서 나타나는 결혼의 수

선형 피팅: 결혼 수

(i = j 인 경우를 포함하기 위해)

신랑과 본관이 같은 신부의 수. 뿌리깊은 동성동본혼 금지 (2005년에 법 조항에서 삭제) 풍습이 나타남.

이고

이고

거리에 관계가 없고 ? [족보에 해당하는 본관이 “에르고딕”(전국구)이면 신랑이 전국 어디에든 있을 수 있으므로 거리에 관계없이 결혼이 이루어지는 것처럼 보이는 것!]

/ m↵i

Page 25: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

“에르고딕”(전국구) vs “비에르고딕”(지역구) 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

본관이 한 명이라도 존재하는 행정구역 수의 분포

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

비에르고딕(지역구)

본관이 차지하는 행정구역 수

확률밀도 (본관별)

에르고딕(전국구)

Page 26: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

“에르고딕”(전국구) vs “비에르고딕”(지역구) 본관

6

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2000

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

ba

bili

ty d

istr

ibu

tion

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

본관이 한 명이라도 존재하는 행정구역 수의 분포

Fig. 1. Examples of (a) ergodic and (b) non-ergodic clans. We color the regions of South Korea based on the percentage of the population made up of members of the clanin 2000. We use arrows to indicate the origins of the two clans: Gimhae on the left and Ulsan (“Hakseong” is the old name of the city) on the right. In this map, we use the2010 administrative boundaries [28]. (See the SI for discussions of data sets and data cleaning.)

Fig. 2. (a) Scatter plot of the number of clan entries in jokbo 1 versus the correspondingcentroid in 2000 using the gravitational-model flux with parameters ↵ = 1 and � = 0. Wecompute the line using a linear regression to find the fitting parameter, aG ⇡ 4.2(2)⇥10�10

with a 95% confidence interval, to satisfy the expression Ni = aGGij , where Gij is thegravity-model flux andNi is the total number of entries from clan i in the jokbo. (b) The sameclan entries compared to the radiation model. We compute the line using a linear regression tofind the fitting parameter, aR ⇡ 0.049(2), to satisfy the expression Ni = aRRij , whereRij is the radiation-model flux and Ni is the total number of entries from clan i in the jokbo.In both panels, we color the points using the number of administrative regions occupied by thecorresponding clans, which we show in Figs. 4(a) and (b). The red markers (outliers) on bothpanels correspond to the clan corresponding to jokbo 1 (the case i = j).

6 www.pnas.org/cgi/doi/10.1073/pnas. Footline Author

비에르고딕(지역구)

본관이 차지하는 행정구역 수

확률밀도 (본관별)

성씨가 차지하는 행정구역 수

확률밀도 (성씨별)

19

Surnames, Geoforum 42, 506 (2011).[25] J. Novotný and J. A. Cheshire, The Surname Space of the Czech

Republic: Examining Population Structure by Network Anal-ysis of Spatial Co-Occurrence of Surnames, PLOS ONE 7,e48568 (2012).

[26] An original jokbo usually includes more details about clans, butwe only use information that we mention explicitly.

[27] Google Maps, https://developers.google.com/maps/.

[28] In fact, it is known that clans that originated in the southernpart of Korea have been more abundant in recent Korean history(including the period that is spanned by our data set) [20].

[29] Korean Statistical Information Service, http://kosis.kr/ (Korean version) and http://kosis.kr/eng/ (En-glish version).

[30] Statistical Geographic Information Service (:üx>⇢t�o�&Ò⌦ò–"fq�€º in Korean), http://sgis.kostat.go.kr/ (Ko-rean version; no English version available).

[31] J. P. Sethna, Statistical Mechanics: Entropy, Order Parame-

0 2000

500

number of regions occupied

dis

tance

move

d (

km) (a)

0 2500

500

radius of gyration (km)

dis

tance

move

d (

km) (b)

0 2000

250

number of regions occupied

rad. of gyr

atio

n (

km) (c)

FIG. I2. Scatter plots of (a) the distance between the clan-originlocation and the population centroid versus the number of admin-istrative regions, (b) the distance between the clan-origin locationand the population centroid versus the radius of gyration, and (c)the radius of gyration versus the number of administrative regions.The corresponding Pearson correlation values are (a) r ⇡ 0.20 (from3 481 clans that include all of the required information; the p-valueis p ⇡ 5.7 ⇥ 10�34), (b) r ⇡ 0.066 (from 3 481 clans that includeall of the required information; the p-value p ⇡ 9.4 ⇥ 10�5), and (c)r ⇡ �0.26 (from 3 481 clans; the p-value is p ⇡ 1.8⇥10�53). Note thatcorrelations over limited ranges may be di↵erent and significant: forexample, in panel (c), the two metrics for ergodicity are significantlypositively correlated when rg < 50 km (r ⇡ 0.39, p ⇡ 2.3 ⇥ 10�4).

ters and Complexity (Oxford University Press, Oxford, UnitedKingdom, 2006)

[32] J. van Lith, Ergodic Theory, Interpretations of Probability andthe Foundations of Statistical Mechanics, Stud. Hist. Philos.Sci. 32, 581 (2001).

[33] Because we only have ten family books, our data allows onlyten distinct possibilities for “clan j” (the family into which awoman marries).

[34] S. Cho, Chapter 8. Traditional Korean Culture in Cultural andEthnic Diversity: A Guide for Genetics Professionals eds FisherNL (The Johns Hopkins University Press, Maryland, UnitedStates, 1996).

[35] F. Simini, M. C. González, A. Maritan, and A.-L. Barabási, AUniversal Model for Mobility and Migration Patterns, Nature(London) 484, 96 (2012).

[36] A. P. Masucci, J. Serras, A. Johansson, and M. Batty, Grav-ity versus Radiation Models: On the Importance of Scale andHeterogeneity in Commuting Flows, Phys. Rev. E 88, 022812

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(cl

an

s) (a)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(cl

an

s) (c)

0 2060

0.02

number of regions occupied

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(b)

0 2500

0.02

radius of gyration (km)

pro

ba

bili

ty d

ist.

(in

div

idu

als

)

(d)

FIG. I3. (a) Probability distribution of the number of di↵erent admin-istrative regions occupied a Czech family name in 2009. Note thatthe leftmost two bars have heights of 0.17 and 0.03 but have beentruncated for visual presentation. (This data was initially analysedin Ref. [25].) (b) Probability distribution of the number of di↵erentadministrative regions occupied by the clan of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions (c) Probabil-ity distribution of radii of gyration (in km) of Czech family namesin 2009. Note that the leftmost bar has a height of 0.11 but has beentruncated for visual presentation. (d) Probability distribution of radiiof gyration (in km) of Czech family names of a Czech individualselected uniformly at random in 2009. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Observethat the distributions in panels (a) and (b) are starkly di↵erent fromthe distributions in panels (a) and (b) from Fig. 3. Solid curves arekernel density estimates (from Matlab R2011a’s ksdensity func-tion with a Gaussian smoothing kernel of width 5).

체코의 성씨 분포

비에르고딕(지역구)

자료 출처: J. Novotný, J. A. Cheshire, PLOS ONE 7, e48568 (2012).

then interpreted as a result of the subsequent resettlement andindustrialization led immigration into these areas, but also of somestate policies that have contributed to the spatial concentrations(and often also segregations) of Roma minority groups [23].

An intriguing exception to these explanations is a typical Czechsurname Vlcek that can also be found in Figure 6 because of itssignificant revealed relatedness with Wolf. From all of thesurnames considered, the name Wolf has been found as thenearest neighbour of Vlcek, with the 56.7% probability that one ofthese surnames concentrates in the region where another one isconcentrated. The high co-occurrence of these two surnames inthe identical regions seems to be attributable to their commonmeaning – Vlcek literally means ‘‘small Wolf’’ in the Czechlanguage. The bi-lingual naming practices or secular nametransformations taking place in these historically multi-ethnicregions (German and Czech) offers the most likely explanation forsuch commonalities.

Analysis of co-occurrence in municipalitiesIn the second stage of our analysis we examined the co-

occurrence of Czech surnames at the finest spatial level of 6,244municipalities. We began with the calculation of the pairwiseindices of revealed relatedness (Di,j,mun) among 5,660 surnamesselected on the basis of the highest revealed relatedness at moreaggregate spatial level. This sample of surnames covers almost ahalf of the Czech male population. Given the significantly highernumber of spatial units considered for this second stage of ouranalysis, the values of Di,j,mun are generally lower than Di,j,reg in thefirst stage which focused on co-occurrence in 206 micro-regionsonly. At the same time, the size distribution of these second stageresults is even more skewed to the right; the maximum Di,j,mun

(from the total of more than 32 million of observations)corresponds to 0.687, while only 0.011% of all observationsexceed 50% of the maximum value. These differences between thefirst and second stage results are understandable and go hand inhand with the expectation that the surname network based on themunicipality level calculations will be more fragmented.

This has been confirmed by the fact that a majority of the mostsignificant Di,j,mun proximity observations occur among relatively

rare surnames that are typically concentrated in a few nearbymunicipalities. This is especially the case of Silesian surnames thataccount for almost all Di,j,mun observations at the very top of thedistribution of results. As such, in order to get a reasonablenetwork representation, we again had to impose some restrictionsin relation to the minimal size of surnames shown as nodes and thestrength of links between them. After applying the criteria from theprevious section, we found the frequency of at least 150 bearersand the links determined by Di,j,mun .0.23 to be optimal. Thesurname network based on these parameters and generated by aweighted force-directed algorithm is depicted in Figure 7 (Fig-ure S5 depicts a high resolution version with labels of individualsurnames and the size of nodes scaled by their population size).

In general, the second stage or municipality level surnamenetwork has reproduced the macro-division of the Czech surnamespace identified in the first stage and described above. Theproportions between the sizes of the main clusters are howeverdifferent with the previously mentioned dominance of the densegroup of Silesian surnames (B2). Regarding Moravian surnames,again the commonality of verb-derived surnames emerges, as theyform the majority of names in the B1 area of the network. TheBohemian part of the surname space (A) is structured into threemain groups of surnames. The A1 cluster comprises some of themost frequent surnames and those prevalent across most ofBohemian regions, whilst the separation from the secondarycluster (A2) is hardly discernible. By contrast, two other core areasare well recognizable and represent northern and easternBohemian names (A3) more specifically and surnames concen-trated mainly in municipalities in the north-west and west ofBohemia (A4).

The general congruence in macro-structure of the surnamenetworks constructed here and in the first stage of our analysis isan important finding (generally similar macro-structure was alsofound when the Ji,j,reg and Ji,j,mun were considered instead of theDi,j,reg and Di,j,mun, respectively). However, the main value of thissecond stage municipality level exercise should be seen inindividual details uncovered with respect to local parts of thesurname network. A number of interesting examples of pairs ofsurnames that have been found as potentially closely related,

Figure 4. Spatial concentration of individual communities of high degree surnames. Individual maps show regional variation in thepercentage of high degree surnames from particular core communities (A1, A2, A3, A4, B1, B2) as listed in Table 5 concentrated in a given region. Forexample, if the percentage of high degree surnames for A1 (the upper left map) corresponds to 100, then all the surnames listed in the A1 group inTable 5 are concentrated in a given region (that is, all of them satisfy LQi,r .1 for the region in question).doi:10.1371/journal.pone.0048568.g004

The Surname Space of the Czech Republic

PLOS ONE | www.plosone.org 8 October 2012 | Volume 7 | Issue 10 | e48568

에르고딕(전국구)

Page 27: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

간단한 본관별 인구 확산 모형

본관들이 밀도가 높은 곳에서 낮은 곳으로 자유롭게 확산한다고 가정, 1985년과 2000년의 인구조사 자료를 경계조건(boundary condition)으로 잡고 미분방정식을 수치적으로 풀어서 확산상수를 예측.

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

Page 28: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

간단한 본관별 인구 확산 모형

본관들이 밀도가 높은 곳에서 낮은 곳으로 자유롭게 확산한다고 가정, 1985년과 2000년의 인구조사 자료를 경계조건(boundary condition)으로 잡고 미분방정식을 수치적으로 풀어서 확산상수를 예측.

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

함안 조씨 분포의 시간적 변화 예측

3

FIG

.3.Example

ofclananomaly

picture

forclan1,1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredat

CM

positionwithradiussetto

radiusofgyration.Heredi↵usivee↵

ects

are

visible.

FIG

.4.Example

ofclananomaly

picture

forKim

from

Gim

hae,

1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredatCM

positionwithradiussetto

radiusofgyration.(N

ote

positionandsize

ofcircle

are

reasonable.)

large �

small �

Page 29: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

간단한 본관별 인구 확산 모형

본관들이 밀도가 높은 곳에서 낮은 곳으로 자유롭게 확산한다고 가정, 1985년과 2000년의 인구조사 자료를 경계조건(boundary condition)으로 잡고 미분방정식을 수치적으로 풀어서 확산상수를 예측.

a very simple di↵usion model described by Fick’s second law:

the flux of clan members

~Ji / ~r�i

(individuals move preferentially away from high concentrations of their family),

then

@ci@t

=

~r · ~Ji / r2�i

(assuming nor spatial variation in the constant of proportionality), or

@�i

@t= Dir2�i,

with the di↵usion constant Di with dimensions [length

2/time].

함안 조씨 분포의 시간적 변화 예측

3

FIG

.3.Example

ofclananomaly

picture

forclan1,1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredat

CM

positionwithradiussetto

radiusofgyration.Heredi↵usivee↵

ects

are

visible.

FIG

.4.Example

ofclananomaly

picture

forKim

from

Gim

hae,

1985vs.

2000,usingDanny’s

anomaly

defi

nition.Circleis

centeredatCM

positionwithradiussetto

radiusofgyration.(N

ote

positionandsize

ofcircle

are

reasonable.)

large �

small �6

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

clans) (a)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

clans) (c)

0 2000

0.02

number of regions occupied

pro

babili

ty d

ist. (

indiv

iduals

)

(b)

0 2500

0.02

radius of gyration (km)

pro

babili

ty d

ist. (

indiv

iduals

)

(d)

FIG. 3. Distribution of the number of di↵erent administrative regionsoccupied by clans. (a) Probability distribution of the number of dif-ferent administrative regions occupied by a Korean clan in the year2000. (b) Probability distribution of the number of di↵erent admin-istrative regions occupied by the clan of a Korean individual selecteduniformly at random in the year 2000. The di↵erence between thispanel and the previous one arises from the fact that clans with largerpopulations tend to occupy more administrative regions. Note thatthe rightmost bar has a height of 0.17 but has been truncated for vi-sual presentation. (c) Probability distribution of radii of gyration (inkm) for clans in 2000. (d) Probability distribution of radii of gyra-tion (in km) for clans of a Korean individual selected uniformly atrandom in 2000. The di↵erence between this panel and the previ-ous one arises from the fact that clans with larger populations tendto occupy more administrative regions. Solid curves are kernel den-sity estimates (from Matlab R2011a’s ksdensity function with aGaussian smoothing kernel of width 5).

have jokbo are fairly ergodic, so the variables associated withthe j indices (i.e., the grooms) in Eqs. (1) and (2) have alreadylost much of their geographical precision, which is consistentboth with the values ↵ = 0 and � = 0 (the population prod-uct model). Again see the scatter plots in Fig. 2, in which wecolor each clan according to the number of di↵erent admin-istrative regions that it occupies. Note that the three di↵erentergodicity diagnostics are only weakly correlated (see Fig. I2).

Our observations of clan bimodality for Korea contrastsharply with our observations for family names in theCzech republic, where most family names appear to be non-ergodic [25] (see Fig. I3). One possible explanation of theubiquity of ergodic Korean names is the historical fact thatmany families from the lower social classes adopted (or evenpurchased) names of noble clans from the upper classes nearthe end of the Joseon dynasty (19th–20th centuries) [20, 52].At the time, Korean society was very unstable, and this pro-cess might have, in essence, introduced a preferential growthof ergodic names.

In Fig. 4, we show the distribution of the di↵usion constants

−5 200

0.7

diffusion constant (km2/year)

pro

babili

ty d

istr

ibutio

n

FIG. 4. Distribution of estimated di↵usion constants (in km2/year)computed using 1985 and 2000 census data and Eq. (3). Thesolid curve is a kernel density estimate (from Matlab R2011a’sksdensity function with default smoothing). See the Appendixfor details of the calculation of di↵usion constants.

that we computed by fitting to Eq. (3). Some of the values arenegative, which presumably arises from finite-size e↵ects inergodic clans as well as basic limitations in estimating di↵u-sion constants using only a pair of nearby years. In Fig. I4,we show the correlations between the di↵usion constants andother measures.

C. Convection in Addition to Di↵usion as Another Mechanismfor Migration

The assumption that human populations simply di↵use is agross oversimplification of reality. We will thus consider theintriguing (but still grossly oversimplified) possibility of si-multaneous di↵usive and convective (bulk) transport. In thepast century, a dramatic movement from rural to urban areashas caused Seoul’s population to increase by a factor of morethan 50, tremendously outpacing Korea’s population growthas a whole [53]. This suggests the presence of a strong at-tractor or “sink” for the bulk flow of population into Seoul, ashas been discussed in rural-urban labor migration studies [54].The density-equalizing population cartogram [55] in Fig. I5clearly demonstrates the rapid growth of Seoul and its sur-roundings between 1970 and 2010.

If convection (i.e., bulk flow) directed towards Seoul hasindeed occurred throughout Korea while clans were simulta-neously di↵using from their points of origin, then one oughtto be able to detect a signature of such a flow. In Fig. 5(a),we show what we believe is such a signature: we observethat the fraction of ergodic clans increases with the distancebetween Seoul and a clan’s place of origin. This would be un-expected for a purely di↵usive system or, indeed, in any othersimple model that excludes convective transport. By allow-ing for bulk flow, we expect to observe that a clan’s mem-bers preferentially occupy territory in the flow path that islocated geographically between the clan’s starting point andSeoul. For clans that start closer to Seoul, this path is short;for those that start farther away, the longer flow path ought

예측되는 확산상수(diffusion constants) 분포

확산상수 (km2/년)

확률밀도

Page 30: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“확산형(diffusive)”과 “대류형(convective)” 인구 이동?

본관의 유래지로부터 현재 무게중심점까지의 거리 (km)

본관이 에르고딕(전국구)인

비율

Page 31: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“확산형(diffusive)”과 “대류형(convective)” 인구 이동?

확산형 이동?

본관의 유래지로부터 현재 무게중심점까지의 거리 (km)

본관이 에르고딕(전국구)인

비율

Page 32: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“확산형(diffusive)”과 “대류형(convective)” 인구 이동?

확산형 이동?

본관의 유래지로부터 현재 무게중심점까지의 거리 (km)

본관이 에르고딕(전국구)인

비율

Page 33: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“확산형(diffusive)”과 “대류형(convective)” 인구 이동?

대류형 이동?확산형 이동?

본관의 유래지로부터 현재 무게중심점까지의 거리 (km)

본관이 에르고딕(전국구)인

비율

Page 34: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

7

distance from clan originlocation to Seoul (km)

fract

ion e

rgodic

(a)

0 3250.3

0.7

distance from clan origin locationto current clan centroid (km)

fract

ion e

rgodic

(b)

0 3250

0.7

FIG. 5. Fraction of ergodic clans and distance scales of clans. (a)Fraction of ergodic clans versus distance to Seoul. The correlationbetween the variables is positive and statistically significant. (ThePearson correlation coe�cient is r ⇡ 0.83, and the p-value is p ⇡0.0017.) For the purpose of this calculation, we call a clan “ergodic”if it is present in at least 150 administrative regions. We estimate thisfraction separately in each of 11 equally-sized bins over the displayedrange of distances. The gray regions give 95% confidence intervals.(b) Fraction of ergodic clans versus the distance between location ofclan origin and the present-day centroid. We measure ergodicity as inthe left panel, and we estimate the fraction separately for each rangeof binned distances. (We use the same bins as in the left panel.) Thecorrelation between the variables is significantly positive up to 150km (r ⇡ 0.94, p ⇡ 0.0098) and is significantly negative (r ⇡ �0.98,p ⇡ 2.4 ⇥ 10�4) for larger distances.

to contribute to an increased number of administrative regionsoccupied and hence to a greater aggregate ergodicity. We plotthe fraction of ergodic clans versus the distance a clan hasmoved (which we estimate by calculating distances betweenclan-origin locations and the corresponding modern clan cen-troids) in Fig. 5(b). This further supports our claim that bothconvective and di↵usive transport have occurred. In additionto the fraction of ergodic clans, we also compared the radii ofgyration rg to the distance to Seoul and to the distance betweenlocation of clan origin and the present-day centroid, as shownin Fig. I6, and the latter shows the same tendency as the frac-tion of clans, which further supports our claim. We speculatethat the absence of statistical significance of the correlationbetween rg and the distance to Seoul is caused by the sam-pling problem, where many small clans are not included be-cause their centroids cannot be determined (see Appendix B).

We assume that clans that have moved a greater distancehave also existed for a longer time and hence have undergonedi↵usion longer; we thus also expect such clans to be more er-godic. This is consistent with our observations in Fig. 5(b) fordistances less than about 150 km, but it is di�cult to use thesame logic to explain our observations for distances greaterthan 150 km. However, if one assumes that long-distancemoves are more likely to arise from convective e↵ects thanfrom di↵usive ones, then our observations for both short andlong distances become understandable: the fraction of movesfrom bulk-flow e↵ects like resettlement or transplantation islarger for long-distance moves, and they become increasingly

dominant as the distance approaches 325 km (roughly the sizeof the Korean peninsula). We speculate that the clans thatmoved farther than 150 km are likely to be ones that origi-nated in the most remote areas of Korea, or even outside ofKorea, and that they have only relatively recently been trans-planted to major Korean population centers, from which theyhave had little time to spread. This observation is necessarilyspeculative because the age of a clan is not easy to determine:the first entry in a jokbo (see Table A-I for our ten jokbo) couldhave resulted from the invention of characters or printing de-vices rather than from the true birth of a clan [20].

Ultimately, our data are insu�cient to definitively ac-cept or reject the hypothesis of human di↵usion. However,as our analysis demonstrates, our data are consistent withthe theory of simultaneous human “di↵usion” and “convec-tion.” Furthermore, our analysis suggests that if the hypoth-esis of pure di↵usion is correct, then our estimated di↵u-sion constants indicate a possible time scale for relaxation toa dynamic equilibrium and thus for mixing in human soci-eties. In mainland South Korea, it would take approximately(100000 km2)/(1.5 km2/year) ⇡ 67000 years for purely dif-fusive mixing to produce a well-mixed society. A convectiveprocess thus appears to be playing the important role of pro-moting human interaction by accelerating mixing in the pop-ulation. In spite of such limitations, we try to estimate andquantify the centrality of Seoul in the context of a popula-tion flow network model and find other distinctive characteris-tics of the population flow patterns of ergodic and non-ergodicclans. For details, see Appendix H.

V. CONCLUSIONS

The long history of detailed record-keeping in Korean cul-ture provides an unusual opportunity for quantitative researchon historical human mobility and migration, and our inves-tigation strongly suggests that both “di↵usive” and “convec-tive” patterns have played important roles in establishing thecurrent distribution of clans in Korea.

By studying the geographical locations of clan origins injokbo (Korean family books), we have quantified the extent of“ergodicity” of Korean clans as reflected in time series of mar-riage snapshots. This illustrates the need to investigate the dis-tribution of individual clans in more details. Additionally, bycomparing our results from Korean clans to those from Czechfamilies, we have also demonstrated that these ideas can giveinsightful indications of di↵erent mobility and migration pat-terns in di↵erent cultures. Our ergodicity analysis using mod-ern census data clearly illustrates that there are both ergodicand non-ergodic clans, and we have used these results to sug-gest two di↵erent mechanisms for human migration on longtime scales. Many processes involve the attractiveness of thecenter versus di↵usion away from it, so this type of modelingframework can be applied to a lot of data sets, we believe.

A noteworthy feature of our analysis is that we used bothdata with high temporal resolution but low spatial resolution(jokbo data) and data with high spatial resolution but low tem-poral resolution (census data). This allowed us to consider

“확산형(diffusive)”과 “대류형(convective)” 인구 이동?

대류형 이동?확산형 이동?

순수한 확산형 이동만으로는 설명이 안 됨!

본관의 유래지로부터 현재 무게중심점까지의 거리 (km)

본관이 에르고딕(전국구)인

비율

Page 35: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

순수한 확산형 이동만으로는 설명이 안 됨!

임진왜란 (1592-1598)

Page 36: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

순수한 확산형 이동만으로는 설명이 안 됨!

정묘/병자호란 (1627-1637)

Page 37: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

순수한 확산형 이동만으로는 설명이 안 됨!

일제강점기 산업화

1950.6.25-1953.7.27.-…

1930 1945 1970 1990

Page 38: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

in-degree distribution as a result of the given fraction p ofedges that are redirected uniformly at random, except forthe central node (i.e., Seoul) [83]. Therefore, the emergenceof a second-largest hub comparable in size to the largesthub (Seoul) is extremely unlikely. We illustrate one instanceof such a rewired network in Fig. 15(c), and the MRS for allclans that we constructed from empirical data differs sig-nificantly from the null-model network (see Table Vas well).It is also instructive to examine the population-flow

networks for individual clans. As with prior discussions, wewill use Kim from Gimhae as an example of an ergodic clanand Lee from Hakseong as an example of a nonergodic clan(see Fig. 1).When we consider the population-flow network for the

clan Kim from Gimhae [by using Niðk; tÞ with i corre-sponding to Kim from Gimhae in Eq. (H3)], we obtain aqualitatively similar result—namely, an abundance ofedges terminating in Seoul—to what we obtained whenusing all clans. See Figs. 14(b) and 16(a), and Table V.By contrast, we find that two different locations“attract” the population for Lee from Hakseong.Following the general trend in the population, one areais the Gyeonggi Province in the northwestern part ofSouth Korea that surrounds the Seoul area. (The name“Gyeonggi” means “the area surrounding capital” inKorean, and it is often construed to be essentially an“extended Seoul.”) The other area is Ulsan/Busan in thesoutheastern part of South Korea (where the clan originis located). See Figs. 14(c) and 16(b), and Table V. Asone can see from Fig. 14(c), the Seoul region is not

special for this clan. Therefore, we see that this young,nonergodic clan has a different mobility pattern fromthe stabilized, ergodic clans that follow the generaltrend in population flow.

33

34

35

36

37

38

39

125 126 127 128 129 130 131

latit

ude

longitude

(a) 1970

SeoulBusanInchonDaeguDaejon

GeonggiGangwon

UlsanChungbuk

ChungnamJeonbuk

JeonnamGwangju

GyeongbukGyeongnam

Jeju 33

34

35

36

37

38

39

125 126 127 128 129 130 131

latit

ude

longitude

(b) 2010

SeoulBusanInchonDaeguDaejon

GeonggiGangwon

UlsanChungbuk

ChungnamJeonbuk

JeonnamGwangju

GyeongbukGyeongnam

Jeju

FIG. 21. Density-equalizing population cartograms [63] for South Korea using population data from (a) 1970 and (b) 2010 censuses[38]. The coordinates are longitude on the horizontal axis and latitude on the vertical axis. The growth of the Seoul metropolitan areaover the past 40 years is clearly visible. (Compare this figure to a regular map of South Korea, such as the one in Fig. 1 in the main text.)

distance from clan originlocation to Seoul (km)

radi

us o

f gyr

atio

n (k

m)

(a)

0 300110

140

distance from clan origin locationto current clan centroid (km)

radi

us o

f gyr

atio

n (k

m)

(b)

0 300110

140

FIG. 22. Radii of gyration and distance scales of clans. (For thisfigure, we use the 3 120 clans that are present in both the 1985 and2000 censuses and for which we could determine the originlocation.) (a) Radius of gyration rg versus distance to Seoul. ThePearson correlation between the variables is not statisticallysignificant (r ≈ 0.18; thep-value isp ≈ 0.6). (b) Radius of gyrationrg versus distance between the clan origin location and the present-day centroid. The Pearson correlation between the diagnostics ispositive and statistically significant up to 170 km (r ≈ 0.86,p ≈ 0.01) and is negative and significant for larger distances(r ≈ −0.96, p ≈ 0.005). For each of the panels, we estimate rgseparately in each of 11 equally-sized bins for the displayed range ofdistances. The gray regions give 95% confidence intervals.

LEE et al. PHYS. REV. X 4, 041009 (2014)

041009-20

면적이 인구에 비례하도록 그려진카토그램 (cartogram): 1970년 vs 2010년

Page 39: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

요약 & 미래 연구 조망

•역사적인 (오랜 시간동안 이루어진) 인간의 이주를 두 상보적인 자료를 묶어 분석: 족보 (시간적 해상도가 좋음) +인구주택총조사 (공간적 해상도가 좋음)

•족보에 나타난 결혼 행태: 간단한 인구 이주 모형으로 기술 가능?

•본관의 “에르고딕성” (전국구 vs 지역구): 이주의 결과

•확산형 vs 대류형 이주

•좀더 많은 자료 (더 많은 족보, 외국 사례 등 …) → 더 좋은 모형/예측

이상훈, R. Ffrancon, D. M. Abrams, 김범준, M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 40: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

[(동래)정]하웅 (족보자료), Josef Novotný (체코 성씨 자료)

공동연구자 & 감사드릴 분들

Mason Porter(University of Oxford)

[(김해)김]범준 (성균관대학교)

Danny Abrams(Northwestern University)

Robyn Ffrancon(University of Gothenburg)

이상훈, R. Ffrancon, D. M. Abrams, 김범준, M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 41: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

[(동래)정]하웅 (족보자료), Josef Novotný (체코 성씨 자료)

공동연구자 & 감사드릴 분들

Mason Porter(University of Oxford)

[(김해)김]범준 (성균관대학교)

Danny Abrams(Northwestern University)

Robyn Ffrancon(University of Gothenburg)

이상훈, R. Ffrancon, D. M. Abrams, 김범준, M. A. Porter, Phys. Rev. X 4, 041009 (2014).

Page 42: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 43: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 44: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

http://scienceon.hani.co.kr/206375 2014. 10. 29. 12:30족보 빅데이터 연구, 국제물리학 저널에 실려

7/8페이지http://scienceon.hani.co.kr/?act=dispMediaPrintArticle&document_srl=206375

리나라의 성씨 분포를 살필 수 있다는 것, 그리고 그 결과 우리나라의 독특한 성씨 분포가 최소한

500여 년 전에도 비슷했다는 것을 주요 결론으로 제시했습니다. [3] 또다른 논문은 공저자인 스

웨덴의 민하겐 교수님의 통계적 모형을 이용해 과거 우리나라 성씨 분포를 이해하려는 시도였고

여기에서 수백 년 동안 그 통계적 특성이 변하지 않았음을 이용해 더 과거로 돌아가면 서기 500

년 즈음에는 김씨가 1만 명 정도였을 것으로 추론된다는 흥미로운 결과도 얻었습니다."

 [1] http://journals.aps.org/pre/abstract/10.1103/PhysRevE.76.046113

 [2] http://statphys.skku.ac.kr/Papers/jkps_kiet1.pdf

 [3] http://iopscience.iop.org/1367-2630/13/7/073036/

물리학자가 왜 족보 연구를 하게 되었는지요?

 "저와 이상훈 박사가 연구하는 통계물리학 분야에서는 우리사회에서 벌어지는 현상의 거시적

패턴을 이해하고자 하는 다양한 시도들이 있습니다. 예를 들어 소득 분포 패턴이나, 지진 크기의

분포, 또 주가 낙폭의 분포 등이 모두 다 통계물리학자들의 관심사 중 하나이죠. 10여 년 전부터

우리나라 성씨 분포의 패턴에 관심을 갖게 됐습니다. 어떻게 하면 연구에 사용할 자료를 구할 수

있을까 하다가 몇 도시의 전화번호부, 그리고 당시 몸담고 있던 학교의 출석부도 구해 성씨 분포

를 연구해 보기도 했습니다. 그러다 문득 만약에 전산화한 족보가 있다면 과거에 그 집안에 시집

온 여자들의 성씨를 이용해 수백년 간의 성씨 분포도 연구할 수 있을 것이라는 아이디어가 생겼

던 겁니다. 사실 제가 했던 연구는 족보 연구라고 하기에는 무리가 있고, 족보 자료를 활용한 성

씨, 본관 연구라고 해야 맞습니다."

이번 연구팀이 다국적인데 어떻게 구성됐는지 궁금합니다. 우리 족보 기록이 해외 연구자의 눈에

는 어떻게 비치는지요?

 "이번 연구에선 이 박사와 옥스포드대학의 포터 교수, 두 분이 허브 역할을 했습니다. 이번 연

구는 저의 성씨 분포 연구를 알고 있던 이 박사님이 제가 정리해 갖고 있던 족보 자료를 이용해

공동연구를 하자고 먼저 제안해 이뤄졌습니다. 함께 연구한 외국인들에게 우리나라 족보 자료는

정말로 신기하게 보이는 것 같습니다. 특히, 우리 선조의 기록 문화가 잘 발달해 무려 몇백 년 전

에 만들어진 족보가 여전히 집안에서 대대로 이어지고 있다는 것이 놀라운 모양입니다. 외국의

경우에는 교회에 개인의 출생에 대한 오래 전 기록이 남아 있긴 하지만, 우리나라 족보처럼 한 집

안의 가계도가 아주 오랜 동안 꾸준히 기록되어 보전되는 예는 거의 없는 것으로 알고 있습니다."

■ 논문 초록 ■ 논문 초록

 사람의 이동성 연구는 기본적으로 중요하며 또한 크나큰 잠재적 가치를 지닌다. 예를 들어, 그

것은 효율적인 도시 계획을 촉진하며 유행병에 직면했을 때 방제 전략을 개선하는 데에 사용될

수 있다. 은행권의 흐름, 이동전화 기록, 수송 데이터와 같은 풍부한 데이터가 새로 발견되면 현

대인의 이동성에 나타난 특징을 파악하려는 시도가 폭발적으로 늘어나곤 한다. 불행하게도, 비교

할 만한 역사적 데이터의 부족 때문에 과거의 인구 이동성 패턴을 연구하는 데에는 더 큰 어려움

이 있다. 이 연구논문에서, 우리는 사람들의 장기적 이주에 대한 분석을 제시한다. 그런 장기 이

김범준 교수님

Page 45: 족보, 인구조사 자료를 기반으로 한 본관별 혼인과 인구 이동에 대한 연구

저의 다른 연구 주제들이 궁금하시면 https://sites.google.com/site/lshlj82/

발표 들어주셔서 감사합니다!!!

질문 있으십니까? 지금 질문하시거나 … [email protected]

슬라이드 (pdf 파일): http://www.slideshare.net/lshlj82/ss-41233108