{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 万神殿项目(Pantheon Project)\n", "\n", "由塞萨尔·伊达尔戈(César Hidalgo)创建的一个在线工具。\n", "\n", "- 伊达尔戈现在是麻省理工学院媒体实验室的教授,\n", " - 他曾说:\"真正著名的人在他们各自的领域外也相当知名\"。\n", "- 一个人的维基百科页面使用了多少种语言,他就有多大的名气。\n", "\n", "若想被列入万神殿,一个人的名气必须跨越国家和语言障碍,必须在维基百科页面上出现至少25种语言。\n", "\n", "单单这一个要求就将名人的范围从所有的小名人或不太出名的人缩小到11341人——他们各有特色,魅力十足。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yu, A. Z., et al. (2016). Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific Data 2:150075. doi: 10.1038/sdata.2015.75\n", "\n", "https://pantheon.world/data/datasets\n", "\n", "- pantheon.tsv\n", "\n", "A tab delimited file containing a row of data per person found in the Panthon 1.0 dataset.\n", "\n", "- wikilangs.tsv\n", "\n", "A tab delimited file of all the different Wikipedia language editions that each biography has a presence in.\n", "\n", "- pageviews_2008-2013.tsv\n", "A file containing the monthly pageview data for each individual, for all the Wikipedia language editions in which they have a presence.\n", "\n", "Please refer to the methods section for more information on how this data was created. For detailed descriptions of these datasets, please refer to our data descriptor paper." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Jara-Figueroa, C., Yu, A.Z. and Hidalgo, C.A., 2015. The medium is the memory: how communication technologies shape what we remember. arXiv preprint arXiv:1512.05020.\n", "\n", "- Yu, A.Z., Ronen, S., Hu, K., Lu, T. and Hidalgo, C.A., 2016. Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific data, 3.\n", "\n", "- Ronen, S., Gonçalves, B., Hu, K.Z., Vespignani, A., Pinker, S. and Hidalgo, C.A., 2014. Links that speak: The global language network and its association with global fame. Proceedings of the National Academy of Sciences, 111(52), pp.E5616-E5622.\n", "\n", "- Cesar A. Hidalgo and Ali Almossawi. \"The Data-Visualization Revolution.\" Scientific American. March 2014.\n", "\n", "- Hidalgo, C. A. \"The Last 20 Inches: Data’s Treacherous Journey from the Screen to the Mind.\" MIT Technology Review. March 2014.\n", "\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2021-05-14T11:21:38.436796Z", "start_time": "2021-05-14T11:21:37.859943Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns" ] }, { "cell_type": "code", "execution_count": 136, "metadata": { "ExecuteTime": { "end_time": "2021-05-14T14:07:05.156786Z", "start_time": "2021-05-14T14:07:04.525056Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "wd_id | \n", "wp_id | \n", "slug | \n", "name | \n", "occupation | \n", "prob_ratio | \n", "gender | \n", "alive | \n", "... | \n", "deathdate | \n", "deathyear | \n", "bplace_geacron_name | \n", "dplace_geacron_name | \n", "is_group | \n", "l_ | \n", "age | \n", "non_en_page_views | \n", "coefficient_of_variation | \n", "hpi | \n", "|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "18934 | \n", "Q9458 | \n", "18934 | \n", "Muhammad | \n", "Muhammad | \n", "RELIGIOUS FIGURE | \n", "0.0 | \n", "M | \n", "NaN | \n", "False | \n", "... | \n", "0632-06-08 | \n", "632.0 | \n", "Mecca | \n", "NaN | \n", "False | \n", "27.918400 | \n", "1450.0 | \n", "5160422.0 | \n", "3.199355 | \n", "100.000000 | \n", "
1 | \n", "17414699 | \n", "Q720 | \n", "17414699 | \n", "Genghis_Khan | \n", "Genghis Khan | \n", "MILITARY PERSONNEL | \n", "0.0 | \n", "M | \n", "NaN | \n", "False | \n", "... | \n", "1227-08-18 | \n", "1227.0 | \n", "NaN | \n", "NaN | \n", "False | \n", "25.843621 | \n", "858.0 | \n", "3249211.0 | \n", "2.753641 | \n", "97.723669 | \n", "
2 | \n", "18079 | \n", "Q762 | \n", "18079 | \n", "Leonardo_da_Vinci | \n", "Leonardo da Vinci | \n", "INVENTOR | \n", "0.0 | \n", "M | \n", "NaN | \n", "False | \n", "... | \n", "1519-05-02 | \n", "1519.0 | \n", "NaN | \n", "NaN | \n", "False | \n", "17.545406 | \n", "568.0 | \n", "5362406.0 | \n", "4.796629 | \n", "97.460691 | \n", "
3 | \n", "14627 | \n", "Q935 | \n", "14627 | \n", "Isaac_Newton | \n", "Isaac Newton | \n", "PHYSICIST | \n", "0.0 | \n", "M | \n", "NaN | \n", "False | \n", "... | \n", "1727-03-31 | \n", "1726.0 | \n", "NaN | \n", "NaN | \n", "False | \n", "21.608920 | \n", "378.0 | \n", "3431331.0 | \n", "4.632474 | \n", "96.836567 | \n", "
4 | \n", "17914 | \n", "Q255 | \n", "17914 | \n", "Ludwig_van_Beethoven | \n", "Ludwig van Beethoven | \n", "COMPOSER | \n", "0.0 | \n", "M | \n", "NaN | \n", "False | \n", "... | \n", "1827-03-26 | \n", "1827.0 | \n", "NaN | \n", "Austria | \n", "False | \n", "19.796430 | \n", "250.0 | \n", "5179518.0 | \n", "3.926626 | \n", "96.583969 | \n", "
5 rows × 34 columns
\n", "