Erdos Numbers Archive ========================================================================= We are pleased to announce a source of information for research mathematicians and others interested in the issue of collaboration in mathematical research -- a fairly comprehensive list of certain co-authorships. These lists can provide fun, as well as a vehicle for more serious studies of the dynamics involved and a "real-life" fairly large graph for combinatorialists to study. Files containing these data are available via anonymous ftp and the World Wide Web; details for obtaining them are given below. (This posting is the README file.) The files will be updated about once a year to reflect corrections and additional information as it becomes available. The current version is 97.10, dated February 1, 1997, and intended to be fairly complete through the end of 1996. Most practicing mathematicians are familiar with the definition of one's Erdos number [long Hungarian umlaut over the "o" suppressed for readability of this message]. Paul Erdos, the late widely-traveled and incredibly prolific Hungarian mathematician of the highest caliber, wrote hundreds of mathematical research papers in many different areas, many in collaboration with others. His Erdos number is 0. His co-authors have Erdos number 1. People other than Erdos who have written a joint paper with someone with Erdos number 1 but not with Erdos have Erdos number 2, and so on. If there is no chain of co-authorships connecting someone with Erdos, then that person's Erdos number is said to be infinite. In graph-theoretic terms, the collaboration graph C has all mathematicians as its vertices; the vertex p is Paul Erdos. There is an edge between u and v if u and v have published at least one mathematics article together. (We will adopt the most liberal interpretation here, and allow any number of other co-authors to be involved; for example, a six-author paper is responsible for 15 edges in this graph, one for each pair of authors. Other approaches would include using hypergraphs or multigraphs or multihypergraphs.) The Erdos number of v, then, is the distance (of the shortest path) in C from v to p. The set of all mathematicians with a finite Erdos number is called the Erdos component of C. It has been conjectured that the Erdos component contains almost all present-day publishing mathematicians (and has a not very large diameter), but perhaps not some famous names from the past, such as Gauss. Clearly, any two people with a finite Erdos number can be connected by a string of co-authorships, of length at most the sum of their Erdos numbers. While there has been much informal discussion of the properties of the collaboration graph [see, for example, "On Properties of a Well-Known Graph, or, What is Your Ramsey Number?" by Tom Odda (alias for Ron Graham) in Topics in Graph Theory (New York, 1977), pp. 166-172], there had been no comprehensive set of data gathered prior to our work. As we compiled our lists, it became evident why this is so. For one thing, the database is quite large. For another, until fairly recently, most of the information has not been available electronically. Even more of an obstacle, however, is the serious problem of identity -- determining whom a given character string (such as "J. Smith") really represents. Further information is contained in our joint paper, "On a Portion of the Well-Known Collaboration Graph", Congressus Numerantium 108 (1995) 129-131, and Grossman's paper, "Paul Erdos: The Master of Collaboration", in The Mathematics of Paul Erdos (R. Graham and V. Nesetril, eds., Springer, 1997). This 2-volume collection also has an updated list of Erdos's publication (numbering over 1400). We provide six lists in ASCII format: Erdos0 is a list of the (currently 472) persons with Erdos number 1, one name per line, single-spaced, last name first, in alphabetical order, ALL CAPS. The name occupies the first 40 characters of each line (including trailing blanks if necessary). The rest of each line contains the year this person's first joint paper with Paul Erdos was published. If they have published more than one joint paper, then the number of joint papers is also given. Erdos0d is similar to Erdos0, except that the date comes first and the list is sorted by year of first joint publication (alphabetical within the same year). Erdos0p is similar to Erdos0d, except that it is sorted by the number of joint papers and contains only those 188 people with more than one joint paper with Erdos. Secondary sort is by year of first paper, most recent first. Erdos1 contains the same information as Erdos0, together with a list of each author's collaborators following his or her name. (The number of joint papers is given by a number following a colon after the year of first paper, if it exceeds one.) These co-authors are listed one per line, single-spaced, each indented by a tab, last name first, in alphabetical order; those who have Erdos number 1 are in ALL CAPS, and those who have Erdos number 2 are in Normal Capitalization. A blank line follows each such sublist. Erdos2 is a kind of inverse of Erdos1. It is an alphabetical list of the (currently 5016) people with Erdos number 2, left-justified, each followed by a sublist of his or her co-authors with Erdos number 1 (each line indented by a tab). The capitalization convention explained above is maintained. Note that only those co-authors with Erdos number 1 are listed for these people. ErdosA is simply a list of all persons with Erdos number less than or equal to 2, in alphabetical order, one per line, with the same capitalization convention (with Paul Erdos listed in spaced caps, as well). *** To obtain these files, use anonymous ftp to *** vela.acs.oakland.edu. Login as anonymous and send *** your e-mail address as password. *** The files are in the directory pub/math/erdos. *** Alternatively, if you have enough memory available, they *** can be viewed or downloaded from the Erdos Number Project *** Home Page on the World Wide Web. The URL for this is *** http://www.oakland.edu/~grossman/erdoshp.html Here are the procedures, rules, conventions, and assumptions we used in creating these lists. In most cases, our source is Mathematical Reviews (MR). Secondary sources include Zentralblatt and the hypertext bibliography project in theoretical computer science (its URL is http://theory.lcs.mit.edu/%7Edmjones/hbp/). In some cases, obituary articles in mathematical journals have been used, or similar sources. Our criterion for inclusion of an edge between vertices u and v is some mathematical collaboration between them resulting in a published work. Any number of additional co-authors is permitted. Not normally included are joint editorships, introductions to books written by others, technical reports, problem sessions, problems posed or solved in problem sections of journals, seminars, very elementary textbooks, books on history, memorial or other tributes, biography, translations, bibliographies, or popular works. Pseudonyms (such as Mutt and G. W. Peck) are usually taken at face value, as if they were real people. When MR lists two people with the same name using superscripts, we follow this convention, using a hyphen, as in Liu, Zhen Hong-1. (Indeed, there are actually two Paul Erdos's, the other being a physicist who has published mathematical papers. "Our" Paul is Paul Erdos-1 to MR. Also, one must not confuse Paul Erdos with Peter L. Erdos, who sometimes publishes under P. L. Erdos; he has Erdos number 2.) We have tried to include as full a name as possible in all cases. As for spelling, all accents are ignored and omitted, but apostrophes and hyphens are included. There are bound to be mistakes in our data. We urgently request people who know of mistakes to report them to us so that the errors can be corrected in subsequent versions. Please tell us of incorrect or incomplete names (we want as full a name for each individual as possible), co-authorships we have missed, entries that should be modified or deleted, including those caused by confusion over distinct people with the same or similar names or initials. Conversely, note that names that identify the (known to us) same person are identical in these lists; if you have information that, say, Jones, Albert is the same person as Jones, A., then please bring it to our attention, since we do not know this and are assuming that they are separate people. When sending us information, please provide complete citations or other documentation. SEND ALL CORRECTIONS AND CORRESPONDENCE TO: Professor Jerrold W. Grossman Department of Mathematical Sciences Oakland University Rochester, MI 48309-4401 voice: (810) 370-3443 FAX: (810) 370-4184 e-mail: grossman@oakland.edu web: http://www.oakland.edu/~grossman/ As a corollary to our work, we issue a plea to authors: please use as complete and consistent a name as possible when you publish a paper. Too many people have too many similar names and initials, and confusion reigns! Finally, let us suggest a few uses for these lists. Many of them require that the lists be downloaded and scanned electronically with a word processor or editor. One obvious thing to do is to compute your own Erdos number. If you are on the list, there is no problem. If not, then perhaps one of your co-authors is on the list, giving you an Erdos number of 3. Otherwise, you can look in electronic versions of MR or other databases and compile a list of the co-authors of your co-authors, and repeat the process until you find a name on the list. If you have been thorough, then you will have an exact value for your Erdos number. For example, Andrew Wiles has Erdos number at most 3, because he is a co-author of Barry C. Mazur, who has written with ANDREW J. GRANVILLE, who has written with Erdos. (Warning: your number might decrease over time, especially if you or your colleagues write more papers!) A more casual thing to do is simply to read through Erdos1, noting the wide range of collaboration that exists (we were surprised by its extent). For example, Paul Erdos is not the only person presented here with more than 100 co-authors. Paul Erdos has made contributions in many different areas of mathematics; and by the time you go one or two more levels down the tree, essentially all areas of mathematics are represented (as well as computer science, physics, and other natural and social sciences). Finally, we offer our data as a fairly large graph on which to test algorithms, in the spirit of Donald Knuth's The Stanford GraphBase (Addison-Wesley, 1993). (For this purpose it is probably best for now to restrict oneself just to the people with Erdos number 1, because our data do not show co-authorships between people with Erdos number 2.) Perhaps there is not as much intrigue in the relationships shown here as in, say, Professor Knuth's graph of encounters between characters in Tolstoy's Anna Karenina (or maybe there is . . .), but connectivity, covering, clique, or other analyses may yield some interesting insights. We would be interested in hearing of any results you obtain. Jerrold W. Grossman Patrick Ion Oakland University Mathematical Reviews Rochester, Michigan Ann Arbor, Michigan grossman@oakland.edu ion@math.ams.org initial version: May 25, 1995 latest revision: February 1, 1997