Erdos Numbers update

Erdos Numbers Archive

=========================================================================

We are pleased to announce a source of information for research
mathematicians and others interested in the issue of collaboration in
mathematical research -- a fairly comprehensive list of certain
co-authorships.  These lists can provide fun, as well as a vehicle for
more serious studies of the dynamics involved and a "real-life" fairly
large graph for combinatorialists to study.  Files containing these data
are available via anonymous ftp and the World Wide Web; details for
obtaining them are given below.  (This posting is the README file.)  The
files will be updated about once a year to reflect corrections and
additional information as it becomes available.  The current version is 
97.10, dated February 1, 1997, and intended to be fairly complete through 
the end of 1996.

Most practicing mathematicians are familiar with the definition of one's
Erdos number [long Hungarian umlaut over the "o" suppressed for
readability of this message].  Paul Erdos, the late widely-traveled and
incredibly prolific Hungarian mathematician of the highest caliber, wrote
hundreds of mathematical research papers in many different areas, many in
collaboration with others.  His Erdos number is 0.  His co-authors have
Erdos number 1.  People other than Erdos who have written a joint paper
with someone with Erdos number 1 but not with Erdos have Erdos number 2,
and so on.  If there is no chain of co-authorships connecting someone with
Erdos, then that person's Erdos number is said to be infinite.  In
graph-theoretic terms, the collaboration graph C has all mathematicians as
its vertices; the vertex p is Paul Erdos.  There is an edge between u and
v if u and v have published at least one mathematics article together. (We
will adopt the most liberal interpretation here, and allow any number of
other co-authors to be involved; for example, a six-author paper is
responsible for 15 edges in this graph, one for each pair of authors.
Other approaches would include using hypergraphs or multigraphs or
multihypergraphs.)  The Erdos number of v, then, is the distance (of the
shortest path) in C from v to p.  The set of all mathematicians with a
finite Erdos number is called the Erdos component of C.  It has been
conjectured that the Erdos component contains almost all present-day
publishing mathematicians (and has a not very large diameter), but perhaps
not some famous names from the past, such as Gauss.  Clearly, any two
people with a finite Erdos number can be connected by a string of
co-authorships, of length at most the sum of their Erdos numbers. 

While there has been much informal discussion of the properties of the
collaboration graph [see, for example, "On Properties of a Well-Known
Graph, or, What is Your Ramsey Number?" by Tom Odda (alias for Ron Graham)
in Topics in Graph Theory (New York, 1977), pp. 166-172], there had been
no comprehensive set of data gathered prior to our work.  As we compiled
our lists, it became evident why this is so.  For one thing, the database
is quite large.  For another, until fairly recently, most of the
information has not been available electronically.  Even more of an
obstacle, however, is the serious problem of identity -- determining whom
a given character string (such as "J. Smith") really represents. 

Further information is contained in our joint paper, "On a Portion of the
Well-Known Collaboration Graph", Congressus Numerantium 108 (1995) 
129-131, and Grossman's paper, "Paul Erdos: The Master of Collaboration",
in The Mathematics of Paul Erdos (R. Graham and V. Nesetril, eds.,
Springer, 1997).  This 2-volume collection also has an updated list of
Erdos's publication (numbering over 1400). 

We provide six lists in ASCII format:

Erdos0 is a list of the (currently 472) persons with Erdos number 1, one
name per line, single-spaced, last name first, in alphabetical order, ALL
CAPS.  The name occupies the first 40 characters of each line (including
trailing blanks if necessary).  The rest of each line contains the year
this person's first joint paper with Paul Erdos was published.  If they
have published more than one joint paper, then the number of joint papers
is also given. 

Erdos0d is similar to Erdos0, except that the date comes first and the
list is sorted by year of first joint publication (alphabetical within the
same year). 

Erdos0p is similar to Erdos0d, except that it is sorted by the number of
joint papers and contains only those 188 people with more than one joint
paper with Erdos.  Secondary sort is by year of first paper, most recent
first. 

Erdos1 contains the same information as Erdos0, together with a list of
each author's collaborators following his or her name.  (The number of
joint papers is given by a number following a colon after the year of
first paper, if it exceeds one.)  These co-authors are listed one per line,
single-spaced, each indented by a tab, last name first, in alphabetical
order; those who have Erdos number 1 are in ALL CAPS, and those who have
Erdos number 2 are in Normal Capitalization.  A blank line follows each
such sublist. 

Erdos2 is a kind of inverse of Erdos1.  It is an alphabetical list of the
(currently 5016) people with Erdos number 2, left-justified, each followed
by a sublist of his or her co-authors with Erdos number 1 (each line
indented by a tab).  The capitalization convention explained above is
maintained.  Note that only those co-authors with Erdos number 1 are
listed for these people.

ErdosA is simply a list of all persons with Erdos number less than or
equal to 2, in alphabetical order, one per line, with the same
capitalization convention (with Paul Erdos listed in spaced caps, as
well). 

***   To obtain these files, use anonymous ftp to
***   vela.acs.oakland.edu.  Login as anonymous and send
***   your e-mail address as password.
***   The files are in the directory pub/math/erdos.

***   Alternatively, if you have enough memory available, they
***   can be viewed or downloaded from the Erdos Number Project
***   Home Page on the World Wide Web.  The URL for this is
***   http://www.oakland.edu/~grossman/erdoshp.html

Here are the procedures, rules, conventions, and assumptions we used in
creating these lists.  In most cases, our source is Mathematical Reviews
(MR).  Secondary sources include Zentralblatt and the hypertext
bibliography project in theoretical computer science (its URL is
http://theory.lcs.mit.edu/%7Edmjones/hbp/).  In some cases, obituary
articles in mathematical journals have been used, or similar sources.  Our
criterion for inclusion of an edge between vertices u and v is some
mathematical collaboration between them resulting in a published work. 
Any number of additional co-authors is permitted.  Not normally included
are joint editorships, introductions to books written by others, technical
reports, problem sessions, problems posed or solved in problem sections of
journals, seminars, very elementary textbooks, books on history, memorial
or other tributes, biography, translations, bibliographies, or popular
works.  Pseudonyms (such as Mutt and G. W. Peck) are usually taken at face
value, as if they were real people.  When MR lists two people with the
same name using superscripts, we follow this convention, using a hyphen,
as in Liu, Zhen Hong-1.  (Indeed, there are actually two Paul Erdos's, the
other being a physicist who has published mathematical papers.  "Our" Paul
is Paul Erdos-1 to MR.  Also, one must not confuse Paul Erdos with Peter
L. Erdos, who sometimes publishes under P. L. Erdos; he has Erdos number
2.)  We have tried to include as full a name as possible in all cases.  As
for spelling, all accents are ignored and omitted, but apostrophes and
hyphens are included. 

There are bound to be mistakes in our data.  We urgently request people
who know of mistakes to report them to us so that the errors can be
corrected in subsequent versions.  Please tell us of incorrect or
incomplete names (we want as full a name for each individual as
possible), co-authorships we have missed, entries that should be modified
or deleted, including those caused by confusion over distinct people with
the same or similar names or initials.  Conversely, note that names that
identify the (known to us) same person are identical in these lists; if
you have information that, say, Jones, Albert is the same person as Jones,
A., then please bring it to our attention, since we do not know this and
are assuming that they are separate people.  When sending us information,
please provide complete citations or other documentation. 

SEND ALL CORRECTIONS AND CORRESPONDENCE TO:
        Professor Jerrold W. Grossman
        Department of Mathematical Sciences
        Oakland University
        Rochester, MI  48309-4401

        voice:        (810) 370-3443
        FAX:          (810) 370-4184
        e-mail:       grossman@oakland.edu
        web:          http://www.oakland.edu/~grossman/

As a corollary to our work, we issue a plea to authors:  please use as
complete and consistent a name as possible when you publish a paper.  Too
many people have too many similar names and initials, and confusion
reigns! 

Finally, let us suggest a few uses for these lists.  Many of them require
that the lists be downloaded and scanned electronically with a word
processor or editor. 

One obvious thing to do is to compute your own Erdos number.  If you are
on the list, there is no problem.  If not, then perhaps one of your
co-authors is on the list, giving you an Erdos number of 3.  Otherwise,
you can look in electronic versions of MR or other databases and compile a
list of the co-authors of your co-authors, and repeat the process until
you find a name on the list.  If you have been thorough, then you will
have an exact value for your Erdos number.  For example, Andrew Wiles has
Erdos number at most 3, because he is a co-author of Barry C. Mazur, who
has written with ANDREW J. GRANVILLE, who has written with Erdos. 
(Warning:  your number might decrease over time, especially if you or 
your colleagues write more papers!)

A more casual thing to do is simply to read through Erdos1, noting the
wide range of collaboration that exists (we were surprised by its extent). 
For example, Paul Erdos is not the only person presented here with more
than 100 co-authors.  Paul Erdos has made contributions in many different
areas of mathematics; and by the time you go one or two more levels down
the tree, essentially all areas of mathematics are represented (as well as
computer science, physics, and other natural and social sciences). 

Finally, we offer our data as a fairly large graph on which to test
algorithms, in the spirit of Donald Knuth's The Stanford GraphBase
(Addison-Wesley, 1993).  (For this purpose it is probably best for now to
restrict oneself just to the people with Erdos number 1, because our data
do not show co-authorships between people with Erdos number 2.)  Perhaps
there is not as much intrigue in the relationships shown here as in, say,
Professor Knuth's graph of encounters between characters in Tolstoy's Anna
Karenina (or maybe there is . . .), but connectivity, covering, clique,
or other analyses may yield some interesting insights.  We would be
interested in hearing of any results you obtain. 

Jerrold W. Grossman                            Patrick Ion
Oakland University                             Mathematical Reviews
Rochester, Michigan                            Ann Arbor, Michigan
grossman@oakland.edu                           ion@math.ams.org

initial version:  May 25, 1995
latest revision:  February 1, 1997
VOLTAR AO PRINCíPIO DE TUDO
Última alteração: 3 de Fevereiro de 1997