On the evolution of some random graphs

by

Phil Pollett

Department of Mathematics

The University of Queensland

**A RANDOM GRAPH**

**The construction.**
A random (undirected) graph with *n* vertices is constructed in the
following way: pairs of vertices are selected one at a time in such a
way that each pair has the same probability of being selected on any
given occasion, and, each selection is made independently of previous
selections. If the vertex pair is selected, then an edge
is constructed which connects *x* and *y*.

**Are multiple edges possible?**
In my model, *yes!* For example, if the vertex pair were to be selected *k* times, there would be *k* edges connecting
*x* and *y*: a *multiple edge* contributing
*cycles of length 2*.

**ASYMPTOTIC BEHAVIOUR**

Suppose that *m* edges have been selected. We shall be concerned with the
behaviour of the graph in the limit as *n* and *m* become large, but in
such a way that *m*=*O*(*n*).

**The problem.**
Our problem is to determine the limiting probability that the graph is
acyclic.

**Motivation.**
Havas and Majewski present an algorithm for
*minimal perfect hashing*
(used for memory-efficient storage and fast retrieval of items from
static sets) based on this random graph.
Their algorithm is optimal when the graph is acyclic.

**WHY ACYCLIC?**

Consider a set *W* of *m* words (or keys). Every
*bijection* , where ,
is called a *minimal perfect hash function.*
HM find hash functions of the form

map keys to integers (they identify the pair of vertices
of the graph corresponding to the edge *w*) and
*g* maps integers to *I*.

Given and ,
can *g* be chosen so that *h* is a bijection?

If the graph is acyclic then, yes, it is easy to construct
*g* from *h*. Traverse the graph: if vertex *w* is reached from
vertex *u* then set

where *e*=(*u*,*w*).

**EFFICIENCY**

HM's algorithm generates and at random until an acyclic graph is found:

where and are tables of random integers and *w*[*i*]
denotes the *i*-th character (an integer) of key *i*.

The efficiency of the algorithm is determined by the probability that the graph is acyclic: the expected number of iterations needed to find an acyclic graph will be (typically between 2 and 3).

**EVALUATING **

**Theorem.**
If *n* and
*m* tend to in such a way that , where *c* is a positive
constant, the limiting probability *p* that the graph is acyclic is
given by

*Proof.* On request. It uses results from [HM] and
Erdös and Renyi.

**SKETCH PROOF**

Let be the number of cycles of length *k*
and let . Following [HM] write

Now let , so that and

ER show that the distribution of is asymptotically Poisson: in particular,

It follows that

So, formally,

and hence

By Fatou's Lemma, we always have

from which it follows immediately that

this argument is valid even if the sum in (2) is divergent. We deduce immediately that if , .

When *c*<1/2, we have and

From Markov's inequality we have and so By Lemma 2 of [HM], we have, for each fixed , that as . In particular, for each , the sequence is bounded above by . It follows that is bounded above by . Further, since ,

Thus, by Dominated Convergence, we have

and, hence, .

**THE FIVE STAGES OF EVOLUTION**

**PRIMORDIAL STEW: m(n)=o(n)**

If , then (with limiting probability 1)
*all components are trees*.

Trees of order *k* appear when *m* reaches order .
In particular, , the number of trees of order *k*, has a (limiting)
Poisson distribution with mean ,
where

Finally, if ,
the number of trees of order *k* is asymptotically
normally distributed with mean and variance equal to

To be precise, . This result holds in the next two stages of evolution; we only require .

**SPOOKY: , where 0< c<1/2**

*Cycles of all orders start to appear:*
, the number of cycles of order *k*, has a (limiting) Poisson
distribution with mean .

Furthermore, with limiting probability 1, all components are either
trees or consist of exactly one cycle (*k* vertices and *k* edges), the
latter having a Poisson distribution with mean

where *k* is the order of the cycle.

The largest component is a tree; it has

vertices (with probability tending to 1).

**A MONSTER APPEARS: , where **

When (*c*=1/2), the largest component has (with
probability tending to 1) vertices.
When with *c*>1/2, *a giant component appears:*
the largest component in the graph has
*G*(*c*) *n* vertices, where
*G*(*c*)=1-*X*(*c*)/2*c* and

Note that *G*(1/2)=0 and as .

Almost all the other vertices belong to trees:
the total number of vertices belonging to trees
is almost surely *n*(1-*G*(*c*))+*o*(*n*).

For *c*>1/2, the expected number of components in the graph
is asymptotically

**CONNECTEDNESS: , where **

*The graph is becoming connected:* if

then (with probability tending to 1) there are only trees of order
outside the giant component, the limiting distribution of the
number of trees of order *l* being Poisson with mean
.
For example (*k*=1), if

there are (almost surely) only isolated vertices outside the giant component, the number of these having a limiting Poisson distribution with mean . And, the chance that the graph is indeed connected tends to (which itself tends to 1 as grows).

**ASYMPTOTIC REGULARITY:
, where **

*The whole graph becomes regular:*
with probability tending to 1,
the graph becomes connected and the orders
of all vertices are equal.