On the evolution of some random graphs

by

Phil Pollett

Department of Mathematics
The University of Queensland


A RANDOM GRAPH

The construction. A random (undirected) graph with n vertices is constructed in the following way: pairs of vertices are selected one at a time in such a way that each pair has the same probability of being selected on any given occasion, and, each selection is made independently of previous selections. If the vertex pair tex2html_wrap_inline514 is selected, then an edge is constructed which connects x and y.

Are multiple edges possible? In my model, yes! For example, if the vertex pair tex2html_wrap_inline514 were to be selected k times, there would be k edges connecting x and y: a multiple edge contributing tex2html_wrap_inline530 cycles of length 2.


ASYMPTOTIC BEHAVIOUR

Suppose that m edges have been selected. We shall be concerned with the behaviour of the graph in the limit as n and m become large, but in such a way that m=O(n).

The problem. Our problem is to determine the limiting probability that the graph is acyclic.

Motivation. Havas and Majewskigif present an algorithm for minimal perfect hashing (used for memory-efficient storage and fast retrieval of items from static sets) based on this random graph. Their algorithm is optimal when the graph is acyclic.


WHY ACYCLIC?

Consider a set W of m words (or keys). Every bijection tex2html_wrap_inline544 , where tex2html_wrap_inline546 , is called a minimal perfect hash function. HM find hash functions of the form

displaymath548

tex2html_wrap_inline550 map keys to integers (they identify the pair of vertices of the graph corresponding to the edge w) and g maps integers to I.

Given tex2html_wrap_inline558 and tex2html_wrap_inline560 , can g be chosen so that h is a bijection?

If the graph is acyclic then, yes, it is easy to construct g from h. Traverse the graph: if vertex w is reached from vertex u then set

displaymath574

where e=(u,w).


EFFICIENCY

HM's algorithm generates tex2html_wrap_inline558 and tex2html_wrap_inline560 at random until an acyclic graph is found:

displaymath582

where tex2html_wrap_inline584 and tex2html_wrap_inline586 are tables of random integers and w[i] denotes the i-th character (an integer) of key i.

The efficiency of the algorithm is determined by the probability tex2html_wrap_inline594 that the graph is acyclic: the expected number of iterations needed to find an acyclic graph will be tex2html_wrap_inline596 (typically between 2 and 3).


EVALUATING tex2html_wrap_inline594

Theorem. If n and m tend to tex2html_wrap_inline604 in such a way that tex2html_wrap_inline606 , where c is a positive constant, the limiting probability p that the graph is acyclic is given by

displaymath612

Proof. On request. It uses results from [HM] and Erdös and Renyigif.


SKETCH PROOF

Let tex2html_wrap_inline614 be the number of cycles of length k and let tex2html_wrap_inline618 . Following [HM] write

displaymath620

Now let tex2html_wrap_inline622 , so that tex2html_wrap_inline624 and

displaymath626

ER show that the distribution of tex2html_wrap_inline614 is asymptotically Poisson: in particular,

displaymath630


It follows that

displaymath632

So, formally,

displaymath634

and hence

  equation192

By Fatou's Lemma, we always have

displaymath636

from which it follows immediately that

displaymath638

this argument is valid even if the sum in (2) is divergent. We deduce immediately that if tex2html_wrap_inline640 , tex2html_wrap_inline642 .


When c<1/2, we have tex2html_wrap_inline646 and

displaymath648

From Markov's inequality we have tex2html_wrap_inline650 and so tex2html_wrap_inline652 By Lemma 2 of [HM], we have, for each fixed tex2html_wrap_inline654 , that tex2html_wrap_inline656 as tex2html_wrap_inline658 . In particular, for each tex2html_wrap_inline654 , the sequence tex2html_wrap_inline662 is bounded above by tex2html_wrap_inline664 . It follows that tex2html_wrap_inline666 is bounded above by tex2html_wrap_inline668 . Further, since tex2html_wrap_inline670 ,

displaymath672

Thus, by Dominated Convergence, we have

displaymath674

and, hence, tex2html_wrap_inline676 .


THE FIVE STAGES OF EVOLUTION

PRIMORDIAL STEW: m(n)=o(n)

If tex2html_wrap_inline680 , then (with limiting probability 1) all components are trees.

Trees of order k appear when m reaches order tex2html_wrap_inline686 . In particular, tex2html_wrap_inline688 , the number of trees of order k, has a (limiting) Poisson distribution with mean tex2html_wrap_inline692 , where

displaymath694

Finally, if tex2html_wrap_inline696 , the number of trees of order k is asymptotically normally distributed with mean and variance equal to

displaymath700

To be precise, tex2html_wrap_inline702 . This result holds in the next two stages of evolution; we only require tex2html_wrap_inline704 .


SPOOKY: tex2html_wrap_inline706 , where 0<c<1/2

Cycles of all orders start to appear: tex2html_wrap_inline710 , the number of cycles of order k, has a (limiting) Poisson distribution with mean tex2html_wrap_inline714 .

Furthermore, with limiting probability 1, all components are either trees or consist of exactly one cycle (k vertices and k edges), the latter having a Poisson distribution with mean

displaymath720

where k is the order of the cycle.

The largest component is a tree; it has

displaymath724

vertices (with probability tending to 1).


A MONSTER APPEARS: tex2html_wrap_inline706 , where tex2html_wrap_inline640

When tex2html_wrap_inline730 (c=1/2), the largest component has (with probability tending to 1) tex2html_wrap_inline734 vertices. When tex2html_wrap_inline706 with c>1/2, a giant component appears: the largest component in the graph has G(c) n vertices, where G(c)=1-X(c)/2c and

displaymath744

Note that G(1/2)=0 and tex2html_wrap_inline748 as tex2html_wrap_inline750 .

Almost all the other vertices belong to trees: the total number of vertices belonging to trees is almost surely n(1-G(c))+o(n).

For c>1/2, the expected number of components in the graph is asymptotically

displaymath756


CONNECTEDNESS: tex2html_wrap_inline758 , where tex2html_wrap_inline760

The graph is becoming connected: if

displaymath762

then (with probability tending to 1) there are only trees of order tex2html_wrap_inline764 outside the giant component, the limiting distribution of the number of trees of order l being Poisson with mean tex2html_wrap_inline768 . For example (k=1), if

displaymath772

there are (almost surely) only isolated vertices outside the giant component, the number of these having a limiting Poisson distribution with mean tex2html_wrap_inline774 . And, the chance that the graph is indeed connected tends to tex2html_wrap_inline776 (which itself tends to 1 as tex2html_wrap_inline778 grows).


ASYMPTOTIC REGULARITY: tex2html_wrap_inline780 , where tex2html_wrap_inline782

The whole graph becomes regular: with probability tending to 1, the graph becomes connected and the orders of all vertices are equal.