quickstart.rst 16.8 KB
Newer Older
1
2
3
Quick start using `graph-tool`
==============================

4
5
6
The :mod:`graph_tool` module provides a :class:`~graph_tool.Graph` class and
several algorithms that operate on it. The internals of this class, and of most
algorithms, are written in C++ for performance.
7

8
9
10
The module must be of course imported before it can be used. The package is
subdivided into several sub-modules. To import everything from all of them, one
can do:
11

12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
.. doctest::

   >>> from graph_tool.all import *

In the following, it will always be assumed that the previous line was run.

Creating and manipulating graphs
--------------------------------

An empty graph can be created by instantiating a :class:`~graph_tool.Graph`
class:

.. doctest::

   >>> g = Graph()

By default, newly created graphs are always directed. To construct undirected
graphs, one must pass the ``directed`` parameter:

.. doctest::

   >>> ug = Graph(directed=False)

A graph can always be switched on-the-fly from directed to undirected (and vice
versa), with the :meth:`~graph_tool.Graph.set_directed` method. The "directedness" of the
graph can be queried with the :meth:`~graph_tool.Graph.is_directed` method,

.. doctest::

   >>> ug = Graph()
   >>> ug.set_directed(False)
   >>> assert(ug.is_directed() == False)

A graph can also be created from another graph, in which case the entire graph
(and its internal property maps, see :ref:`sec_property_maps`) is copied:

.. doctest::

   >>> g1 = Graph()
   >>> # ... populate g1 ...
   >>> g2 = Graph(g1)                 # g1 and g2 are copies

Once a graph is created, it can be populated with vertices and edges. A vertex
can be added with the :meth:`~graph_tool.Graph.add_vertex` method,

.. doctest::

   >>> v = g.add_vertex()

which returns an instance of a :class:`~graph_tool.Vertex` class, also called a
*vertex descriptor*. The :meth:`~graph_tool.Graph.add_vertex` method also
accepts an optional parameter which specifies the number of vertices to
create. If this value is greater than 1, it returns a list of vertices:

.. doctest::

   >>> vlist = g.add_vertex(10)
   >>> print len(vlist)
   10

Each vertex has an unique index, which is numbered from 0 to N-1, where N is the
number of vertices. This index can be obtained by using the
:attr:`~graph_tool.Graph.vertex_index` attribute of the graph (which is a
*property map*, see :ref:`sec_property_maps`), or by converting the vertex
descriptor to an int.

.. doctest::

   >>> v = g.add_vertex()
   >>> print g.vertex_index[v], int(v)
   11 11

There is no need to keep the vertex descriptor lying around to access them at a
later point: One can obtain the descriptor of a vertex with a given index using
the :meth:`~graph_tool.Graph.vertex` method,

.. doctest::

   >>> print g.vertex(8)
   8

Another option is to iterate through the vertices, as described in section
:ref:`sec_iteration`.

Once we have some vertices in the graph, we can create some edges between them
with the :meth:`~graph_tool.Graph.add_edge` method, which returns an edge
descriptor (an instance of the :class:`~graph_tool.Edge` class).

.. doctest::

   >>> v1 = g.add_vertex()
   >>> v2 = g.add_vertex()
   >>> e = g.add_edge(v1, v2)

Edges also have an unique index, which is given by the :attr:`~graph_tool.Graph.edge_index`
property:

.. doctest::

   >>> print g.edge_index[e]
   0

Unlike the vertices, edge indexes are not guaranteed to be continuous in any
range, but they are always unique.

Both vertex and edge descriptors have methods which query associate information,
such as :meth:`~graph_tool.Vertex.in_degree`,
:meth:`~graph_tool.Vertex.out_degree`, :meth:`~graph_tool.Edge.source` and
:meth:`~graph_tool.Edge.target`:

.. doctest::

   >>> v1 = g.add_vertex()
   >>> v2 = g.add_vertex()
   >>> e = g.add_edge(v1, v2)
   >>> print v1.out_degree(), v2.in_degree()
   1 1
   >>> assert(e.source() == v1 and e.target() == v2)

Edges and vertices can also be removed at any time with the
:meth:`~graph_tool.Graph.remove_vertex` and :meth:`~graph_tool.Graph.remove_edge` methods,

.. doctest::

   >>> e = g.add_edge(g.vertex(0), g.vertex(1))
   >>> g.remove_edge(e)                                      # e no longer exists
   >>> g.remove_vertex(g.vertex(1))              # the second vertex is also gone

.. _sec_iteration:

Iterating over vertices and edges
+++++++++++++++++++++++++++++++++

Algorithms must often iterate through the vertices, edges, out edge, etc. of the
graph. The :class:`~graph_tool.Graph` and :class:`~graph_tool.Edge` classes
provide the necessary iterators for doing so. The iterators always give back
edge or vertex descriptors.

In order to iterate through the vertices or edges of the graph, the
:meth:`~graph_tool.Graph.vertices` and :meth:`~graph_tool.Graph.edges` methods should be used, as such:

.. doctest::

   for v in g.vertices():
       print v
   for e in e.vertices():
       print e

The code above will print the vertices and edges of the graph in the order they
are found.

The out- and in-edges of a vertex, as well as the out- and in-neighbours can be
iterated through with the :meth:`~graph_tool.Vertex.out_edges`,
:meth:`~graph_tool.Vertex.in_edges`, :meth:`~graph_tool.Vertex.out_neighbours`
and :meth:`~graph_tool.Vertex.in_neighbours` respectively.

.. doctest::

   from itertools import izip
   for v in g.vertices():
      for e in v.out_edges():
          print e
      for e in v.out_neighbours():
          print e

      # the edge and neighbours order always match
      for e,w in izip(v.out_edges(), v.out_neighbours()):
          assert(e.target() == w)

.. warning:

   You should never remove vertex or edge descriptors when iterating over them,
   since this invalidates the iterators. If you plan to remove vertices or edges
   during iteration, you must first store them somewhere (such as in a list) and
   remove them only later. Removal during iteration will cause bad things to
   happen.

.. _sec_property_maps:

Property maps
-------------

194
195
196
Property maps are a way of associating additional information to the vertices,
edges or to the graph itself. There are thus three types of property maps:
vertex, edge and graph. All of them are instances of the same class,
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
:class:`~graph_tool.PropertyMap`. Each property map has an associated *value
type*, which must be chosen from the predefined set:

.. tabularcolumns:: |l|l|

.. table::

    =======================     ================
     Type name                  Aliases
    =======================     ================
    ``bool``                    ``uint8_t``
    ``int32_t``                 ``int``   
    ``int64_t``                 ``long``  
    ``double``                  ``float``
    ``long double``             
    ``string``                  
    ``vector<bool>``            ``vector<uint8_t>``
    ``vector<int32_t>``         ``vector<int>``
    ``vector<int64_t>``         ``vector<long>``
    ``vector<double>``          ``vector<float>``
    ``vector<long double>``
    ``vector<string>``
    ``python::object``          ``object`` 
    =======================     ================

New property maps can be created for a given graph by calling the
:meth:`~graph_tool.Graph.new_vertex_property`, :meth:`~graph_tool.Graph.new_edge_property`, or
:meth:`~graph_tool.Graph.new_graph_property`, for each map type. The values are then
accessed by vertex or edge descriptors, or the graph itself, as such:

.. doctest::

    from itertools import izip
    from numpy.random import randint

    g = Graph()
    g.add_vertex(100)
    # insert some random links
    for s,t in izip(randint(0, 100, 100), randint(0, 100, 100)):
        g.add_edge(g.vertex(s), g.vertex(t))
    
    vprop_double = g.new_vertex_property("double")
    vprop_vint = g.new_vertex_property("vector<int>")

    eprop_dict = g.new_edge_property("object")

    gprop_bool = g.new_edge_property("bool")

    vprop_double[g.vertex(10)] = 3.1416

    vprop_vint[g.vertex(40)] = [1, 3, 42, 54]
    
    eprop_dict[g.edges().next()] = {"foo":"bar", "gnu":42}

    gprop_bool[g] = True

Property maps with scalar value types can also be accessed as a numpy
:class:`~numpy.ndarray`, with the :meth:`~graph_tool.PropertyMap.get_array`
method, i.e.,

.. doctest::

    from numpy.random import random

    # this assigns random values to the properties
    vprop_double.get_array()[:] = random(g.num_vertices())

Internal property maps
++++++++++++++++++++++

Any created property map can be made "internal" to the respective graph. This
means that it will be copied and saved to a file together with the
graph. Properties are internalized by including them in the graph's
dictionary-like attributes :attr:`~graph_tool.Graph.vertex_properties`,
:attr:`~graph_tool.Graph.edge_properties` or
:attr:`~graph_tool.Graph.graph_properties`. When inserted in the graph, the
property maps must have an unique name (between those of the same type):

.. doctest::

    >>> eprop = g.new_edge_property("string")
    >>> g.edge_properties["some name"] = eprop
    >>> g.list_properties()
    some name      (edge)    (type: string)


Graph I/O
---------

Graphs can be saved and loaded in two formats: `graphml
<http://graphml.graphdrawing.org/>`_ and `dot
<http://www.graphviz.org/doc/info/lang.html>`_. Graphml is the default and
preferred format. The dot format is also supported, but since it contains no
type information, all properties are read later as strings, and must be
converted per hand. Therefore you should always use graphml, except when
interfacing with another software which expects dot format.

A graph can be saved or loaded to a file with the :attr:`~graph_tool.Graph.save`
and :attr:`~graph_tool.Graph.load` methods, which take either a file name or a
file-like object. A graph can also be loaded from disk with the
:func:`~graph_tool.load_graph` function, as such:

.. doctest::

    g = Graph()
    #  ... fill the graph ...
    g.save("my_graph.xml.gz")    
    g2 = load_graph("my_graph.xml.gz")
    # g and g2 should be a copy of each other

Graph classes can also be pickled with the :mod:`pickle` module.


An Example: Building a Price Network
------------------------------------

313
314
315
316
317
318
319
A Price network is the first known model of a "scale-free" graph, invented in
1976 by `de Solla Price
<http://en.wikipedia.org/wiki/Derek_J._de_Solla_Price>`_. It is defined
dynamically, and at each time step a new vertex is added to the graph, and
connected to an old vertex, with probability proportional to its in-degree. The
following program implements this construction method using ``graph-tool``.

320
321
322
.. literalinclude:: price.py
   :linenos:

323
324
The following is what should happen when the program is run.

325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
.. testcode::
   :hide:

   from price import *
   clf()

.. testoutput::

    vertex: 36063 in-degree: 0 out-degree: 1 age: 36063
    vertex: 9075 in-degree: 4 out-degree: 1 age: 9075
    vertex: 5967 in-degree: 3 out-degree: 1 age: 5967
    vertex: 1113 in-degree: 7 out-degree: 1 age: 1113
    vertex: 25 in-degree: 84 out-degree: 1 age: 25
    vertex: 10 in-degree: 541 out-degree: 1 age: 10
    vertex: 5 in-degree: 140 out-degree: 1 age: 5
    vertex: 2 in-degree: 459 out-degree: 1 age: 2
    vertex: 1 in-degree: 520 out-degree: 1 age: 1
    vertex: 0 in-degree: 210 out-degree: 0 age: 0
    Nowhere else to go... We found the main hub!

345
346
347
This is the degree distribution, with 100000 nodes. If you want to really see a
power law, try to increase the number of vertices to something like :math:`10^6`
or :math:`10^7`.
348

349
350
.. figure:: deg-hist.png
   :align: center
351

352
353
354
355
356
357
358
359
360
   In-degree distribution of a price network with 100000 nodes.

We can draw the graph to see some other features of its topology. For that we
use the :func:`~graph_tool.draw.graph_draw` function.

.. testcode::

   g = load_graph("price.xml.gz")
   g.remove_vertex_if(lambda v: g.vertex_index[v] >= 1000)
361
   graph_draw(g, size=(10,10), layout="arf", output="price.png")
362
363
364
365
366

.. figure:: price.png
   :align: center

   First 1000 nodes of a price network.
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427

Graph filtering
---------------

One of the very nice features from ``graph-tool`` is the "on-the-fly" filtering
of edges and/or vertices. Filtering means the temporary masking of
vertices/edges, which are not really removed, and can be easily
recovered. Vertices or edges which are to be filtered should be marked with a
:class:`~graph_tool.PropertyMap` with value type ``bool``, and then set with
:meth:`~graph_tool.Graph.set_vertex_filter` or
:meth:`~graph_tool.Graph.set_edge_filter` methods. By default, vertex or edges
with value "1" are `kept` in the graphs, and those with value "0" are filtered
out. This behaviour can be modified with the ``inverted`` parameter of the
respective functions. All manipulation functions and algorithms will work as if
the marked edges or vertices were removed from the graph, with minimum overhead.

Here is an example which obtains the minimum spanning tree of a graph, using
edge filtering.

.. testcode::
   :hide:

   seed(42)

.. testcode::

   g = random_graph(100, lambda: (poisson(4), poisson(4)))
   tree = min_spanning_tree(g)
   graph_draw(g, size=(8,8), ecolor=tree, output="min_tree.png")

The ``tree`` property map has a bool type, with value "1" if the edge belongs to
the tree, and "0" otherwise. Below is an image of the original graph, with the
marked edges.

.. figure:: min_tree.png
   :align: center

We can now filter out the edges which don't belong to the minimum spanning tree.

.. testcode::

    g.set_edge_filter(tree)
    graph_draw(g, size=(8,8), layout="arf", output="min_tree_filtered.png")

This is how the graph looks when filtered:

.. figure:: min_tree_filtered.png
   :align: center

Everything should work transparently on the filtered graph, simply as if the
masked edges were removed.

.. testcode::

    pr = pagerank(g)
    print pr.a

Which outputs the following.

.. testoutput::

Tiago Peixoto's avatar
Tiago Peixoto committed
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
    [ 0.25333333  0.2         0.2         0.2         0.2         0.79733333
      0.49482667  0.52853333  0.44        0.36        0.2         0.42455467
      0.2         0.552       0.28        0.36        0.36        0.2         0.2
      0.25333333  1.89578667  1.06077099  0.2         0.2         1.26709333
      0.36853333  0.2         0.2         0.2         0.488       0.49333333
      0.28        0.2         0.2         0.2         0.648       0.2
      0.29173333  0.28        0.56138667  0.42455467  0.2         0.36        0.504
      0.2         1.17173333  0.2         0.28        0.36        0.488       0.52
      0.2         0.2         0.44        0.648       0.2         0.6704      0.2
      0.36        0.2         0.2         0.2         1.15701333  0.2         0.344
      0.2         0.36        0.55733333  0.2         0.344       0.28        0.2
      0.2         0.424       0.36        0.73333333  0.36853333  0.29173333
      1.07596373  0.36        0.408       1.33386667  0.25333333  0.2         0.2
      0.2         0.2         0.2         1.24533333  0.45173333  0.28        0.2
      0.344       0.2         0.2         1.19626667  0.2         0.632       0.2
      0.2       ]
444
445
446
447
448
449
450
451
452
453
454
455
456

The original graph can be recovered by setting the edge filter to ``None``.

.. testcode::

    g.set_edge_filter(None)
    pr = pagerank(g)
    print pr.a

Which outputs the following.

.. testoutput::

Tiago Peixoto's avatar
Tiago Peixoto committed
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
    [ 0.62729462  0.76981859  2.49409878  0.6482403   0.74615387  1.05405443
      0.61325462  0.89427918  0.71954456  1.30133216  0.2826627   0.77604271
      0.25073123  0.86915196  1.14884858  0.2826627   1.10094496  0.57026726
      0.6198043   0.76768522  1.52240328  0.41022172  1.17159772  0.95765161
      0.83490887  1.2136575   1.41449882  1.5489521   0.66412068  0.7352214
      1.21037608  0.64396361  0.87802656  0.31938462  0.78743109  1.67050184
      0.41200881  0.73928389  0.36523029  0.87377465  2.47043781  0.30561659
      0.93662203  0.86383309  1.21911903  1.80271636  0.2         1.03872561
      1.4359001   1.81688914  1.68310565  0.25073123  0.52549083  1.188486
      1.31594365  0.2         1.52498274  0.66120137  0.66025516  0.63644263
      0.26686166  0.88481433  1.34522024  0.31707021  1.06448852  0.51983431
      0.96831557  1.29751162  0.60525803  1.44864461  0.86032791  0.8863202
      0.44530184  0.97948075  1.5064464   1.34553188  1.23884369  0.91887273
      0.89110859  1.08966816  1.11685236  1.4889228   1.29937733  0.2
      1.37848879  0.50230514  0.60896565  0.65921635  0.98165444  0.71947832
      0.56083022  0.604076    0.48384859  0.34872367  0.5166419   1.52940485
      1.40411236  0.99922722  0.98348377  1.04335144]
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488

Everything works in analogous fashion with vertex filtering.

.. note::

    It is important to emphasize that the filtering functionality does not add
    any overhead when the graph is not being filtered. In this case, the
    algorithms run just as fast as if the filtering functionality didn't exist.

Additionally, the graph can also have its edges reversed with the
:meth:`~graph_tool.Graph.set_reversed` method. This is also an :math:`O(1)`
operation, which does not really modify the graph.

As mentioned previously, the directedness of the graph can also be changed
"on-the-fly" with the :meth:`~graph_tool.Graph.set_directed` method.