Commit 557588b1 authored by Tiago Peixoto's avatar Tiago Peixoto
Browse files

Update documentation

parent c4c7e675
......@@ -19,47 +19,3 @@ job_clang_amd64:
- cd doc; python3 /usr/bin/sphinx-build -b doctest . build *.rst
tags:
- amd64
# job_gcc_py2_amd64:
# script:
# - ./autogen.sh
# - ./configure CXX="ccache g++" PYTHON=python2 --prefix=$PWD/install --with-python-module-path=$PWD/install/site-packages
# - CCACHE_BASEDIR=$PWD make $MAKEOPTS
# - make install
# - export PYTHONPATH=$PWD/install/site-packages
# - cd doc; python2 /usr/bin/sphinx-build2 -b doctest . build *.rst demos/inference/inference.rst
# tags:
# - amd64
# job_clang_py2_amd64:
# script:
# - ./autogen.sh
# - ./configure CXX="ccache clang++" PYTHON=python2 --prefix=$PWD/install --with-python-module-path=$PWD/install/site-packages
# - CCACHE_BASEDIR=$PWD make $MAKEOPTS
# - make install
# - export PYTHONPATH=$PWD/install/site-packages
# - cd doc; python2 /usr/bin/sphinx-build2 -b doctest . build *.rst
# tags:
# - amd64
job_gcc_amd64_nosh:
script:
- ./autogen.sh
- ./configure CXX="ccache g++" PYTHON=python3 --prefix=$PWD/install --with-python-module-path=$PWD/install/site-packages --disable-sparsehash
- CCACHE_BASEDIR=$PWD make $MAKEOPTS
- make install
- export PYTHONPATH=$PWD/install/site-packages
- cd doc; python3 /usr/bin/sphinx-build -b doctest . build *.rst
tags:
- amd64
job_clang_amd64_nosh:
script:
- ./autogen.sh
- ./configure CXX="ccache clang++" PYTHON=python3 --prefix=$PWD/install --with-python-module-path=$PWD/install/site-packages --disable-sparsehash
- CCACHE_BASEDIR=$PWD make $MAKEOPTS
- make install
- export PYTHONPATH=$PWD/install/site-packages
- cd doc; python3 /usr/bin/sphinx-build -b doctest . build *.rst
tags:
- amd64
CXX=g++
CXXFLAGS=-O3 -fopenmp -std=gnu++17 -Wall -fPIC `pkg-config --cflags --libs graph-tool-py3.7` -shared
CXXFLAGS=-O3 -fopenmp -std=gnu++17 -Wall -fPIC `pkg-config --cflags --libs graph-tool-py3.8` -shared
ALL: libkcore.so
libkcore.so: kcore.hh kcore.cc
......
......@@ -162,7 +162,7 @@ In graph-tool iteration can also be done in a more convenient way using
}
Both approaches above are equivalent. Graph-tool also provides
``edges_range()``, ``out_edges_range()``, ``in_edges_range()``, etc. in
``edges_range()``, ``out_edges_range()``, ``in_edges_range()``, etc., in
an analogous fashion.
Extracting specific property maps
......
......@@ -119,11 +119,15 @@ the weights, as follows:
state.draw(edge_color=g.ep.weight, ecmap=(matplotlib.cm.inferno, .6),
eorder=g.ep.weight, edge_pen_width=gt.prop_to_size(g.ep.weight, 1, 4, power=1),
edge_gradient=[], output="moreno-train-wsbm.svg")
edge_gradient=[], output="moreno-train-wsbm.pdf")
.. figure:: moreno-train-wsbm.*
.. testcleanup:: weighted-model
conv_png("moreno-train-wsbm.pdf")
.. figure:: moreno-train-wsbm.png
:align: center
:width: 350px
:width: 450px
Best fit of the Binomial-weighted degree-corrected SBM for a network
of terror suspects, using the strength of connection as edge
......@@ -154,7 +158,7 @@ follows:
os.chdir("demos/inference")
except FileNotFoundError:
pass
gt.seed_rng(44)
gt.seed_rng(44 + 4)
.. testcode:: food-web
......@@ -169,11 +173,15 @@ follows:
state.draw(edge_color=gt.prop_to_size(g.ep.weight, power=1, log=True), ecmap=(matplotlib.cm.inferno, .6),
eorder=g.ep.weight, edge_pen_width=gt.prop_to_size(g.ep.weight, 1, 4, power=1, log=True),
edge_gradient=[], output="foodweb-wsbm.svg")
edge_gradient=[], output="foodweb-wsbm.pdf")
.. testcleanup:: food-web
.. figure:: foodweb-wsbm.*
conv_png("foodweb-wsbm.pdf")
.. figure:: foodweb-wsbm.png
:align: center
:width: 350px
:width: 450px
Best fit of the exponential-weighted degree-corrected SBM for a food
web, using the energy flow as edge covariates (indicated by the edge
......@@ -204,11 +212,15 @@ can fit this alternative model simply by using the transformed weights:
state_ln.draw(edge_color=gt.prop_to_size(g.ep.weight, power=1, log=True), ecmap=(matplotlib.cm.inferno, .6),
eorder=g.ep.weight, edge_pen_width=gt.prop_to_size(g.ep.weight, 1, 4, power=1, log=True),
edge_gradient=[], output="foodweb-wsbm-lognormal.svg")
edge_gradient=[], output="foodweb-wsbm-lognormal.pdf")
.. testcleanup:: food-web
.. figure:: foodweb-wsbm-lognormal.*
conv_png("foodweb-wsbm-lognormal.pdf")
.. figure:: foodweb-wsbm-lognormal.png
:align: center
:width: 350px
:width: 450px
Best fit of the log-normal-weighted degree-corrected SBM for a food
web, using the energy flow as edge covariates (indicated by the edge
......@@ -245,9 +257,9 @@ Therefore, we can compute the posterior odds ratio between both models as:
.. testoutput:: food-web
:options: +NORMALIZE_WHITESPACE
ln Λ: -24.246389...
ln Λ: -16.814693...
A value of :math:`\Lambda \approx \mathrm{e}^{-24} \approx 10^{-10}` in
A value of :math:`\Lambda \approx \mathrm{e}^{-17} \approx 10^{-7}` in
favor the exponential model indicates that the log-normal model does not
provide a better fit for this particular data. Based on this, we
conclude that the exponential model should be preferred in this case.
......
......@@ -21,7 +21,7 @@ network of American football teams, which we load from the
os.chdir("demos/inference")
except FileNotFoundError:
pass
gt.seed_rng(8)
gt.seed_rng(12)
.. testcode:: football
......@@ -32,7 +32,7 @@ which yields
.. testoutput:: football
<Graph object, undirected, with 115 vertices and 613 edges at 0x...>
<Graph object, undirected, with 115 vertices and 613 edges, 4 internal vertex properties, 2 internal graph properties, at 0x...>
we then fit the degree-corrected model by calling
......@@ -134,7 +134,7 @@ illustrate its use with the neural network of the `C. elegans
.. testsetup:: celegans
gt.seed_rng(51)
gt.seed_rng(52)
.. testcode:: celegans
......@@ -145,7 +145,7 @@ which has 297 vertices and 2359 edges.
.. testoutput:: celegans
<Graph object, directed, with 297 vertices and 2359 edges at 0x...>
<Graph object, directed, with 297 vertices and 2359 edges, 2 internal vertex properties, 1 internal edge property, 2 internal graph properties, at 0x...>
A hierarchical fit of the degree-corrected model is performed as follows.
......@@ -161,10 +161,16 @@ clustering using the
.. testcode:: celegans
state.draw(output="celegans-hsbm-fit.svg")
state.draw(output="celegans-hsbm-fit.pdf")
.. figure:: celegans-hsbm-fit.*
.. testcleanup:: celegans
conv_png("celegans-hsbm-fit.pdf")
.. figure:: celegans-hsbm-fit.png
:align: center
:width: 80%
Most likely hierarchical partition of the neural network of
the *C. elegans* worm according to the nested degree-corrected SBM.
......@@ -186,10 +192,10 @@ which shows the number of nodes and groups in all levels:
.. testoutput:: celegans
l: 0, N: 297, B: 16
l: 1, N: 16, B: 8
l: 2, N: 8, B: 3
l: 3, N: 3, B: 1
l: 0, N: 297, B: 19
l: 1, N: 19, B: 6
l: 2, N: 6, B: 2
l: 3, N: 2, B: 1
The hierarchical levels themselves are represented by individual
:meth:`~graph_tool.inference.blockmodel.BlockState` instances obtained via the
......@@ -203,10 +209,10 @@ The hierarchical levels themselves are represented by individual
.. testoutput:: celegans
<BlockState object with 16 blocks (16 nonempty), degree-corrected, for graph <Graph object, directed, with 297 vertices and 2359 edges at 0x...>, at 0x...>
<BlockState object with 8 blocks (8 nonempty), for graph <Graph object, directed, with 16 vertices and 134 edges at 0x...>, at 0x...>
<BlockState object with 3 blocks (3 nonempty), for graph <Graph object, directed, with 8 vertices and 50 edges at 0x...>, at 0x...>
<BlockState object with 1 blocks (1 nonempty), for graph <Graph object, directed, with 3 vertices and 8 edges at 0x...>, at 0x...>
<BlockState object with 19 blocks (19 nonempty), degree-corrected, for graph <Graph object, directed, with 297 vertices and 2359 edges, 2 internal vertex properties, 1 internal edge property, 2 internal graph properties, at 0x...>, at 0x...>
<BlockState object with 6 blocks (6 nonempty), for graph <Graph object, directed, with 19 vertices and 176 edges, 2 internal vertex properties, 1 internal edge property, at 0x...>, at 0x...>
<BlockState object with 2 blocks (2 nonempty), for graph <Graph object, directed, with 6 vertices and 30 edges, 2 internal vertex properties, 1 internal edge property, at 0x...>, at 0x...>
<BlockState object with 1 blocks (1 nonempty), for graph <Graph object, directed, with 2 vertices and 4 edges, 2 internal vertex properties, 1 internal edge property, at 0x...>, at 0x...>
This means that we can inspect the hierarchical partition just as before:
......@@ -221,6 +227,54 @@ This means that we can inspect the hierarchical partition just as before:
.. testoutput:: celegans
5
2
1
0
Trade-off between memory usage and computation time
+++++++++++++++++++++++++++++++++++++++++++++++++++
The agglomerative algorithm behind
:func:`~graph_tool.inference.minimize.minimize_blockmodel_dl` and
:func:`~graph_tool.inference.minimize.minimize_nested_blockmodel_dl` has
a log-linear complexity on the size of the network, but it makes several
copies of the internal blockmodel state, which can become a problem for
very large networks (i.e. tens of millions of edges or more). An
alternative is to use a greedy algorithm based on a merge-split MCMC
with zero temperature [peixoto-merge-split-2020]_, which requires a
single global state, and thus can reduce memory usage. This is achieved
by following the instructions in Sec. :ref:`sampling`, while setting the
inverse temperature parameter ``beta`` to infinity. For example, an
equivalent to the above minimization for the `C. elegans` network is the
following:
.. testcode:: celegans-mcmc
g = gt.collection.data["celegansneural"]
state = gt.NestedBlockState(g)
for i in range(1000): # this should be sufficiently large
state.multiflip_mcmc_sweep(beta=np.inf, niter=10)
Whenever possible, this procedure should be repeated several times, and
the result with the smallest description length (obtained via the
:meth:`~graph_tool.inference.blockmodel.BlockState.entropy` method)
should be chosen. Better results still can be obtained, at the expense
of a longer computation time, by using the
:meth:`~graph_tool.inference.mcmc.mcmc_anneal` function, which
implements `simulated annealing
<https://en.wikipedia.org/wiki/Simulated_annealing>`_:
.. testcode:: celegans-mcmc-anneal
g = gt.collection.data["celegansneural"]
state = gt.NestedBlockState(g)
gt.mcmc_anneal(state, beta_range=(1, 10), niter=1000, mcmc_equilibrate_args=dict(force_niter=10))
Any of the above methods should give similar results to the previous
algorithms, while requiring less memory. In terms of quality of the
results, it will vary depending on the data, thus experimentation is
recommended.
\ No newline at end of file
......@@ -154,8 +154,8 @@ evidence efficiently, as we show below, using
.. testoutput:: model-evidence
Model evidence for deg_corr = True: -579.300446... (mean field), -832.245049... (Bethe)
Model evidence for deg_corr = False: -597.211055... (mean field), -728.704043... (Bethe)
Model evidence for deg_corr = True: -383.7870425700103 (mean field), -1053.691486781025 (Bethe)
Model evidence for deg_corr = False: -364.121582158287 (mean field), -945.9676904776013 (Bethe)
If we consider the more accurate approximation, the outcome shows a
preference for the non-degree-corrected model.
......@@ -219,8 +219,8 @@ approach for the same network, using the nested model.
.. testoutput:: model-evidence
Model evidence for deg_corr = True: -503.011203... (mean field), -728.064049... (Bethe)
Model evidence for deg_corr = False: -518.022407... (mean field), -633.507167... (Bethe)
Model evidence for deg_corr = True: -189.190149... (mean field), -844.518064... (Bethe)
Model evidence for deg_corr = False: -177.192565... (mean field), -727.585051... (Bethe)
The results are similar: If we consider the most accurate approximation,
the non-degree-corrected model possesses the largest evidence. Note also
......
......@@ -25,8 +25,8 @@ we have
.. testoutput:: model-selection
:options: +NORMALIZE_WHITESPACE
Non-degree-corrected DL: 8470.198169...
Degree-corrected DL: 8273.840277...
Non-degree-corrected DL: 8553.474528...
Degree-corrected DL: 8266.554118...
Since it yields the smallest description length, the degree-corrected
fit should be preferred. The statistical significance of the choice can
......@@ -52,13 +52,13 @@ fits. In our particular case, we have
.. testoutput:: model-selection
:options: +NORMALIZE_WHITESPACE
ln Λ: -196.357892...
ln Λ: -286.92041...
The precise threshold that should be used to decide when to `reject a
hypothesis <https://en.wikipedia.org/wiki/Hypothesis_testing>`_ is
subjective and context-dependent, but the value above implies that the
particular degree-corrected fit is around :math:`\mathrm{e}^{196}
\approx 10^{85}` times more likely than the non-degree corrected one,
particular degree-corrected fit is around :math:`\mathrm{e}^{287}
\approx 10^{124}` times more likely than the non-degree corrected one,
and hence it can be safely concluded that it provides a substantially
better fit.
......@@ -80,11 +80,11 @@ example, for the American football network above, we have:
.. testoutput:: model-selection
:options: +NORMALIZE_WHITESPACE
Non-degree-corrected DL: 1733.525685...
Non-degree-corrected DL: 1738.138494...
Degree-corrected DL: 1780.576716...
ln Λ: -47.051031...
ln Λ: -42.4382219...
Hence, with a posterior odds ratio of :math:`\Lambda \approx \mathrm{e}^{-47} \approx
10^{-20}` in favor of the non-degree-corrected model, we conclude that the
Hence, with a posterior odds ratio of :math:`\Lambda \approx \mathrm{e}^{-42} \approx
10^{-18}` in favor of the non-degree-corrected model, we conclude that the
degree-corrected variant is an unnecessarily complex description for
this network.
......@@ -142,8 +142,8 @@ above).
.. testoutput:: missing-edges
likelihood-ratio for (101, 102): 0.37...
likelihood-ratio for (17, 56): 0.62...
likelihood-ratio for (101, 102): 0.36...
likelihood-ratio for (17, 56): 0.63...
From which we can conclude that edge :math:`(17, 56)` is more likely
than :math:`(101, 102)` to be a missing edge.
......
......@@ -132,8 +132,8 @@ simple example, using
os.chdir("demos/inference")
except FileNotFoundError:
pass
np.random.seed(42)
gt.seed_rng(44)
np.random.seed(43)
gt.seed_rng(45)
.. testcode:: measured
......@@ -151,14 +151,11 @@ simple example, using
n[e] = 2 # pretend we have measured non-edge (15, 73) twice,
x[e] = 1 # but observed it as an edge once.
bs = [g.get_vertices()] + [zeros(1)] * 5 # initial hierarchical partition
# We inititialize MeasuredBlockState, assuming that each non-edge has
# been measured only once (as opposed to twice for the observed
# edges), as specified by the 'n_default' and 'x_default' parameters.
state = gt.MeasuredBlockState(g, n=n, n_default=1, x=x, x_default=0,
state_args=dict(bs=bs))
state = gt.MeasuredBlockState(g, n=n, n_default=1, x=x, x_default=0)
# We will first equilibrate the Markov chain
gt.mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10))
......@@ -191,9 +188,9 @@ Which yields the following output:
.. testoutput:: measured
Posterior probability of edge (11, 36): 0.742674...
Posterior probability of non-edge (15, 73): 0.020702...
Estimated average local clustering: 0.572384 ± 0.003466...
Posterior probability of edge (11, 36): 0.768976...
Posterior probability of non-edge (15, 73): 0.039203...
Estimated average local clustering: 0.571939 ± 0.003534...
We have a successful reconstruction, where both ambiguous adjacency
matrix entries are correctly recovered. The value for the average
......@@ -287,8 +284,7 @@ with uniform error rates, as we see with the same example:
.. testcode:: measured
state = gt.MixedMeasuredBlockState(g, n=n, n_default=1, x=x, x_default=0,
state_args=dict(bs=bs))
state = gt.MixedMeasuredBlockState(g, n=n, n_default=1, x=x, x_default=0)
# We will first equilibrate the Markov chain
gt.mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10))
......@@ -312,9 +308,9 @@ Which yields:
.. testoutput:: measured
Posterior probability of edge (11, 36): 0.839483...
Posterior probability of non-edge (15, 73): 0.061106...
Estimated average local clustering: 0.572476 ± 0.004370...
Posterior probability of edge (11, 36): 0.631563...
Posterior probability of non-edge (15, 73): 0.022402...
Estimated average local clustering: 0.570065 ± 0.007145...
The results are very similar to the ones obtained with the uniform model
in this case, but can be quite different in situations where a large
......@@ -401,15 +397,13 @@ inference:
e = g.add_edge(15, 73)
q[e] = .5 # ambiguous spurious edge
bs = [g.get_vertices()] + [zeros(1)] * 5 # initial hierarchical partition
# We inititialize UncertainBlockState, assuming that each non-edge
# has an uncertainty of q_default, chosen to preserve the expected
# density of the original network:
q_default = (E - q.a.sum()) / ((N * (N - 1))/2 - E)
state = gt.UncertainBlockState(g, q=q, q_default=q_default, state_args=dict(bs=bs))
state = gt.UncertainBlockState(g, q=q, q_default=q_default)
# We will first equilibrate the Markov chain
gt.mcmc_equilibrate(state, wait=2000, mcmc_args=dict(niter=10))
......@@ -441,9 +435,9 @@ The above yields the output:
.. testoutput:: uncertain
Posterior probability of edge (11, 36): 0.891889...
Posterior probability of non-edge (15, 73): 0.026002...
Estimated average local clustering: 0.544468 ± 0.021562...
Posterior probability of edge (11, 36): 0.775777...
Posterior probability of non-edge (15, 73): 0.010001...
Estimated average local clustering: 0.523013 ± 0.017268...
The reconstruction is accurate, despite the two ambiguous entries having
the same measurement probability. The reconstructed network is visualized below.
......@@ -488,4 +482,74 @@ the same measurement probability. The reconstructed network is visualized below.
reconstruction. The pie fractions on the nodes correspond to the
probability of being in group associated with the respective color.
Latent Poisson multigraphs
++++++++++++++++++++++++++
Even in situations where measurement errors can be neglected, it can
still be useful to assume a given network is the outcome of a "hidden"
multigraph model, i.e. more than one edge between nodes is allowed, but
then its multiedges are "erased" by transforming them into simple
edges. In this way, it is possible to construct generative models that
can better handle situations where the underlying network possesses
heterogeneous density, such as strong community structure and broad
degree distributions [peixoto-latent-2020]_. This can be incorporated
into the scheme of Eq. :eq:`posterior-reconstruction` by considering
the data to be the observed simple graph,
:math:`\boldsymbol{\mathcal{D}} = \boldsymbol G`. We proceed in same
way as in the previous reconstruction scenarios, but using instead
:class:`~graph_tool.inference.uncertain_blockmodel.LatentMultigraphBlockState`.
For example, in the following we will obtain the community structure and
latent multiedges of a network of political books:
.. testcode:: erased-poisson
g = gt.collection.data["polbooks"]
state = gt.LatentMultigraphBlockState(g)
# We will first equilibrate the Markov chain
gt.mcmc_equilibrate(state, wait=100, mcmc_args=dict(niter=10))
# Now we collect the marginals for exactly 100,000 sweeps, at
# intervals of 10 sweeps:
u = None # marginal posterior multigraph
pv = None # marginal posterior group membership probabilities
def collect_marginals(s):
global pv, u
u = s.collect_marginal_multigraph(u)
bstate = state.get_block_state()
b = gt.perfect_prop_hash([bstate.levels[0].b])[0]
pv = bstate.levels[0].collect_vertex_marginals(pv, b=b)
gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10),
callback=collect_marginals)
# compute average multiplicities
ew = u.new_ep("double")
w = u.ep.w
wcount = u.ep.wcount
for e in u.edges():
ew[e] = (wcount[e].a * w[e].a).sum() / wcount[e].a.sum()
bstate = state.get_block_state()
bstate = bstate.levels[0].copy(g=u)
pv = u.own_property(pv)
bstate.draw(pos=u.own_property(g.vp.pos), vertex_shape="pie", vertex_pie_fractions=pv,
edge_pen_width=gt.prop_to_size(ew, .1, 8, power=1), edge_gradient=None,
output="polbooks-erased-poisson.svg")
.. figure:: polbooks-erased-poisson.*
:align: center
:width: 450px
Reconstructed latent Poisson degree-corrected SBM for a network of
political books, showing the marginal mean edge multiplicities as
line thickness. The pie fractions on the nodes correspond to the
probability of being in group associated with the respective color.
.. include:: _reconstruction_dynamics.rst
......@@ -111,7 +111,7 @@ epidemic process.
# Create reconstruction state
rstate = gt.EpidemicsBlockState(u, s=ss, beta=None, r=1e-6, global_beta=.1,
state_args=dict(B=1), nested=False, aE=g.num_edges())
nested=False, aE=g.num_edges())
# Now we collect the marginals for exactly 100,000 sweeps, at
# intervals of 10 sweeps:
......@@ -143,8 +143,8 @@ The reconstruction can accurately recover the hidden network and the infection p
.. testoutput:: dynamics
Posterior similarity: 0.978091...
Inferred infection probability: 0.68809 ± 0.054207
Posterior similarity: 0.990587...
Inferred infection probability: 0.692324 ± 0.0496223
The figure below shows the reconstructed network and the inferred community structure.
......
......@@ -27,62 +27,44 @@ large networks.
In order to perform such moves, one needs again to operate with
:class:`~graph_tool.inference.blockmodel.BlockState` or
:class:`~graph_tool.inference.nested_blockmodel.NestedBlockState` instances, and calling
their :meth:`~graph_tool.inference.blockmodel.BlockState.mcmc_sweep` methods. For
example, the following will perform 1000 sweeps of the algorithm with
the network of characters in the novel Les Misérables, starting from a
random partition into 20 groups
:class:`~graph_tool.inference.nested_blockmodel.NestedBlockState`
instances, and calling either their
:meth:`~graph_tool.inference.blockmodel.BlockState.mcmc_sweep` or
:meth:`~graph_tool.inference.blockmodel.BlockState.multiflip_mcmc_sweep`
methods. The former implements a simpler MCMC where a single node is
moved at a time, where the latter is a more efficient version that
performs merges and splits [peixoto-merge-split-2020]_, which should be
in general preferred.
For example, the following will perform 1000 sweeps of the algorithm
with the network of characters in the novel Les Misérables, starting
from a random partition into 20 groups
.. testcode:: model-averaging
g = gt.collection.data["lesmis"]
state = gt.BlockState(g, B=20) # This automatically initializes the state
# with a random partition into B=20
# nonempty groups; The user could
# also pass an arbitrary initial
# partition using the 'b' parameter.
state = gt.BlockState(g, B=1) # This automatically initializes the state with a partition
# into one group. The user could also pass a higher number
# to start with a random partition of a given size, or pass a
# specific initial partition using the 'b' parameter.
# Now we run 1,000 sweeps of the MCMC. Note that the number of groups
# is allowed to change, so it will eventually move from the initial
# value of B=20 to whatever is most appropriate for the data.
# value of B=1 to whatever is most appropriate for the data.
dS, nattempts, nmoves = state.mcmc_sweep(niter=1000)
dS, nattempts, nmoves = state.multiflip_mcmc_sweep(niter=1000)
print("Change in description length:", dS)
print("Number of accepted vertex moves:", nmoves)
.. testoutput:: model-averaging
Change in description length: -353.848032...
Number of accepted vertex moves: 37490
Change in description length: -71.707346...
Number of accepted vertex moves: 44055
.. note::
Starting from a random partition is rarely the best option, since it
may take a long time for it to equilibrate. It was done above simply
as an illustration on how to initialize
:class:`~graph_tool.inference.blockmodel.BlockState` by hand. Instead, a much
better option in practice is to start from an approximation to the
"ground state" obtained with
:func:`~graph_tool.inference.minimize.minimize_blockmodel_dl`, e.g.
.. testcode:: model-averaging
state = gt.minimize_blockmodel_dl(g)
state = state.copy(B=g.num_vertices())
dS, nattempts, nmoves = state.mcmc_sweep(niter=1000)
print("Change in description length:", dS)
print("Number of accepted vertex moves:", nmoves)
.. testoutput:: model-averaging
Change in description length: 31.622518...
Number of accepted vertex moves: 43152
Although the above is sufficient to implement model averaging, there is a
convenience function called
Although the above is sufficient to implement sampling from the
posterior, there is a convenience function called
:func:`~graph_tool.inference.mcmc.mcmc_equilibrate` that is intend to
simplify the detection of equilibration, by keeping track of the maximum
and minimum values of description length encountered and how many sweeps
......@@ -93,58 +75,7 @@ have been made without a "record breaking" event. For example,
# We will accept equilibration if 10 sweeps are completed without a
# record breaking event, 2 consecutive times.
gt.mcmc_equilibrate(state, wait=10, nbreaks=2, mcmc_args=dict(niter=10), verbose=True)
will output:
.. testoutput:: model-averaging
:options: +NORMALIZE_WHITESPACE