Commit c0b1794f by Tiago Peixoto

### inference: update cookbook and docstrings

parent 100f01d4
 ... ... @@ -34,7 +34,7 @@ which yields we then fit the degree-corrected model by calling We then fit the degree-corrected model by calling: .. testcode:: football ... ...
 ... ... @@ -52,129 +52,65 @@ The computation of the posterior entropy :math:\mathcal{S}, however, is significantly more difficult, since it involves measuring the precise value of :math:q(\boldsymbol b). A direct "brute force" computation of :math:\mathcal{S} is implemented via :meth:~graph_tool.inference.blockmodel.BlockState.collect_partition_histogram and :func:~graph_tool.inference.blockmodel.microstate_entropy, however this is only feasible for very small networks. For larger networks, we are forced to perform approximations. The simplest is a "mean field" one, where we assume the posterior factorizes as .. math:: q(\boldsymbol b) \approx \prod_i{q_i(b_i)} where .. math:: q_i(r) = P(b_i = r | \boldsymbol A) is the marginal group membership distribution of node :math:i. This yields an entropy value given by .. math:: S \approx -\sum_i\sum_rq_i(r)\ln q_i(r). This approximation should be seen as an upper bound, since any existing correlation between the nodes (which are ignored here) will yield smaller entropy values. A more accurate assumption is called the Bethe approximation [mezard-information-2009]_, and takes into account the correlation between adjacent nodes in the network, .. math:: q(\boldsymbol b) \approx \prod_{i_, :math:k_i is the degree of node :math:i, and .. math:: q_{ij}(r, s) = P(b_i = r, b_j = s|\boldsymbol A) is the joint group membership distribution of nodes :math:i and :math:j (a.k.a. the edge marginals). This yields an entropy value given by .. math:: S \approx -\sum_{i0 only the mean-field approximation is applicable, since the adjacency matrix of the higher layers is not constant. We show below the When using the nested model, the approach is entirely analogous. We show below the approach for the same network, using the nested model. ... ... @@ -182,49 +118,50 @@ approach for the same network, using the nested model. g = gt.collection.data["lesmis"] nL = 10 for deg_corr in [True, False]: state = gt.minimize_nested_blockmodel_dl(g, deg_corr=deg_corr) # Initialize the Markov # chain from the "ground # state" bs = state.get_bs() # Get hierarchical partition. bs += [np.zeros(1)] * (nL - len(bs)) # Augment it to L = 10 with # single-group levels. state = state.copy(bs=bs, sampling=True) dls = [] # description length history vm = [None] * len(state.get_levels()) # vertex marginals em = None # edge marginals def collect_marginals(s): global vm, em levels = s.get_levels() vm = [sl.collect_vertex_marginals(vm[l]) for l, sl in enumerate(levels)] em = levels[0].collect_edge_marginals(em) state = gt.NestedBlockState(g, deg_corr=deg_corr) # Equilibrate gt.mcmc_equilibrate(state, force_niter=1000, mcmc_args=dict(niter=10)) dls = [] # description length history bs = [] # partitions def collect_partitions(s): global bs, dls bs.append(s.get_state()) dls.append(s.entropy()) # Now we collect the marginal distributions for exactly 200,000 sweeps gt.mcmc_equilibrate(state, force_niter=20000, mcmc_args=dict(niter=10), callback=collect_marginals) # Now we collect 2000 partitions; but the larger this is, the # more accurate will be the calculation gt.mcmc_equilibrate(state, force_niter=2000, mcmc_args=dict(niter=10), callback=collect_partitions) # Infer partition modes pmode = gt.ModeClusterState(bs, nested=True) # Minimize the mode state itself gt.mcmc_equilibrate(pmode, wait=1, mcmc_args=dict(niter=1, beta=np.inf)) S_mf = [gt.mf_entropy(sl.g, vm[l]) for l, sl in enumerate(state.get_levels())] S_bethe = gt.bethe_entropy(g, em)[0] L = -mean(dls) # Posterior entropy H = pmode.posterior_entropy() print("Model evidence for deg_corr = %s:" % deg_corr, L + sum(S_mf), "(mean field),", L + S_bethe + sum(S_mf[1:]), "(Bethe)") # log(B!) term logB = mean([sum(gammaln(len(np.unique(bl))+1) for bl in b) for b in bs]) # Evidence L = -mean(dls) + logB + H print(f"Model log-evidence for deg_corr = {deg_corr}: {L}") .. testoutput:: model-evidence Model evidence for deg_corr = True: -183.999497... (mean field), -840.108164... (Bethe) Model evidence for deg_corr = False: -187.344952... (mean field), -738.797625... (Bethe) Model log-evidence for deg_corr = True: -660.331060... Model log-evidence for deg_corr = False: -657.839574... The results are similar: If we consider the most accurate approximation, the non-degree-corrected model possesses the largest evidence. Note also that we observe a better evidence for the nested models themselves, when comparing to the evidences for the non-nested model --- which is not quite surprising, since the non-nested model is a special case of the nested one. The results are similar: The non-degree-corrected model possesses the largest evidence. Note also that we observe a better evidence for the nested models themselves, when comparing to the evidences for the non-nested model --- which is not quite surprising, since the non-nested model is a special case of the nested one.
 ... ... @@ -25,8 +25,8 @@ we have .. testoutput:: model-selection :options: +NORMALIZE_WHITESPACE Non-degree-corrected DL: 8520.825480... Degree-corrected DL: 8227.987410... Non-degree-corrected DL: 8553.474528... Degree-corrected DL: 8266.554118... Since it yields the smallest description length, the degree-corrected fit should be preferred. The statistical significance of the choice can ... ... @@ -52,7 +52,7 @@ fits. In our particular case, we have .. testoutput:: model-selection :options: +NORMALIZE_WHITESPACE ln Λ: -292.838070... ln Λ: -286.920410... The precise threshold that should be used to decide when to reject a hypothesis _ is ... ... @@ -80,11 +80,11 @@ example, for the American football network above, we have: .. testoutput:: model-selection :options: +NORMALIZE_WHITESPACE Non-degree-corrected DL: 1757.843826... Degree-corrected DL: 1809.861996... ln Λ: -52.018170... Non-degree-corrected DL: 1738.138494... Degree-corrected DL: 1780.576716... ln Λ: -42.438221... Hence, with a posterior odds ratio of :math:\Lambda \approx \mathrm{e}^{-52} \approx 10^{-22} in favor of the non-degree-corrected model, we conclude that the Hence, with a posterior odds ratio of :math:\Lambda \approx \mathrm{e}^{-42} \approx 10^{-19} in favor of the non-degree-corrected model, we conclude that the degree-corrected variant is an unnecessarily complex description for this network.
 ... ... @@ -164,15 +164,14 @@ simple example, using # intervals of 10 sweeps: u = None # marginal posterior edge probabilities pv = None # marginal posterior group membership probabilities bs = [] # partitions cs = [] # average local clustering coefficient def collect_marginals(s): global pv, u, cs global u, bs, cs u = s.collect_marginal(u) bstate = s.get_block_state() b = gt.perfect_prop_hash([bstate.levels[0].b])[0] pv = bstate.levels[0].collect_vertex_marginals(pv, b=b) bs.append(bstate.levels[0].b.a.copy()) cs.append(gt.local_clustering(s.get_graph()).fa.mean()) gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), ... ... @@ -223,7 +222,10 @@ reconstructed network: bstate = state.get_block_state() bstate = bstate.levels[0].copy(g=u) pv = u.own_property(pv) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, converge=True) pv = pmode.get_marginal(u) edash = u.new_ep("vector") edash[u.edge(15, 73)] = [.1, .1, 0] bstate.draw(pos=u.own_property(g.vp.pos), vertex_shape="pie", vertex_pie_fractions=pv, ... ... @@ -293,7 +295,7 @@ with uniform error rates, as we see with the same example: # intervals of 10 sweeps: u = None # marginal posterior edge probabilities pv = None # marginal posterior group membership probabilities bs = [] # partitions cs = [] # average local clustering coefficient gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), ... ... @@ -412,15 +414,14 @@ inference: # intervals of 10 sweeps: u = None # marginal posterior edge probabilities pv = None # marginal posterior group membership probabilities bs = [] # partitions cs = [] # average local clustering coefficient def collect_marginals(s): global pv, u, cs global bs, u, cs u = s.collect_marginal(u) bstate = s.get_block_state() b = gt.perfect_prop_hash([bstate.levels[0].b])[0] pv = bstate.levels[0].collect_vertex_marginals(pv, b=b) bs.append(bstate.levels[0].b.a.copy()) cs.append(gt.local_clustering(s.get_graph()).fa.mean()) gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), ... ... @@ -465,7 +466,11 @@ the same measurement probability. The reconstructed network is visualized below. bstate = state.get_block_state() bstate = bstate.levels[0].copy(g=u) pv = u.own_property(pv) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, converge=True) pv = pmode.get_marginal(u) bstate.draw(pos=u.own_property(g.vp.pos), vertex_shape="pie", vertex_pie_fractions=pv, edge_color=ecolor, edge_dash_style=edash, edge_gradient=None, output="lesmis-uncertain-reconstruction-marginals.svg") ... ... @@ -516,14 +521,13 @@ latent multiedges of a network of political books: # intervals of 10 sweeps: u = None # marginal posterior multigraph pv = None # marginal posterior group membership probabilities bs = [] # partitions def collect_marginals(s): global pv, u global bs, u u = s.collect_marginal_multigraph(u) bstate = state.get_block_state() b = gt.perfect_prop_hash([bstate.levels[0].b])[0] pv = bstate.levels[0].collect_vertex_marginals(pv, b=b) bs.append(bstate.levels[0].b.a.copy()) gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), callback=collect_marginals) ... ... @@ -538,7 +542,11 @@ latent multiedges of a network of political books: bstate = state.get_block_state() bstate = bstate.levels[0].copy(g=u) pv = u.own_property(pv) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, converge=True) pv = pmode.get_marginal(u) bstate.draw(pos=u.own_property(g.vp.pos), vertex_shape="pie", vertex_pie_fractions=pv, edge_pen_width=gt.prop_to_size(ew, .1, 8, power=1), edge_gradient=None, output="polbooks-erased-poisson.svg") ... ...
 ... ... @@ -117,14 +117,13 @@ epidemic process. # intervals of 10 sweeps: gm = None bm = None bs = [] betas = [] def collect_marginals(s): global gm, bm global gm, bs gm = s.collect_marginal(gm) b = gt.perfect_prop_hash([s.bstate.b])[0] bm = s.bstate.collect_vertex_marginals(bm, b=b) bs.append(s.bstate.b.a.copy()) betas.append(s.params["global_beta"]) gt.mcmc_equilibrate(rstate, force_niter=10000, mcmc_args=dict(niter=10, xstep=0), ... ... @@ -132,9 +131,13 @@ epidemic process. print("Posterior similarity: ", gt.similarity(g, gm, g.new_ep("double", 1), gm.ep.eprob)) print("Inferred infection probability: %g ± %g" % (mean(betas), std(betas))) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, converge=True) pv = pmode.get_marginal(gm) gt.graph_draw(gm, gm.own_property(g.vp.pos), vertex_shape="pie", vertex_color="black", vertex_pie_fractions=gm.own_property(bm), vertex_pen_width=1, vertex_pie_fractions=pv, vertex_pen_width=1, edge_pen_width=gt.prop_to_size(gm.ep.eprob, 0, 5), eorder=gm.ep.eprob, output="dolphins-posterior.svg") ... ...
 ... ... @@ -81,37 +81,41 @@ Note that the value of wait above was made purposefully low so that the output would not be overly long. The most appropriate value requires experimentation, but a typically good value is wait=1000. The function :func:~graph_tool.inference.mcmc.mcmc_equilibrate accepts a callback argument that takes an optional function to be invoked The function :func:~graph_tool.inference.mcmc.mcmc_equilibrate accepts a callback argument that takes an optional function to be invoked after each call to :meth:~graph_tool.inference.blockmodel.BlockState.multiflip_mcmc_sweep. This function should accept a single parameter which will contain the actual :class:~graph_tool.inference.blockmodel.BlockState instance. We will use this in the example below to collect the posterior vertex marginals (via :class:~graph_tool.inference.blockmodel.BlockState.collect_vertex_marginals), i.e. the posterior probability that a node belongs to a given group: :meth:~graph_tool.inference.blockmodel.BlockState.multiflip_mcmc_sweep. This function should accept a single parameter which will contain the actual :class:~graph_tool.inference.blockmodel.BlockState instance. We will use this in the example below to collect the posterior vertex marginals (via :class:~graph_tool.inference.partition_modes.PartitionModeState, which disambiguates group labels [peixoto-revealing-2020]_), i.e. the posterior probability that a node belongs to a given group: .. testcode:: model-averaging # We will first equilibrate the Markov chain gt.mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10)) pv = None bs = [] # collect some partitions def collect_marginals(s): global pv b = gt.perfect_prop_hash([s.b])[0] pv = s.collect_vertex_marginals(pv, b=b) def collect_partitions(s): global bs bs.append(s.b.a.copy()) # Now we collect the marginals for exactly 100,000 sweeps, at # intervals of 10 sweeps: # Now we collect partitions for exactly 100,000 sweeps, at intervals # of 10 sweeps: gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), callback=collect_marginals) callback=collect_partitions) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, converge=True) pv = pmode.get_marginal(g) # Now the node marginals are stored in property map pv. We can # visualize them as pie charts on the nodes: state.draw(pos=g.vp.pos, vertex_shape="pie", vertex_pie_fractions=pv, edge_gradient=None, output="lesmis-sbm-marginals.svg") output="lesmis-sbm-marginals.svg") .. figure:: lesmis-sbm-marginals.* :align: center ... ... @@ -135,8 +139,8 @@ itself, as follows. B = s.get_nonempty_B() h[B] += 1 # Now we collect the marginals for exactly 100,000 sweeps, at # intervals of 10 sweeps: # Now we collect partitions for exactly 100,000 sweeps, at intervals # of 10 sweeps: gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), callback=collect_num_groups) ... ... @@ -194,7 +198,6 @@ network as above. Change in description length: -73.716766... Number of accepted vertex moves: 366160 .. warning:: When using ... ... @@ -212,28 +215,34 @@ Similarly to the the non-nested case, we can use :func:~graph_tool.inference.mcmc.mcmc_equilibrate to do most of the boring work, and we can now obtain vertex marginals on all hierarchical levels: .. testcode:: nested-model-averaging # We will first equilibrate the Markov chain gt.mcmc_equilibrate(state, wait=1000, mcmc_args=dict(niter=10)) pv = [None] * len(state.get_levels()) # collect nested partitions bs = [] def collect_marginals(s): global pv bs = [gt.perfect_prop_hash([s.b])[0] for s in state.get_levels()] pv = [s.collect_vertex_marginals(pv[l], b=bs[l]) for l, s in enumerate(s.get_levels())] def collect_partitions(s): global bs bs.append(s.get_bs()) # Now we collect the marginals for exactly 100,000 sweeps gt.mcmc_equilibrate(state, force_niter=10000, mcmc_args=dict(niter=10), callback=collect_marginals) callback=collect_partitions) # Disambiguate partitions and obtain marginals pmode = gt.PartitionModeState(bs, nested=True, converge=True) pv = pmode.get_marginal(g) # Get consensus estimate bs = pmode.get_max_nested() state = state.copy(bs=bs) # Now the node marginals for all levels are stored in property map # list pv. We can visualize the first level as pie charts on the nodes: state_0 = state.get_levels()[0] state_0.draw(pos=g.vp.pos, vertex_shape="pie", vertex_pie_fractions=pv[0], edge_gradient=None, output="lesmis-nested-sbm-marginals.svg") # We can visualize the marginals as pie charts on the nodes: state.draw(vertex_shape="pie", vertex_pie_fractions=pv, output="lesmis-nested-sbm-marginals.svg") .. figure:: lesmis-nested-sbm-marginals.* :align: center ... ... @@ -316,3 +325,79 @@ distribution. :width: 200px .. image:: lesmis-partition-sample-9.svg :width: 200px Characterizing the posterior distribution +++++++++++++++++++++++++++++++++++++++++ The posterior distribution of partitions can have an elaborate structure, containing multiple possible explanations for the data. In order to summarize it, we can infer the modes of the distribution using :class:~graph_tool.inference.partition_modes.ModeClusterState, as described in [peixoto-revealing-2020]_. This amounts to identifying clusters of partitions that are very similar to each other, but sufficiently different from those that belong to other clusters. Collective, such "modes" represent the different stories that the data is telling us through the model. Here is an example using again the Les Misérables network: .. testcode:: partition-modes g = gt.collection.data["lesmis"] state = gt.NestedBlockState(g) # Equilibration gt.mcmc_equilibrate(state, force_niter=1000, mcmc_args=dict(niter=10)) bs = [] def collect_partitions(s): global bs bs.append(s.get_bs()) # We will collect only partitions 1000 partitions. For more accurate # results, this number should be increased. gt.mcmc_equilibrate(state, force_niter=1000, mcmc_args=dict(niter=10), callback=collect_partitions) # Infer partition modes pmode = gt.ModeClusterState(bs, nested=True) # Minimize the mode state itself gt.mcmc_equilibrate(pmode, wait=1, mcmc_args=dict(niter=1, beta=np.inf)) # Get inferred modes modes = pmode.get_modes() for i, mode in enumerate(modes): b = mode.get_max_nested() # mode's maximum pv = mode.get_marginal(g) # mode's marginal distribution print(f"Mode {i} with size {mode.get_M()/len(bs)}") state = state.copy(bs=b) state.draw(vertex_shape="pie", vertex_pie_fractions=pv, output="lesmis-partition-mode-%i.svg" % i) Running the above code gives us the relative size of each mode, corresponding to their collective posterior probability. .. testoutput:: partition-modes Mode 0 with size 0.389389...