_reconstruction_dynamics.rst 6.31 KB
 Tiago Peixoto committed May 26, 2019 1 2 3 4 5 .. _reconstruction_dynamics: Reconstruction from dynamics ++++++++++++++++++++++++++++  Tiago Peixoto committed May 26, 2019 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 In some cases direct measurements of the edges of a network are either impossible to be done, or can be done only at significant experimental cost. In such situations, we are required to infer the network of interactions from the observed functional behavior [peixoto-network-2019]_. In graph-tool this can be done for epidemic spreading (via :class:~graph_tool.inference.uncertain_blockmodel.EpidemicsBlockState), for the kinetic Ising model (via :class:~graph_tool.inference.uncertain_blockmodel.IsingGlauberBlockState and :class:~graph_tool.inference.uncertain_blockmodel.CIsingGlauberBlockState), and for the equilibrium Ising model (via :class:~graph_tool.inference.uncertain_blockmodel.PseudoIsingBlockState and :class:~graph_tool.inference.uncertain_blockmodel.PseudoCIsingBlockState). We consider the general reconstruction framework outlined above, where the observed data :math:\mathcal{D} in Eq. :eq:posterior-reconstruction are a time-series generated by some of the supported processes. Just like before, the posterior distribution includes not only the adjacency matrix, but also the parameters of the dynamical model and of the SBM that is used as a prior.  Tiago Peixoto committed May 26, 2019 27   Tiago Peixoto committed May 26, 2019 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 For example, in the case of a SIS epidemics, where :math:\sigma_i(t)=1 means node :math:i is infected at time :math:t, or :math:0 otherwise, the likelihood for a time-series :math:\boldsymbol\sigma is .. math:: P(\boldsymbol\sigma|\boldsymbol A,\boldsymbol\beta,\gamma)=\prod_t\prod_iP(\sigma_i(t)|\boldsymbol\sigma(t-1)), where .. math:: P(\sigma_i(t)|\boldsymbol\sigma(t-1)) = f(e^{m_i(t-1)}, \sigma_i(t))^{1-\sigma_i(t-1)} \times f(\gamma,\sigma_i(t))^{\sigma_i(t-1)} is the transition probability for node :math:i at time :math:t, with :math:f(p,\sigma) = (1-p)^{\sigma}p^{1-\sigma}, and where .. math:: m_i(t) = \sum_jA_{ij}\ln(1-\beta_{ij})\sigma_j(t) is the contribution from all neighbors of node :math:i to its infection probability at time :math:t. In the equations above the value :math:\beta_{ij} is the probability of an infection via an existing edge :math:(i,j), and :math:\gamma is the :math:1\to 0 recovery probability. With these additional parameters, the full posterior distribution for the reconstruction becomes .. math:: P(\boldsymbol A,\boldsymbol b,\boldsymbol\beta|\boldsymbol\sigma) = \frac{P(\boldsymbol\sigma|\boldsymbol A,\boldsymbol b,\gamma)P(\boldsymbol A|\boldsymbol b)P(\boldsymbol b)P(\boldsymbol\beta)}{P(\boldsymbol\sigma|\gamma)}. Since :math:\beta_{ij}\in[0,1] we use the uniform prior :math:P(\boldsymbol\beta)=1. Note also that the recovery probably :math:\gamma plays no role on the reconstruction algorithm, since its term in the likelihood does not involve :math:\boldsymbol A (and hence, gets cancelled out in the denominator :math:P(\boldsymbol\sigma|\gamma)=P(\gamma|\boldsymbol\sigma)P(\boldsymbol\sigma)/P(\gamma)). This means the above posterior only depends on the infection events :math:0\to 1, and thus is also valid without any modifications to all epidemic variants SI, SIR, SEIR, etc, since the infection events occur with the same probability for all these models. In the example below is shown how to perform reconstruction from an epidemic process.  Tiago Peixoto committed May 26, 2019 76 77 78 79 80 81 82 83 84 85 86 87 .. testsetup:: dynamics import os try: os.chdir("demos/inference") except FileNotFoundError: pass np.random.seed(42) gt.seed_rng(44) .. testcode:: dynamics  Tiago Peixoto committed May 26, 2019 88 89  # We will first simulate the dynamics with a given network g = gt.collection.data["dolphins"]  Tiago Peixoto committed May 26, 2019 90   Tiago Peixoto committed May 26, 2019 91 92 93  # The algorithm accepts multiple independent time-series for the # reconstruction. We will generate 100 SI cascades starting from a # random node each time, and uniform infection probability 0.7.  Tiago Peixoto committed May 26, 2019 94   Tiago Peixoto committed May 26, 2019 95 96 97 98 99 100 101 102 103 104 105  ss = [] for i in range(100): si_state = gt.SIState(g, beta=.7) s = [si_state.get_state().copy()] for j in range(10): si_state.iterate_sync() s.append(si_state.get_state().copy()) # Each time series should be represented as a single vector-valued # vertex property map with the states for each note at each time. s = gt.group_vector_property(s) ss.append(s)  Tiago Peixoto committed May 26, 2019 106   Tiago Peixoto committed May 26, 2019 107 108 109 110  # Prepare the initial state of the reconstruction as an empty graph u = g.copy() u.clear_edges() ss = [u.own_property(s) for s in ss] # time series properties need to be 'owned' by graph u  Tiago Peixoto committed May 26, 2019 111   Tiago Peixoto committed May 26, 2019 112 113 114 115 116 117 118  # Create reconstruction state rstate = gt.EpidemicsBlockState(u, s=ss, beta=None, r=1e-6, global_beta=.1, state_args=dict(B=1), nested=False, aE=g.num_edges()) # Now we collect the marginals for exactly 100,000 sweeps, at # intervals of 10 sweeps:  Tiago Peixoto committed May 26, 2019 119 120  gm = None bm = None  Tiago Peixoto committed May 26, 2019 121 122 123 124 125 126  betas = [] def collect_marginals(s): global gm, bm gm = s.collect_marginal(gm) b = gt.perfect_prop_hash([s.bstate.b])[0]  Tiago Peixoto committed May 26, 2019 127  bm = s.bstate.collect_vertex_marginals(bm, b=b)  Tiago Peixoto committed May 26, 2019 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144  betas.append(s.params["global_beta"]) gt.mcmc_equilibrate(rstate, force_niter=10000, mcmc_args=dict(niter=10, xstep=0), callback=collect_marginals) print("Posterior similarity: ", gt.similarity(g, gm, g.new_ep("double", 1), gm.ep.eprob)) print("Inferred infection probability: %g ± %g" % (mean(betas), std(betas))) gt.graph_draw(gm, gm.own_property(g.vp.pos), vertex_shape="pie", vertex_color="black", vertex_pie_fractions=gm.own_property(bm), vertex_pen_width=1, edge_pen_width=gt.prop_to_size(gm.ep.eprob, 0, 5), eorder=gm.ep.eprob, output="dolphins-posterior.svg") The reconstruction can accurately recover the hidden network and the infection probability: .. testoutput:: dynamics  Tiago Peixoto committed May 26, 2019 145   Tiago Peixoto committed May 26, 2019 146 147  Posterior similarity: 0.978108... Inferred infection probability: 0.687213 ± 0.054126  Tiago Peixoto committed May 26, 2019 148   Tiago Peixoto committed May 26, 2019 149 150 151 152 153 The figure below shows the reconstructed network and the inferred community structure. .. figure:: dolphins-posterior.* :align: center :width: 400px  Tiago Peixoto committed May 26, 2019 154   Tiago Peixoto committed May 26, 2019 155 156 157 158 159  Reconstructed network of associations between 62 dolphins, from the dynamics of a SI epidemic model, using the degree-corrected SBM as a latent prior. The edge thickness corresponds to the marginal posterior probability of each edge, and the node pie charts to the marginal posterior distribution of the node partition.