max_dist argument of shortest_distance sometimes excludes points closer than max_dist
Are you running the latest
- I didn't bother with the git version as there were no relevant commits made since 2.44
graph_tool.topology.shortest_distance, I found that the
max_dist argument does not produce the expected behavior. Specifically, vertices that are within the desired distance can be erroneously marked as infinitely far away (meaning they were found to be further than the desired distance.
I'm running python 3.9.10 on Ubuntu 20.04 LTS and have installed graph-tool 2.44 through conda-forge.
I've created a minimal example by generating a random graph with random edge distances. I then ask for shortest distances within successively larger balls and print the number of expected non-inf-valued distances returned along with the actual number returned by
import graph_tool.all as gt import numpy as np # set random seed for repeatability rng = np.random.default_rng(0) # build random graph with edge distances num_vertices = 10 graph = gt.random_graph(num_vertices, lambda: rng.integers(2, 5), False) graph.edge_properties.dist = graph.new_edge_property( 'double', rng.exponential(size=graph.num_edges())) # use gt.shortest_distances to compute shortest distance to all other vertices # without an max_dist constraint shortest_distances = gt.shortest_distance(graph, 0, weights=graph.edge_properties.dist).a[1:] sorted_shortest_distances = np.sort(shortest_distances) # print sorted shortest distances print("Sorted distances from vertex 0") print(sorted_shortest_distances) print() # iteratively set max_dist the distance of increasingly further vertices # "nth closest vertex" and "number of non-inf distances returned" should # remain equal print("nth closest vertex | number of non-inf distances returned | max_dist supplied") for it, d in enumerate(sorted_shortest_distances): not_inf = (~np.isinf(gt.shortest_distance(graph, 0, graph.get_vertices()[1:], graph.edge_properties.dist, max_dist=d))).sum() print(it + 1, not_inf, d)
which produces the following output
Sorted distances from vertex 0 [0.67358295 0.75530136 0.84893303 0.85022078 3.49036893 5.7594636 6.90797386 6.98047155 8.04987185] nth closest vertex | number of non-inf distances returned | max_dist supplied 1 1 0.6735829526672319 2 2 0.7553013578253913 3 3 0.8489330295445371 4 3 0.85022077987936 5 5 3.4903689317429576 6 6 5.759463596009866 7 7 6.907973860321933 8 8 6.980471545241825 9 8 8.049871847409143 Process finished with exit code 0
Note the 4th call to
shortest_distance which only returns 3 instead of the expected 4 non-inf-valued distances. You can change the value of
num_vertices in the code block above to see how this plays out on much larger graphs where the effect becomes much more common.