max_dist argument of shortest_distance sometimes excludes points closer than max_dist
-
Are you running the latest graph-tool
version?- I didn't bother with the git version as there were no relevant commits made since 2.44
In using graph_tool.topology.shortest_distance
, I found that the max_dist
argument does not produce the expected behavior. Specifically, vertices that are within the desired distance can be erroneously marked as infinitely far away (meaning they were found to be further than the desired distance.
I'm running python 3.9.10 on Ubuntu 20.04 LTS and have installed graph-tool 2.44 through conda-forge.
I've created a minimal example by generating a random graph with random edge distances. I then ask for shortest distances within successively larger balls and print the number of expected non-inf-valued distances returned along with the actual number returned by shortest_distance
.
import graph_tool.all as gt
import numpy as np
# set random seed for repeatability
rng = np.random.default_rng(0)
# build random graph with edge distances
num_vertices = 10
graph = gt.random_graph(num_vertices, lambda: rng.integers(2, 5), False)
graph.edge_properties.dist = graph.new_edge_property(
'double', rng.exponential(size=graph.num_edges()))
# use gt.shortest_distances to compute shortest distance to all other vertices
# without an max_dist constraint
shortest_distances = gt.shortest_distance(graph, 0, weights=graph.edge_properties.dist).a[1:]
sorted_shortest_distances = np.sort(shortest_distances)
# print sorted shortest distances
print("Sorted distances from vertex 0")
print(sorted_shortest_distances)
print()
# iteratively set max_dist the distance of increasingly further vertices
# "nth closest vertex" and "number of non-inf distances returned" should
# remain equal
print("nth closest vertex | number of non-inf distances returned | max_dist supplied")
for it, d in enumerate(sorted_shortest_distances):
not_inf = (~np.isinf(gt.shortest_distance(graph, 0, graph.get_vertices()[1:],
graph.edge_properties.dist, max_dist=d))).sum()
print(it + 1, not_inf, d)
which produces the following output
Sorted distances from vertex 0
[0.67358295 0.75530136 0.84893303 0.85022078 3.49036893 5.7594636
6.90797386 6.98047155 8.04987185]
nth closest vertex | number of non-inf distances returned | max_dist supplied
1 1 0.6735829526672319
2 2 0.7553013578253913
3 3 0.8489330295445371
4 3 0.85022077987936
5 5 3.4903689317429576
6 6 5.759463596009866
7 7 6.907973860321933
8 8 6.980471545241825
9 8 8.049871847409143
Process finished with exit code 0
Note the 4th call to shortest_distance
which only returns 3 instead of the expected 4 non-inf-valued distances. You can change the value of num_vertices
in the code block above to see how this plays out on much larger graphs where the effect becomes much more common.