topology.shortest_distance with "max_dist" and "reached" arguments returns array with vertices with distance greater than "max_dist"
Summary
In the "shortest_distance" routine there is the option to return an array with the reached vertices by setting the flag/attribute "return_reached" to True. Using this routine with a max_dist value, the returned array with the reached vertices not only includes the vertices in fact reached within the max_dist limit, but also some neighbouring vertices that probably were visited during the Dijkstra steps but exceeded the limit. I believe this behaviour is confusing and it may lead to erratic code/bugs for those who are not expecting it. I propose either:
- Change the routine as to return only the vertices that are in fact reached within max_dist;
- Change this flag to something like "return_visited", as to hint that the returned vertices may not in fact be reached within the expected distance limit;
- Add a warning in the documentation regarding this behaviour.
System information
- Operating system: Linux Pop-OS 20.04 LTS x86-64
- Python version/interpreter: miniconda 3.7.12
Example code
from graph_tool import Graph
from graph_tool.topology import shortest_distance
if __name__ == '__main__':
"""In this example we have a 4 vertices/ 3 edges graph with the following topology:
(2) <----- (1) ----> (3) ----> (4)
where each edge has weight=5.
Shortest distance is calculated from vertex (1).
"""
# Create graphs
g = Graph(directed=True)
# Add vertices
g.add_vertex()
g.add_vertex()
g.add_vertex()
g.add_vertex()
# Edge weight property
w = g.new_edge_property("float")
# Add edges
e1 = g.add_edge(1, 2)
w[e1] = 5
e2 = g.add_edge(1, 3)
w[e2] = 5
e3 = g.add_edge(3, 4)
w[e3] = 5
# Tests:
# Case 1: max_dist = 1, so only the source should be reached.
dist, reached = shortest_distance(g, source=g.vertex(1), weights=w, max_dist=1, return_reached=True)
print(f"\nCase 1 (source=1, max_dist=1): \n\t- Reached: {reached}\n\t- Dist.: {dist.a}")
# Result: reached has vertex 1 (the only reached), but also the neighbours 2 and 3 which have dist == inf
# Case 2: max_dist = 6, so 1,2,3 should be reached but not 4
dist, reached = shortest_distance(g, source=g.vertex(1), weights=w, max_dist=6, return_reached=True)
print(f"\nCase 2 (source=1, max_dist=6): \n\t- Reached: {reached}\n\t- Dist.: {dist.a}")
# Result: reached has all vertices, despite 1 -> 4 having total distance of 10 (path 1 -> 3 -> 4, two edges with w=5)