topology.shortest_distance with "max_dist" and "reached" arguments returns array with vertices with distance greater than "max_dist"

Summary

In the "shortest_distance" routine there is the option to return an array with the reached vertices by setting the flag/attribute "return_reached" to True. Using this routine with a max_dist value, the returned array with the reached vertices not only includes the vertices in fact reached within the max_dist limit, but also some neighbouring vertices that probably were visited during the Dijkstra steps but exceeded the limit. I believe this behaviour is confusing and it may lead to erratic code/bugs for those who are not expecting it. I propose either:

  1. Change the routine as to return only the vertices that are in fact reached within max_dist;
  2. Change this flag to something like "return_visited", as to hint that the returned vertices may not in fact be reached within the expected distance limit;
  3. Add a warning in the documentation regarding this behaviour.

System information

  • Operating system: Linux Pop-OS 20.04 LTS x86-64
  • Python version/interpreter: miniconda 3.7.12

Example code

from graph_tool import Graph
from graph_tool.topology import shortest_distance

if __name__ == '__main__':
    """In this example we have a 4 vertices/ 3 edges graph with the following topology:

    (2) <----- (1) ----> (3) ----> (4)
    
     where each edge has weight=5.
     Shortest distance is calculated from vertex (1).
    """

    # Create graphs
    g = Graph(directed=True)

    # Add vertices
    g.add_vertex()
    g.add_vertex()
    g.add_vertex()
    g.add_vertex()

    # Edge weight property
    w = g.new_edge_property("float")

    # Add edges
    e1 = g.add_edge(1, 2)
    w[e1] = 5
    e2 = g.add_edge(1, 3)
    w[e2] = 5
    e3 = g.add_edge(3, 4)
    w[e3] = 5

    # Tests:

    # Case 1: max_dist = 1, so only the source should be reached.
    dist, reached = shortest_distance(g, source=g.vertex(1), weights=w, max_dist=1, return_reached=True)
    print(f"\nCase 1 (source=1, max_dist=1): \n\t- Reached: {reached}\n\t- Dist.: {dist.a}")
    # Result: reached has vertex 1 (the only reached), but also the neighbours 2 and 3 which have dist == inf

    # Case 2: max_dist = 6, so 1,2,3 should be reached but not 4
    dist, reached = shortest_distance(g, source=g.vertex(1), weights=w, max_dist=6, return_reached=True)
    print(f"\nCase 2 (source=1, max_dist=6): \n\t- Reached: {reached}\n\t- Dist.: {dist.a}")
    # Result: reached has all vertices, despite 1 -> 4 having total distance of 10 (path 1 -> 3 -> 4, two edges with w=5)
Edited by Tiago Peixoto