GraphView.edge() does not give consistent results after applying edge filter
I've noticed an issue with the GraphView.edge()
method which does not give consistent results after applying an efilt
lambda to hide some parallel edges in a directed graph. Especially, there is a case where it returns a non-empty list with all_edges=True
and None
with all_edges=False
.
Unfortunately I can provide only a large reproducer, because it seems to work fine on trivial graphs. The following example is a dependency graph of all Arch Linux packages - a totally non-statistical and even non-scientific application
import os.path
import requests
from tqdm import tqdm
import graph_tool.all as gt
def download(url, filepath):
# Streaming, so we can iterate over the response.
response = requests.get(url, stream=True)
# Sizes in bytes.
total_size = int(response.headers.get("content-length", 0))
block_size = 1024
with tqdm(total=total_size, unit="B", unit_scale=True) as progress_bar:
with open(filepath, "wb") as file:
for data in response.iter_content(block_size):
progress_bar.update(len(data))
file.write(data)
if total_size != 0 and progress_bar.n != total_size:
raise RuntimeError("Could not download file")
filename = "arch-pkgs.graphml"
if not os.path.exists(filename):
url = f"https://pkgbuild.com/~lahwaacz/{filename}"
download(url, filename)
graph = gt.Graph()
graph.load(filename)
print(graph)
pkgnames = graph.vertex_properties["pkgname"]
dependency_types = graph.edge_properties["dependency_type"]
# get the vertex corresponding to the "python" package and print its direct dependencies
v = graph.vertex(gt.GraphView(graph, vfilt=lambda v: pkgnames[v] == "python").get_vertices()[0])
for e in v.out_edges():
print(e, dependency_types[e], pkgnames[e.target()])
# use only depends and makedepends
g = gt.GraphView(graph, efilt=lambda e: dependency_types[e] in {"depends", "makedepends"})
# test edge method
e = g.edge(11127, 13442, add_missing=False, all_edges=True)
print(e)
e = g.edge(11127, 13442, add_missing=False, all_edges=False)
print(e)
Output:
<Graph object, directed, with 14629 vertices and 150308 edges, 1 internal vertex property, 1 internal edge property, at 0x73afa6034ec0>
(11127, 682) depends bzip2
(11127, 1811) depends expat
(11127, 2466) depends gdbm
(11127, 5815) depends libffi
(11127, 6074) depends libnsl
(11127, 6627) depends libxcrypt
(11127, 8192) depends openssl
(11127, 14593) depends zlib-ng-compat
(11127, 13633) depends tzdata
(11127, 7520) depends mpdecimal
(11127, 11029) optdepends python-setuptools
(11127, 10564) optdepends python-pip
(11127, 10565) optdepends python-pipx
(11127, 12794) optdepends sqlite
(11127, 14471) optdepends xz
(11127, 13442) optdepends tk
(11127, 13442) makedepends tk
(11127, 12794) makedepends sqlite
(11127, 549) makedepends bluez-libs
(11127, 6790) makedepends llvm
(11127, 2467) makedepends gdb
(11127, 14379) makedepends xorg-server-xvfb
(11127, 13585) makedepends ttf-liberation
[<Edge object with source '11127' and target '13442' at 0x73afa5ffd750>]
None
I would expect the last g.edge()
call to return the same object that is wrapped in a list in the previous result.
This reproducer is for the python package, but the same issue appears for many other packages that have the same dependency in makedepends
and optdepends
.