Copy a graph_tool.Graph into a new one causes an enormously slowdown
In a program, I have this code snipped:
mapping = graph.new_vp("int64_t", vals=numpy.arange(graph.num_vertices()))
g = Graph(graph, prune=True, vorder=mapping)
graph
is a graph_tool.GraphView
(based on multiple vertex and edge filters).
The whole code can be found here: https://github.com/luhsra/ara/blob/soso/ara/steps/create_abbs.py#L147
When I run the (whole) program, it needs around 2s to execute. The program consists at minimum of several thousands of LOC. Now, when I run the program in parallel with a very similar input (especially the mentioned graph
object is equal, I use meson test --suite ...
to start the program 3 times in parallel), its runtime increases to ~40s.
After debugging and profiling, I found that replacing the above lines with:
mapping = graph.new_vp("int64_t", vals=numpy.arange(graph.num_vertices()))
g = Graph()
reduces that runtime to ~8s.
The profiler says, that 60-80% of the runtime of the program belongs to the native copy operation in graph-tool/__init__.py
line 1789:
# The actual copying of the graph and property maps
self.__graph = libcore.GraphInterface(gv.__graph, False,
vprops,
eprops,
_prop("v", gv, vorder))
The profiling indicates that a lot of time is used for spinning in libgomp (a part of pthreads of gcc) at gomp_team_barrier_wait_end
and gomp_simple_barrier_wait
, but this happens only when multiple processes are running in parallel.
I have no clue at all, why another process has some interference here (the processes do not communicate!).
Do you have an idea what might be wrong here?