8000 `all_node_cuts()` performance degrades much faster than predicted by Kanevsky's minimum cut algorithm · Issue #7994 · networkx/networkx · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

all_node_cuts() performance degrades much faster than predicted by Kanevsky's minimum cut algorithm #7994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Cerebus opened this issue Apr 23, 2025 · 6 comments

Comments

@Cerebus
Copy link
Cerebus commented Apr 23, 2025

NetworkX vs igraph. igraph implements Kanevsky's too.

import timeit

import igraph as ig
import networkx as nx


def bigO(g):
    return (2 ** nx.node_connectivity(g)) * (len(g) ** 3)


def runit(i):
    g = nx.grid_2d_graph(i, i)
    O = bigO(g)
    h = ig.Graph.from_networkx(g)
    t = timeit.timeit(lambda: list(nx.all_node_cuts(g)), number=1)
    t2 = timeit.timeit(lambda: ig.GraphBase.minimum_size_separators(h), number=1)
    return dict(i=i, k=nx.node_connectivity(g), n=len(g), O=O, nx_t=t, ig_t=t2)


if __name__ == "__main__":
    results = []
    for i in range(2, 12):
        r = runit(i)
        results.append(r)
        print(r)

Results dataframe w/ percent change calculated for nx_t and ig_t (* 100, b/c humans). Rate of growth is the interesting bit, b/c NetworkX is 100% Python, and igraph is C w/ a python binding. Raw data attached results.json.

    i  k    n        O        nx_t      ig_t  nx_t_pct_chg  ig_t_pct_chg
0   2  2    4      256    0.004447  0.000238           NaN           NaN
1   3  2    9     2916    0.033335  0.000569    649.551704    138.733327
2   4  2   16    16384    0.101915  0.001271    205.725524    123.537768
3   5  2   25    62500    0.143263  0.002748     40.571573    116.194569
4   6  2   36   186624    0.198543  0.006185     38.586098    125.084930
5   7  2   49   470596    0.431762  0.011818    117.465543     91.083902
6   8  2   64  1048576    1.275459  0.022733    195.407745     92.349527
7   9  2   81  2125764    6.076753  0.039182    376.436524     72.357797
8  10  2  100  4000000   32.342393  0.066873    432.231473     70.674632
9  11  2  121  7086244  180.278205  0.109162    457.405282     63.236688

Clearly these implementations are following different growth curves. Kanevsky's algorithm should be growing at $\mathcal{O}(2^{k} n^3)$. NetworkX's growth becomes geometric (or worse).

The cause seems to be for some graph sizes, the number of members of the antichains(L) set explodes---billions of members, in some case---but I have not yet figured out why that happens.

Originally posted by @Cerebus in #7989 (comment)

@Cerebus
Copy link
Author
Cerebus commented Apr 23, 2025

I think this is a bug:

>>> import networkx as nx
>>> import igraph as ig
>>> g = nx.Graph([(1, 2), (2, 3)])
>>> list(nx.all_node_cuts(g))
[{2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[[1]]
>>> g = nx.Graph([(1, 2)])
>>> list(nx.all_node_cuts(g))
[{1}, {2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]

There's no vertex cut between neighboring nodes.

@dschult

@Cerebus
Copy link
Author
Cerebus commented Apr 23, 2025

Also, this seems weird:

>>> g = nx.Graph()
>>> g.add_node(1)
>>> list(nx.all_node_cuts(g))
[set()]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]

@dschult
Copy link
Member
dschult commented Apr 24, 2025

There have been a number of corner case problems with the all_node_cuts function over the years: #6558, #3039, #1976

The PR #3039 represented a substantial change that corrected "multiple incorrect implementation in the original code " and also slowed down the code by quite a bit.

The two cases you mention above (node_cuts for very small graphs) seem like corner cases, but could be indicative of mistakes in the larger implementation. They erode confidence in the rest of the code too. Looks like we need a deep dive into this algorithm and implementation.

@amcandio
Copy link
Contributor

I'm not able to reproduce this issue. This test passes:

def test_repro():
    G = nx.Graph([(1, 2)])
    assert nx.density(G) == 1
    assert list(nx.all_node_cuts(G)) == []

all_node_cuts should exit early here:

    # Address some corner cases first.
    # For complete Graphs

    if nx.density(G) == 1:
        yield from ()
        return

What value do you get when you call nx.density(G)?

@dschult
Copy link
Member
dschult commented Apr 26, 2025

I think the first comment above (with maybe bugs) runs OK on my machine -- as also reported by @amcandio

>>> import networkx as nx
>>> import igraph as ig
>>> g = nx.Graph([(1, 2), (2, 3)])
>>> list(nx.all_node_cuts(g))
[{2}]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[[1]]
>>> g = nx.Graph([(1, 2)])
>>> list(nx.all_node_cuts(g))
[]
>>> ig.GraphBase.minimum_size_separators(ig.Graph.from_networkx(g))
[]

I think the result [{2}] vs [[1]] actually agrees between nx and ig because, IIUC, they index their nodes by integers starting at 0. So the nx "2" is an ig "1". and a set vs a list is an API difference I assume...

I do see the result in the following comment that a one node graph reports an empty set as a node cut. I'm not sure what a return value of an empty set would even mean -- I guess you might get that when the original graph is not connected... definitely strange...

@amcandio
Copy link
Contributor
amcandio commented Apr 26, 2025

I put up a property-based test:

def _random_connected_graph(n, prob, seed):
    return nx.compose(
        nx.random_labeled_tree(n, seed=seed),
        nx.erdos_renyi_graph(n, prob, seed=seed),
    )

@settings(print_blob=True)
@given(
    graph_st(
        _random_connected_graph,
        n=st.integers(5, 10),
        prob=st.floats(0, 1)
    )
)
def test_random_connected_graph(G):
    for flow_func in flow_funcs:
        kwargs = {"flow_func": flow_func}
        edge_cut = nx.minimum_edge_cut(G, **kwargs)
        H = G.copy()
        H.remove_edges_from(edge_cut)
        assert not nx.is_connected(H)
        node_cut = nx.minimum_node_cut(G, **kwargs)
        H = G.copy()
        H.remove_nodes_from(node_cut)
        assert not nx.is_connected(H), f"Failed for {flow_func.__name__} with {G}"

Hypothesis is able to find counter-examples. We seem to be strugling with complete graphs:

>           assert not nx.is_connected(H), f"Failed for {flow_func.__name__} with {G}"
E           AssertionError: Failed for boykov_kolmogorov with Graph with 5 nodes and 10 edges
E           assert not True
E            +  where True = <function is_connected at 0x7f969999f9c0>(<networkx.classes.graph.Graph object at 0x7f9699dd5850>)
E            +    where <function is_connected at 0x7f969999f9c0> = nx.is_connected
E           Falsifying example: test_random_connected_graph(
E               G=Graph()
E               G.add_node(0)
E               G.add_node(1)
E               G.add_node(2)
E               G.add_node(3)
E               G.add_node(4)
E               G.add_edge(0, 1)
E               G.add_edge(0, 2)
E               G.add_edge(0, 3)
E               G.add_edge(0, 4)
E               G.add_edge(1, 2)
E               G.add_edge(1, 3)
E               G.add_edge(1, 4)
E               G.add_edge(2, 3)
E               G.add_edge(2, 4)
E               G.add_edge(3, 4)
E               ,
E           )
E           Explanation:
E               These lines were always and only run by failing examples:
E                   /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:446
E                   /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:447
E                   /home/amcandio/workspace/networkx/networkx/algorithms/connectivity/cuts.py:610
E                   /home/amcandio/workspace/networkx/networkx/classes/graph.py:436
E                   /home/amcandio/workspace/networkx/networkx/classes/reportviews.py:532
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.131.9', b'AEEFKD/wAAAAAAAAQQBBAEEA') as a decorator on your test case

For this graph we detect {1, 2, 3, 4} as cut, which is wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants
0