Running with Python 3.x #1

jolasman · 2018-11-20T17:42:24Z

Hi! :) I changed some of the code to use with Python 3,, however, I have some issues.
I cannot find a library with the FP-growth algorithm that works. I tried the pyspark one and the FP-growth. In the pyspark one, I end up with spark's connection errors after some runs. It was working in the beginning, but then it blew up. The second one cannot handle my dataset due to memory problems.

Btw, after I changed some dict problems with iteritems() and has_key(), the nimeFPtree function gives me an error that I do not know what it is:

bigL = [v[0] for v in sorted(headerTable.items(), key=lambda p: p[1])] # (sort header table)
AttributeError: 'NoneType' object has no attribute 'items'

Any thoughts?

Thanks in advance

Inger-Chao · 2020-03-11T09:37:12Z

The error happens because headerTable has None value returned in the createFPtree method,

    for k in list(headerTable.keys()):
        if headerTable[k] < minSup:
            del (headerTable[k])  # 删除不满足最小支持度的元素
    freqItemSet = set(headerTable.keys())  # 满足最小支持度的频繁项集
    if len(freqItemSet) == 0:
        return None, None

the headerTable[k] value was all deleted and finally headerTable return None.
The author set the n = 20000 in the demo, maybe it's too big for your dataset, and I decreased the n value to make this demo works at my dataset.

WissenY · 2020-04-24T02:10:46Z

想请作者解释一下，在支持度计数为100000的情况下，如何在mac上用13秒跑完（你的中文博客如是写道），我将你的代码改为python3.7后，在8代i7，内存16g下也依然跑了十几分钟

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running with Python 3.x #1

Running with Python 3.x #1

Uh oh!

Uh oh!

Running with Python 3.x #1

Running with Python 3.x #1

Comments

Uh oh!

Uh oh!