8000 fuzzystring is slow. Less than 1% as fast as FuzzySharp · Issue #20 · kdjones/fuzzystring · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
fuzzystring is slow. Less than 1% as fast as FuzzySharp #20
Open
@achalk

Description

@achalk

Test Results and Code for fuzzystring (FuzzySharp code below)

On:
NOTICE.20P.TXT

image
`public static int DoSearch(string needle, List haystack)
{
// Choose which algorithms should weigh in for the comparison
List options = new List();
options.Add(FuzzyStringComparisonOptions.UseLevenshteinDistance);

// Choose the relative strength of the comparison - is it almost exactly equal? or is it just close?
FuzzyStringComparisonTolerance tolerance = FuzzyStringComparisonTolerance.Weak;

int counter = 0;
bool found;

var watch = new System.Diagnostics.Stopwatch();
watch.Start();

foreach (var word in haystack)
{
    // ApproximatelyEquals
    found = word.ApproximatelyEquals(needle, options, tolerance);
    if (found)
    {
        Console.WriteLine($"word: {word}. count: {++counter}");
    }
}
watch.Stop();
Console.WriteLine($"Execution Time: {watch.ElapsedMilliseconds} ms");

return (counter);

}`

//Input file read in with the following...

    public static List<string> ParseFile(string FQFN)
    {
        List<string> words = null;
        if (File.Exists(FQFN))
        {
            string readText = File.ReadAllText(FQFN);
            string[] substrings = readText.Split(" ", StringSplitOptions.RemoveEmptyEntries);  // Split the string into substrings
            words = new List<string>(substrings);
        }

        return (words);
    }

FuzzySharp code below:

public static int DoSearch(string needle, List<string> haystack, int limit = 20000, int cutoff = 100)
{
    var watch = new System.Diagnostics.Stopwatch();
    watch.Start();

    var results = Process.ExtractAll(needle, haystack, cutoff: cutoff);
    int count = 0;
    foreach (var result in results)
    {
        Console.WriteLine($" Count: {++count}. Score: {result.Score}: Value:{result.Value}");
    }
    watch.Stop();
    Console.WriteLine($"Execution Time: {watch.ElapsedMilliseconds} ms");

    return (count);
} 

}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0