8000 GitHub - go-andiamo/splitter at v1.2.3
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

go-andiamo/splitter

Repository files navigation

Splitter

GoDoc Latest Version codecov Go Report Card

Overview

Go package for splitting strings (aware of enclosing braces and quotes)

The problem with standard Golang strings.Split is that it does not take into consideration that the string being split may contain enclosing braces and/or quotes (where the separator should not be considered where it's inside braces or quotes)

Take for example a string representing a slice of comma separated strings...

    str := `"aaa","bbb","this, for sanity, should not be split"`

running strings.Split on that...

package main

import "strings"

func main() {
    str := `"aaa","bbb","this, for sanity, should not be parts"`
    parts := strings.Split(str, `,`)
    println(len(parts))
}

would yield 5 (try on go-playground) - instead of the desired 3

However, with splitter, the result would be different...

package main

import "github.com/go-andiamo/splitter"

func main() {
    commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotes)

    str := `"aaa","bbb","this, for sanity, should not be split"`
    parts, _ := commaSplitter.Split(str)
    println(len(parts))
}

which yields the desired 3! try on go-playground

Note: The varargs, after the first separator arg, are the desired 'enclosures' (e.g. quotes, brackets, etc.) to be taken into consideration

While splitting, any enclosures specified are checked for balancing!

Installation

To install Splitter, use go get:

go get github.com/go-andiamo/splitter

To update Splitter to the latest version, run:

go get -u github.com/go-andiamo/splitter

Enclosures

Enclosures instruct the splitter specific start/end sequences within which the separator is not to be considered. An enclosure can be one of two types: quotes or brackets.

Quote type enclosures only differ from bracket types in that the end quote can optionally be 'escaped' within the quoted sequence.

The Splitter provides many pre-defined enclosures:

Var Name Type Start - End Escaped end
DoubleQuotes Quote " " none
DoubleQuotesBackSlashEscaped Quote " " \"
DoubleQuotesDoubleEscaped Quote " " ""
SingleQuotes Quote ' ' none
SingleQuotesBackSlashEscaped Quote ' ' \'
SingleQuotesDoubleEscaped Quote ' ' ''
SingleInvertedQuotes Quote ` ` none
SingleInvertedQuotesBackSlashEscaped Quote ` ` \'
SingleInvertedQuotesDoubleEscaped Quote ` ` ``
SinglePointingAngleQuotes Quote none
SinglePointingAngleQuotesBackSlashEscaped Quote \›
DoublePointingAngleQuotes Quote « » none
LeftRightDoubleDoubleQuotes Quote none
LeftRightDoubleSingleQuotes Quote none
LeftRightDoublePrimeQuotes Quote none
SingleLowHigh9Quotes Quote none
DoubleLowHigh9Quotes Quote none
Parenthesis Brackets ( ) n/a
CurlyBrackets Brackets { } n/a
SquareBrackets Brackets [ ] n/a
LtGtAngleBrackets Brackets < > n/a
LeftRightPointingAngleBrackets Brackets n/a
SubscriptParenthesis Brackets n/a
SuperscriptParenthesis Brackets n/a
SmallParenthesis Brackets n/a
SmallCurlyBrackets Brackets n/a
DoubleParenthesis Brackets n/a
MathWhiteSquareBrackets Brackets n/a
MathAngleBrackets Brackets n/a
MathDoubleAngleBrackets Brackets n/a
MathWhiteTortoiseShellBrackets Brackets n/a
MathFlattenedParenthesis Brackets n/a
OrnateParenthesis Brackets ﴿ n/a
AngleBrackets Brackets n/a
DoubleAngleBrackets Brackets n/a
FullWidthParenthesis Brackets n/a
FullWidthSquareBrackets Brackets n/a
FullWidthCurlyBrackets Brackets n/a
SubstitutionBrackets Brackets n/a
SubstitutionQuotes Quote none
DottedSubstitutionBrackets Brackets n/a
DottedSubstitutionQuotes Quote none
TranspositionBrackets Brackets n/a
TranspositionQuotes Quote none
RaisedOmissionBrackets Brackets n/a
RaisedOmissionQuotes Quote none
LowParaphraseBrackets Brackets n/a
LowParaphraseQuotes Quote none
SquareWithQuillBrackets Brackets n/a
WhiteParenthesis Brackets n/a
WhiteCurlyBrackets Brackets n/a
WhiteSquareBrackets Brackets n/a
WhiteLenticularBrackets Brackets n/a
WhiteTortoiseShellBrackets Brackets n/a
FullWidthWhiteParenthesis Brackets n/a
BlackTortoiseShellBrackets Brackets n/a
BlackLenticularBrackets Brackets n/a
PointingCurvedAngleBrackets Brackets n/a
TortoiseShellBrackets Brackets n/a
SmallTortoiseShellBrackets Brackets n/a
ZNotationImageBrackets Brackets n/a
ZNotationBindingBrackets Brackets n/a
MediumOrnamentalParenthesis Brackets n/a
LightOrnamentalTortoiseShellBrackets Brackets n/a
MediumOrnamentalFlattenedParenthesis Brackets n/a
MediumOrnamentalPointingAngleBrackets Brackets n/a
MediumOrnamentalCurlyBrackets Brackets n/a
HeavyOrnamentalPointingAngleQuotes Quote n/a
HeavyOrnamentalPointingAngleBrackets Brackets n/a

Quote enclosures with escaping

Quotes within quotes can be handled by using an enclosure that specifies how the escaping works, for example the following uses \ (backslash) prefixed escaping...

package main

import "github.com/go-andiamo/splitter"

func main() {
    commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesBackSlashEscaped)

    str := `"aaa","bbb","this, for sanity, \"should\" not be split"`
    parts, _ := commaSplitter.Split(str)
    println(len(parts))
}

try on go-playground

Or with double escaping...

package main

import "github.com/go-andiamo/splitter"

func main() {
    commaSplitter, _ := splitter.NewSplitter(',', splitter.DoubleQuotesDoubleEscaped)

    str := `"aaa","bbb","this, for sanity, """"should,,,,"" not be split"`
    parts, _ := commaSplitter.Split(str)
    println(len(parts))
}

try on go-playground

Not separating when separator encountered in quotes or brackets...

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    encs := []*splitter.Enclosure{
        splitter.Parenthesis, splitter.SquareBrackets, splitter.CurlyBrackets,
        splitter.DoubleQuotesDoubleEscaped, splitter.SingleQuotesDoubleEscaped,
    }
    commaSplitter, _ := splitter.NewSplitter(',', encs...)

    str := `do(not,)split,'don''t,split,this',[,{,(a,"this has "" quotes")}]`
    parts, _ := commaSplitter.Split(str)
    println(len(parts))
    for i, pt := range parts {
        fmt.Printf("\t[%d]%s\n", i, pt)
    }
}

try on go-playground

Options

Options define behaviours that are to be carried out on each found part during splitting.

An option, by virtue of it's return args from .Apply(), can do one of three things:

  1. return a modified string of what is to be added to the split parts
  2. return a false to indicate that the split part is not to be added to the split result
  3. return an error to indicate that the split part is unacceptable (and cease further splitting - the error is returned from the Split method)

Options can be added directly to the Splitter using .AddDefaultOptions() method. These options are checked for every call to the splitters .Split() method.

Options can also be specified when calling the splitter .Split() method - these options are only carried out for this call (and after any options already specified on the splitter)

Option Examples

1. Stripping empty parts

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    s := splitter.MustCreateSplitter('/').
        AddDefaultOptions(splitter.IgnoreEmpties)

    parts, _ := s.Split(`/a//c/`)
    println(len(parts))
    fmt.Printf("%+v", parts)
}

try on go-playground

2. Stripping empty first/last parts

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    s := splitter.MustCreateSplitter('/').
        AddDefaultOptions(splitter.IgnoreEmptyFirst, splitter.IgnoreEmptyLast)

    parts, _ := s.Split(`/a//c/`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`a//c/`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`/a//c`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)
}

try on go-playground

3. Trimming parts

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    s := splitter.MustCreateSplitter('/').
        AddDefaultOptions(splitter.TrimSpaces)

    parts, _ := s.Split(`/a/b/c/`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`  / a /b / c/    `)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`/   a   /   b   /   c   /`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)
}

try on go-playground

4. Trimming spaces (and removing empties)

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    s := splitter.MustCreateSplitter('/').
        AddDefaultOptions(splitter.TrimSpaces, splitter.IgnoreEmpties)

    parts, _ := s.Split(`/a/  /c/`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`  / a // c/    `)
    println(len(parts))
    fmt.Printf("%+v\n", parts)

    parts, _ = s.Split(`/   a   /      /   c   /`)
    println(len(parts))
    fmt.Printf("%+v\n", parts)
}

try on go-playground

5. Error for empties found

package main

import (
    "fmt"
    "github.com/go-andiamo/splitter"
)

func main() {
    s := splitter.MustCreateSplitter('/').
        AddDefaultOptions(splitter.TrimSpaces, splitter.NoEmpties)

    if parts, err := s.Split(`/a/  /c/`); err != nil {
        println(err.Error())
    } else {
        println(len(parts))
        fmt.Printf("%+v\n", parts)
    }

    if parts, err := s.Split(`  / a // c/    `); err != nil {
        println(err.Error())
    } else {
        println(len(parts))
        fmt.Printf("%+v\n", parts)
    }

    if parts, err := s.Split(`/   a   /      /   c   /`); err != nil {
        println(err.Error())
    } else {
        println(len(parts))
        fmt.Printf("%+v\n", parts)
    }

    if parts, err := s.Split(` a / b/c `); err != nil {
        println(err.Error())
    } else {
        println(len(parts))
        fmt.Printf("%+v\n", parts)
    }
}

try on go-playground

About

Go package for splitting strings (enclosing bracket and quotes aware)

Topics

Resources

License

Stars

Watchers

Forks

Languages

0