8000 GitHub - Paligo/xoz: read-only succinct XML with jumping to tags
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Paligo/xoz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Xoz

Xoz is read-only XML library with the following features:

< 5B4C ul dir="auto">
  • small representation of the XML in memory.

  • fast iteration over named tags.

  • fast text search

  • Implementation notes

    In order to store compact XML, Xoz uses succinct data structures. I wrote a gentle introduction to succinct data structures on my blog.

    For instance, an 86 meg XML file takes up only about 102 megabytes of memory.

    We make use of:

    • vers - for the succinct tree implementation using the Balanced Parentheses technique. This is backed by its RsVec, a bit vector that supports rank and select. We also uses its wavelet matrix implementation for connecting tags to trees.

    • fm-index. This allows fast search over compressed text. This is based on vers for both rank/select as well as its wavelet matrix. (at the time of writing this isn't integrated yet)

    • sucds - right now still used for arrays that use the minimal amounts of bits, but we aim to build this on top of vers.

    About

    read-only succinct XML with jumping to tags

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages

    0