Tags · vuamitom/goose

2.1.6

version 2.1.6

Dec 12, 2011
7d8c2c6
zip
tar.gz

2.1.4

8000

Version 2.1.4

Nov 2, 2011
da32e90
zip
tar.gz

2.1.2

version 2.1.2

Sep 29, 2011
3164a4a
zip
tar.gz

2.0.2

upping to version 2.0.2

Aug 30, 2011
0c56de5
zip
tar.gz

2.0.1

MINOR: RE-enabling Additional Data Extraction. Upping to version 2.0.1

Aug 30, 2011
eae069a
zip
tar.gz

1.4.1-FINALJAVA

Final release of the Java version

Aug 21, 2011
a344b5d
zip
tar.gz

1.4.1

Resolving goofy maven issue. it required a new version to fully update.

Jun 14, 2011
1f222a2
zip
tar.gz

1.4.0

< 6869 /div>

Major: DefaultOutputFormatter#getFormattedText now unescapes HTML inc…

…luding all HTML Entities

Minor: I have begun to convert the usage of DefaultOutputFormatter so that you only use a single method: getFormattedText(Element topNode)

Bug fixes:
  * clean by class name was too restrictive and removed actual content elements, modified the list of names to only remove classes
    that end in "meta" instead of just containing the word "meta"

  * Modified DefaultDocumentCleaner#cleanBadTags to only select from within the body element to avoid removing it.

  * Added a helper method for removing nodes to handle cases where the node's parentNode is null (already removed). This was previously
    throwing an IllegalArgumentException from within jSoup and thus failing the extraction.

Jun 13, 2011
765927a
zip
tar.gz

1.3.14

Version 1.3.14

Jun 9, 2011
0085a9c
zip
tar.gz

1.3.13

upping to version 1.3.13 that contains a minor fix to tag extraction

May 20, 2011
b2df435
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.1.6

2.1.4

2.1.2

2.0.2

2.0.1

1.4.1-FINALJAVA

1.4.1

1.4.0

1.3.14

1.3.13

Tags: vuamitom/goose