8000 MarkdownRenderer not rendering HTML as Markdown · Issue #367 · commonmark/commonmark-java · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

MarkdownRenderer not rendering HTML as Markdown #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
the-red-herring opened this issue Mar 21, 2025 · 2 comments
Closed

MarkdownRenderer not rendering HTML as Markdown #367

the-red-herring opened this issue Mar 21, 2025 · 2 comments

Comments

@the-red-herring
Copy link
the-red-herring commented Mar 21, 2025

Hi!
I have been trying for a while to do a conversion on a string that can contain some user text, which might well contain some html which I would like to convert to being markdown. This is the way I have setup to do this:

        List<Extension> extensions = List.of(TablesExtension.create());
        Parser parser = Parser.builder()
                .extensions(extensions)
                .build();
        MarkdownRenderer renderer = MarkdownRenderer.builder()
                .extensions(extensions)
                .build();

        Node document = parser.parse(stringToConvert);

        return renderer.render(document);

A simple example of stringToConvert is:

<ul><li>asdf</li><li>asdf</li><li>asdf</li><li>asdf</li></ul>

The returned string from the renderer.render(document) is:

<ul><li>asdf</li><li>asdf</li><li>asdf</li><li>asdf</li></ul>

This is what I was expecting to be returned though:

* asdf
* asdf
* asdf

Apologies if I am missing something quite obvious here. It looks like the Parser is returning a Node that, on the surface, looks correct - I suspect I might be doing something wrong here or doing something that is not supported. I can get the example usage of MarkdownRenderer provided in the docs working as it should do but the way I am doing it above is slightly different from the example. I don't see an obvious reason why it shouldn't work though?

(Also see what the reference implementation does: https://spec.commonmark.org/dingus/)

@robinst
Copy link
Collaborator
robinst commented Mar 23, 2025

The reason for this is that commonmark-java's Parser is a Markdown parser, not an HTML parser. Markdown allows some HTML to be embedded, but when rendered back to HTML or Markdown, it's mostly just passed through. There is no actual semantic meaning to that HTML for the parser/renderer.

If your input is always HTML, you will want to use a HTML parser, and then convert from the HTML representation (ul, li etc) to the commonmark-java representation such as BulletList, ListItem, etc. Then rendering that using MarkdownRenderer will work as you expect.

@robinst robinst closed this as completed Mar 23, 2025
@robinst robinst removed the bug label Mar 23, 2025
@the-red-herring
Copy link
Author

Hi @robinst,

I appreciate the answer, thank you,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0