8000 Implement a Mallard Reader. by MathieuDuponchelle · Pull Request #2700 · jgm/pandoc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Implement a Mallard Reader. #2700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MathieuDuponchelle
Copy link
Contributor

See http://projectmallard.org for more information, mallard is
basically a simplified docbook with the added notion of
http://projectmallard.org/1.0/mal_links.

The documentation maintainers of developer.gnome.org
might consider migrating to CommonMark, that's why I wrote this
reader, which is basically a copy pasted and trimmed down version
of the docbook one.

See https://github.com/GNOME/gnome-devel-docs for a decent corpus
of mallard pages.

The coverage is not total for this reader, but it's already useful
as is and I'll certainly get back to it pretty soon.

I'm really not sure about the handling of the <links> node, I'll be interested in other ideas.

Thanks again for that awesome piece of software :)

See http://projectmallard.org for more information, mallard is
basically a dumbed down docbook with the added notion of
http://projectmallard.org/1.0/mal_links , which usefulness
is very debatable.

The documentation maintainers of developer.gnome.org
might consider migrating to CommonMark, that's why I wrote this
reader, which is basically a copy pasted and trimmed down version
of the docbook one.

See https://github.com/GNOME/gnome-devel-docs for a decent corpus
of mallard pages.

The coverage is not total for this reader, but it's already useful
as is and I'll certainly get back to it pretty soon.
@jgm
Copy link
Owner
jgm commented Feb 6, 2016

I'd rather avoid code duplication. If mallard really is
a subset of docbook with a small addition (mal_links), then
I wonder whether it would make more sense to implement
it as a variant of the existing docbook reader?

That's how we handle the difference between html and html5,
or plain and markdown, for example.

+++ Mathieu Duponchelle [Feb 06 16 12:13 ]:

See [1]http://projectmallard.org for more information, mallard is
basically a dumbed down docbook with the added notion of
[2]http://projectmallard.org/1.0/mal_links , which usefulness
is very debatable.

The documentation maintainers of developer.gnome.org
might consider migrating to CommonMark, that's why I wrote this
reader, which is basically a copy pasted and trimmed down version
of the docbook one.

See [3]https://github.com/GNOME/gnome-devel-docs for a decent corpus
of mallard pages.

The coverage is not total for this reader, but it's already useful
as is and I'll certainly get back to it pretty soon.

I'm really not sure about the handling of the node, I'll be interested
in other ideas.

Thanks again for that awesome piece of software :)
__________________________________________________________________

You can view, comment on, or merge this pull request online at:

[4]https://github.com/jgm/pandoc/pull/2700

Commit Summary

* Implement a Mallard Reader.

File Changes

* M [5]pandoc.cabal (1)
* M [6]src/Text/Pandoc.hs (3)
* A [7]src/Text/Pandoc/Readers/Mallard.hs (306)

Patch Links:

* [8]https://github.com/jgm/pandoc/pull/2700.patch
* [9]https://github.com/jgm/pandoc/pull/2700.diff


Reply to this email directly or [10]view it on GitHub.

References

  1. http://projectmallard.org/
  2. http://projectmallard.org/1.0/mal_links
  3. https://github.com/GNOME/gnome-devel-docs
  4. Implement a Mallard Reader. #2700
  5. https://github.com/jgm/pandoc/pull/2700/files#diff-0
  6. https://github.com/jgm/pandoc/pull/2700/files#diff-1
  7. https://github.com/jgm/pandoc/pull/2700/files#diff-2
  8. https://github.com/jgm/pandoc/pull/2700.patch
  9. https://github.com/jgm/pandoc/pull/2700.diff
    1. Implement a Mallard Reader. #2700

@MathieuDuponchelle
Copy link
Contributor Author

Well the thing is that it isn't really a subset, for example where docbook has para / informalpara / formalpara, mallard has p, if you look at the list of block elements here -> https://github.com/jgm/pandoc/pull/2700/files#diff-3765f376fcbd39161286f22a5375facfR129 you'll see that it's similar but not identical, there are also some tiny differences in parsing certain things, and these differences piling up make factorization of the code less obvious than it could be.

Also please note that it's my second time writing haskell, and everything still seems a bit mysterious to me (<$> oO) .

I think what we should do is have you pay a closer look at the differences between both readers, decide what's worth sharing and I'll be happy to do that, cause I'm fairly sure the solutions I'll come up with will not be exactly the cleanest ones. I don't mind waiting, as there's obviously no risk of conflicts here, up to you :)

@MathieuDuponchelle
Copy link
Contributor Author

Also do you hang out on some irc channels ? I've got a lot of silly questions to ask you about cmark, I'm digging into the code right now to find the "parsing extensions should go there" sign :)

@jgm
Copy link
Owner
jgm commented Feb 7, 2016

+++ Mathieu Duponchelle [Feb 06 16 14:11 ]:

Also do you hang out on some irc channels ? I've got a lot of silly
questions to ask you about cmark, I'm digging into the code right now
to find the "parsing extensions should go there" sign :)

I don't, no. Email is the best way for me.

@jgm
Copy link
Owner
jgm commented Feb 7, 2016 via email
8000

@MathieuDuponchelle
Copy link
Contributor Author

Cool thanks, however I've given this a bit more thought and an alternate approach could also be to just merge this and take the time to review / improve it when there's time as :

  • The code is completely self-contained
  • The reader is already useful in its current state

Note that I don't really need this upstream, as I only need it at "porting-time", but I wouldn't like it to just get forgotten.

I can't promise I'll stick around for doing the factorization work, but that's quite likely :)

Your call anyway !

@MathieuDuponchelle
Copy link
Contributor Author

The code is completely self-contained

Correction, it's not, but the "code-path dependencies" it introduces are purely in the Reader -> Pandoc direction, not sure how to best express that but the net result is that removing it or updating it will not require any changes elswehere, and it would even nearly be possible to revert the patch without conflicts at any point in Pandoc's future history (nearly because the cabal file and the import of Mallard might conflict, but that's really a non-issue)

@jgm
Copy link
Owner
jgm commented May 10, 2016

I've merged this into my mallard branch.
But it's not ready for master. Lots of element aren't supported (e.g. tables), and there are no real tests (I added a stub).

@MathieuDuponchelle
Copy link
Contributor Author

Heh, that's nice :) Pretty much forgot about that request, regarding tables for example, indeed, my target was commonmark which has no syntax for them at the moment. Did you think about code sharing between this reader and the docbook reader? Do you think it is practical ?

@jgm
Copy link
Owner
jgm commented May 10, 2016

The CommonMark writer will output raw HTML for tables (currently).

Code sharing: yes, there's too much duplication for my tastes. I think the way to do this well would be to make the mallard reader a "variant" of the docbook reader. That is, the DocBook module could provide a function readMallard that sets a field in DBState indicating that the mallard variant is to be used. Then for elements like p that exist in mallard but not DocBook, we could simply check that this field is set, and for minor differences in handling of other fields, we could also check this. I think it would be okay just to leave DocBook elements that have no Mallard counterparts as they are -- after all, it isn't our intent to validate the documents, just to read them, and a Mallard document shouldn't contain these elements in the first place.

@tarleb
Copy link
Collaborator
tarleb commented May 11, 2021

This seems to have stalled out. I assume that the branch would require a lot of work to bring it up to date. Should we keep this issue open, close it, or maybe reopen as a new issue to give it a clean slate?

Mallard/Ducktype is stable now, with v1.1 published some years ago. Further development appears to have faded, judging from the GitHub repo and mailing list, but the gnome project continues to actively use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0