8000 JATS validation fails if footnotes include block quotes · Issue #5570 · jgm/pandoc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
JATS validation fails if footnotes include block quotes #5570
Closed
@coryschires

Description

@coryschires

Background

The JATS spec does not allow you to have block quotes (i.e. <disp-quote>) inside footnotes (i.e. <fn>).

Frankly, I think this is an odd and unreasonable restriction. Some authors – and some disciplines, such as legal scholarship – make heavy use of footnotes. As far as I know, there's nothing fundamentally wrong with placing a block quote inside a footnote.

Problem

I am encountering real-world examples of this problem, so I need some sort of workaround.

To be clear, here's an example of invalid JATS:

<fn>
  <p> ... </p>
  <disp-quote> ... </disp-quote>
</fn>

This JATS XML would fail validation with the error: Element fn content does not follow the DTD, expecting (label? , p+), got (p disp-quote)

Steps to recreate

Pandoc version: 2.7.2
Files: jats_example.zip

To reproduce the issue described above

  1. pandoc -s --metadata-file metadata.json --to jats example.md -o output.xml
  2. Validate output.xml using the PMC online validation tool: https://www.ncbi.nlm.nih.gov/pmc/tools/xmlchecker
  3. The validation tool will display the errors described above

Solution

After examining the JATS spec, I think I have a solid workaround. I want to wrap the <disp-quote> element in a <p> element. This will ensure the JATS is valid while while only minimally changing the semantic meaning. Unless it's too much work, it would be nice if we could also include a specific-use attribute.

So in practice, we would convert this:

<fn>
  <p> ... </p>
  <disp-quote> ... </disp-quote>
</fn>

Into this:

<fn>
  <p> ... </p>
  <p specific-use="wrapper">
    <disp-quote> ... </disp-quote>
  </p>
</fn>

For sure, this is a little weird. From a semantic (or even just commonsense) standpoint, it doesn't make sense to have a block quote inside a paragraph. But this is allowed / valid in JATS. In fact, here's a proof of concept demonstrating that it's valid / okay to wrap a <disp-quote> in a <p> tag. You can download this file and run it through the PMC Validator to confirm.

Finally, in case it's unclear, I only want to wrap <disp-quote> when nested inside <fn> (i.e. I don't want to wrap all <disp-quote>).

Questions

First, I can't decide if this fix should be made directly in the core JATS writer or only in my code (e.g. using a custom filter). Personally, I am leaning toward the core JATS writer because, imo, the JATS writer should strive to produce valid JATS and thus everyone would benefit from this fix. However, I can also imagine y'all feeling like this problem is too specific and should be solved in the client's code rather than Pandoc. And, of course, it's really not my decision to make. So... Let me know what y'all think.

Second, if y'all think this should be solved in the client's code, then I could use some help writing a Lua filter for this use case. I have successfully written some basic Lua filters in the past, but this problem is proving trickier than I expected. Seems like Pandoc's AST expects paragraphs to include a list of inline elements but I'm trying to nest a block quote which also a block element. Anyway, for whatever reason, it's not working as expected, so any advice would be very much appreciated.

Thanks again for maintaining Pandoc! It's an amazing tool!

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0