Description
Explain the problem.
The Markdown writers should alternate nested emphasis and strong emphasis delimiters to prevent incorrect formatting being output.
I encountered nested <i>
tags in the wild (they appear to be relatively common on Wikipedia), and I noticed that nested italics are rendered as strong emphasis instead of nested emphasis:
$ echo '<i><i>A</i></i>' | ./pandoc --from html --to gfm --trace
[trace] Parsed [Plain [Emph [Emph [Str "A"]]]] at line 1
**A**
$ echo '<strong>A</strong>' | ./pandoc --from html --to gfm --trace
[trace] Parsed [Plain [Strong [Str "A"]]] at line 1
**A**
I would instead expect that <i><i>A</i></i>
be converted to _*A*_
or *_A_*
. This syntax appears to be treated as nested emphasis according to both the Markdown specification (https://spec.commonmark.org/0.31.2/#emphasis-and-strong-emphasis) and Pandoc's own Markdown reader:
$ echo '*A B _*C*_*' | ./pandoc --from gfm --to native
[ Para
[ Emph
[ Str "A"
, Space
, Str "B"
, Space
, Emph [ Emph [ Str "C" ] ]
]
]
]
$ echo '*A B _*C*_*' | ./pandoc --from gfm --to gfm
*A B **C***
This issue is similar to #9521, but that bug report is asking for the formatting to be dropped. I am instead asking that Pandoc not try to "clean" the formatting here and simply write Markdown that it can itself read in. In addition, it is tagged with format:HTML
and reader
when this issue should instead be format:Markdown
and writer
.
I'm not sure how one should handle nested intraword emphasis, but given the limitations of Markdown, it might be best to consider that impossible to write without problems.
Pandoc version?
macOS on Apple Silicon (albeit an x86_64 executable running under Rosetta2)
pandoc 3.6.3-nightly-2025-02-24
Features: +server +lua
Scripting engine: Lua 5.4