[Docx Reader] Only honour the last seen paragraph style #5767
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚧⚠️ 🚨 Do not merge 🚨 ⚠️ 🚧
This PR is mostly intended for discussion related to #5738. The issue is, Word doesn't allow multiple paragraph styles being applied to a single paragraph. In my tests, both LibreOffice and Word (2013 and 2019) ignored all styles defined on the paragraph except the last one.
This PR basically makes Docx Reader do the same. The main question is, do we actually want to do that? This does actually simplify the code a tiny bit, but due to Writer writing multiple styles per paragraph in some cases, round-tripping through docx will sometimes create weird inconsistencies. This arguably is not a big issue, since opening and saving Writer's output with Word will lead to the same weird inconsistencies.
One test is failing. I chose not to update it. Technically, that test is not actually testing for anything particularly meaningful: only pandoc can produce a paragraph that has both
Definition
andSource Code
styles, and such documents won't survive being edited by Word or LibreOffice. So the thing that test is supposed to test for is, well, frankly, not a real thing under a particular definition of "real". So I'm not sure how to update that test correctly, provided we decide this should be merged./cc @jkr
/cc @conklech (I believe you said something about actually round-tripping through docx in your workflow, so your input here might be enlightening)