-
-
Notifications
You must be signed in to change notification settings - Fork 402
Fixed assembly of 920120-no-backtracking.data #2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed assembly of 920120-no-backtracking.data #2333
Conversation
Just to have it in one place, the target of this rule is:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the intention on this rule to get a collective view into this one.
First, and not related, the 920120.data looks like the regexp is not "disassembled". So it is a bit difficult to compare.
Regarding the new one, looks like aacute
was left out in line 37. It should be &[aeEiIoOuUyY]acute
. I'm going to review all these html metas so see if we are missing others, just in case. BTW, it is out also in the original no backtracking data file....
Other than that, my test for just file
works as it is not matching now. Will create a new issue to track changes in 920120 in general.
Not sure it helps, but I've properly disassembled 920120. |
This reverts commit 55f522e.
Just learned something new: lookbehind assertions must be of fixed length. That even means that you can't use certain grouping patterns, so we can't use regexp-assemble.py to assemble the individual branches inside of the lookbehind assertion. |
You're right about |
Also, take a look at the linked issue. |
@theseion With the extended html entities, and using case ignore this is what I get:
What do you think? |
Meeting decision January: @franbuehler volunteered to review this P,R but just saw that @fzipi is already reviewing it... |
Responding to the question in chat: the line |
@fzipi I've updated the data files and the rule comment with your suggestion. I noticed that you've added a couple more entities and I was wondering about the rational. The rule comment says that "common" entities are allowed but I'm not sure what would be considered common. I'd also include |
@fzipi pint |
@theseion I just added all the entities that could be used in a file name.... probably adding |
Improved comments in data files for 920120
@fzipi Can you run your test once more on the latest commit? |
Yes, all tests look good using the regex generated with no-backtracking data:
Tests:
|
I've added tests for all of the entities (including variants). Tests are green. From my point of view this is ready merge. @fzipi? |
Agreed, merging. |
No description provided.