Description
While testing the new schematron file for EAC 2.0, I noticed that the regex borrowed from the EAD3 schematron has a small bug. For example, the following value is valid according to the EAD3 schematron:
US-oclc-12345678901
However, that is a fake 19 digit code, which should NOT be valid. That same 19-digit code is, correctly, not valid in EAD2002 nor EAC 1.0.
I am going to recreate that pattern for EAC 2.0 by following, essentially, the EAD2002 model, which does validate the country code, when present. Since we are validating the country code elsewhere, it seems like we should do that here, as well, rather than just using a two-character match pattern for that. Anyhow, here's the current EAD3 regex:
(^([A-Z]{2})|([a-zA-Z]{1})|([a-zA-Z]{3,4}))(-[a-zA-Z0-9:/-]{1,11})$
Whereas that should probably be (though NOT tested):
^(([A-Z]{2})|([a-zA-Z]{1})|([a-zA-Z]{3,4}))(-[a-zA-Z0-9:/-]{1,11})$
To decide:
Should we:
- update the regex as is so that invalid codes up to 19 digits will not be able to validate (the max length is 16 digits)?
- update the regex to ensure that a country code, when present, is also valid (as was done with EAD2002, and will be done in the new approach)?
- ignore this bug altogether (outside of documenting it) since it likely does not impact anyone at all?
Another example: right now, the following is also valid in EAD3:
XX-1
Whereas that same fake code is correctly not valid in EAD2002 (though it is in EAC-CPF 1.0, which switched to a pure regex validation).
Creator of issue
The issue relates to
- EAC-CPF schema issue
- EAC-CPF Tag Library issue
- EAD schema issue
- EAD Tag Library issue
- Schema issue
- Tag Library issue
- Suggestions for all schemas
- Suggestions for all Tag Libraries
- Other
Wanted change/feature
- Text:
Reporting a bug
- Text:
Suggested Solution
- Text:
Steps to Reproduce (for bugs)
Context
- Text:
Your Environment can be a clue to a bug
- Version used:
- Environment name and version (e.g. Chrome 39, node.js 5.4):
- Operating System and version (desktop or mobile):