Description
We have encountered data files from Europe where a comma is used as the decimal separator (rather than a period which is common in the US). We have not found a place in the EML schema to record this, so this issue records that request. Commas are common in other parts of the world. https://i.redd.it/omgfapht3qn51.png
A comma decimal separator is not always correctly interpreted automatically by packages (e.g., pandas), although most have a mechanism for specifying this in the import statement (e.g., pd.read_csv(file_name,sep=';', decimal=","
). EML metadata can be used to aid importing data tables, and so could populate that statement. Most likely, an optional field named decimalSeparator
would suffice.
We agree that it would be almost impossible to interpret a table that used commas as both the field separator and the decimal separator without differentiating them somehow. Therefore, its likely that a best practice would be to not construct a table this way. We have not explored the effect of using the literalCharacter field, for example ‘2021-03-28; 20\,27
’.