Description
I hope this is the correct place to put this. I'm trying to add the protocol
element to EML and it's causing EML::eml_validate()
to produce errors. I can't for the life of me see where the error is, according to the documentation models (see image below). Perhaps if someone could point me to an example of valid EML with a protocol that would help, but I haven't found any yet (i.e. 95% chance this is me missing something simple).
My questions are:
- Does the
eml_validate()
function and/or the schema behind it have an error regardingprotocol
? - Does the eml.ecoinformatics.org/schema/ documentation have an error regarding
protocol
? - If not, what am I doing wrong and how do I fix it?
minimally valid EML with just the protocol
element (so the problem isn't the protocol element itself):
<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="EXAMPLE title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
<protocol>
<title>test protocol title</title>
<creator>
<individualName>
<surName>test</surName>
</individualName>
</creator>
<distribution>
<online>
<url>https://doi.org/10.57830/xxxxxxx</url>
</online>
</distribution>
</protocol>
</eml:eml>
minimally valid EML a dataset
element (so the dataset component is not the problem):
<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="Example: title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
<dataset>
<alternateIdentifier>doi: https://doi.org/10.57830/xxxxxxx</alternateIdentifier>
<title>EXAMPLE: title</title>
<creator>
<individualName>
<surName>EXAMPLE</surName>
</individualName>
</creator>
<pubDate>2022-11-11</pubDate>
<abstract>
<para>The abstract goes here</para>
</abstract>
<intellectualRights>
<para>This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.
</para>
</intellectualRights>
<maintenance>
<description>complete</description>
</maintenance>
<contact>
<individualName>
<surName>EXAMPLE</surName>
</individualName>
</contact>
<dataTable>
<entityName>Example Intercept Observations</entityName>
<entityDescription>just some example data</entityDescription>
<physical>
<objectName>Example_Data_Cleaned.csv</objectName>
<size unit="bytes">191995</size>
<authentication method="MD5">d2f8fe468e393c41c6dccf30bab1a91a</authentication>
<dataFormat>
<textFormat>
<numHeaderLines>1</numHeaderLines>
<recordDelimiter>\n</recordDelimiter>
<attributeOrientation>column</attributeOrientation>
<simpleDelimited>
<fieldDelimiter>,</fieldDelimiter>
</simpleDelimited>
</textFormat>
</dataFormat>
</physical>
<attributeList>
<attribute>
<attributeName>scientificName</attributeName>
<attributeDefinition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</attributeDefinition>
<storageType>string</storageType>
<measurementScale>
<nominal>
<nonNumericDomain>
<textDomain>
<definition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</definition>
</textDomain>
</nonNumericDomain>
</nominal>
</measurementScale>
</attribute>
</attributeList>
<numberOfRecords>500</numberOfRecords>
</dataTable>
</dataset>
</eml:eml>
But if I put dataset
and protocol
together in a single document the EML is invalid:
<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="EXAMPLE title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
<dataset>
<alternateIdentifier>doi: https://doi.org/10.57830/xxxxxxx</alternateIdentifier>
<title>EXAMPLE: title</title>
<creator>
<individualName>
<surName>EXAMPLE</surName>
</individualName>
</creator>
<pubDate>2022-11-11</pubDate>
<abstract>
<para>The abstract goes here</para>
</abstract>
<intellectualRights>
<para>This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.
</para>
</intellectualRights>
<maintenance>
<description>complete</description>
</maintenance>
<contact>
<individualName>
<surName>EXAMPLE</surName>
</individualName>
</contact>
<dataTable>
<entityName>Example Intercept Observations</entityName>
<entityDescription>just some example data</entityDescription>
<physical>
<objectName>Example_Data_Cleaned.csv</objectName>
<size unit="bytes">191995</size>
<authentication method="MD5">d2f8fe468e393c41c6dccf30bab1a91a</authentication>
<dataFormat>
<textFormat>
<numHeaderLines>1</numHeaderLines>
<recordDelimiter>\n</recordDelimiter>
<attributeOrientation>column</attributeOrientation>
<simpleDelimited>
<fieldDelimiter>,</fieldDelimiter>
</simpleDelimited>
</textFormat>
</dataFormat>
</physical>
<attributeList>
<attribute>
<attributeName>scientificName</attributeName>
<attributeDefinition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</attributeDefinition>
<storageType>string</storageType>
<measurementScale>
<nominal>
<nonNumericDomain>
<textDomain>
<definition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</definition>
</textDomain>
</nonNumericDomain>
</nominal>
</measurementScale>
</attribute>
</attributeList>
<numberOfRecords>500</numberOfRecords>
</dataTable>
</dataset>
<protocol>
<title>test protocol title</title>
<creator>
<individualName>
<surName>test</surName>
</individualName>
</creator>
<distribution>
<online>
<url>https://doi.org/10.57830/xxxxxxx</url>
</online>
</distribution>
</protocol>
</eml:eml>
Specifically, I get the error:
EML::eml_validate(metadata)
[1] FALSE
attr(,"errors")
[1] "Element 'protocol': This element is not expected. Expected is one of ( annotations, additionalMetadata )."
From what I can tell, neither annotations
nor additionalMetadata
are required elements. Furthermore, when I do have additionalMetadata
elements, I get this same error. If I remove the protocol
element and have just the dataset
and additionalMetadata
elements, the EML is valid (so there is no problem with the additionalMetadata
- I'm just not including those examples here for the sake of brevity).