8000 Including protocol causes validation issues · Issue #350 · ropensci/EML · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Including protocol causes validation issues #350
Open
@RobLBaker

Description

@RobLBaker

I hope this is the correct place to put this. I'm trying to add the protocol element to EML and it's causing EML::eml_validate() to produce errors. I can't for the life of me see where the error is, according to the documentation models (see image below). Perhaps if someone could point me to an example of valid EML with a protocol that would help, but I haven't found any yet (i.e. 95% chance this is me missing something simple).

My questions are:

  1. Does the eml_validate() function and/or the schema behind it have an error regarding protocol?
  2. Does the eml.ecoinformatics.org/schema/ documentation have an error regarding protocol?
  3. If not, what am I doing wrong and how do I fix it?

minimally valid EML with just the protocol element (so the problem isn't the protocol element itself):

<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="EXAMPLE title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
    <protocol>
    <title>test protocol title</title>
    <creator>
      <individualName>
        <surName>test</surName>
      </individualName>
    </creator>
    <distribution>
      <online>
        <url>https://doi.org/10.57830/xxxxxxx</url>
      </online>
    </distribution>
  </protocol>
</eml:eml>

minimally valid EML a dataset element (so the dataset component is not the problem):

<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="Example: title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
    <dataset>
        <alternateIdentifier>doi: https://doi.org/10.57830/xxxxxxx</alternateIdentifier>
        <title>EXAMPLE: title</title>
        <creator>
            <individualName>
                <surName>EXAMPLE</surName>
            </individualName>
        </creator>
        <pubDate>2022-11-11</pubDate>
        <abstract>
            <para>The abstract goes here</para>
        </abstract>
        <intellectualRights>
            <para>This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.
            </para>
        </intellectualRights>
        <maintenance>
            <description>complete</description>
        </maintenance>
        <contact>
            <individualName>
                <surName>EXAMPLE</surName>
            </individualName>
        </contact>    
        <dataTable>
            <entityName>Example Intercept Observations</entityName>
            <entityDescription>just some example data</entityDescription>
            <physical>
                <objectName>Example_Data_Cleaned.csv</objectName>
                <size unit="bytes">191995</size>
                <authentication method="MD5">d2f8fe468e393c41c6dccf30bab1a91a</authentication>
                <dataFormat>
                    <textFormat>
                        <numHeaderLines>1</numHeaderLines>
                        <recordDelimiter>\n</recordDelimiter>
                        <attributeOrientation>column</attributeOrientation>
                        <simpleDelimited>
                            <fieldDelimiter>,</fieldDelimiter>
                        </simpleDelimited>
                    </textFormat>
                </dataFormat>
            </physical>
            <attributeList>
                <attribute>
                    <attributeName>scientificName</attributeName>
                    <attributeDefinition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</attributeDefinition>
                    <storageType>string</storageType>
                    <measurementScale>
                        <nominal>
                            <nonNumericDomain>
                                <textDomain>
                                    <definition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</definition>
                                </textDomain>
                            </nonNumericDomain>
                        </nominal>
                    </measurementScale>
                </attribute>
            </attributeList>
            <numberOfRecords>500</numberOfRecords>
        </dataTable>
    </dataset>
</eml:eml>

But if I put dataset and protocol together in a single document the EML is invalid:

<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="EXAMPLE title" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd" system="unknown">
  <dataset>
    <alternateIdentifier>doi: https://doi.org/10.57830/xxxxxxx</alternateIdentifier>
    <title>EXAMPLE: title</title>
    <creator>
      <individualName>
        <surName>EXAMPLE</surName>
      </individualName>
    </creator>
    <pubDate>2022-11-11</pubDate>
    <abstract>
      <para>The abstract goes here</para>
    </abstract>
    <intellectualRights>
      <para>This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.
</para>
    </intellectualRights>
    <maintenance>
      <description>complete</description>
    </maintenance>
    <contact>
      <individualName>
        <surName>EXAMPLE</surName>
      </individualName>
    </contact>    
    <dataTable>
      <entityName>Example Intercept Observations</entityName>
      <entityDescription>just some example data</entityDescription>
      <physical>
        <objectName>Example_Data_Cleaned.csv</objectName>
        <size unit="bytes">191995</size>
        <authentication method="MD5">d2f8fe468e393c41c6dccf30bab1a91a</authentication>
        <dataFormat>
          <textFormat>
            <numHeaderLines>1</numHeaderLines>
            <recordDelimiter>\n</recordDelimiter>
            <attributeOrientation>column</attributeOrientation>
            <simpleDelimited>
              <fieldDelimiter>,</fieldDelimiter>
            </simpleDelimited>
          </textFormat>
        </dataFormat>
      </physical>
      <attributeList>
        <attribute>
          <attributeName>scientificName</attributeName>
          <attributeDefinition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</attributeDefinition>
          <storageType>string</storageType>
          <measurementScale>
            <nominal>
              <nonNumericDomain>
                <textDomain>
                  <definition>The full scientific name for the observed species according to the Guide to the Vascular Plants of Florida, Second Edition published by Richard P. Wunderlin and Bruce F. Hansen in 2003. University Press of Florida.</definition>
                </textDomain>
              </nonNumericDomain>
            </nominal>
          </measurementScale>
        </attribute>
      </attributeList>
      <numberOfRecords>500</numberOfRecords>
    </dataTable>
  </dataset>
  <protocol>
    <title>test protocol title</title>
    <creator>
      <individualName>
        <surName>test</surName>
      </individualName>
    </creator>
    <distribution>
      <online>
        <url>https://doi.org/10.57830/xxxxxxx</url>
      </online>
    </distribution>
  </protocol>
</eml:eml>

Specifically, I get the error:

EML::eml_validate(metadata)
[1] FALSE
attr(,"errors")
[1] "Element 'protocol': This element is not expected. Expected is one of ( annotations, additionalMetadata )."

From what I can tell, neither annotations nor additionalMetadata are required elements. Furthermore, when I do have additionalMetadata elements, I get this same error. If I remove the protocol element and have just the dataset and additionalMetadata elements, the EML is valid (so there is no problem with the additionalMetadata - I'm just not including those examples here for the sake of brevity).

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0