Description
While working on the ingestion of ABCDEFG data we has some issue ingesting data coming from several BioCase instances.
In the 2.06 which most instance use the RecordBasis is a controlled vocabulary consisting of these elements:
"PreservedSpecimen","LivingSpecimen","FossileSpecimen","FossilSpecimen","OtherSpecimen","HumanObservation","MachineObservation","DrawingOrPhotograph","MultimediaObject","AbsenceObservation"
However in the EFG XML other types of RecordBasis, resulting in XML which does not validate against the ABCD schema.
This gives validation errors such as:
cvc-enumeration-valid: Value 'MineralSpecimen' is not facet-valid with respect to enumeration '[PreservedSpecimen, LivingSpecimen, FossileSpecimen, OtherSpecimen, HumanObservation, MachineObservation, DrawingOrPhotograph, MultimediaObject]'. It must be a value from the enumeration.
BioCase instances produces invalid data should be something to be avoided. However, there are already applications dependent on these (invalid) types being in in the RecordBasis. For example GeoCase uses this to populate the Specimen Type.
If I look at the 3.0 version of ABCD there is now some space to include geological specimen in the ABCD standard with the new RecordBasis type MineralSpecimen
. However, with the BioCase providing EFG data we already noticed a kind of standard for the types, which are also used in GeoCase. The following types we have seen used:
"Unspecified", "RockSpecimen", "MineralSpecimen", "MeteoriteSpecimen"
As these types are already semi standardized and actively used within both the BioCase EFG instances and the GeoCase portal I would propose to include also the other types into the ABCD standard.
Additionally I would like to propose that before data is exchanged the data is validated so we are sure that all ABCD(EFG) data communicated complies to the data standard.
Interested in what other think regarding this subject.
For an example of a ABCDEFG from TalTech which uses RockSpecimen see:
https://bc.geocollections.info/querytool/raw.cgi?dsa=sarv&filter=(inst=Department%20of%20Geology,%20TalTech)AND(col=GIT)AND(cat=374-5)&schema=http://www.tdwg.org/schemas/abcd/2.06&wrapper_url=https://bc.geocollections.info/pywrapper.cgi?dsa=sarv