Description
I'm creating this issue for discussions related to a data model for Plant-Pollinator Interactions (PPI).
We have discussed many options to fit PPI into DwC-Archives, and their advantages and disadvantages. Here I will summarized what have been discussed, so we have tracking of this discussion and other people can also participate on that.
Data Model
- Event Core Model: uses
dwc:Event
class as core in DwC-A's to represent interactions. So, an interaction is adwc:Event
with spatial (dwc:Location
) and temporal information (e.g.dwc:eventData
,dwc:eventTime
). - Resource Relationship: uses
dwc:ResourceRelationship
extension and the core class can bedwc:Occurrence
ordwc:Taxon
. The terms indwc:ResourceRelationship
class are used to linkdwc:Occurrence
s ordwc:Taxon
's in order to represent an interaction between organisms/species. - Interaction extension: create a new DwC Extension called Interaction extension (or PPI extension), so it will allow to specify the interaction partner of a core
dwc:Occurrence
ordwc:Taxon
.
Event Core Model
How it works
The same dwc:eventID
is used to link two dwc:Occurrence
s or dwc:Taxon
s represeting an interaction between linked resources. The DwC-A contains a dwc:Event
class as core and dwc:Occurrence
and/or dwc:Taxon
as extensions (see figure bellow).
The dwc:MeasurementOrFact
class is then used within Plant Pollinator Interactions Vocabulary to describe many characteristics about interacting organism/species and interactions.
Issues:
dwc:Event
does not provide a term which can be used to specify the type of an interaction (e.g.visitsFlowersOf
,pollinates
).- No direction of the recored interaction can be provided. So, from two occurrences/taxa that share the same
dwc:eventID
we don't know which one is the subject and the object of the interaction (who visits who?, who pollinates who?).
Issue 1) can be resolved defining a term interactionType
in the Plant Pollinator Interactions Vocabulary and use dwc:MeasurementOrFact
linked to the dwc:Event
, but we still do not have the direction of the interaction.
Resource Relationship
This looks the more "natural" way to use DwC to specify a relationship (aka interaction) between two or more resources.
The dwc:ResourceRelationship
solves the issues of previous data model by the usage of dwc:relationshipOfResource
(ie. interaction type) and the direction of an interaction is given by the terms dwc:resourceID
and dwc:relatedResourceID
.
New term dwc:relationshipOfResourceID
tdwg/dwc#283 (comment) will allow the adoption of an URI (instead of a literal value) and so, a term in a vocabulary or ontology can be used here (e.g. RO).
Some critics have been made about the complexity of using dwc:ResourceRelationship
class, but I don't see it much different (and more complex) than using dwc:Event
as core, since in any scenario we will need a relation model to capture interactions.
The dwc:MeasurementOrFact
class is used in the same way as in Event Model to specify the characteristics of occurrences/taxa, but the interaction data is linked directly to the dwc:Occurrence
/dwc:Taxon
since **Extended Measurement Or Fact does not
7189
define a term for **
dwc:resourceRelationshipID(oppsed to the
dwc:occurrenceID). The alternative here is to define a new class
Interaction` (next approach).
Interaction Measurement Or Fact extension
Since we can not link dwc:MeasurementOrFact
to a dwc:ResourceRelationship
, we discussed the creation of a new extension Interaction
. The ideia is similar to Extended Measurement Or Fact extension, but instead of defining the term dwc:occurrenceID
as part of the extension, it will define the dwc: resourceRelationshipID
.
With that approach the star schema can be expanded to a snowflake schema, and the relationships (aka interactions) could have some measurements and facts directly attached to them.
Data sample
Event Data Model
Core: interactions.csv:
eventID | eventDate | locality |
---|---|---|
eventID_1 | 2021-01-01 10:00:00 | São Paulo |
Occurrences extension: occurrences.csv:
eventID | occurrenceID | sex | scientificName |
---|---|---|---|
eventID_1 | occ_1 | female | Xylocopa frontalis |
eventID_1 | occ_2 | hermaphrodite | Passiflora edulis |
Extended MoF extension: emof.csv:
eventID | occurrenceID | measurementType | measurementValue |
---|---|---|---|
eventID_1 | occ_2 | flowerColor | purple |
eventID_1 | resourceCollected | nectar |
ResourceRelationship Data Model
Core: occurrences.csv:
occurrenceID | sex | scientificName | eventDate | locality |
---|---|---|---|---|
occ_1 | female | Xylocopa frontalis | 2021-01-01 10:00:00 | São Paulo |
occ_2 | hermaphrodite | Passiflora edulis | 2021-01-01 10:00:00 | São Paulo |
ResourceRelationship extension: resrelat.csv:
occurrenceID | resourceRelationshipID | resourceID | relatedResourceID | relationshipOfResource |
---|---|---|---|---|
occ_1 | resrelat_1 | occ_1 | occ_2 | visistsFlowerOf |
MeasurementOrFact extension: mof.csv:
occurrenceID | measurementType | measurementValue |
---|---|---|
occ_2 | flowerColor | purple |
occ_1 | resourceCollected | nectar |
Interaction extension Data Model
Core: occurrences.csv:
occurrenceID | sex | scientificName |
---|---|---|
occ_1 | female | Xylocopa frontalis |
occ_2 | hermaphrodite | Passiflora edulis |
ResourceRelationship extension: resrelat.csv:
occurrenceID | resourceRelationshipID | resourceID | relatedResourceID | relationshipOfResource | relationshipEstablishedDateProperty |
---|---|---|---|---|---|
occ_1 | resrelat_1 | occ_1 | occ_2 | visistsFlowerOf | 2021-01-01 10:00:00 |
Interaction Measurement Or Fact Extension: interactions-mof.csv:
occurrenceID | resourceRelationshipID | measurementType | measurementValue |
---|---|---|---|
occ_2 | flowerColor | purple | |
occ_1 | resrelat_1 | resourceCollected | nectar |
Other options
A Interaction extension
can extend the dwc:ResourceRelationship
to include geography information direct to the interactions (dwc:Location
class) and many others terms that are relevant to characterize an interaction.
Questions
- What is the "best" model for sharing plant-pollinator interactions?
- What are others advantages/disadvantages of each model?
- Could someone provide examples that fits each one of the models, and try to explain each one is the best for the purpose?
So, I would like to know the opinion of others about it.
thanks.