8000 GitHub - king8w/ecocomDP: A dataset design pattern and R package for ecological community data.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
forked from sokole/ecocomDP

A dataset design pattern and R package for ecological community data.

License

Notifications You must be signed in to change notification settings

king8w/ecocomDP

 
 

Repository files navigation

Travis-CI Build Status codecov.io

ecocomDP: A dataset design pattern for ecological community data to facilitate synthesis and reuse

With the almost forty year existence of the Long-term Ecological Research (LTER) Network and other networks worldwide, a diverse array of long-term observational and experimental data is becoming increasingly available. A number of data repositories are making the data accessible, with the intent that accompanying detailed metadata allow meaningful reuse, repurpose and integration with other data.

However, in synthesis research the largest time investment is still in discovering, cleaning and combining primary datasets until all data are completely understood and converted to a similar format. There are two approaches to achieving this data regularity: a) to prescribe the format before data collection starts, or b) to convert primary data into a flexible intermediate format for reuse. Prescribed formats have rarely been successful due to a wide range of ecosystems represented, original research questions that drive collection, and varying sampling and analysis methods. Hence, we took the second approach: define a flexible intermediate data model, and convert primary data to it. In the context of the Environmental Data Initiative’s data repository, this allows us to maintain the original dataset, which is most convenient for descibing and answering the original research questions (and does not interfere with depositors' existing practices), add a conversion script to reformat the data into the intermediate format and make the harmonized intermediate available to synthesis research. This pre-harmonization step may be accomplished by data managers because the dataset still contains all original information without aggregation, filtering or cleaning necessary for targeted research questions. Although the data are still distributed into distinct datasets, they can easily be discovered, aggregated, and converted further into other formats, for specific secondary use.

Figure: Abstract view of dataset levels. A flexible intermediate (L1, middle) lies between datasets of primary observations (L0, left) and the aggregated views used by synthesis projects. If datasets are in a recognized format, EDI can create tools for some basic functions

Contents

Create ecocomDP data

Use ecocomDP data

PostgreSQL

A PostgreSQL implementation of ecocomDP can be found here.

R package

The R package helps create, validate, document, archive, discover, and use ecocomDP data.

# Install from GitHub
remotes::install_github("EDIorg/ecocomDP")

Running the tests

Tests are implemented with the testthat R-package, and are organized under the /tests/testthat.

Contributing

Community contributions are welcome. Please reference our code conduct and contributing guidelines for submitting pull requrests to us.

Versioning

This project follows the semantic versioning specification.

Authors

Several people have contributed to this project. We welcome you to join us.

About

A dataset design pattern and R package for ecological community data.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 61.4%
  • HTML 37.8%
  • Other 0.8%
0