Open
Description
Interesting workflow:
- Start with a species list
- Clean names with taxize and scrubr
- Get occurrence data with rgbif
- Clean occurrence data w/ rgbif, scrubr or
- CoordinateCleaner
- assertr to check data
R code example:
# read in species list
spp <- read.csv("spp_list.txt", header = TRUE,
stringsAsFactors = FALSE)$bad
# resolve names: fix misspellings
spp2 <- taxize::gnr_resolve(spp, data_source_ids = 11,
canonical = TRUE)$matched_name2
# fetch GBIF occurrence data
dat <- rgbif::occ_data(scientificName = spp2, limit = 300)
# remove data with issues: COUNTRY_MISMATCH & COORDINATE_ROUNDED
dat <- rgbif::occ_issues(dat, -cum, -cdround)
# make a single data.frame
dat <- dplyr::bind_rows(lapply(dat, "[[", "data"))
# remove records with incomplete lat/lon data
dat <- scrubr::coord_incomplete(dat)
So adding taxize and scrubr as Galaxy-E tools can be of interest