Tags: S4M8/intel-extractor
Tags
Scraping improvements (#4) * wip/curious script Update scraping strategy to target API directly for scraping instead of scrolling the DOM. Add functions to handle verification that the org exists. Determine number of pages to be scraped for citizen dossiers. Add all parsed citizen urls to an array. * wip/add sqlite database for citizen data storage and export Add Sequelize package. Create Citizen model with required fields. Add database initialization on startup. * update scraping method and csv structure * wip/updated csv structure * various improvements * cleanup * feat/updated scraping method and csv strucuture * chore/update version * cleanup and formatting * fix/reduce org name to only SID; update csv to remove mainOrg from affiliation list