This is a pair of documents used to calculate the statistics behind documents posted on Engineering Archive. The data is collected using the PHP script (modified from here). The jupyter notebook here is used on: https://blog.engrxiv.org/stats/
Use the OSF's API (https://developer.osf.io/) to gather a list of papers in engrXiv so we can gather stats.
You will need an Authorization Token (https://developer.osf.io/#tag/Authentication) to make requests.
We use the PHP Curl Class library to handle our GET requests: https://github.com/php-curl-class/php-curl-class
You can pass the token to the script by appending -p$osf_token
to the end of your php command, where $osf_token
is your OSF authorization token.
The Jupyter notebook is just a small bit of Python to plot the results that are stored in the CSV file generated by the PHP script. The current data file used in the notebook is hosted here: https://osf.io/ns9yr/
The bash script download_engrxiv.sh
uses the engrxiv-papers.csv
file which is created using the PHP script, there is an occasionally updated version available on OSF. The CSV file is read using csvtool
and then the primary file for each preprint is downloaded using wget
. Each file is saved as GUID.pdf.