8000 GitHub - nwehner/engrXiv-print-count: Updating preprint server source
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

nwehner/engrXiv-print-count

 
 

Repository files navigation

engrXiv-print-count

This is a pair of documents used to calculate the statistics behind documents posted on Engineering Archive. The data is collected using the PHP script (modified from here). The jupyter notebook here is used on: https://blog.engrxiv.org/stats/

To use the PHP script

Use the OSF's API (https://developer.osf.io/) to gather a list of papers in engrXiv so we can gather stats.

You will need an Authorization Token (https://developer.osf.io/#tag/Authentication) to make requests.

We use the PHP Curl Class library to handle our GET requests: https://github.com/php-curl-class/php-curl-class

You can pass the token to the script by appending -p$osf_token to the end of your php command, where $osf_token is your OSF authorization token.

The Jupyter Notebook

The Jupyter notebook is just a small bit of Python to plot the results that are stored in the CSV file generated by the PHP script. The current data file used in the notebook is hosted here: https://osf.io/ns9yr/

Colab

Binder

Download all engrXiv files

The bash script download_engrxiv.sh uses the engrxiv-papers.csv file which is created using the PHP script, there is an occasionally updated version available on OSF. The CSV file is read using csvtool and then the primary file for each preprint is downloaded using wget. Each file is saved as GUID.pdf.

About

Updating preprint server source

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.0%
  • PHP 3.8%
  • Shell 0.2%
0