NOTE: This is a user-maintained extension of the pymed project which was archived in 2020. Some bugs in pymed
are fixed here. This package can be installed via pip install pymed-paperscraper
since I forked it to support paperscraper
.
PyMed is a Python library that provides access to PubMed through the PubMed API.
The PubMed API is not very well documented and querying it in a performant way is too complicated and time consuming for researchers. This wrapper provides access to the API in a consistent, readable and performant way.
This library takes care of the following for you:
- Querying the PubMed database (with the standard PubMed query language)
- Batching of requests for better performance
- Parsing and cleaning of the retrieved articles
For full (working) examples have a look at the examples/
folder in this repository. In essence you only need to import the PubMed
class, instantiate it, and use it to query:
from pymed_paperscraper import PubMed
pubmed = PubMed(tool="MyTool", email="my@email.address")
results = pubmed.query("Some query", max_results=500)
Bugfixes compared to archived pymed
:
- Article IDs are correctly extracted
pymed#22
- Automatic retries if API is unresponsive/overloaded. Support for
max_tries
inPubMed
class.
The original documentation of the PubMed API can be found here: PubMed Central. PubMed Central kindly requests you to:
- Do not make concurrent requests, even at off-peak times; and
- Include two parameters that help to identify your service or application to our servers
- tool should be the name of the application, as a string value with no internal spaces, and
- email should be the e-mail address of the maintainer of the tool, and should be a valid e-mail address.
If you use pymed_paperscraper
in your work, please cite:
(Citation follows)