8000 Script update-metrics using service scopus does not update all publications · Issue #511 · 4Science/DSpace · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Script update-metrics using service scopus does not update all publications #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jorgeltd opened this issue Apr 28, 2025 · 0 comments
Labels

Comments

@jorgeltd
Copy link
jorgeltd commented Apr 28, 2025

Describe the bug
DSpace-CRIS Version: 2024.02.00

Issue Description
Running the script update-metrics --service scopus does not add metrics for all publications inside CRIS.
In my instance, with 866 publications having either a DOI or a Scopus ID, only 442 publications get their metrics updated.

Steps to Reproduce

  1. Add a considerable number of publications (e.g., 100) with the metadata dc.identifier.doi or dc.identifier.scopus, corresponding to publications indexed in Scopus. Identifiers should be unique and the documents should have a citation number greater than 0.
  2. Run the script update-metrics --service scopus.

Expected Behavior
All publications with a valid identifier should have their metrics updated, respecting the --limit parameter of the script.

Related Work
Similarly to issue #508, there appears to be a problem with the item iterator and the committing of results.
In the function updateMetric of UpdateScopusMetrics.java, there is a while block starting at line 88 where the final action is to commit the obtained metrics of the item.
This seems to alter the item iterator, reducing the number of items that should be updated.

@Override
public long updateMetric(Context context, Iterator<Item> itemIterator, String param) {
long updatedItems = 0;
long foundItems = 0;
long apiCalls = 0;
logsCache = new ArrayList<>();
try {
while (itemIterator.hasNext()) {
Map<String, Item> queryMap = new HashMap<>();
List<Item> itemList = new ArrayList<>();
for (int i = 0; i < fetchSize && itemIterator.hasNext(); i++) {
Item item = itemIterator.next();
logAndCache("Adding item with uuid: " + item.getID());
setLastImportMetadataValue(context, item);
itemList.add(item);
}
foundItems += itemList.size();
String id = this.generateQuery(queryMap, itemList);
logAndCache("Getting scopus metrics for " + id);
updatedItems +=
scopusProvider.getScopusList(this.generateQuery(queryMap, itemList))
.stream()
.filter(Objects::nonNull)
.map(scopusMetric -> this.updateScopusMetrics(
context,
this.findItem(queryMap, scopusMetric),
scopusMetric
)
)
.filter(BooleanUtils::isTrue)
.count();
apiCalls++;
context.commit();
}
} catch (SQLException e) {
logAndCacheError("Error while updating scopus' metrics", e);
} finally {
logAndCache("Found and fetched " + foundItems + " with " + apiCalls + " api calls!");
}
logsCache.addAll(scopusProvider.getLogs());
return updatedItems;
}

I moved the commit call outside the while block on my instance an that seems to fix the problem.

@jorgeltd jorgeltd added the bug label Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant
0