This repository was archived by the owner on Nov 1, 2023. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 1
feature: unarchive files, add support for online files #45
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## main #45 +/- ##
==========================================
+ Coverage 86.10% 87.40% +1.30%
==========================================
Files 8 10 +2
Lines 511 548 +37
==========================================
+ Hits 440 479 +39
+ Misses 71 69 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
pabloarosado
approved these changes
Oct 17, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed PR! A few minor concerns:
- Now that we have
datautils.io.json
anddatautils.io.archive
and others, we may end up now knowing where things actually are, when attempting to load a utils function (likeload_json
). So maybe we should import all utils functions in thedatautils.io.__init__
(or evendatautils.__init__
, given that the risk of duplicate function names is low) so that they are always easily accessible? - I think your initial concern was to avoid having interdependencies among modules. But I don't see how this has changed: Now
io.archive
importsdecorators
, anddecorators
importsweb
. In any case, I don't think we should worry about this.
Feel free to merge!
This was referenced Oct 17, 2022
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Additions:
owid.datautils.decorators
.decompress_file
so it can handle.tar.gz
and.tar.bz2
.Improves changes from PR #42.
New module
owid.datautils.decorators
This PR creates a new module,
owid.datautils.decorators
. In general, decorators can come handy to enhance the capabilities of functions. You can read more about them in this guide.The first decorator is
enable_file_download
, which adds the functionality to read or process a file directly from a URL.Suppose you have the following function, which reads a LOCAL file and processes it.
Now, imagine that you want this function to be able to download a remote file (hosted in a certain URL) and then apply the same processing. You can do it now with the decorator by adding the following on top of the function declaration:
@enable_file_download("path")
In this case, the decorator argument is the argument's name in the function
process_file
that contains the input path to the file (in this case, a URL).So, all together:
.tar.gz
and.tar.bz2
Now tar files should also be supported. I have used the standard library package
tarfile
. Tests have been added, too.