-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add HdfsFlagTarget #2559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HdfsFlagTarget #2559
Conversation
update current fork of Luigi
Could you merge master for the flake8 fix? (it was recently merged) |
test/contrib/hdfs_test.py
Outdated
@@ -386,6 +386,12 @@ def create_target(self, format=None): | |||
target.remove(skip_trash=True) | |||
return target | |||
|
|||
def create_flag_target(self, format=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the method name here is confusing. You aren't actually creating the flag target. Just the class object needed to later create the flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here, this method does create a HdfsFlagTarget
, and the helper method is modeled off of the similar method called create_target
line 385.
But in any case I don't think the helper method is necessary (especially if only used in that once test) and can be removed and the FlagTarget created in the test
* upstream-master: Make Worker parameter task_process_context an OptionalParameter (spotify#2468) (spotify#2574) Version 2.8.0 Implement configurable CORS. Add HdfsFlagTarget (spotify#2559) Fix HdfsAtomicWriteDirPipe.close() when using snakebite and the file do not exist. (spotify#2549) Small fix to logging in contrib/ecs.py (spotify#2556) [ImgBot] optimizes images (spotify#2555) Add CopyToTable task for MySQL (spotify#2553) Make capture_output non-positional in ExternalProgramTask (spotify#2547) Add Movio to list of Luigi users (spotify#2551) Interpolate environment variables in .cfg config files (spotify#2527) Fix ReadTheDocs build (spotify#2546)
Description
Added an
HdfsFlagTarget
modeled off the classS3FlagTarget
https://github.com/spotify/luigi/blob/master/luigi/contrib/s3.py#L680Motivation and Context
The problem this solves is the same as the
S3FlagTarget
. We'll havespark
jobs output to directories and use the_SUCCESS
semaphore to check for job completion.Have you tested this? If so, how?
This is a fix we've used for a pipeline, and it runs 😸 . It is also nearly identical to the
S3FlagTarget
, which also works. I included unit tests.