-
Notifications
You must be signed in to change notification settings - Fork 430
TEZ-4604: tez-mapreduce does not delete files under staging directory #395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -623,6 +623,7 @@ public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts) | |||
try { | |||
dagAMConf.set(TezConfiguration.TEZ_AM_STAGING_DIR, | |||
jobSubmitDir); | |||
dagAMConf.setBoolean(TezConfiguration.TEZ_AM_STAGING_BASE_DIR_CLEANUP, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YARNRunner is a glue code between a MapReduce Job and Tez, implementing ClientProtocol
. So, the client code of YARNRunner is Apache Hadoop.
ClientProtocol
doesn't have an API to declare that a specific job has been completed. If we resolve this issue on the client side, we have to add new APIs to Apache Hadoop. That's why I added a new param and handled the issue on Apache Tez side.
I'm not confident that this approach is the best. I'd appreciate it if someone could give me a better idea.
This comment was marked as outdated.
This comment was marked as outdated.
💔 -1 overall
This message was automatically generated. |
https://issues.apache.org/jira/browse/TEZ-4604