-
Notifications
You must be signed in to change notification settings - Fork 9.1k
YARN-11358. [Federation] Add FederationInterceptor#allow-partial-result config. #5056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
We have something similar with RBF. |
Thanks a lot for your suggestion, I refer to RBF's code, I named this optional parameter |
🎊 +1 overall
This message was automatically generated. |
@goiri Can you help review this pr? Thank you very much! |
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
Outdated
Show resolved
Hide resolved
@@ -2024,9 +2030,15 @@ private <R> Map<SubClusterInfo, R> invokeConcurrent(Collection<SubClusterInfo> c | |||
Map<SubClusterInfo, R> results = new HashMap<>(); | |||
|
|||
// Send the requests in parallel | |||
CompletionService<R> compSvc = new ExecutorCompletionService<>(this.threadpool); | |||
CompletionService<Pair<R, Exception>> compSvc = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
if (response != null) { | ||
results.put(clusterId, response); | ||
} | ||
Exception exception = pair.getRight(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the API seems to be getValue()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
|
||
Configuration conf = this.getConf(); | ||
// Enable strict mode | ||
conf.setBoolean(YarnConfiguration.ROUTER_INTERCEPTOR_ALLOW_PARTIAL_RESULT_ENABLED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
@@ -50,5 +50,4 @@ protected void registerBadSubCluster(SubClusterId badSC) { | |||
badSC); | |||
interceptor.setRunning(false); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@goiri Can you help review this pr again? Thank you very much! |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
setupCluster(Arrays.asList(bad2)); | ||
|
||
LambdaTestUtils.intercept(YarnRuntimeException.class, "RM is stopped", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the old test and add a new one with the partial enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for reviewing the code, I will add new unit tests.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
...ter/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java
Show resolved
Hide resolved
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
Outdated
Show resolved
Hide resolved
/** Router Interceptor Allow Partial Result Enable. **/ | ||
public static final String ROUTER_INTERCEPTOR_ALLOW_PARTIAL_RESULT_ENABLED = | ||
ROUTER_PREFIX + "interceptor.allow-partial-result.enable"; | ||
public static final boolean DEFAULT_ROUTER_INTERCEPTOR_ALLOW_PARTIAL_RESULT_ENABLED = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the other comments, I think this needs to be false
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@@ -2117,7 +2128,7 @@ private <R> Map<SubClusterInfo, R> invokeConcurrent(Collection<SubClusterInfo> c | |||
throw new YarnRuntimeException(e.getCause().getMessage(), e); | |||
} | |||
}); | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a warning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for helping to review the code, I will fix it.
🎊 +1 overall
This message was automatically generated. |
@goiri Thank you very much for helping to review the code! |
JIRA: YARN-11358. [Federation] Add FederationInterceptor#allow-partial-result config.
When we use Federation, we will have multiple sub-clusters, and some interfaces need to call all clusters to get results,
such as getNodes, getNodeToLabels, etc.
Q1 : if the sub-cluster is unavailable when we call the interface, should the interface return the result?
The current code is to return the result. If the result is returned, the result information is inaccurate;
if the result is not returned, it is not user-friendly.
Plan to add a new configuration
yarn.router.interceptor.allow-partial-result.enable
If
true
, we will ignore the exception of some subClusters during the calling process, and return result.If
false
, if an exception occurs in a subCluster during the calling process, an exception will be thrown directly.