-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates Spark configuration heuristic severity calculations #229
Conversation
@@ -128,18 +135,29 @@ object ConfigurationHeuristic { | |||
lazy val serializer: Option[String] = getProperty(SPARK_SERIALIZER_KEY) | |||
|
|||
lazy val serializerSeverity: Severity = serializer match { | |||
case None => Severity.NONE | |||
case None => Severity.MODERATE | |||
case Some(`serializerIfNonNullRecommendation`) => Severity.NONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not part of your PR, but it looks suspicious nonetheless: IIUC the third branch is unreachable because the pattern Some(`foo`)
is equavalent to Some(_)
modulo variable name.
scala> Some(42) match {
| case Some(`foo`) => "first"
| case Some(_) => "second"
| }
res1: String = first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if that's true, otherwise the unit tests will fail. I think it all depends on the value of foo. I tried the code locally, and got the following outputs:
scala> val foo = 50
foo: Int = 50
scala> Some(42) match {
| case Some(`foo`) => "first"
| case Some(_) => "second"
| }
res1: String = second
scala> val foo = 42
foo: Int = 42
scala> Some(42) match {
| case Some(`foo`) => "first"
| case Some(_) => "second"
| }
res2: String = first
I am going to leave this code unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, evil Scala...
case Some(_) => DEFAULT_SERIALIZER_IF_NON_NULL_SEVERITY_IF_RECOMMENDATION_UNMET | ||
} | ||
|
||
lazy val isDynamicAllocationEnabled: Option[Boolean] = getProperty(SPARK_DYNAMIC_ALLOCATION_ENABLED).map(_.toBoolean) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for not applying defaults at declaration site? I might be wrong, but it looks like in the following match statement you treat None
as false
for both properties.
Here's a rewritten match:
lazy val shuffleAndDynamicAllocationSeverity = (isDynamicAllocationEnabled, isShuffleServiceEnabled) match {
case (_, Some(true)) | (None, Some(true)) => Severity.NONE
case (Some(true), Some(false)) | (Some(true), None) => Severity.SEVERE
case (Some(false), Some(false)) => Severity.MODERATE
case (Some(false), None) => Severity.MODERATE
case (None, Some(false)) => Severity.MODERATE
case (None, None) => Severity.MODERATE
}
The first case is essentially
case (_, true) => Severity.None
the next bunch is
case (true, false) => Severity.SEVERE
and finally
case (false, false) => Severity.MODERATE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion of using default value instead of None. It simplified the pattern matching.
lazy val isDynamicAllocationEnabled: Option[Boolean] = getProperty(SPARK_DYNAMIC_ALLOCATION_ENABLED).map(_.toBoolean) | ||
lazy val isShuffleServiceEnabled: Option[Boolean] = getProperty(SPARK_SHUFFLE_SERVICE_ENABLED).map(_.toBoolean) | ||
|
||
lazy val shuffleAndDynamicAllocationSeverity = (isDynamicAllocationEnabled, isShuffleServiceEnabled) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be useful to have a brief comment here outlining the rationale. The reader could of course trace it back to that commit, but a comment is more reader-friendly in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I don't have an opinion on dynamic allocation/shuffle service, but the Kryo heuristic looks very useful 👍 . |
Thanks @superbobry for the review. |
@@ -77,6 +74,14 @@ class ConfigurationHeuristic(private val heuristicConfigurationData: HeuristicCo | |||
new HeuristicResultDetails( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add recommendations to the heuristic details. If multiple of them fails, we take the max and user has to fix one to know there is more to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shankar37 I did not understand this comment. Where should I add recommendations to the heuristic details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all configs that we are checking, we should add a detail like of key-value pair like
whenever we find that check not yielding none.
thanks
shankar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shankar37 I am sorry, I still don't understand your comment. Can you please give an example?
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shankar37 Can you please provide an example here. I am not able to understand your comment. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed your reply. An example would be
if( serializer check fails)
HeuristicResult.add( new HeuristicResultDetails(
"Serializer",
"KyroSerializer Not Enabled.") )
if( dynamicallocation check fails) {
HeuristicResult.add (
new HeuristicResultDetails(
"DynamicAllocation without Shuffle",
"DynamicAllocation is enabled but."))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why do we need to have an if statement here, we don't have it in any other case. I am already adding details like the following:
new HeuristicResultDetails(
SPARK_DYNAMIC_ALLOCATION_ENABLED,
formatProperty(evaluator.isDynamicAllocationEnabled.map(.toString))
),
new HeuristicResultDetails(
SPARK_SHUFFLE_SERVICE_ENABLED,
formatProperty(evaluator.isShuffleServiceEnabled.map(.toString))
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the scenario. If I have a job that fails both dynamic allocation check and kyroserializer check you will take the max of severity and show that. Let's say that is dynamic allocation check. Once that is fixed, they will rerun the job and find it failing the kyroserializer check now. I am asking to make it obvious to the user they have multiple issues to fix within this one heuristic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shankar37 I think I got it. Thank you very much. I've updated the PR. Plese have a look.
+1 LGTM |
… fix-config-heuristic
… fix-config-heuristic
Thanks @shkhrgpt. |
As discussed in document, updating the spark configuration heuristic. The main changes are based on the following logics:
Serializer: Suggest using Kryo serializer because its higher performance than the Java Serializer. Maybe a moderate severity if not using Kryo. Config key spark.serializer should be set to org.apache.spark.serializer.KryoSerializer
Shuffle Service: If not set -> Moderate -> Suggest configuring a Shuffle service so that jobs can survive an executor failure. spark.shuffle.service.enabled should be set to true
If Spark Dynamic Allocation is on, and no Shuffle Service configured then Severe severity.
Also updates unit tests.
@shankar37 @akshayrai @superbobry