8000 Modify osu! standardised scoring to introduce a combo exponent by Zyfarok · Pull Request #24166 · ppy/osu · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Modify osu! standardised scoring to introduce a combo exponent #24166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Dec 12, 2023

Conversation

Zyfarok
Copy link
Contributor
@Zyfarok Zyfarok commented Jul 9, 2023

This PR introduces a new scoring formula, similar to ScoreV2 but re-balanced with 3 changes:

  • A square-root is applied to the combos (in the ComboScore)
  • The Combo/Acc split is 50/50 (500k/500k)
  • The exponent on accuracy is reduced to 5

For players, this basically sums up to:

  • Reduced "choke" penalty (first few misses cost less), while keeping the impact of misses high at smaller combos
  • A bigger score range for high-misscount low-accuracy plays (200k in v2 maps to approximately 450k in this formula)

You can get an idea of what this gives in this spreadsheet comparing with v1 and v2 in "worst case miss-spread" examples :
https://docs.google.com/spreadsheets/d/1g-hel7ywUdZFQGpaC5UVeURBYLuPhTS6rTnI1-ImoTI/edit#gid=0

The formula was tuned from the results of this survey:
https://docs.google.com/document/d/1Jzdwt3-QMQ_zBEB2wNzP9TX1ZusRusvj2kGdEMX4N3g/edit?tab=t.0

This PR also implements the conversion of stable's score v1 scores to the new formula, using some maths to best approximate the combos of a play from the data available.

(Note: this comment was edited to summarize this PR's changes and remove useless details)

@@ -293,7 +293,7 @@ protected sealed override void RevertResultInternal(JudgementResult result)

protected virtual double GetBonusScoreChange(JudgementResult result) => Judgement.ToNumericResult(result.Type);

protected virtual double GetComboScoreChange(JudgementResult result) => Judgement.ToNumericResult(result.Type) * (1 + result.ComboAfterJudgement / 10d);
protected virtual double GetComboScoreChange(JudgementResult result) => Judgement.ToNumericResult(result.Type) * Math.Pow(result.ComboAfterJudgement, 0.5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both of these functions get overridden by the ruleset-specific processors, except for GetComboScoreChange in OsuScoreProcessor. is this intended for just osu! or all rulesets?

Copy link
Contributor Author
@Zyfarok Zyfarok Jul 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch: I didn't know ComputeTotalScore was overridden in OsuScoreProcessor, so mybad.
The change to ComputeTotalScore was affecting the Scoring test scene though, which is why I assumed it was affecting osu! but this is b 8000 ecause apparently that scene is using the raw ScoreProcessor. (guess I'll fix that test scene in another PR)

The PR is still a draft. The changes were originally intended to be for osu! only but potentially benefit other gamemodes too, but as I gave a deeper look at the code of other game-modes, they already have an alternative solution (logarithm of combo) and require different accuracy exponents. Thus I believe they do not need this change but it can be a good idea to ask experts from those rulesets if they think this new alternative can also work better in their ruleset.

One question this brings up is: does it make sense to override ComputeTotalScore/GetComboScoreChange in OsuScoreProcessor ? If yes, then what should be the default implementation in ScoreProcessor since all others rulesets already override it ? Wouldn't it make more sense to have ScoreProcessor be abstract and thus ComputeTotalScore and GetComboScoreChange too ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention for it not being abstract is to make it easier for third party ruleset developers to get to a running state without having to worry about how score is implemented. So, in a similar vein, the implementation there(/here) should be something that leads to good values in a general case.
For example, ask the question of: if the taiko implementation were to use this implementation - would it behave better with the previous version or this new one?

OsuScoreProcessor is a bit of a weird one. I wouldn't be against removing the overrides there.

Copy link
Contributor Author
@Zyfarok Zyfarok Jul 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed your answer here.
The taiko, mania and catch scoring would definitely be better with the square-root combos than raw combo as the former is closer to their current scoring implementation. (sqrt(combo) is closer to log(combo) than just raw combo)
Thus we can indeed expect future game modes to benefit from that change too, but based on your argument, it might make more sense to have log(combo) as the default since all other game-modes use the logarithm ?

Copy link
Member

Choose a 8000 reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, if we're just looking to get things moving, then the base implementation should not be changed (and your change should be moved to OsuScoreProcessor as to not break other rulesets).

But that said, I don't know how much this matters in such an early state. At worst it's going to cause differences in local user scores for users playing non-official rulesets.

@smoogipoo thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts maybe it's just fine to make this a breaking change..

Copy link
Contributor
@smoogipoo smoogipoo Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked above: "would [taiko] behave better with the previous version or this new one?".

The response was "[taiko] scoring would definitely be better with the square-root combos".

Based on that I'm inclined to believe the new method would be better as a base implementation, if all of our rulesets would benefit from this.

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Taiko, mania and ctb override this anyway, and square-root combos is closer to what those 3 rulesets already use (which is log of combo + some hard-cap), thus it seems safe to assume that this is a better starting base for future rulesets.

What might make sense is to have taiko as the default though, and OsuScoreProcessor overriding it, since I think mania and ctb are closer to taiko's combo scaling. I suppose it's not worth spending too much time on that decision though ?

@peppy
Copy link
Member
peppy commented Jul 12, 2023

I'm going to hold off taking action on this until after the next release (later this week). @smoogipoo interested in your thoughts on rolling this out for user testing (separate isolated build? setting?) if at all.

I have a formula for legacy score conversion but it still needs to be implemented.

Keep in mind we need both legacy (score v1) and current-lazer to proposed-lazer. Would probably be helpful if you show the formula here just so we can confirm it's going to work.

@smoogipoo
Copy link
Contributor
smoogipoo commented Jul 12, 2023

@peppy I'm not entirely sure how to test these changes out myself.

  • Do we allow scores to be submitted with it/to have leaderboards.
  • Do we want SV1 scores to be converted to this right away, as opposed to the current impl?
  • If we don't want SV1 scores to be converted right away, do we want to re-run the import potentially a second time with this change? (if we run it once with the current standardised scoring and then later another time with this proposal)
  • If we don't run the import a second time, then conversion may be sv1 -> sv2 -> sv3 - is there data loss involved here?
  • In any case, we need a process that converts from the current standardised to this proposal.

@smoogipoo
Copy link
Contributor

Please provide the formula for converting legacy scores. It all needs to be done in bulk.

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Jul 13, 2023

We've discussed the conversion algorithm with @WitherFlower but I just didn't write it down in mathematical form yet, I'll try to provide the formula tomorrow if I have the time (worst case scenario this weekend).

And yes there might be some loss in converting in a "sv1 -> sv2 -> sv3" fashion instead of "sv1 -> sv3" but it should be quite small so it shouldn't matter much if it's temporary.
Also after investigating a bit more on the last few days, the alternative 2 that is present on the spreadsheet might be a better choice (and closer to how acc is in v2) so it seems to be a safer starting point and I'll adjust the acc scaling to follow that instead.

Question: Should I implement the GetComboScoreChange and ComputeTotalScore in the OsuScoreProcessor class or remove the current overides to have it be only in ScoreProcessor ? And if I put them in OsuScoreProcessor, do I revert the changes in ScoreProcessor to have it back as original v2 ? (using git rebase to clean history or simple revert commits ?)

@smoogipoo
Copy link
Contributor

Let’s remove the OsuScoreProcessor overrides.

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Jul 15, 2023

I changed the accuracy exponents to "alternative 2" (which is something in the middle between alternative 1 and scorev2 accuracy, as you can see on the spreadsheet above) as we noticed that alternative 1 reduces the impact of accuracy on low combo scores too much. (I did it with a git rebase to avoid creating useless commits)

And most importantly I implemented the conversion from score v1 to score v3. The code compiles but I suppose that other changes are necessary (changing the score version to trigger recalculation and such things).

Expect some graphs from @WitherFlower showcasing the conversion accuracy soon.

@WitherFlower
Copy link
Contributor
WitherFlower commented Jul 15, 2023

The conversion formula was tested using a python script that simulates random miss positions in a map to compute the ComboScore values for scorev1 and scorev3.

Please note that the goal of this script was only to simulate the combo part, and as such it does not take into account accuracy. Because of this, error values will be closer to 0 in actual applications.

In the graphs included below you can see :

  • the estimations compared to the expected value (the red line)
  • the absolute error (ComboScore difference on the Y axis)
  • the relative error (0.01 = 1% on the Y axis)

Each blue dot is a score that was simulated and the estimated by the script. There are 60000 scores in total, 1000 scores for each miss count ranging from 1 to 60, on a 1000 combo map.

Lower Bound

Comparison to ScoreV3 Combo Portion Absolute Error Relative Error

Upper Bound

Comparison to ScoreV3 Combo Portion Absolute Error Relative Error

Final Estimation

Comparison to ScoreV3 Combo Portion Absolute Error Relative Error

The following formula that combines the upper and lower bounds into the final estimation was manually tuned to be as close as possible to the expected value, while trying to not overestimate certain scores to be safe. As it stands, roughly 99% of scores have a negative error to minimize the overestimation.

https://github.com/Zyfarok/osu/blob/a97915180f37a87ac5716acdb0bfcafffe9e6452/osu.Game/Database/StandardisedScoreMigrationTools.cs#L309-L312

It might be possible to improve it even further, but it should already be good enough as is.

P.S. please someone help me embed the code from the link above why does github not work as you would expect it to

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Jul 15, 2023

Things left to do :

  • Consider applying changes to the value of slider-bonus (affects conversion) and slider-ends (doesn't affect conversion) as was discussed during the meeting (as a separate PR ?)
  • Discuss / evaluate if a different mix of lower and upper estimate is more appropriate, depending on the objective (trade-off between conversion accuracy and probability of overestimation)
  • Apply the other changes required for the conversion to work properly (+1 to score version ?)
  • Add conversion of v2, including current lazer scores (should be pretty easy)

@peppy peppy changed the title Scorev3 - Modify v2 to introduce a combo exponent Modify standardised scoring to introduce a combo exponent Jul 16, 2023
@peppy peppy self-assigned this Aug 21, 2023
@peppy
Copy link
Member
peppy commented Aug 21, 2023

I'm going to self assign this pending either merge / deployment or additional testing via a test scene that more people can be involved in being added.

@peppy
Copy link
Member
peppy commented Aug 22, 2023

I've noticed that the RevertResultInternal can introduce errors due to floating point precision. We might want to open a separate issue for this ?

Yes, please open an issue for this with as much detail as you can provide.

@peppy peppy self-requested a review August 22, 2023 06:52
@@ -317,8 +317,8 @@ private void updateScore()

protected virtual double ComputeTotalScore(double comboProgress, double accuracyProgress, double bonusPortion)
{
return 700000 * comboProgress +
300000 * Math.Pow(Accuracy.Value, 10) * accuracyProgress +
return 700000 * Accuracy.Value * comboProgress +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having accuracy multiplied in here is going to take me a while to get my head around, and accept on a conceptual level.

What happens once we show the individual score portions on the results screen? How do we tell a user that combo score can be less than max even with a full combo?

This may call for making this algorithm slightly more complex in order to keep these two pieces isolated..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can factor out one Accuracy.Value from both terms like this

return Accuracy.Value *
        (700000 * comboProgress + 300000 * Math.Pow(Accuracy.Value, 7) * accuracyProgress)
        + bonusPortion;

now you have combo and accuracy portions isolated, but you just multiply by the accuracy at the end (before bonus), which can be conveyed quite easily i think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't help when displaying to a user.

To recap, I definitely want to be able to show at the results screen something like this:

Safari 2023-08-22 at 09 52 17

So the user can better understand the score breakdown.

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what's the best. I don't think factoring out one Accuracy.Value as @Walavouchey makes it any simpler indeed.

First note that, just like in score v2 (and v1), comboProgress is 1 only if you SS (FC is not enough), because it accounts for the value of each judgements thus already taking into account some form of accuracy (which is good).
Multiplying the final comboProgress by the acc was done to give more "importance" to accuracy on low acc scores (which is currently an issue in score v2) and thus bringing down a 70% acc FC to ~360k compared to the ~500k in v2.

I see 4 options:

  • Keep the formula like that, and work on a way to better explain things. (Rename ComboScore as MixedScore ? idk)
  • Keep the end-results the same, but have ComboScore split into a "ComboScore" without accuracy multiplier and some "AccComboMalus" whose value is simply the difference:
    AccComboMalus = 700000 * comboProgress * (Accuracy.Value - 1) (which is negative)
  • Change the formula to split score into 3 parts instead of 2 : ComboScore, AccScore, and MixedScore, where ComboScore has no acc multiplier but MixedScore has one and would look something like comboProgress * Pow(Accuracy.Value, 3). The best proportions (300k/300k/400k ? 333k each ?) of the three and acc exponents on AccScore and MixedScore would have to be studied again to agree on a good value.
  • Simply revert that part of the change, and go back to v2's acc scaling (The result being "Square-root combo with original acc scaling of score v2" in the spreadsheet). This is the simplest but then it wouldn't address the above mentioned issue with low-acc scores. Since this low-acc scores issue could be considered an "edge case", I would personally find it slightly worse but still acceptable though.

Which one do you prefer ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what's the best. I don't think factoring out one Accuracy.Value as @Walavouchey makes it any simpler indeed.

Agree

Which one do you prefer ?

The only valid one if we want to aim for the acc/combo split display on the results screen is the last option, IMO. That said, if I'm the only one that wants to see this information on the results screen and the majority opinion is that it's not useful, I'm willing to go with option 1 and do away with that concept altogether.

@ppy/team-client interested in other opinions here. and players that may happen to be watching this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note regarding score farming balance : I've re-run the calculations and updated the spreadsheet I sent in #23763, and while the results are worse than the previous iteration, I'd argue they are still acceptable, so that isn't really a concern anymore.

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also about:

No one has complained until now about this scenario.

v2 has almost only be used in tournament (and tournament training) until now.
Low accs FCs at the scale of 70% rarely happen in tournament in practice, and when they happen, the match is rarely any close due to v2's ComboScore scaling. I believe this issue has thus been overshadowed. This issue is the conclusion I ended up when I asked a small sample (about 10 people) of the tournament community how much score a low acc FCs should give as well as how it should compare other scores.

If you want to gather more opinions on the question, then this is the kind of poll that we should send to the community of experienced players: We define some scores and ask "should A or B win". I can prepare a list of question for such a poll if you think that would help.

Copy link
Member
@peppy peppy Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a starting point would be showing which leaderboards have accuracy below 70% in the top 500.

But I'm fine with "should A or B win" examples, posted here first.

Copy link
@monochrome22 monochrome22 Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I see this low acc issue the same way as changing combo-accuracy ratio in ScoreV2. Currently, an 80% FC would beat a 99% with 4 miss (miss every 20% of the map). With acc scaling changes, the 99% would win instead. So once again it comes down to the question, which is more important, combo or accuracy? I think BTMC made a good point in one of the community meetings when he said there will always be debate in the community over which is more important. Personally I'm fine with either score winning, though I feel like opinions would vary greatly from player to player. If a decision is made based on player feedback, then many players should be asked their opinion (more than 10), and if there is a clear majority who think the 99% score should win, then the change can be made.
Also, I don't understand how a 70-80% FC would be an issue on leaderboards. If someone is able to FC with 80% and beat a 4miss 99%, then surely there would be many players who can FC with 95%+ and the leaderboard would be filled with them, right?

I know twitter polls tend to be troublesome, but it might work well here to see if there's a clear majority, and it doesn't require technical knowledge.
"Which score should win in osu!standard?"
"80% 1000x/1000x FC"
"99% 200x/1000x 4miss"
"Either is fine"

Copy link
Member
@peppy peppy Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not polling on twitter because no one can agree on anything and the resulst are always wrong to someone.

Will rely on expert opinions instead.

/// <summary>
/// The bonus portion of the legacy (ScoreV1) total score.
/// </summary>
public int LegacyBonusScore { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the implications that come with this (is it required that we write these out to the database?). Would want @smoogipoo to chime in here.

Copy link
Contributor
@smoogipoo smoogipoo Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on the implementation. If it's required for score conversion, then yes it will need to be written to the database as that's the only thing osu-queue-score-statistics knows about.

Doing so is simple, by just changing the ToDatabaseAttributes and FromDatabaseAttributes methods and adding a row to osu_difficulty_attribs, though whether you want to add 2 new properties along with the 3 existing ones that facilitate the current sv2 conversion is up to your discretion. And whether the current ones are still required by the new conversion in this PR.

As a general rule, I've tried to keep these attributes as lean as possible, purely for database size concerns, and I would suggest the same to be done here. But again, you're best to say whether this is actually a concern.

Which also brings me to - is LegacyMaxCombo required in the first place. The difficulty calculators are pretty much legacy-only right now, so doesn't this duplicate MaxCombo, and if not - why? Because that's an issue.

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the LegacyBonusScore:

The problem when converting legacy scores is we don't know how much of the score comes from the bonus and how much comes from something else. Thus there is three approaches:

  • Assume the minimum possible bonus
  • Assume the maximum possible bonus
  • Assume the same ratio of bonus as in the best possible score on the map.

The previous algo (by smoogi?) was assuming the 1st option (minimum bonus) everywhere. I went with the 3rd approach for the ComboScore, and the 1st for the BonusScore to reduce the chance of overestimating scores with low combo and high spinner bonus but still keeping the conversion close in most cases. Alternatively, the worst case scenario can be computed using the 2nd approach for ComboScore and 1st for BonusScore, but the problem is that then you could end up with a ComboScore of 0 after conversion in some extreme scenarios, even though you had comboed a bit.

For LegacyMaxCombo:

lazer's MaxCombo is not the same as the Legacy one (unless we match stable and give combo on slider-ends ? Idk if there's any other difference though) but I have no idea if the "MaxCombo" in DifficultyAttributes is supposed to be the legacy one or the new one though.

Copy link
Contributor
@smoogipoo smoogipoo Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I have no idea if the "MaxCombo" in DifficultyAttributes is supposed to be the legacy one or the new one though.

Have you tried it? When it's relevant (i.e. storing to the database), diffcalc attaches the classic mod:

public IEnumerable<DifficultyAttributes> CalculateAllLegacyCombinations(CancellationToken cancellationToken = default)
{
var rulesetInstance = ruleset.CreateInstance();
foreach (var combination in CreateDifficultyAdjustmentModCombinations())
{
Mod classicMod = rulesetInstance.CreateMod<ModClassic>();
var finalCombination = ModUtils.FlattenMod(combination);
if (classicMod != null)
finalCombination = finalCombination.Append(classicMod);
yield return Calculate(finalCombination.ToArray(), cancellationToken);
}
}

... which sets this to false:

public override Judgement CreateJudgement() => OnlyJudgeNestedObjects ? new OsuIgnoreJudgement() : new OsuJudgement();

... which gives sliders a combo-affecting judgement (effectively the slider tail).

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried it?

Nope. What I meant is that I expected MaxCombo to be the lazer one (without classic mod) when I wrote this PR (especially since other legacy fields have "Legacy" in the name), but I might have been completely wrong indeed.

Copy link
Contributor
@smoogipoo smoogipoo Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for LegacyBonusScore, I take it that the "ratio" version that I have only remains because it's required for the other rulesets?

Copy link
Contributor Author
@Zyfarok Zyfarok Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember but either that or I wasn't sure if I would use it at first and then forgot about it. It might be simpler to just have a LegacyBonusScore and NewBonusScore (or whatever you name it) though if the ratio is needed, but I wanted to minimize the changes.

Copy link
Contributor
@smoogipoo smoogipoo Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's recently come to my(/our?) attention that the size of this table in production is 117GB, this is definitely something to consider.

I will be working on improving the situation for score conversion, please don't worry about it for the time being.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these legacy attributes required to be part of this PR?

@bdach bdach changed the title Modify standardised scoring to introduce a combo exponent Modify osu! standardised scoring to introduce a combo exponent Nov 24, 2023
@bdach
Copy link
Collaborator
bdach commented Nov 24, 2023

I've pushed changes that attempt to make the migration code more palatable. @Zyfarok please double-check that the comments all make sense and that I haven't broken logic along the way.

I've ran this past a database I had of about 7k old scores. There appear to be no conversions that I would consider anomalous.

@peppy @smoogipoo At this point I'm done with this branch and see no further work to be done my end, but I'm not merging this without one further approval.

bdach
bdach previously approved these changes Nov 24, 2023

// Compute approximate upper estimate new score for that play.
// This time, divide the remaining combo among remaining objects equally to achieve longest possible combo lengths.
// There is no rigorous proof that doing this will yield a correct upper bound, but it seems to work out in practice.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with your changes.
My only comment would be that I can provide a proof for this if you really want.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice if you have one, just for my peace of mind.

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Nov 24, 2023

I'll try to PR the AccScore substraction later today, which would need to be merged before this PR to avoid recalculating conversion twice.

@smoogipoo
Copy link
Contributor

!diffcalc
GENERATORS=score

Copy link
github-actions bot commented Nov 24, 2023

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Nov 24, 2023

@smoogipoo sorry, I wasn't able to split into a different pull-request as it was too intricate with the changes on this one

Unless you want me to have to do a messy merge/rebase in this branch

It's less than 10 lines changed though

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Nov 24, 2023

We're good to go for me.

The only topic that could still be discussed is what is decided to be done regarding the lowerbound/upperbound mix. There theoretically exists a score were the perfect estimation is the lowerbound. The question is what should we do to ensure that converted scores should be beatable by a play in lazer ?

As was shown in the above comment #24166 (comment) (which was for the original formula parameters with 700k/300k and all, but otherwise should remain the same) the current mix of lower-bound + upper-bound has very low error while still underestimating in most cases but it can overestimate even if it's usually by a small margin (<2%). If you want to be extra safe we could change the mix or apply a bound like 1.5*lowerbound on the value (depending on the classic mod multiplier ?) to make sure it's impossible (or almost impossible) to get a bigger ComboScore than you would have had in lazer without the mod. ideally, we could do a statistical analysis with real scores or a probabilistic analysis but I don't think it's worth the time. Especially since the current conversion is guaranteed to be almost perfect for FCs and single-break scores which dominates most leaderboards.

@smoogipoo
Copy link
Contributor
smoogipoo commented Nov 29, 2023

The question is what should we do to ensure that converted scores should be beatable by a play in lazer ?

Rather than a hard requirement that every score should be beatable, it's more a suggestion to guide implementations. It's the reason why, for example, in the old conversion code I preferred to underestimate the added bonus.

We may end up adding a flat negative score multiplier to the classic mod, but that's unrelated to the suggestion.

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Nov 29, 2023

Then what we have now should be good. I believe players from the feedback server want to test this ASAP and getting feedback on it is really important.
I'll try to provide the proof that bdach requested but I need some time that I don't have right now (I have a paper deadline coming up next week) and this shouldn't be needed for this PR to be accepted (it's only on the upperbound estimate anyway, and he only requested it for his peace of mind).

@smoogipoo
Copy link
Contributor

I know we discussed this before above, but the answer wasn't quite clear. Is the intention for other rulesets to also use this method for conversion in the future, or will they remain as is?

I ask because it influences the storage of the attributes. For example I don't believe the osu max combo can change but it is mod dependent in mania.

@smoogipoo
Copy link
Contributor

Copy link
github-actions bot commented Dec 5, 2023

@smoogipoo
Copy link
Contributor
smoogipoo commented Dec 5, 2023

As seen in the spreadsheet above, there's a few edge cases that look weird, such as scores exceeding 3M total score. It would be good to go through and see why this is happening, as well as checking if a few of the other less-than-1M scores are behaving as expected.

You can test with the output added in ppy/osu-tools#189 (now merged)

@Zyfarok
Copy link
Contributor Author
Zyfarok commented Dec 11, 2023

I know we discussed this before above, but the answer wasn't quite clear. Is the intention for other rulesets to also use this method for conversion in the future, or will they remain as is?

@smoogipoo what part are you talking about ? Estimating combo to improve conversion ? I'm not as knowledgeable when it comes to other rulesets so I'm not sure, but as far as I know:

  • They are much more acc focused so they do not care as much about converting the combo part perfectly.
  • Also they already use better score formula in stable so there's less difference between their legacy and new ComboScore.

As seen in the spreadsheet above, there's a few edge cases that look weird, such as scores exceeding 3M total score. It would be good to go through and see why this is happening, as well as checking if a few of the other less-than-1M scores are behaving as expected.

I'll have a look when I can. For the 3M ones, one possibility is that the position of the 100s and/or sliders mess up the assumptions the conversion code makes, but I wouldn't expect such a difference in the end result.

@smoogipoo
Copy link
Contributor

For the 3M ones

I've already looked at that case, and have started going through other scores on the sheet.

@smoogipoo
Copy link
Contributor

Without having looked into the calculations too deeply, I think this is fine to go in as is so it can be deployed ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

8 participants
0