Jump to content
Praetor77

WN8 breakthrough

Recommended Posts

Greetings gentlemen. I have come up with a way to implement the two point system (discussed about in the WN development group, and recently with Garbad and Echelon in the "problem with WN8 thread) in regards to the expected values for each tank, without increasing the complexity of the formula too much or the calculation time.

 

The way WN8 is currently implemented, it uses a single value for each stat for each tank. "Expected" values which are to be achieved by a player to reach a rating of 1565 WN8. The stats for a player are compared to the expected stats for the tanks he has played in his WOT career. stat/expected_stat = rSTAT. So if you do twice the expected stat, your rSTAT would be 2. Also, the way WN8 is designed makes it so that if players have 1.5 times the expected stats, then their WN8 is around 3000, and 900 WN8 (average player) is attained through achieving 0.75 expected stats.

 

This relationship between rSTATS and WN8 score were derived from using Eureqa, a program which iteratively solves equations and looks for optimum solutions, to determine what weight should be assigned to each rSTAT and determine the final WN8 formula.

 

The problem with this is that not all tanks follow this overall distribution. For some tanks on which high skill players can shine specially, or tanks that have recieved major nerfs (for example Hellcat, or 3601) high level players outperform expected values by a factor larger than 1.5.

 

There were two possible extreme solutions:

 

1- Fit the curve using the "average" players. As a result, anyone playing these tanks moderately well would recieve huge WN8 scores, allowing WN8 farming by playing these tanks.

2- Fit the curve using the top players. As a result, top players playing Hellcat, 3601 (for example) would recieve approximately the same WN8 scores as they would recieve for playing any other tank. The problem with this approach is the average player recieved a lower WN8 than expected for playing this tank, even though they still played as an average player.

 

 

The solution to this was always to use a curve instead of a linear function (or use several linear functions). However, no one was able to come up with an implementation of a two value expected table which wouldnt over-complicate things.

 

 

 

Today I came up with a solution that I think should work, without getting too complex.

 

Previously WN8 was:

 

Step 1

rDAMAGE = avgDmg / expDmg
rSPOT = avgSpot / expSpot
rFRAG = avgFrag / expFrag
rDEF = avgDef / expDef
rWIN = avgWinRate / expWinRate

 

Step 2

rWINc = max(0, (rWIN - 0.71) / (1 - 0.71) )
rDAMAGEc = max(0, (rDAMAGE - 0.22) / (1 - 0.22) )
rFRAGc = max(0, min(rDAMAGEc + 0.2, (rFRAG - 0.12) / (1 - 0.12)))
rSPOTc = max(0, min(rDAMAGEc + 0.1, (rSPOT - 0.38) / (1 - 0.38)))
rDEFc = max(0, min(rDAMAGEc + 0.1, (rDEF - 0.10) / (1 - 0.10)))

 

Step 3

WN8 = 980*rDAMAGEc + 210*rDAMAGEc*rFRAGc + 155*rFRAGc*rSPOTc + 75*rDEFc*rFRAGc + 145*MIN(1.8,rWINc)

 

 

 

I propose simply modifying the first step, by having two expected values, for each stat, for each tank. Let´s say (for now, will analyze what would be the optimum values to use) one value for 1565 WN8 and another for 2320 WN8. Let´s call them expected1 and expected2. So, theoretically, avgSTAT/expSTAT1 should be 1 for a 1565 WN8 player and avgSTAT/expSTAT2 should be 1 for a 2320 WN8 player:

 

Step 1 (new)

 

For each rSTAT, rDAMAGE, rSPOT, rFRAG, rDEF and rWIN:

 

rSTAT = min(1, avgSTAT/expSTAT1) + max(0, [(avgSTAT-expSTAT1)/(expSTAT2-expSTAT1)]/4)

 

 

This way, the WN8 equation hardly changes for players below 1565 WN8, since their avgSTATS should be equal or lower than expSTAT1, so the whole second term is thrown out, and their rSTAT = avgSTAT/expSTAT1, just like before. The change is, that for players with rSTATS above expSTAT1, their stats are not compared only to expSTAT1, but to expSTAT2, which is a much better and fair measure of how they perform. Likely, players below 1565 WN8 will be more fairly compared, since their stats will not be compared to unrealistic expectations for them due to superunicums abusing certain tank´s capabilities or OP periods.

 

 

 

 

So, to sum it up, we would simply need to come up with expected values tables with expected stats for a 1565 WN8 player and a 2320 WN8 player, and replace step 1 in the calculation, replacing a division for a little more complex formula (not that much more complex), and the rest remains exactly the same. I think this kills two birds with one stone, by increasing WN8 accuracy and also decreasing the complexity and difficulty of coming up with expected values, since now we don´t need to try and represent many different kinds of players in a single number.

It shouldnt be too complicated to apply this same method to using perhaps three values (900,1565 and 2320), but first I will explore the two value option and see how it performs.

 

Do you guys think this is worth implementing into WN8? Do you think we should call it WN9, WN8b, or simply maintain the name, since the methodology is pretty much the same?

Link to post
Share on other sites

I kind of understand what you are saying, but I am not as versed in statistics as you, could you give an example or two how it would change things?

 

 

The way I understand it is that it would stretch and contract the WN8 scale slightly so the difference between tanks with high and low skill floors and ceilings respectively would be less dramatic, is that correct?

Link to post
Share on other sites

Keep the moniker of "WN8" please since it's a variation on the same theme.

 

I'm not really sure it's necessary, though.  It gives slightly more granularity while doubling the upkeep of the expected values and most likely doesn't change the overall picture of someones' profile.

Link to post
Share on other sites

Is the sample size big enough for all tanks to include expSTAT2 - that might cause a problem with certain tanks finding enough data points for unicum level?!

This is my only concern, I am working on the data as we speak to determine this and see what comes out.

I kind of understand what you are saying, but I am not as versed in statistics as you, could you give an example or two how it would change things?

The way I understand it is that it would stretch and contract the WN8 scale slightly so the difference between tanks with high and low skill floors and ceilings respectively would be less dramatic, is that correct?

Pretty much.

Keep the moniker of "WN8" please since it's a variation on the same theme.

I'm not really sure it's necessary, though. It gives slightly more granularity while doubling the upkeep of the expected values and most likely doesn't change the overall picture of someones' profile.

On paper, it should be more accurate, and be more farm-proof than current WN8. Also, people who play a LOT of games in low tiers at a so-so level will have his performance compared with so-so players, and not with superunicum 5 crew skill sealclubbers.

We will see how it plays out. Doing some testing atm...

Link to post
Share on other sites

this sounds like a good idea.  however, what about people in the 1800-2k range?  which are they being compared to?  for the name, I vote calling it WN8b

Link to post
Share on other sites

Sounds like a promising step forward. I think it should be called something like WN8.1, since it's not really a new rating just a refinement of the current one. I also think that some of the really high skill tanks (lights) need three points as a minimum for defining the curve, but you are right that it should be tested first with two.

Link to post
Share on other sites

If it dragged a lot of folks out of their current brackets (I'd probably go from blue to green as I like "OP" tanks) then there would be much wailing and gnashing of teeth. That said, in principle and not that my opinion matters for much... I'd say yes.

Link to post
Share on other sites

Hello,

 

Please do correct me if I am wrong...

 

The way I understand your proposal is the following:

 

Right now, all players are compared to an "average" player's expected value.

 

With your proposal, average players will still only be compared to an "average" expected value, because the formula cancels expStat2 at this point, but good players will be compared to an average player, with added handicap calculated from a "good" expected value.

 

The end effect of this would be almost nothing for players below the "good" expected values, but if I understand correctly, a player with an average WN8 or 3000 on all his tanks, but an average of 4000 in a WFE100 or a Hellcat will see his 4000 score lowered closer to his "normal" 3000 value for his other tanks.

 

Assuming the purpose of the WN rating is to estimate a player's skill level, and that it makes sense that a player's skill is relatively stable (as opposed to tank performance, which is variable), this is a good idea.

 

However, just like current WN8, the results would be arbitrary and depending on pre-selected expStat numbers.

 

The only way I can think of to fix that issue is to use a curve system to generate dynamic "expStat" values for each tank, rather than fixed numbers, but that would be a bit more complicated and require some serious CPU muscle...

Link to post
Share on other sites

personally I dont like the idea - whats wrong with rewarding people for very good performance in lights for example?

And punishing wannabe stat padders in KV1-S and Hellcat - oh yeahhh!!! Why should these medicore players get a free pass? If the tank allows for high performance - your problem if you dont manage it

 

 

Also, people who play a LOT of games in low tiers at a so-so level will have his performance compared with so-so players, and not with superunicum 5 crew skill sealclubbers.

 

Thats really bad imo - these people deserve to be compared with Superunicums.

Link to post
Share on other sites

personally I dont like the idea - whats wrong with rewarding people for very good performance in lights for example?

And punishing wannabe stat padders in KV1-S and Hellcat - oh yeahhh!!! Why should these medicore players get a free pass? If the tank allows for high performance - your problem if you dont manage it

 

 

 

Thats really bad imo - these people deserve to be compared with Superunicums.

 

I didn't think this would stop punishing KV1S/Hellcat stat padders. If I understand Praetor this should make it slightly easier to achieve an 'average' result in a KV1S/Hellcat but still difficult to get a Unicum result.  Basically it will stop making average performances in OP tanks get below average WN8 levels while at the same time keep slightly above average performances from getting huge Wn values.  I could certainly be wrong.   

 

I think it should still be considered Wn8.  Methodology isn't changing enough to justify another entry.

Link to post
Share on other sites

 

 

rSTAT = max(1, avgSTAT/expSTAT1) + min (0, [(avgSTAT-expSTAT1)/(expSTAT2-expSTAT1)]/4)

Pretty sure the min should be a max. This version penalizes people with less than expStat1.

 

 

 

The solution to this was always to use a curve instead of a linear function (or use several linear functions). However, no one was able to come up with an implementation of a two value expected table which wouldnt over-complicate things.

I don't get it. Your proposal is piecewise linear with two components. Why is it not over-complicated now?

Link to post
Share on other sites

Humm thinking? :verysmug:

 

Spartan96 - sorry for the down vote is was supposed to be an upvote! I understand your concern. I am at 1499 wn7 overall - just below blue - a goal for which I worked for a long time - now with wn8 I am back to green - hurts my head!

Link to post
Share on other sites

Pretty sure the min should be a max. This version penalizes people with less than expStat1.

 

Shouldn't be that way? I mean if you are not getting expected - should you not be penalized?

Link to post
Share on other sites

If it dragged a lot of folks out of their current brackets (I'd probably go from blue to green as I like "OP" tanks) then there would be much wailing and gnashing of teeth. That said, in principle and not that my opinion matters for much... I'd say yes.

It´s rather to work the other way around.

 

Hello,

 

Please do correct me if I am wrong...

 

The way I understand your proposal is the following:

 

Right now, all players are compared to an "average" player's expected value.

 

With your proposal, average players will still only be compared to an "average" expected value, because the formula cancels expStat2 at this point, but good players will be compared to an average player, with added handicap calculated from a "good" expected value.

 

The end effect of this would be almost nothing for players below the "good" expected values, but if I understand correctly, a player with an average WN8 or 3000 on all his tanks, but an average of 4000 in a WFE100 or a Hellcat will see his 4000 score lowered closer to his "normal" 3000 value for his other tanks.

 

Assuming the purpose of the WN rating is to estimate a player's skill level, and that it makes sense that a player's skill is relatively stable (as opposed to tank performance, which is variable), this is a good idea.

 

However, just like current WN8, the results would be arbitrary and depending on pre-selected expStat numbers.

 

The only way I can think of to fix that issue is to use a curve system to generate dynamic "expStat" values for each tank, rather than fixed numbers, but that would be a bit more complicated and require some serious CPU muscle...

expStats numbers are derived from vbaddict data, not arbitrary numbers.

 

personally I dont like the idea - whats wrong with rewarding people for very good performance in lights for example?

And punishing wannabe stat padders in KV1-S and Hellcat - oh yeahhh!!! Why should these medicore players get a free pass? If the tank allows for high performance - your problem if you dont manage it

 

 

 

Thats really bad imo - these people deserve to be compared with Superunicums.

The idea is to reward very good performance in light tanks the same as any other tank, not more, not less.

 

I didn't think this would stop punishing KV1S/Hellcat stat padders. If I understand Praetor this should make it slightly easier to achieve an 'average' result in a KV1S/Hellcat but still difficult to get a Unicum result.  Basically it will stop making average performances in OP tanks get below average WN8 levels while at the same time keep slightly above average performances from getting huge Wn values.  I could certainly be wrong.   

 

I think it should still be considered Wn8.  Methodology isn't changing enough to justify another entry.

Exactly.

 

 

Pretty sure the min should be a max. This version penalizes people with less than expStat1.

 

I don't get it. Your proposal is piecewise linear with two components. Why is it not over-complicated now?

You are correct. should be a max, and the max should be a min... fixing. 

 

...isn't this breakthrough the same thing you spent 40+ pages arguing with me about why it wasn't needed?

I spent 30+ pages arguing that no one had come up with a proper way to implement it, and that using a single value system produced pretty decent results, not that the two point system wasn´t an improvement over single-point system. The overall change will be pretty small IMHO, but with this implementation the added complexity is very low, and probably worth it. Right from the start I said it would obviously be a more accurate system, I just couldn´t come up witha  proper way to implement, and now I sat down and figured it out.

Link to post
Share on other sites

So when I am pulling 3k dpg and 2.8 kills with my Tiger-H that won't qualify as superunicum?

 

Sorry nope

 

Or 4.6k and 2.2 kills in my Waffent IV

 

 

What you need to do is recognize that a player with 2200 dpg 1.75 kills and 2.0 average spots with 75+ wins in a WZ-132 isn't padding shit..that's playing amazing.  An example of a WN8 "padding" tank.  If you can perform 3x better than 99% of players you are earning the purple.  

 

As of now the only people I see with 3500+ Wn8 are pretty damn good at the game.

 

When does the bar stop being raised?  What qualifies as super Unicum?  I can list some top 10 players for you if you want to see what qualifies.  Nip100, Tornade, Sela, Heavytwenty, CanadianImpact, Kewei, CrimsonCorsair, Carbonward, Veril, Camador..these are the kind of people you need to base your platinum numbers on, and work backwards from there.

Link to post
Share on other sites

Having crunched the (vast amounts of) data for Praetor, my advice would be the simplify the system, not add more complexity. Instead of providing expected values for two levels of player (average and unicum), we need to stick to plan A and see if we can achieve a workable result by putting the benchmark in the middle of those two and just sticking with one set of values. As 1565 WN8 is approximately the 54% WR player, it seems we have the benchmark in the right place already, and so its hard to see the problem we are trying to solve.

 

A lot of discussion on this seems to revolve around whether unicum A is better than unicum B and whether WN8 reflects that accurately. Sorry to be blunt about this, but statistically, there are so few unicums in the game that you are almost irrelevant in data terms for validating a universal rating like WN8. It really doesnt matter to the 99% of non-unicums whether your WN8 is 2900 or 3500. What matters is that the WN8 scale works well for the vast majority of players. Those on the extremities of the bellcurve - bots and unicums - should NOT be a driver in this.

 

To move this ahead we need to get a revised set of expected values published to correct the well-publicized problems that have been observed with the original set. As soon as you open the door to changing the formula, many others will walk through it with their ideas about what WN8 should be and we'll have another 95 pages on what to do next.

 

The only thing wrong with the WN8 formula right now is the 'philosophical' addition of a rWINc term into WN8 - which is completley unnecessary; if it was added to limit the WN8 given to super unicums because of their huge winrate - the 99% doesnt care.

 

As long as we base the expected values on the performance of the players whose winrates are 53.0 - 54.9% we have it right. Suggest we do that, and see how it works.

Link to post
Share on other sites

So when I am pulling 3k dpg and 2.8 kills with my Tiger-H that won't qualify as superunicum?

 

Sorry nope

 

Or 4.6k and 2.2 kills in my Waffent IV

 

 

What you need to do is recognize that a player with 2200 dpg 1.75 kills and 2.0 average spots with 75+ wins in a WZ-132 isn't padding shit..that's playing amazing.  An example of a WN8 "padding" tank.  If you can perform 3x better than 99% of players you are earning the purple.  

 

As of now the only people I see with 3500+ Wn8 are pretty damn good at the game.

This is fundamentally flawed.  WN8 is based around the idea that purple = 1.5x good.  The problem is on some tanks purples are more like 2x good.  This fixes that.  If you still actually are one of the best at driving a tank, your rating won't change.  What it will do is hurt people who are only actually good in a tank but it happens to be a tank that doesn't fit the 1.5x curve (ie, waffle, hellcat, etc).

Link to post
Share on other sites

This is fundamentally flawed.  WN8 is based around the idea that purple = 1.5x good.  The problem is on some tanks purples are more like 2x good.  This fixes that.  If you still actually are one of the best at driving a tank, your rating won't change.  What it will do is hurt people who are only actually good in a tank but it happens to be a tank that doesn't fit the 1.5x curve (ie, waffle, hellcat, etc).

 

I'll use the T-54 since I presume that it has quite an inflated expected value from all the good player abuse. I'm not really sure what this proposed change is supposed to do.

 

frags  damage   spots   def        WR
1.1      1568       1.90    0.95     55.23        expected
2.11     2615      1.5      unkwn  72.22        mine
 
Don't have my def stat cause I'm at work and don't have my wotstats data to pull it out from. At any rate, I exceed the expected values other than spots by a good amount, and my damage is more than 1.5x. So, does this mean that my T-54 WN8 will go up or down if the two-point system goes in? I'm just trying to understand the point of it. If the T-54 has higher than usually expected values and I still meet the purple requirement of 1.5x, does that mean my WN8 score should actually be higher than it is? Or is the change supposed to do the opposite, and pull my numbers down since most purples could probably get 2x expected in the T-54, to which I underperform?
Link to post
Share on other sites

 

When does the bar stop being raised?  What qualifies as super Unicum?  I can list some top 10 players for you if you want to see what qualifies.  Nip100, Tornade, Sela, Heavytwenty, CanadianImpact, Kewei, CrimsonCorsair, Carbonward, Veril, Camador..these are the kind of people you need to base your platinum numbers on, and work backwards from there.

 

Exactly , it seems in order to have a product that has some value, the community needs to come up with a standardized testing system.

We need to define what is good with players that don't platoon or play in tank companies or clan wars. Win % is king and the only reason

it is not used as a direct method of evaluating skill is its padability.  We simply need to remove that factor using players we know play solo and

test various systems using the control. There has to be many players from all skill ranges that only play solo. The whole system need to be built

from observations made from these players.

 

We NEED a control group!!!!!!

Link to post
Share on other sites

 

 
Don't have my def stat cause I'm at work and don't have my wotstats data to pull it out from. At any rate, I exceed the expected values other than spots by a good amount, and my damage is more than 1.5x. So, does this mean that my T-54 WN8 will go up or down if the two-point system goes in? I'm just trying to understand the point of it. If the T-54 has higher than usually expected values and I still meet the purple requirement of 1.5x, does that mean my WN8 score should actually be higher than it is? Or is the change supposed to do the opposite, and pull my numbers down since most purples could probably get 2x expected in the T-54, to which I underperform?

 

 

In this context, high skill cap  or tanks that have wide ranging results include things like:

 

Fast = Bads suicide, goods clean up

Gold rounds matter = bads use rounds that are hard to pen with, goods/pay to win use gold rounds and pen

Positioning matters= bads still get pen, goods get bounces

Great gun dep= bads go to same spot as usual, goods find new spots and pwn

Poor gun dep= bads go to same spots as usual, goods find new locations and can get shots

Sucky tank= bads/passerbys play the tank without gear,guns, or love (or just free exp), goods trick out their lee and find left hand turns.

 

I would say the t54 only partially qualifies.

 

You might see a slight reduction in your rating.

Link to post
Share on other sites

We NEED a control group!!!!!!

And where will you find that?  No one can agree who the top players are, what they should be measured by, or even compare them apples to apples (everyone spams different amounts of gold/tc/platoons/etc).  To compound the problem, kewei and I are the only people to prove dark purple solopub win rates with a large sample...and we got it with substantially different stats profiles.

 

TL;DR -- WN8 will never tell you the top elite of the elite; but it can be pretty damn good at screening the padding greens from the actual blues.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...