Jump to content
bjshnog

⟪WN8⟫ Development / Resources

Recommended Posts

Why do TC padders get purple win rate, when they cant play for shit?

 

There can never be a perfect system. Right now its pretty easy to spot abuse.

 

If they are all so fucking retarded than how come they win all the time? They must be doing something right, teamplay > dmg whoring.

Link to post
Share on other sites

We know that WG has separated some of the stats for TCs and CWs, so maybe now we have to request that they separate them completely, in the same way that they have done with TBs. That way, WN* won't be affected by teams specifically created to win or pad WR. After that, we must try to create a way in which (a transformed version of) WR can be included in WN*, to make up for stats lost by platooning with high end players (and I think I may have found a strategy for this). At the moment, I think that platooning and TCs are actually the reason why Eureqa decided that the best WN8 equation needed to stretch at the end (using multiplication of terms), and it may be incorrect.

Link to post
Share on other sites

If they are all so fucking retarded than how come they win all the time? They must be doing something right, teamplay > dmg whoring.

There is also the assertion that generally speaking those farming wins in TCs / platoons need to be good enough to actually be in those TC/platoon capable of win farming.

 

---

 

On a side note - any word on WN9? Thoughts on making it based on the extended data available post 8.8? One would expect a more complete data set would lead to a more accurate representation of a player's influence.

Link to post
Share on other sites

There is also the assertion that generally speaking those farming wins in TCs / platoons need to be good enough to actually be in those TC/platoon capable of win farming.

 

---

 

On a side note - any word on WN9? Thoughts on making it based on the extended data available post 8.8? One would expect a more complete data set would lead to a more accurate representation of a player's influence.

 

Here's the problem: it is not a more complete dataset by any means. Not only is it only data available since the 8.8 update, but it was only tracked since 8.8. This means that new player accounts have a full dataset for their time playing, but players who have been playing for 3 years or so would only have about their most recent 15-20% battles' data in those same categories.

Also, I thought it was only available through dossier files, but since XVM can actually show you 0.8.8+ stats of other players, that may not be the case.

Link to post
Share on other sites

Here's the problem: it is not a more complete dataset by any means. Not only is it only data available since the 8.8 update, but it was only tracked since 8.8. This means that new player accounts have a full dataset for their time playing, but players who have been playing for 3 years or so would only have about their most recent 15-20% battles' data in those same categories.

Also, I thought it was only available through dossier files, but since XVM can actually show you 0.8.8+ stats of other players, that may not be the case.

I think you misunderstood - more complete as in a more types of data tracked. Think of it similar to a more accurate 'recent'. As to availability of data we'd need someone more familiar with the API to confirm, but it is my understanding that more data is now available post 8.8.

Link to post
Share on other sites

I think you misunderstood - more complete as in a more types of data tracked. Think of it similar to a more accurate 'recent'. As to availability of data we'd need someone more familiar with the API to confirm, but it is my understanding that more data is now available post 8.8.

 

It could indeed be used for recent WN... It would take up quite a bit more space, though, and stat sites would take time to catch up.

 

EDIT: Then again, I don't know what stats are tracked. If it's just spotted/tracking damage, then it probably won't have much use.

Link to post
Share on other sites

It could indeed be used for recent WN... It would take up quite a bit more space, though, and stat sites would take time to catch up.

 

EDIT: Then again, I don't know what stats are tracked. If it's just spotted/tracking damage, then it probably won't have much use.

I seem to recall reading that spotted/tracking damage had a reasonable correlation to winning (one of the recent WN8 threads on here) - granted not at the same level as other metrics, but not insignificant all the same.

Link to post
Share on other sites

There is also the assertion that generally speaking those farming wins in TCs / platoons need to be good enough to actually be in those TC/platoon capable of win farming.

 

...

 

 

good old Hami

 

http://tanks.noobmeter.com/player/eu/Hami/500513914/

 

 

still the best argument why WR has no buisness in any rating

 

For those of you from SEA/NA where t8 companies are a rarity and t10s some kind of myth - on EU and RU server we have quite a few of these players. Hami is just the most prominent one on the Eu-server. He is a above avg player (looking at server avg), maybe 1400 -1500 WN7 unpadded if he would play solo random. But nowhere near unicum status.

 

WN as far as I understood it, was always aimed at measuring single players not some kind of group effort. Thats why it doesnt make any sense to me to incl. WR into WN as long as there arnt pure solo random stats available.

Link to post
Share on other sites

Meh.

If you exclude win rates and spotting damage, than why bother with making rating formulas? Why not just use DPG per tank and call it GG?

Link to post
Share on other sites

Meh.

If you exclude win rates and spotting damage, than why bother with making rating formulas? Why not just use DPG per tank and call it GG?

 Because DPG isn't good enough.

Link to post
Share on other sites

Meh.

If you exclude win rates and spotting damage, than why bother with making rating formulas? Why not just use DPG per tank and call it GG?

 

Winrate is the thing that WN is attempting to correlate to.  Including X in a formula meant to correlate to X is bad stats, but is done because there are some unquantifiable elements and people are too lazy to use both WN8 and WR at the same time to form a picture of a player.

Link to post
Share on other sites

Funny guy is funny.

Show me a player that has lower dmg/tank than me, but higher WN8 ratings because he is better in other aspects of the game.

 

Damage per tank has such an influence on WN8 that all other stats are practically irrelevant.

 

 

good old Hami

 

http://tanks.noobmeter.com/player/eu/Hami/500513914/

 

 

still the best argument why WR has no buisness in any rating

 

For those of you from SEA/NA where t8 companies are a rarity and t10s some kind of myth - on EU and RU server we have quite a few of these players. Hami is just the most prominent one on the Eu-server. He is a above avg player (looking at server avg), maybe 1400 -1500 WN7 unpadded if he would play solo random. But nowhere near unicum status.

 

WN as far as I understood it, was always aimed at measuring single players not some kind of group effort. Thats why it doesnt make any sense to me to incl. WR into WN as long as there arnt pure solo random stats available.

 

WN8 is also invalid in Hami's case.

Link to post
Share on other sites

Funny guy is funny.

Show me a player that has lower dmg/tank than me, but higher WN8 ratings because he is better in other aspects of the game.

 

Damage per tank has such an influence on WN8 that all other stats are practically irrelevant.

Thats because it is the most important factor in winning games as preator explains here:

 

I think this stuff belongs in this thread, so posting old and new analysis on this matter here.

 

TIER 9 TANKS - 500k games

 

Filtered the data from Phalynx as it was a HUGE dataset, picked at random 500k battles from tier 9 tanks (500k rows, although many tanks were in the same battles, does that make sense?). I realized normalizing scale is not necessary for inputing the data to Eureqa since the logistic function used to predict models for binary classification already squashes the result of each variable to a 0,1 range. Xelos, if you can do something similar with your variables before doing the logit regression that would be dandy! Otherwise, your resulting coefficients have to at least be multiplied by the mean for each variable, right?

 

Then I plugged all of that into Eureqa and ran 1.5e10 formula evaluations using AUC error and fitting damage, track_damage, radio_damage, kills, defense, spots, teamdamage to a binary win formula using a logistic function. (did not use cap, since it will be a strong predictor but only due to the fact that you can only get cap points while winning. Causality is completely reversed for this  stat, so no sense in including it).

 

Some preliminary results:

 

The most accurate solution using 1 degree of freedom is : logistic(damage_dealt), which has a .384 AUC error (FP+FN/P+N),for which I will simplify saying that it correctly predicts the outcome of about 72.3% of games, which is 22.3% above random, since a random variable should be able to predict 50% of games correctly. So:

 

Damage_dealt:72.3%

damage_dealt+kills: 73.7%

damage_dealt+kills+damage_assisted_radio:74.6%

damage_dealt+kills+spots: 73.9%

 

As you can see, damage is by far the most defining variable.

 

 

Another thing I tried is evaluating df=1 models, switching the variable, and measuring AUC.

Damage_dealt:72.3% (still a poor predictor as far as AUC values go, though of course, this is due to the other 14 players in the match)

Damage_assisted_radio: 66.2%

Damage_assisted_track: 54% (fails miserably)

kills: 65.0%

 

As you can see, every stat is quite poor at predicting wins on its own except for damage_dealt.

 

Also, comparing how well current variables in WN8 predict wins and how it would improve if we could use spotted_damage (even though we can´t):

 

damage_dealt+kills+spots: 73.9%

damage_dealt+kills+damage_assisted_radio:74.6%

 

That is a very small gain. Also, spots "fill in" for spotted damage pretty adequately.

Link to post
Share on other sites

 

Thats because it is the most important factor in winning games as preator explains here:
 

 

So damage alone gives 72.3%, while adding the other factors give 73.9%. A huge gain there isn't it.

Link to post
Share on other sites

Thats because it is the most important factor in winning games as preator explains here:

 

 

They correlate so closely that it's, imo, safe to just use damage; and by doing that save all the time and effort needed for making these fancy formulas and invest it into something that actually matters.

Link to post
Share on other sites

Were numbers ever looked at by tank type? E.g. I think most are on board with a TDs primary contribution being damage / kills. It would be interesting to see if the strength of the correlations shifted for roles where damage was less central to the role. eg at tier 8 comparing the TDs to the lights.

Link to post
Share on other sites

 

 

 

So damage alone gives 72.3%, while adding the other factors give 73.9%. A huge gain there isn't it.

 

 

 

They correlate so closely that it's, imo, safe to just use damage; and by doing that save all the time and effort needed for making these fancy formulas and invest it into something that actually matters.

 

These are actually very interesting concerns, which will allow me to explain WN8 a lot better, I think.

 

That was true for a binary classification system for single games. For overall account stats, the added benefit of the other stats in the formula are much, much larger. 

For account-wide stats, if we translate WN8 into a WR scale using a linear function and do the same with rDMG, the 95% confidence interval for each player is +-1.84% winrate for WN8 and 2.75% winrate for rDMG. That is a HUGE increase in accuracy, about 50%.

 

 

Also, using more factors makes the formula more robust, and will be more likely to properly measure outliers in damage (players which do less damage, but more of other stuff, as well as the opposite, players who do more damage than expected, but less of everything else). All this combined is why Eureqa determines that the best formula is the one where rDMG accounts for about 55-65% of the total score.

Link to post
Share on other sites

Were numbers ever looked at by tank type? E.g. I think most are on board with a TDs primary contribution being damage / kills. It would be interesting to see if the strength of the correlations shifted for roles where damage was less central to the role. eg at tier 8 comparing the TDs to the lights.

 

For the single-game data, I analyzed lights vs non-lights, and the results were pretty similar regarding how influential damage is to winning. Spotted damage however, is quite a lot less influential in non-LTs. That being said, we only have aggregate data to calculate WN8 from, so we can´t have separate curves for each tank type for rDMG and such, but this information should already be included in the expected stats table.

 

Also, I calculated how much the error in WN8 is for aggregate stats vs single-tank stats, and it is, IMHO, negligible:

 

 

While performing the needed analysis for implementing the new two-value WN8, I measured the error between average WN8 per-tank, weighted by battles, and the normal WN8 calculated with aggregate stats.

 

The average error is 2.15% of the player´s WN8, while 95% of players have an error under 6.3% of their total WN8. 99% of players have errors under 9.2% of their total WN8.

Link to post
Share on other sites

If they are all so fucking retarded than how come they win all the time? They must be doing something right, teamplay > dmg whoring.

Me playing solo = ~62% W/R with no TS and yolo-ing

Me playing TB = 85%/90% W/R with no TS and yolo-ing

Goooo Teeeemplay

Link to post
Share on other sites

Spotted damage however, is quite a lot less influential in non-LTs.

So using the numbers for non-LTs in 9s, 'quite a lot less influential' is about 66.2% as a standalone. How much closer to other variables was spotted dmg for the LTs? Do you have those tier 8 numbers for comparison?

Link to post
Share on other sites

So using the numbers for non-LTs in 9s, 'quite a lot less influential' is about 66.2% as a standalone. How much closer to other variables was spotted dmg for the LTs? Do you have those tier 8 numbers for comparison?

 

I really didn´t do an exhaustive analysis, because I really have better things to do with my time, I think Xelos is doing an in-depth analysis of the spotted damage data.

 

Since historical spotted damage will never be available, I saw no sense in going into a detailed analysis for something that is never going to happen. I merely wanted to show that we are not missing out on that much due to not having spotted damage available (specially not for account-wide WN8 measurement). Also, as I said above, it´s a whole different picture for single-battle outcome prediction as opposed to account-wide measurement. 73% prediction power in single games for DMG alone translates to about +-2.75% WN8 winrate error, while adding kills, def and spots reduces that to +-1.84%. One could only speculate what spotting damage would reduce that to. Extrapolating, my guesstimate would be about +-1.4% to +-1.7%.

Link to post
Share on other sites

We know that WG has separated some of the stats for TCs and CWs, so maybe now we have to request that they separate them completely, in the same way that they have done with TBs. That way, WN* won't be affected by teams specifically created to win or pad WR. After that, we must try to create a way in which (a transformed version of) WR can be included in WN*, to make up for stats lost by platooning with high end players (and I think I may have found a strategy for this). At the moment, I think that platooning and TCs are actually the reason why Eureqa decided that the best WN8 equation needed to stretch at the end (using multiplication of terms), and it may be incorrect.

 

Link to post
Share on other sites

Fair enough on the time required. I think the community as a whole is very appreciative of the effort you guys put in to continually refine and improve these metrics.

Any thoughts then in having a 'post 8.8' refined recent category?

Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...