Jump to content
RichardNixon

WN9 candidate prototype

Recommended Posts

That's just a scale thing. WN8 has a much higher zero point and non-linear scaling, so bad players get higher WN9 than WN8, while good players get lower WN9 than WN8. You were bad when you played your low tiers, and so they go up.

A lot of less-popular low tier tanks were underrated in WN8, so it was possible to get very high WN8 values in them with vaguely competent play. Hence my 3400 WN8 in the Pz IIIA. Similarly, Kuroialty currently has 2070 WN9 (a little over unicum) and 3190 WN8 (over superunicum).

That's a rather unfair dig at me, calling my unicum-level play "vaguely competent" and singling me out as some remarkable beneficiary of low tier WN8 expected values when several other unicums have already shown drops in rating that look no different from my own.  It's also flat-out wrong - expected values for every low tier tank are skewed much higher than expected values at higher tiers, which you could have noticed by just glancing at stats like expected win rates and kills, or you could have taken a page out of Crab's book and just RTFW:

Also, the per-tank expected values table were compared with the table used for the PR rating by Noobmeter, and a table of top 1%/100 players of each tank for the RU server kindly provided by Seriych (similar to what was in the service record with XVM for 8.6 and older). Most expected damage values are pretty close with noobmeter´s (from tier 3-8 ), and if you multiply those damages by 1.5 (to see what a 2400 WN8 player would need to get on a tank), you get unicum values, which are quite close to Seriych´s values for top player numbers from RU. Also, using this approach resulted in numbers for low tier tanks that are obviously high for the new player, but that isn´t really an issue since average players only have 3% or less of their total games in tier 1. This conveniently also functions as a control against seal-clubbing your way into a high rating. It means you can still club tier 1 players, you just have to actually be good at it! No longer will averaging 1.7 kill/game (a good value in tier 10) at tier 1 make you appear good. This isn’t because the WN* team has any bias against folks who play low tiers, but simply that we wish to identify player skill irrespective of tier played (review the Why WN8? for reasoning).

Together with the way WN8 scales (and likely how WN9 scales too), this is why you'll never see someone like me with a >4k recent, not because I don't carry the relative skill, but because you and every other stat maker hold people like me to much greater standards than you apply to your mid and high tier brethren.  Just because nobody can find a consistent way to shit all over a way to play the game that you don't appreciate and you can't handle doesn't mean that my performance above all others is without skill.

Link to post
Share on other sites
 

That's a rather unfair dig at me, calling my unicum-level play "vaguely competent" and singling me out as some remarkable beneficiary of low tier WN8 expected values when several other unicums have already shown drops in rating that look no different from my own.  It's also flat-out wrong - expected values for every low tier tanks are skewed much higher than expected values at higher tiers, which you could have noticed by just glancing at stats like expected win rates and kills, or you could have taken a page out of Crab's book and just RTFW:

Together with the way WN8 scales (and likely how WN9 scales too), this is why you'll never see someone like me with a >4k recent, not because I don't carry the relative skill, but because you and every other stat maker hold people like me to much greater standards than you apply to your mid and high tier brethren.  Just because nobody can find a consistent way to shit all over a way to play the game that you don't appreciate and you can't handle doesn't mean that my performance above all others is without skill.

because beating newcomers is no skill and is perhaps even despicable, especially with multi crew skills and multi million equipment...

Link to post
Share on other sites

because beating newcomers is no skill and is perhaps even despicable, especially with multi crew skills and multi million equipment...

This non-argument is so old and boring to listen to.

  • High tier players encounter stock tanks and players who haven't driven their tank more than 10 times all the time.
  • High tier players beat on these stock tanks and relatively new players with multi skill crews and multi million equipment.
  • Skilled players platooning together in higher tiers to have numbers advantage in small encounters is common and used to inflate performance.
  • WG implements special matchmaking to filter out newbs but does nothing to stop fail-your-way-to-the-top fodder from playing in high tiers.
  • Low tier tanks are far more capable of impacting games on an individual level, no matter the skill level of the tanker.
  • Low tier cheapness, abundance, and short MM queue make the supposedly easy death of bad players far more inconsequential and easy to recover from than failing in a higher tier tank.

Pull the 130mm barrel out of your ass and come at me with something interesting and original to refute.

Link to post
Share on other sites

Guys, stop it already.

This thread is about WN9 and not whether playing low tiers is a good thing or a bad thing. We have been here before, so dont start again. Warnings and ROs will be given if this continues.

I understand that low tiers is part of the game and therefore also the creation of WN9. But you are not discussion the implications of WN9 and low tiers but how much ass sealclubbers do (not) suck. So stop.

Link to post
Share on other sites
 

Also, using this approach resulted in numbers for low tier tanks that are obviously high for the new player

Nice selective bolding there.

You don't seem to be trying to understand the expected values principle. There's no explicit adjustment for low tiers: It's simply based on how players perform in their tanks relative to their other recent tanks.

The main problem with low-tier expected values is that some tanks are played with very different crew skills than others. Popular sealcubbing tanks like the Cruiser III are often played with good crew skills, while unpopular tanks like the Pz IIIA are rarely played to 100%. The WN9 expected values use earned XP to adjust the input data, so the expected values for popular low-tier tanks actually dropped from WN8.

Your WN9 fell relative to your WN8 for the same reason that T-62A players took a hit. It's not because you played low tier tanks, but because you mostly played tanks with high skill scaling, which as a general rule means fast tanks.

 

Of course, after I put a ton of work into the crew skill adjustment method, WG completely broke tiers 1-3 by introducing the newbie matchmaking split. The problem here is that players with <2500 battles are playing a much easier game than players with 2500+ battles, and so it's not really legitimate to have a single value per tank. I'm currently using pre-9.8 expected values for tiers 1-3, which is harsh on 2500+ battle players and generous to <2500 battle players. They're fine for historical data, however.

BTW, I was a sealclubber (although mostly tiers 3-6 rather than 2), so your rant at the end is mistargeted.

Link to post
Share on other sites

@RichardNixon, is WN9 actually ready to be implemented or does it still needs some tweaks?

I had a last shot at persuading the API guys to add better spotting damage data. Not sure if it worked or not: They're going to "discuss" it at least. If we do get the data then it'll be another ~4+6 weeks to add it into the metric because I'll need to collect new data for expected values.

Other than that, I need to get my head together sufficiently to check for regression dilution issues in the WN9 slope parameter. Might be an issue, might not be. Needs a lot of testing & maths.

I did at least finish a test for whether platoon-padding positively or negatively affects damage. The result was fairly weak, but it seems that platooning with equal-skilled players gives you the best damage output, regardless of skill, which suggests that social factors are relatively important. Anyway, there was no significant difference between solo vs equal-skill platoon damage, so there's probably no need for a platoon correction.

I've also just finished collecting unbiased tanks/stats data for EU and RU, which I'm going to use for the final version of the tier averages.

Link to post
Share on other sites

 Anyway, there was no significant difference between solo vs equal-skill platoon damage, so there's probably no need for a platoon correction.

This is interesting, is this globally, across all skill levels? Because I could see the very best players possibly 'stealing' each others potential damage, but only for the very best.

It seems like "platooning makes your DMG go down" is 'common wisdom' around here.

Link to post
Share on other sites

This is interesting, is this globally, across all skill levels? Because I could see the very best players possibly 'stealing' each others potential damage, but only for the very best.

A correction: I should have said there was no significant difference between solo vs equal-skill 2-man platoon damage.

The data's too thin at the edges to say much, but the indications are that damage does drop off for triple platoons and low-tier platoons. For this purpose, player skill is tier relative, so a blue platoon at tier 6 may be padding harder than a unicum platoon at tier 10.

Of course, there's no indication of causality. It's quite possible that platoon-padding naturally diminishes damage but that people (on average) play better in platoons, or vice versa.

Link to post
Share on other sites

Is there a somewhat-easy way to alter the script such that it highlights tanks where the delta between the various methods of computing the WN9 score is above some threshold (say, 5% of the highest of the scores or something)?  Can be done manually, but might be an easy way to identify weirdness both in the metric and in my own per-tank performance...

Link to post
Share on other sites

Do we have an updated color chart to go with the preliminary calculations?

^^This.

Link to post
Share on other sites

Nixon, if you post server wide account data with WN9 (like you did with WN8 last month, preferably for the NA server because it's easier to work with and all the servers give close enough to average data), I can whip up a color chart based on the last time we updated the WN8 version. I understand you're a bit leary about how that was done, but it would at least be close enough to give a general estimate until you can put out a version you like.

Link to post
Share on other sites

IB4 Yellow :oscar:

 

RN  - You said about 4-6 in test before a release candidate.  How close are we to test?

Edited by Itchi
Link to post
Share on other sites

My T1E6 has adjusted win rate of.. 101.4%??

Edit: And -5.6 WN9c1 on my T7 Combat Car

Edited by WhiteTiger
Link to post
Share on other sites

Nixon? You have the tools to calculate server wide WN9 a lot faster than me.

Could you post the full formula for your current version of WN9 if your not going to post server wide data? I could dig through the JavaScript on your webpage but it would suck to calculate all that data and have something wrong or for an older version.

Link to post
Share on other sites

Nixon? You have the tools to calculate server wide WN9 a lot faster than me.

Could you post the full formula for your current version of WN9 if your not going to post server wide data? I could dig through the JavaScript on your webpage but it would suck to calculate all that data and have something wrong or for an older version.

I put up a full version a couple of pages back for Never to implement, although apparently he had other priorities. The formula's unlikely to change unless WG provide raw assisted damage, although the tier averages and tank scaling will probably get a final pass.

Some stat distribution lists I did last month. Overall on the left, recent on the right:

https://docs.google.com/spreadsheets/d/1w7AHGqbBvnTv0v9hxn-ltOVxj5kK0WtFGQjmUjsY-cs/edit?usp=sharing

Of course, for stat distributions the scaling is critical, and I'm undecided on it. No-one else seems to have any opinion on it either. Some points:

  • If it's not going to be used as an achievement (overall) metric, there seems little point in using 4 digits.
  • If it's too close to WN8, then people will be confused about why their value rose/dropped.
  • If it's near a 0-100 scale then it'll be confused with a percentile scale.
  • GD will probably hate it regardless so maybe it doesn't matter.

Other notes:

Regression dilution

The regression dilution check came up clean. I got the regressions the right way around (not sure whether by accident or design), and the average games per player over the interval have minimal dependence on the tank played.

Tier averages

I now have true per-tier random-sampled interval averages for EU and RU. They're quite fun:

https://docs.google.com/spreadsheets/d/1YYKND2eV8nyw45mOu80hkW2Lr-mjaZnebH4Ypn0XRL8/edit?usp=sharing

RU damage outputs are somewhat lower at tiers 5+, which is probably down to increased arty counts. RU also has lower survival rates but higher high-tier cap rates.

Note that the draw rates and win rates don't quite add up within the same tier: Tier 1-4 winrates come out slightly low, and tier 9-10 winrates come out slightly high. This is probably because not all games have balanced tiers, and mismatches are more likely to be in the higher-tier player's favour.

Per-tank normalization

I have a plan for normalizing the per-tank values. I'm going to choose a list of popular tanks from tier 5 to 10, and then normalize all the results so that the chosen tanks have an average intercept of 0 and an average scale of 1. The results probably won't differ much from the current method but they'll be easier to replicate with different datasets.

Link to post
Share on other sites

I put up a full version a couple of pages back for Never to implement, although apparently he had other priorities. The formula's unlikely to change unless WG provide raw assisted damage, although the tier averages and tank scaling will probably get a final pass.

Hmm, I thought I had read every reply, must have skipped a page during my vacation. I found it though, thanks, and sry about that.

 

Edit: Here's a colorful scale for everyone with Nixon's WN9 values rounded up and down to 50's to go along with the current WN8 scale.

 

FAIXntL.png

HTML for the first table in case anybody wants to easy edit it.

<table class="colorRef">
				<thead>
					<tr>
						<td>Win rate</td>
						<td>WGP Rating</td>
						<td>WN8 Rating</td>
						<td>WN9 Rating</td>
					</tr>
				</thead>
				<tbody>
					<tr>
						<td style="background:#930D0D;">Under 46%</td>
						<td style="background:#930D0D;">Under 2500</td>
						<td style="background:#930D0D;">Under 300</td>
						<td style="background:#930D0D;">Under 600</td>
						
					</tr>
					<tr>
						<td style="background:#CD3333;">46%</td>
						<td style="background:#CD3333;">2500 to 2999</td>
						<td style="background:#CD3333;">300 to 449</td>
						<td style="background:#CD3333;">600 to 749</td>
						
					</tr>
					<tr>
						<td style="background:#CC7A00;">47%</td>
						<td style="background:#CC7A00;">3000 to 3499</td>
						<td style="background:#CC7A00;">450 to 649</td>
						<td style="background:#CC7A00;">750 to 899</td>
						
					</tr>
					<tr>
						<td style="background:#CCB800;">48% to 49%</td>
						<td style="background:#CCB800;">3500 to 4499</td>
						<td style="background:#CCB800;">650 to 899</td>
						<td style="background:#CCB800;">900 to 1099</td>
						
					</tr>
					<tr>
						<td style="background:#849B24;">50% to 51%</td>
						<td style="background:#849B24;">4500 to 5499</td>
						<td style="background:#849B24;">900 to 1199</td>
						<td style="background:#849B24;">1100 to 1299</td>
						
					</tr>
					<tr>
						<td style="background:#4D7326;">52% to 53%</td>
						<td style="background:#4D7326;">5500 to 6999</td>
						<td style="background:#4D7326;">1200 to 1599</td>
						<td style="background:#4D7326;">1300 to 1499</td>
						
					</tr>
					<tr>
						<td style="background:#4099BF;">54% to 55%</td>
						<td style="background:#4099BF;">7000 to 7999</td>
						<td style="background:#4099BF;">1600 to 1999</td>
						<td style="background:#4099BF;">1500 to 1749</td>
						
					</tr>
					<tr>
						<td style="background:#3972C6;">56% to 59%</td>
						<td style="background:#3972C6;">8000 to 8999</td>
						<td style="background:#3972C6;">2000 to 2449</td>
						<td style="background:#3972C6;">1750 to 1949</td>
						
					</tr>
					<tr>
						<td style="background:#793DB6;">60% to 64%</td>
						<td style="background:#793DB6;">9000 to 9999</td>
						<td style="background:#793DB6;">2450 to 2899</td>
						<td style="background:#793DB6;">1950 to 2199</td>
						
					</tr>
					<tr>
						<td style="background:#401070;">65% +</td>
						<td style="background:#401070;">10000 +</td>
						<td style="background:#401070;">2900 +</td>
						<td style="background:#401070;">2200 +</td>
						
					</tr>
				</tbody>
			</table>

 

Link to post
Share on other sites
Battles148
Adjusted WR57.8
WN9 c21698

 

Hmm... I never thought that me being teal would be possible... Guess I'm ok with that!

Link to post
Share on other sites

Per-tank normalization

I have a plan for normalizing the per-tank values. I'm going to choose a list of popular tanks from tier 5 to 10, and then normalize all the results so that the chosen tanks have an average intercept of 0 and an average scale of 1. The results probably won't differ much from the current method but they'll be easier to replicate with different datasets.

I took a look at the params file you posted online, showing the slopes and intercepts for every tank. Are the slopes and intercepts only applied for the wn9c2 method, or are they used for the wn9c1 version as well? The reason I mention it is the issue of sustainment; if the new rating requires action every time a new tank is created, or periodically when tanks are buffed and nerfed, then you are never done; and every update you do, the seagulls will fly in and crap on you if they didnt like what you did to their favorite tank.

Lesson learned is that if it's 'good enough' to go with a version of wn9 that doesn't use per tank stats, but instead is based on expected values that are kind of tier averages, do it. You'll be glad you did...

Having said that, if you decide to stick with the 'per tank' version you will need to publish a script which would allow a third party to calculate the slopes and intercepts from a known datasource; if you dont do that, some poor sucker will have to try and figure that out 6 months after you've vanished, to keep WN9 alive. 

Great work RN, really looking forward to seeing this roll out and WN8 go away

Link to post
Share on other sites

Simply my opinion, I just don't see the point in a new metric when most people stay in the exact same area. The point would be to eliminate a calculated portion of really early games for a more accurate representation. I'm sorry, I apologize, I know it's not about me ofcourse, just kinda disappointed my tier 10 wn9c2 average is at 2k and I'm barely over 1750 overall.

Edited by kariverson
Link to post
Share on other sites

Simply my opinion, I just don't see the point in a new metric when most people stay in the exact same area. The point would be to eliminate a calculated portion of really early games for a more accurate representation. I'm sorry, I apologize, I know it's not about me ofcourse, just kinda disappointed my tier 10 wn9c2 average is at 2k and I'm barely over 1750 overall.

WN8 gets padded a shit ton, making it totally irrelevant, especially for recent, WN9 will fix this

Link to post
Share on other sites

Would it be valuable to use position on team as part of wn9? So even if you don't win but are in the top say 3 you do better in a wn9 sense than if you are in the bottom 3. Even though you had a shit team and you lost.

Does the API even make available position in team?

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...