Jump to content
Praetor77

WN8 breakthrough

Recommended Posts

Not sure about the expected 7201 values, they'll probably need to be pulled down to E100-esque values to make sense. I don't reasonably see how its expected for a bad E-100 clone to do so much more damage, but I've been wrong before. Seems like an oddity to me; the other CW reward tank (m60) has pretty reasonable expectations.

 

Yeah it seems high to me too. However, that´s what the numbers show. Nevertheless, all high tier new tanks have slightly "high looking" values. STB1, Type 61, STA-1, 7201. Also, same for Conq GC. 

 

As I said, as a solution we can change these values to same numbers of similar tanks, or we can leave them as is. Currently, 7201 is the highest WN8 recieving tier 10 tank on vbaddict. The new exp1 is 13% higher than before, but the exp2 is 40% higher than before. These numbers will obviously drop as people play more on the tank, which should level it out. People are getting crazy numbers on this tank though (you and me included). Don´t know if its the tank itself, the fact of changing meta influencing the numbers or what.

 

A fully automated expected stat procedure is highly appealing to me, and these tanks show odd numbers, but they are also quite amply represented. The number of players and games isnt low enough to not allow accurate exp stat calculation, as happens with other tanks (like T7 combat car), so maybe this is really due to a combination of the uniqueness of the playerbase of those tanks and the changes in meta over time. Worst thing that happens if we leave the "high" number, is that some superunicums get slightly reduced WN8 numbers, so not really a big deal IMHO.

Link to post
Share on other sites

I compared the exp 1 damage values in the download you posted above, with the current expected damage values for the 'live' single-parameter system. They are all lower than the current values, on average the new exp1 is just 65% of the current one. That's a huge reduction.

 

This is a surprise, as the analysis we did on the current expected values showed that some values (especially the WTE-100 damage) needed to be significantly higher. Overall, the analysis I gave you recommended a rise on average of 4% for the damage values. I cant see how the exp1 damage values could therefore be set lower than the current ones. Unless I'm misreading your new formula, a player who has rSTATs up to 1.0 would get a massive increase in WN8 (for damage at least) using those new numbers. 

 

However, I really think it would be best to table work on a new formula until such time as you have managed to publish a new set of expected values for the current system using all the analysis we did. Fix the expected values for the system we have, then we will all have a clearer view of the problem we are trying to solve with a new system.

Link to post
Share on other sites

Tbh I only look at recent wn to compare me to others sure I might have a shit overall, but if im not plateau-ing and the other person is then I feel better about how I continue to improve at the game. 60D and recent performance in an upward climb toward purple or deep purple and maintaining it is what I look at.

Link to post
Share on other sites

I compared the exp 1 damage values in the download you posted above, with the current expected damage values for the 'live' single-parameter system. They are all lower than the current values, on average the new exp1 is just 65% of the current one. That's a huge reduction.

 

This is a surprise, as the analysis we did on the current expected values showed that some values (especially the WTE-100 damage) needed to be significantly higher. Overall, the analysis I gave you recommended a rise on average of 4% for the damage values. I cant see how the exp1 damage values could therefore be set lower than the current ones. Unless I'm misreading your new formula, a player who has rSTATs up to 1.0 would get a massive increase in WN8 (for damage at least) using those new numbers. 

 

However, I really think it would be best to table work on a new formula until such time as you have managed to publish a new set of expected values for the current system using all the analysis we did. Fix the expected values for the system we have, then we will all have a clearer view of the problem we are trying to solve with a new system.

 

These exp1 values are supposed to fit 900 WN8 players, the old ones were supposed to fit 1565 WN8 players. On the right of the exp1 new values, I compared them to 0.75 of the old expected values (which was supposed to fit 900 WN8 players).

 

 

Also, I directly abandoned the old search for improved expected values, since I realized it is impossible to properly fit the whole population of rSTAT distributions into one expected value. Any improvement we could come up with would be a very small one, since a single linear equation describes the distributions of rSTATS amongst players of varying WN8 quite poorly. It wouldnt get rid of the current problems with high tier tanks, light tanks, etc. I am using the same database we put together, but now determining two expected values instead of one, which works MUCH better IMHO, spitting out values that make a lot of sense, on both ends of the spectrum (average and unicum).

Link to post
Share on other sites

Again, that shows your lack of understanding.  Guess which of us is better educated?

 

 

If you are knowledgeable of statistical analysis please throw your hat into the ring. I'm an economist who does statistical analysis in the real world and this is... work.

Link to post
Share on other sites

If your knowledgeable of statistical analysis please throw your hat into the ring. I'm an economist who does statistical analysis in the real world and this is... work.

 

I second this. I was thinking it before, but didn't post because I had a million other thoughts happening at the same time.

Link to post
Share on other sites

I still think there should be two ratings - WN8 style estimate rating so it's useful regarding the unwashed masses and one that uses only vbaddict data with all stats to promote uploading your dossier etc.

 

Then make the estimate on the slightly low side, anyone that cares will have ADU in no time.

Link to post
Share on other sites

If your knowledgeable of statistical analysis please throw your hat into the ring. I'm an economist who does statistical analysis in the real world and this is... work.

 

Well, economists aren't exactly known for accurate application of statistics.

Link to post
Share on other sites

No bitching intended, I appreciate your work a lot!

I know the difference is not THAT big, but still... Average Joe 900 is supposed to have the highest t10 medium tank DPG... in his...Leopard 1 (alongside with the STB-1)?

 

In general many of not most of the less played (not yet grinded by the masses) and/or more "difficult" tanks seem to have slightly to high requirements for a WN900. WT@PZ IV needing 200-250 more damage than most T9 TDs to get there? Again: for a WN900 player? SEEMS not very realistic to me.

Link to post
Share on other sites

I have seen lately several players with terrible wn8 who have decent dpg in their german tds. Dunno, its what the data says... Anyways as i said, this requires at least one more round of wn8 calculation and exp table re-calculation.

Link to post
Share on other sites

I have seen lately several players with terrible wn8 who have decent dpg in their german tds. Dunno, its what the data says... Anyways as i said, this requires at least one more round of wn8 calculation and exp table re-calculation.

 

Thank you

Link to post
Share on other sites

If your knowledgeable of statistical analysis please throw you're hat into the ring. I'm an economist who does statistical analysis in the real world and this is... work.

fixed?

Link to post
Share on other sites

sorry for bed english

 

when were you when statistic dies?

 

i was sat at home eating smegma butter when reed form

 

'statistic is kill'

 

'no'

Link to post
Share on other sites

sorry for bed english

 

when were you when statistic dies?

 

i was sat at home eating smegma butter when reed form

 

'statistic is kill'

 

'no'

AHAHAHAHA

Link to post
Share on other sites

Not sure about the expected 7201 values, they'll probably need to be pulled down to E100-esque values to make sense. I don't reasonably see how its expected for a bad E-100 clone to do so much more damage, but I've been wrong before. Seems like an oddity to me; the other CW reward tank (m60) has pretty reasonable expectations.

7201 specificcally has roughly 40k entries, whereas the e100 has 639k (?, I think). With only 300 of them issued.....some folks sold them.....most folks play them once a day if that. Most of the people with VK's had/have had E100's and know how to deal with the long reload, the angles, etc......if the unwashed masses had VK's, the number would go WAY down because most of them would not have a clue about it and play piss poorly. The reload is 2.XX seconds longer than the E100, the hit points are lower, so yea, it likely needs to come down a bit to a more normalized number. 

Link to post
Share on other sites

Remember that (sadly) history also plays a hand in how the expected numbers come out. E100 used to play tier 6 tanks. The HP available in those matches was WAY lower than what it is today. Not to mention the changing meta, and also the very unique playerset who play the VK. 

 

 

I would really like more feedback on the table, is it better overall? Is it worse? What do we do with the "weird" values? 

Link to post
Share on other sites

As for historical balance changes:

 

Wouldn't it make sense to base the expected values on recent data only, to make recent WN8 as accurate as possible? I'd say that recent stats trump lifetimes stats in relative importance.

 

I understand that the API doesn't provide time stamps, but a lot of stats sites do calculate WN8 based on recent results. And that's what most people (?) will be looking at.

Link to post
Share on other sites

Remember that (sadly) history also plays a hand in how the expected numbers come out. E100 used to play tier 6 tanks. The HP available in those matches was WAY lower than what it is today. Not to mention the changing meta, and also the very unique playerset who play the VK. 

 

 

I would really like more feedback on the table, is it better overall? Is it worse? What do we do with the "weird" values? 

 

I commented, but now I'm confused by your response; please either report the entire formula that these new exp values are applied to, or a link to the post that mentions it. All the ones I found say nothing about changing the baseline rRSTAT value to align with the 900 WN8 player. If so, that would require a change to the parameters on the WN8 equation. If so, please (re)post..

Link to post
Share on other sites

Thinking about the 7201, It occurred to me that perhaps part of what is factoring into tank ratings is not about the tank or the driver.

 

The tanks that are new or rare are relatively  unknown to the enemy team. The enemy might not know the tanks strengths and weaknesses.

This may have a real impact on the performance of the tank.

 

I am sure far more people know how to play vs an e100 compared to a 7201.

 

If a new tank popped into existence that had the exact same properties as the e100, but looked a little bit different, i guarantee it would

have substantially better stats until the enemy drivers learned about its properties. This would never happen if the tank were sufficiently rare.

Link to post
Share on other sites

As for historical balance changes:

 

Wouldn't it make sense to base the expected values on recent data only, to make recent WN8 as accurate as possible? I'd say that recent stats trump lifetimes stats in relative importance.

 

I understand that the API doesn't provide time stamps, but a lot of stats sites do calculate WN8 based on recent results. And that's what most people (?) will be looking at.

This is a nice discussion, although there is no real answer or way to measure only recent stats. Let me explain.

 

1-This data is based on the dossiers of people, and is why it is as large as it is (around 128 million games). However, for each tank, it only contains aggregate data. We don´t know if they guy played the tank yesterday or 2 years ago. To be able to filter only recent data, Phalynx would need to keep a database of dossiers, and only give me the recent games for each tank for each player. This would reduce our database hundreds of times, making the sample size unfit for calculating expected values for most tanks.

2- Using only recent values would be bad at measuring people with a lot of games, since for each player we only have aggregate stats to calculate WN8 with.

 

 

 

I commented, but now I'm confused by your response; please either report the entire formula that these new exp values are applied to, or a link to the post that mentions it. All the ones I found say nothing about changing the baseline rRSTAT value to align with the 900 WN8 player. If so, that would require a change to the parameters on the WN8 equation. If so, please (re)post..

 

Ummm, nope nope nope. Looks like I am terrible at explaining things. The baseline values are the ones we used in the rSTATc formula (which I now call exp0):

rWINc = max(0, (rWIN - 0.71) / (1 - 0.71) )

rDAMAGEc = max(0, (rDAMAGE - 0.22) / (1 - 0.22) )

rFRAGc = max(0, min(rDAMAGEc + 0.2, (rFRAG - 0.12) / (1 - 0.12)))

rSPOTc = max(0, min(rDAMAGEc + 0.1, (rSPOT - 0.38) / (1 - 0.38)))

rDEFc = max(0, min(rDAMAGEc + 0.1, (rDEF - 0.10) / (1 - 0.10)))

 

So, .1 for rDEF, .22 for rDAMAGE, etc.

 

For calculating exp0 numbers with the new tables, I need to re-calculate the WN8 of teh 115k player dataset and regress with the new formula to see what the baseline values are.

Before that, I need to come up with a new rSTAT formula based on the two value system (I posted my first version of what these would look like a few posts back, and although in that formula I now consider rSTAT for a WN8 900 player to be 1 instead of 0.75, the math is quite simple if we wanted to keep it as it was before. If we change to rSTAT(900) = 1 and rSTAT(2350)=2, then obviously the weights in the final formula will change).

Then I need to use Eureqa to re-assign the weights to each rSTAT in coming up with the final formula. Then, reevaluate the baselines. Then re-evaluate the expected stat table.

 

That will give us a final table which should be similar to this one, but after all is said and done, updating the table in the future should be child´s play, since it will be 100% automatic, except for new tanks.

 

In the meantime, I am asking for players to please take a look at the table, and critically evaluate the numbers based on WoT common sense and knowledge.

 

 

Thinking about the 7201, It occurred to me that perhaps part of what is factoring into tank ratings is not about the tank or the driver.

 

The tanks that are new or rare are relatively  unknown to the enemy team. The enemy might not know the tanks strengths and weaknesses.

This may have a real impact on the performance of the tank.

 

I am sure far more people know how to play vs an e100 compared to a 7201.

 

If a new tank popped into existence that had the exact same properties as the e100, but looked a little bit different, i guarantee it would

have substantially better stats until the enemy drivers learned about its properties. This would never happen if the tank were sufficiently rare.

 

Yeah, although simple speculation, it makes sense as another factor that could be skeweing 7201 expected stats.

Link to post
Share on other sites

Seems like there should be in place a uniform procedure on how to handle tanks that need some form of adjustments.

 

 

Examples of some possible ideas that might make sense:

 

3 months before a new tank gets a non probationary rating. (to minimize the freak out's, since the people should know the rating in not permanent in the least.)

 

New tanks always get a probationary  rating half way between the closest two  tanks. Arbitrary and inaccurate but, not open to bias.

 

A decaying method of getting to a  tanks rating from its pre change numbers to its post change numbers.

(the waffen 100's rating should decay to its new performance rating much faster  than the foch 155's.)

 

That is a bit elaborate , but something along those lines.

 

 

 

 

I see this system dealing with varied ranges of skill caps on tanks, but how does this method help with the wot nerf cycle on tanks?

The m4 heat spam days are very different from its post nerf days.

How can you do any better than being equally inaccurate for both sets of eras for that tank?

 

Even though this type of error is smaller in magnitude to wn7s documented tier errors, it benefits players with more longevity in the game

( particularly people like me who would only play an op tank.) in a reasonably consistant way.

 

Errors with a consistent bent are worse than errors that are more random in nature it seems to me.

 

Perhaps a penalty for all players who have more than 2768 games played that slides up with how ever many games i play in a day.

Link to post
Share on other sites

@Garbad: Have about creating your own rating?

 

@Praetor:

 

edit

 

 

That aside I took a short look (2min) and the numbers for some t8 heavies compared to meds (unicum numbers) look off. ~1600 DpB for Löwe or KT is easy but 1350 for T44 is to high especially considring that I play my T44 with ~60% WR solo random without goldspam and achieve this with 1300 DpB. While I need 1900 DpB in said t8 heavies to achive between 57-59% WR. All random solo. U need 2K+ DpB to reach 60%+ WR solo random as a t8 hvy on EU-Server.

 

 

Edit2:

 

Numbers for some artis (old mid/hightier) unicum level are to low ---- ~1650 DpB for t7 (S51 or GW-P) will make all old time medicore arti players look to good. Just compare that with the expected values used atm for WN8 vers.14

Link to post
Share on other sites

Thx Folter, I will keep that in mind. Let´s see if/how those values change with another iteration of WN8 calculation and exp stat re-calculation.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...