Jump to content
bjshnog

⟪WN8⟫ Development / Resources

Recommended Posts

I've read countless posts about VBE, but have yet to see one graph or model to substantiate it.

I don't think the idea of pushing put a WN9 in a week is practical or necessary.

WN8 has its faults, but we are beginning to understand what they are. I think over the next few weeks the best use of our time is to post some graphs and models illustrating its faults and testing possible fixes for them.

Link to post
Share on other sites

I've read countless posts about VBE, but have yet to see one graph or model to substantiate it.

I don't think the idea of pushing put a WN9 in a week is practical or necessary.

WN8 has its faults, but we are beginning to understand what they are. I think over the next few weeks the best use of our time is to post some graphs and models illustrating its faults and testing possible fixes for them.

 

If I had the technical ability to do the analysis necessary for VBE, then I am certain it would be convincing, just as the correlation between WN8 and rWINc was convincing to Praetor (yet we have now realised flaws, despite the improvement). Also, remember that there was nothing to substantiate WN8 either -- I had an idea and suggested it, Praetor liked it, he did the analysis and brought out a system. If anyone could do the calculations/analysis for VBE, then we'd likely get over a huge speed bump. Another thing you need to remember is that your original methodology that you clung to so strongly in your expected value update thread was flawed, and that could be seen directly from the concept, yet you took so long to create a different automated system it was ridiculous. If anything, that suggests you can't visualize or imagine a statistical concept without actually going through all the calculations first. I can, and that is how I know WN8 as a system is so flawed. It's probably to do with the amount of time and effort put in.

 

Well, the one week rating I mentioned wouldn't quite be WN9; I didn't mean to suggest that we quickly come up with two sets of expected values for all 4 main stats and shove it into a rating just to get it out quickly. I was thinking more along the lines of "DKR" (Damage/Kill Rating). In fact, there is only a small subset of the player base who would care about such a thing (and some of them don't play anymore). I merely suggested it as a minor side project for use by damage padders.

 

The thing about WN8's flaws is that a lot of people on this forum see exactly why it's wrong just from the numbers it outputs, and some of them can see it in their mind exactly the same way I can and do. We know why it's flawed, few of us know what cannot be fixed in such a simple construct, and even fewer of us know how to fix it. As a concept, VBE is strictly better than WN8, and would be more useful in every case. WN9 is very unlikely to beat it, and as such, should be a second priority.

 

 

And yes, I realise I haven't given you graphs, models or hard proof yet. It would be a circular argument to bring that up again. The best I can do is try to create some diagrams to explain the concept visually (which I said I'd do in the VBE thread).

Link to post
Share on other sites

Easiest and most promising way for improving WN8 will be this as far as I understood it:

 

 

 


Here's a relatively simple way of fixing skill scaling:
 
1. Generate expected values as usual, centred on 1565.
2. Go back through your player database, calculating per-tank WN8 values with the new expected values.
3. Throw out tanks below 50 games and then the bottom 50% of tanks for each player, as usual.
4. Calculate a "recent WN8" based on the remaining tanks.
5. Throw out any players below 2500 recent WN8.
6. Average the tanks of the remainder, and the WN8 of the players of that tank.
7. Normalize the ratio between the average tank WN8 and average player WN8 to give you a scale factor per point of WN8 from 1565: scalefactor = ((tankWN8 - 1565) / (playerWN8 - 1565))
 
Final results should be a bit like this:
 
https://docs.google.com/spreadsheets/d/1AII2UgMjkNopmIOC8WE-CUIuI2GS9Kax-qr-tqoxVrM/edit?usp=sharing
 
Mine's a bit distorted because I'm mixing Gryphon's expected values with my own database, but you get the idea. Fast tanks have scale factors above 1.0, while slow tanks have scale factors below 1.0. Low tiers are mostly garbage results due to lack of data, but you can guess or substitute 1.0 as appropriate.
 
Once you've got that, you add a final step to the WN8 calculation:
 
1. Sum battles*scalefactor over the player's tanks. Divide by total battles to give scaleAvg.
2. Adjust with the following formula:
 
scaledWN8 = 1565 + ((wn8 - 1565) / scaleAvg)
 
So for example, a 3000 WN8 player who's only played the Maus will get 3057 scaledWN8, and a 3000 WN8 player who's only played the T62A will get 2884 scaledWN8.

 

 

+ removing the WR term ofc

 

Call that maybe WN8b. Shouldnt take much time and is a step forward.

Link to post
Share on other sites

Easiest and most promising way for improving WN8 will be this as far as I understood it:

 

While it balances the high and low end of ratings a little bit, a lot of tanks that are rated way too heavily by damage (scouts) are still being rated by damage the exact same amount. It doesn't really fix it, but it is one step forward (although you kind of have to take two steps back after it and another three steps forward).

Link to post
Share on other sites

While it balances the high and low end of ratings a little bit, a lot of tanks that are rated way too heavily by damage (scouts) are still being rated by damage the exact same amount. It doesn't really fix it, but it is one step forward (although you kind of have to take two steps back after it and another three steps forward).

 

WHat you have in mind with WN9 or VBE will be a totally new rating as I understood it. That will take a lot of time and the result will be "uncertain". What RichardNixon proposed wont fix all fails of WN8, but will improve it with a "minimal effort" needed.

 

After that we have all the time in the world to either work on VBE or develop something new.

Link to post
Share on other sites

WHat you have in mind with WN9 or VBE will be a totally new rating as I understood it. That will take a lot of time and the result will be "uncertain". What RichardNixon proposed wont fix all fails of WN8, but will improve it with a "minimal effort" needed.

 

After that we have all the time in the world to either work on VBE or develop something new.

 

For now, it's probably worth implementing RichardNixon's idea, as it is basically complete already.

Link to post
Share on other sites

Another way of getting to where RN is going is,to just use the linear models I generated for every R stat for every tank. They each have a slope and an intercept. They were primarily used to identify corrections needed to expected values to get the line to go throuh 1,1 but there is no reason they cpuldnt e used to scale the score. If you like I can supply some values later for you to play with.

Link to post
Share on other sites

Using the current expected values, with dataset filtered exactly how it was done for the update process, the link has a zipfile with a plot of every tank's WN8 vs user_WN8, and also a spreadsheet showing for all tanks the intercept and slope of the line of best fit.

 

For those interested in how WN8 scales on a per tank basis, you have a lot to look at....!

 

https://www.dropbox.com/s/9hdxilbvpgkb5zg/WN8vsUserWN8_Scaling_Info.zip

Link to post
Share on other sites

Since this is an update at least equivalent to adding a tier penalty to WN6 to create WN7, you guys should move this into and post your ideas/results/information there.

Link to post
Share on other sites

Since this is an update at least equivalent to adding a tier penalty to WN6 to create WN7, you guys should move this into and post your ideas/results/information there.

 

I'm not sure that I agree, or I may misunderstand.

 

It looks like the method is scaling the actual WN8 outcomes, to create a situation where the WN8 for each tank scales roughly equally, which is the stated intention of WN8.  There might be problems with the approach or method, but if I understand it correctly it's an adjustment based on the observed effect not matching intended effect (that WN8 in certain tanks is far higher than WN8 in other tanks).

Link to post
Share on other sites

I'm not sure that I agree, or I may misunderstand.

 

It looks like the method is scaling the actual WN8 outcomes, to create a situation where the WN8 for each tank scales roughly equally, which is the stated intention of WN8.  There might be problems with the approach or method, but if I understand it correctly it's an adjustment based on the observed effect not matching intended effect (that WN8 in certain tanks is far higher than WN8 in other tanks).

 

This is adding a modifier for tanks with low or high relative scaling. WN7 was reducing the scores of sealclubbers, which affected a minority of the player base. I am fairly certain that this change is more significant.

Link to post
Share on other sites

I guess that now WN8 is in sustainment mode it would be best to open up new threads to discuss potential new improved systems, whether they be evolutionary or revolutionary.

Link to post
Share on other sites

Using the current expected values, with dataset filtered exactly how it was done for the update process, the link has a zipfile with a plot of every tank's WN8 vs user_WN8, and also a spreadsheet showing for all tanks the intercept and slope of the line of best fit.

 

For those interested in how WN8 scales on a per tank basis, you have a lot to look at....!

 

https://www.dropbox.com/s/9hdxilbvpgkb5zg/WN8vsUserWN8_Scaling_Info.zip

 

:( nothing in link.  I'd really like to look at that data.

Link to post
Share on other sites

I'll put it back tonight.

 

Its back up

Link to post
Share on other sites

I forgot to mention, its a 50 meg file, so please dont download unless you really want to sstudy 350 WN8 graphs...

Link to post
Share on other sites

When calculating the new rating (the scaled WN8, or WN9 -- may I remind you that is what Praetor called it as well, although calculating it in a slightly different way), do you just average the coefficients from the linear equations over tanks played and then transform inversely in some way?

 

Please excuse my terminology; I've completely forgotten how to use it.

 

EDIT: If this is the direction the next rating system is going to go in, let's move it .

Link to post
Share on other sites

So this WN8b (WN9)... are we going to have it implemented, at least for testing? It looks to be just about finished.

 

Seriously, it's long overdue.

Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...