Jump to content
Gryphon_

WN8 Expected Values Update

Recommended Posts

WN8 was intended to be updated every patch or so, and in discussion elsewhere it was decided that as Praetor isnt around, and didnt finish the '2 point system' we just better go ahead and update the expected values for the current '1-point system'

 

This thread is not for discussion on changes to WN8, it is just to peer review potential updates to the Expected Values table.

 

I worked with Praetor crunching numbers for WN8. The data used is from vbaddict, thanks to Phalynx. The dataset is filtered as follows:

 

> 2500 battles per account

> 50 battles for any user in any tank

user global winrate is between 51.0 and 56.9%

 

The methodology I originally proposed - in the first 10 pages of this thread - has been abandoned, but over the next 10 pages of forum help I developed a much better one that very closely mirrors the guidance on the wnefficiency.net wiki.

 

The method has been scripted in R, and new values have been produced by that script that are close to the current ones, but correct some problems in mid tiers and high tiers. A very few tanks that didnt have enough data have been given new 'plug number' values from a similar tank - the 'nearest neighbor' method.

 

The values we ended up with are here

 

The R script I wrote to generate the new expected values automatically is here:

 

#WN8 Expected Values Calculator by Gryphon
 
#load data from csv file on HDD
dataMaster <- read.csv("xxxxxxxxxx.csv")
any(is.na(dataMaster))
head(dataMaster)
nrow(dataMaster)
 
#apply 50 battle filter
userTankStats <- dataMaster[dataMaster$battles > 50,]
userTankStats$damage_dealt <- as.double(userTankStats$damage_dealt)
userTankStats <- userTankStats[,c("server", "userid", "compDescr","title", "type", "tier", "countryid", "battles","victories","damage_dealt","frags","spotted","defence_points","overall_battles","overall_winrate")]
any(is.na(userTankStats))
 
# number of battles in dataset
sum(userTankStats$battles)
 
#calc actuals
userTankStats$aWIN <- 100*userTankStats$victories/userTankStats$battles
userTankStats$aDAMAGE <- userTankStats$damage_dealt/userTankStats$battles
userTankStats$aFRAG <- userTankStats$frags/userTankStats$battles
userTankStats$aSPOT <- userTankStats$spotted/userTankStats$battles
userTankStats$aDEF <- userTankStats$defence_points/userTankStats$battles
any(is.na(userTankStats))
 
#load current expected values from csv file on HDD
expectedValues <- read.csv("~/R/WN8/expected_values_current.csv")
#expectedValues$compDescr <- as.factor(expectedValues$compDescr)
any(is.na(expectedValues))
 
# add the expected values data to the user data
userTankStats <- merge(x=userTankStats, y=expectedValues, all=FALSE )
# fix chars that upset file naming
userTankStats$title <- chartr("*/", "_-", userTankStats$title)
any(is.na(userTankStats))
 
# calculate the user rSTATS
userTankStats$rWIN <- userTankStats$aWIN/userTankStats$eWIN
userTankStats$rDAMAGE <- userTankStats$aDAMAGE/userTankStats$eDAMAGE
userTankStats$rFRAG <- userTankStats$aFRAG/userTankStats$eFRAG
userTankStats$rSPOT <- userTankStats$aSPOT/userTankStats$eSPOT
userTankStats$rDEF <- userTankStats$aDEF/userTankStats$eDEF
userTankStats$rWINproduct <- userTankStats$rWIN * userTankStats$battles
userTankStats$rDAMAGEproduct <- userTankStats$rDAMAGE * userTankStats$battles
userTankStats$rFRAGproduct <- userTankStats$rFRAG * userTankStats$battles
userTankStats$rSPOTproduct <- userTankStats$rSPOT * userTankStats$battles
userTankStats$rDEFproduct <- userTankStats$rDEF * userTankStats$battles
any(is.na(userTankStats))
 
# calculate the user rSTATc's
userTankStats$rWINc <- pmax(0,(userTankStats$rWIN - 0.71)/(1 - 0.71))
userTankStats$rDAMAGEc <- pmax(0,(userTankStats$rDAMAGE - 0.22)/(1 - 0.22))
userTankStats$rFRAGc <- pmax(0,pmin(userTankStats$rDAMAGEc + 0.2,((userTankStats$rFRAG - 0.12)/(1 - 0.12))))
userTankStats$rSPOTc <- pmax(0,pmin(userTankStats$rDAMAGEc + 0.1,((userTankStats$rSPOT - 0.38)/(1 - 0.38))))
userTankStats$rDEFc <- pmax(0,pmin(userTankStats$rDAMAGEc + 0.1,((userTankStats$rDEF - 0.10)/(1 - 0.10))))
userTankStats$rWINcproduct <- userTankStats$rWINc * userTankStats$battles
userTankStats$rDAMAGEcproduct <- userTankStats$rDAMAGEc * userTankStats$battles
userTankStats$rFRAGcproduct <- userTankStats$rFRAGc * userTankStats$battles
userTankStats$rSPOTcproduct <- userTankStats$rSPOTc * userTankStats$battles
userTankStats$rDEFcproduct <- userTankStats$rDEFc * userTankStats$battles
any(is.na(userTankStats))
 
# calculate the user WN8 per tank 
userTankStats$WN8 <- with(userTankStats, 980*rDAMAGEc + 210*rDAMAGEc*rFRAGc + 155*rFRAGc*rSPOTc + 75*rDEFc*rFRAGc + 145*pmin(1.8,rWINc))
userTankStats$WN8product <- userTankStats$battles * userTankStats$WN8
any(is.na(userTankStats))
 
# filter out all tanks where WN8 is below median WN8 for every users' tanks
require(plyr)
median.userTankStatsWN8 <- ddply(userTankStats, .(userid), summarise, median_WN8 = median(WN8, na.rm=TRUE))
userTankStatsFiltered <- merge(userTankStats, median.userTankStatsWN8, all=FALSE)
userTankStatsFiltered <- userTankStatsFiltered[userTankStatsFiltered$WN8 >= userTankStatsFiltered$median_WN8,]
nrow(userTankStatsFiltered)
any(is.na(userTankStatsFiltered))
rm(median.userTankStatsWN8)
 
#calcultate the user account WN8, rSTATs, and rSTATSc
userAccountStats <- aggregate(cbind(WN8product,rWINproduct, rDAMAGEproduct, rFRAGproduct, rSPOTproduct, rDEFproduct, rWINcproduct, rDAMAGEcproduct, rFRAGcproduct, rSPOTcproduct, rDEFcproduct, battles) ~userid, userTankStatsFiltered, sum)
userAccountStats$user_WN8 <- userAccountStats$WN8product / userAccountStats$battles
userAccountStats$user_rWIN <- userAccountStats$rWINproduct / userAccountStats$battles
userAccountStats$user_rDAMAGE <- userAccountStats$rDAMAGEproduct / userAccountStats$battles
userAccountStats$user_rFRAG <- userAccountStats$rFRAGproduct / userAccountStats$battles
userAccountStats$user_rSPOT <- userAccountStats$rSPOTproduct / userAccountStats$battles
userAccountStats$user_rDEF <- userAccountStats$rDEFproduct / userAccountStats$battles
userAccountStats$user_rWINc <- userAccountStats$rWINcproduct / userAccountStats$battles
userAccountStats$user_rDAMAGEc <- userAccountStats$rDAMAGEcproduct / userAccountStats$battles
userAccountStats$user_rFRAGc <- userAccountStats$rFRAGcproduct / userAccountStats$battles
userAccountStats$user_rSPOTc <- userAccountStats$rSPOTcproduct / userAccountStats$battles
userAccountStats$user_rDEFc <- userAccountStats$rDEFcproduct / userAccountStats$battles
 
userAccountStats <- userAccountStats[,c("userid",  "user_WN8", "user_rWIN", "user_rDAMAGE", "user_rFRAG", "user_rSPOT", "user_rDEF", "user_rWINc", "user_rDAMAGEc", "user_rFRAGc", "user_rSPOTc", "user_rDEFc")]
any(is.na(userAccountStats))
 
#merge back
userTankStatsFiltered <- merge(userTankStatsFiltered, userAccountStats, all = FALSE)
any(is.na(userTankStatsFiltered))
 
# model for each tank the rDAMAGE per user, plot vs user overall rDAMAGE
 
# create table of compDescr and title as index for the loop
require(plyr)
listOfTanks <- ddply(userTankStatsFiltered, c("compDescr", "title"), summarize, total_battles = sum(battles))
any(is.na(listOfTanks))
 
# loop to do linear regression for each rSTAT vs user account rSTAT, derive corrected expected values
newExpectedValues <- expectedValues
for (i in listOfTanks$compDescr){
    sample <- userTankStatsFiltered[userTankStatsFiltered$compDescr == i,]
    rDAMAGEmodel <- lm(rDAMAGE ~ user_rDAMAGE, data=sample)
    rDAMAGEcorrection <- rDAMAGEmodel$coef[[1]] + rDAMAGEmodel$coef[[2]]
    eDAMAGE_new <- round(rDAMAGEcorrection * expectedValues$eDAMAGE[expectedValues$compDescr == i], 2)
    newExpectedValues$eDAMAGE[newExpectedValues$compDescr == i] <- eDAMAGE_new
    rFRAGmodel <- lm(rFRAG ~ user_rFRAG, data=sample)
    rFRAGcorrection <- rFRAGmodel$coef[[1]] + rFRAGmodel$coef[[2]]
    eFRAG_new <- round(rFRAGcorrection * expectedValues$eFRAG[expectedValues$compDescr == i], 2)
    newExpectedValues$eFRAG[newExpectedValues$compDescr == i] <- eFRAG_new
    rSPOTmodel <- lm(rSPOT ~ user_rSPOT, data=sample)
    rSPOTcorrection <- rSPOTmodel$coef[[1]] + rSPOTmodel$coef[[2]]
    eSPOT_new <- round(rSPOTcorrection * expectedValues$eSPOT[expectedValues$compDescr == i], 2)
    newExpectedValues$eSPOT[newExpectedValues$compDescr == i] <- eSPOT_new
    rDEFmodel <- lm(rDEF ~ user_rDEF, data=sample)
    rDEFcorrection <- rDEFmodel$coef[[1]] + rDEFmodel$coef[[2]]
    eDEF_new <- round(rDEFcorrection * expectedValues$eDEF[expectedValues$compDescr == i], 2)
    newExpectedValues$eDEF[newExpectedValues$compDescr == i] <- eDEF_new
    rWINmodel <- lm(rWIN ~ user_rWIN, data=sample)
    rWINcorrection <- rWINmodel$coef[[1]] + rWINmodel$coef[[2]]
    eWIN_new <- round(rWINcorrection * expectedValues$eWIN[expectedValues$compDescr == i], 2)
    newExpectedValues$eWIN[newExpectedValues$compDescr == i] <- eWIN_new
}
any(is.na(newExpectedValues))
 
#load tank details from WG API
require(rjson)
url <- 'https://api.worldoftanks.com/wot/encyclopedia/tanks/?application_id=demo'
tanksAPIList <- fromJSON(file=url, method='C')
require(plyr)
tanks.df <- do.call("rbind.fill", lapply(tanksAPIList$data, as.data.frame))
tanks.df <- rename(tanks.df, c("tank_id" = "compDescr"))
tanks.df <- rename(tanks.df, c("name_i18n" = "title"))
tanks.df <- rename(tanks.df, c("type_i18n" = "type"))
tanks.df <- rename(tanks.df, c("level" = "tier"))
tanks.df$compDescr <- as.integer(tanks.df$compDescr)
 
# add tank details to results, and export as 'expected_values_<date>.csv'
newExpectedValues <- merge(newExpectedValues, tanks.df, all=FALSE)
newExpectedValues <- newExpectedValues[,c("compDescr",  "title", "tier", "nation", "type", "eFRAG", "eDAMAGE", "eSPOT", "eDEF",  "eWIN")]
any(is.na(newExpectedValues))
date <- as.Date(Sys.Date(), "%m/%d/%Y" )
expected_value_filename <- paste("~/R/WN8/expected_values_",date,".csv")
write.csv(x=newExpectedValues,file=expected_value_filename ,row.names = FALSE)

 

Many of the links in the next 25+ pages are dead to avoid having you read stuff that is now OBE. The discussion on the final method and the results is from

 

Thank you for your patience - WN8 lives on, and will be updated at least every 3 months, and sooner if there are new tanks

Link to post
Share on other sites

ELC expected damage and frags have gone down significantly...

Link to post
Share on other sites

IS and IS-2 expected values have also split apart compared to the previous expected stats.

Link to post
Share on other sites

Any chance of a tank by tank expected value comparison?  I suppose I can do one while bored at work tomorrow if one does not already exist.

Link to post
Share on other sites

Any chance of a tank by tank expected value comparison?  I suppose I can do one while bored at work tomorrow if one does not already exist.

 

Not going to do them all, but here's some of tier 2:

 

bw7gaCy.png

 

Left is expected values I had before, right is the new stuff.

 

It seems that the expected values in general went down across every stat.  Only 10 tanks have new expected values for frags that put them in the same range of the old expected values, only 9 for damage.  Spots went down, defense went way down.  Would be interesting to see if this was a trend across other tiers.

Link to post
Share on other sites

Your methodology is significantly different from Praetor's, as you can see below. Just glancing through at the tanks whose numbers I'm familiar with, a lot of tanks have had their expected values go up quite significantly even for tanks that aren't new or haven't been nerfed/buffed recently. On the other hand, tanks like the hellcat/elc have had their expected values go down.

 

From the wn8 wiki:

Because WN8 was a per-tank rating, we needed data per tank, which as always is not available via the WG web API. We turned to Phalynx of vBAddict.net, who kindly handed over his database of 17k dossiers. The database was filtered for players with less than 1000 games played, and tanks that were played for less than 50 games. From this database we determined, using linear regression the stats to be expected on each tank for a median ability player. For each tank/player combination, we calculated playerWN8alpha and tankWN8alpha. WN8alpha was approximately WN7 in formulation, basically a means to measure per tank effectiveness. Afterwards, we filtered to the 50% of players who play that tank, who perform well ON THAT TANK, not overall. This incorporated a good mix of high win-rate and low win-rate players. We posit that using the top half of players in a given tank is a good way to compare tanks to each other, since they can squeeze out every last ounce of performance a tank has to offer. Otherwise, at the low end, you would be comparing tanks based on the performance of players who don´t know basic mechanics, or how to properly use a given tank. That being said, I use the top 50% of players to do the linear regression, because simply using the top player values would be biased and not generalizable to the entire population.

To check that expected stats for each tank were balanced, we looked at the tankWN8/accountWN8 ratio. We checked that the players with top 10% tankWN8/accountWN8 corresponded to about 1.15 for all the tanks in the game.

When a tank had a lower ratio, for example, we lowered the expected values used to regress with the top 50% of players, and then checked what the top 10% ratio was. This took several iterations of recalculating tankWN8 and playerWN8 until a balance was reached, and tankWN8/accountWN8 was about 1.15.

 

 

Im assuming a lot of this is because Praetor tweaked a lot of the lower tier tanks' expected values up.

Like I said, his main methodology is also very different.

Link to post
Share on other sites

Praetor did some tweaking with the low tiers to prevent sealclubbing and dont reward it, but I dont know what exactly. Is there a site where we could check the changes to low tier vehicles in the last patches? If for example there werent any changes to the Loltraktor since 0.8.8, we shouldnt fiddle around with the numbers but keep the old ones.

 

Edit:

 

Just checked maybe ~15 tanks/artis ... something is wonky here. No way a 1565 Player manages these numbers.

 

- high tier artis are totally off, they would make sense (with much good will) for early 2011 not half a year past 0.8.6. 

- t10 heavies ranging from 1950 - 2500 DpB ...

- KT got nerfed (less agile, worse frontal armor, lower engine HP, -10m viewrange) but DpB increased by ~100 points? o.O

 

 

edit2:

 

Allway keep in mind that since ~0.8.6 we have a massive increase in high tier TDs "stealing" the available HP from other tank classes and bringing less HP into the game than same tier heavies and still we have a massive increase in damage output by t10 heavies? Something doesnt fit here.

Link to post
Share on other sites

There's a lot of odd creep in the X med sets. 62a AT 1999 140 AT 2200

 

iS-6 creep at 1603, from 1415

 

the previous dataset started at 50.1,
 

 

 The database was filtered for players with less than 1000 games played, and tanks that were played for less than 50 games

 

 

 

Ok, Got procrastinating so made this for people to help on the overlook: Sortable searchable and comparing. 

 

https://docs.google.com/spreadsheets/d/1NIsXTsk-cRqgolmrqdUUrmfqsAuCY1JB2P6uIOU-hAI/edit?usp=sharing

Link to post
Share on other sites

There's definitely something wonky going on here, the expected DPG for the IS-4 managed to creep up by more than 300 points, and I doubt the meta has changed so favourably for that tank...

Link to post
Share on other sites

Any need to reinvent the wheel?  

 

IMO leave it alone until the "Balancing patch" that 9.2 is supposed to be.

 

You will just need to redo it then anyway.  This will also give it time to see how the TD camo nerf changes the meta if at all.  That is a huge wildcard, as the days of 4-5 arty, 7-8 heavies, 1 TD, and a couple mediums might return.  Or not, we wont know till it has been active a couple weeks.

Link to post
Share on other sites

Ok, compiled my comparison list.  Eliminated some of the rare premiums, simply because I didn't want to update my locally stored expected values table.

 

Some notes:

 

SU-85I gets massively increased expectations.  Nearly a half kill, nearly 5% win rate, 132 damage, .31 spot and .46 def.

 

STB-1 gets a huge exdmg increase of 708, and a nearly 9 point WR increase!  T10 meds in general have much higher expected damage rates.

 

MT-25 is expected to get over 1 more spot.  A-20 is expected to get over 1 less.

 

Sexton II's xdmg was cut by more than half.  

 

Sherman got big reductions across the board.  Those of us who used in during its golden age will get a big boost.  Ditto for the VK 36.01.

 

EVERYTHING at tier 10 has increases across the board.  Waffle, GWE, T92 and CGC spot less.  263 and 113 defend less.  Every other stat on every other 10 is up.

 

Similar story at 9.  Damage up everywhere, frags even or increased, WR increased for all but 4.  

 

At tiers 4 and below, the trend reverses sharply.  xvalues all down.

 

 

 

I have to say, I am greatly opposed to the changes as they currently stand.  They severely punish top tier play, while rewarding seal clubbing.  Without serious adjustment, I think these changes would invalidate WN8.

 

 

https://docs.google.com/spreadsheets/d/1CROBvAOl3EWWU83uxzYfN3dEJIPK9HjZOoMB09onVsI/edit?usp=sharing

Link to post
Share on other sites

Isn't this basically saying that using people with better win rates is thus reflecting on an entirely different dataset?  I would think this is still an improvement, but I would think that the old and the new data aren't comparable at all unless you stick to the exact same target winrate which would introduce a separate set of mistakes and problems.  

 

TLDR: to me it's an improvement in concept, but it's still not entirely accurate as people can find low WN8 expected values and wn8 farm among other issues. Effectively it can still be gamed, as a metric.

 

Also, has anyone looked into vbaddict's battle rating as an alternative? I'm aware of the WN9 planning. I don't want to threadjack, I just don't know if people are willing to consider it.

Link to post
Share on other sites

Ok, compiled my comparison list.  Eliminated some of the rare premiums, simply because I didn't want to update my locally stored expected values table.

 

Some notes:

 

SU-85I gets massively increased expectations.  Nearly a half kill, nearly 5% win rate, 132 damage, .31 spot and .46 def.

 

STB-1 gets a huge exdmg increase of 708, and a nearly 9 point WR increase!  T10 meds in general have much higher expected damage rates.

 

MT-25 is expected to get over 1 more spot.  A-20 is expected to get over 1 less.

 

Sexton II's xdmg was cut by more than half.  

 

Sherman got big reductions across the board.  Those of us who used in during its golden age will get a big boost.  Ditto for the VK 36.01.

 

EVERYTHING at tier 10 has increases across the board.  Waffle, GWE, T92 and CGC spot less.  263 and 113 defend less.  Every other stat on every other 10 is up.

 

Similar story at 9.  Damage up everywhere, frags even or increased, WR increased for all but 4.  

 

At tiers 4 and below, the trend reverses sharply.  xvalues all down.

 

 

 

I have to say, I am greatly opposed to the changes as they currently stand.  They severely punish top tier play, while rewarding seal clubbing.  Without serious adjustment, I think these changes would invalidate WN8.

Possible theory for this is that because WN8 DID punish so hard the seal clubbing, people have indeed stuck to higher tier play. So, tier 8-10 play has gotten better since people were rewarded to play it, thus expected values were increased. It will indeed present a very sharp change if these values are implemented, and probably have a very negative impact on players who only play the higher tiers the most. Someone who *used* to seal club, might in fact get an increase

Link to post
Share on other sites

This really depends on what your goal is. It has always been pretty apparent to me that the authors were interested in rewarding high tier play and penalizing low tier play. My best vehicles are mostly my higher tier vehicles, even though the general level of competition is better. Your analysis seems to confirm that feeling. As such I suspect you will get a huge amount of push back as you will absolutely wreck a lot of people's stats. I support the change but fully expect to be in the minority on this one.

Link to post
Share on other sites

Id be ok with an incremental increase, perhaps a 25% increase in expected damage output for higher tiers [ (new expected - old expected) *.25 ] (same for kills, WR, etc)it will still be a change, but not a terribly dramatic one. I also feel we need more players included in the dataset, perhaps 10k? Doing this small steps might account for trends within the game. Another possible thing to consider, there were no new tanks lines in this last patch, perhaps there were fewer low tiers to play as players moved back into older, more established lines?

Link to post
Share on other sites

My M5 Stuart has a 6250 WN in the new system...

My Cunningham, MT-25, Pz II, Luchs, Pz IV and Sherman all gain over 1k.

My T-50 triples, from ~1500 to over 4700.

Looks like my biggest drops are my waffles.  ~500 on the RHM, ~700 on the 4 and waffle grande.  

Most of my 10s drop 3-400 points, while my low tiers all jump.

Overall rating only changes by around 5 points.

 

I stick with my earlier assessment.  This update encourages seal clubbing, while discouraging high tier play.  I think that would be anathema to the spirit of WNX. 

Link to post
Share on other sites

Punishing high tiers and rewarding low tiers seems like a horrible idea.  I have friends that are red/orange WN8 that refuse to play above tier 6 because they lose money like crazy in any high tier games (one damaging shot and dead most games).  These people would get a huge stat boost to around the yellow/almost green range.  Someone that can only do one damaging shot at tier 8 before they die should not be rated yellow, much less green.

Link to post
Share on other sites

Cooking the numbers is not the way to encourage high tier play. I think a better approach would be to not include low tier vehicles in the ratings. Make it clear it is solely for measuring high tier play. Otherwise it is always compromising the impartial and repeatable nature of the measurement.

Link to post
Share on other sites

I don't see that as cooking the numbers at all.  There is an extremely noticeable skill gap between tiers 8-10 and tiers 1-6 (7 seems like a mix, not sure where I'd put it).  What WN8 tries to do is take into consideration the difference between difficulty in tier.  It might not be perfect, but grading it on a linear scale would be an incredible failure compared to the weighted curve of today's WN8.

Link to post
Share on other sites

waiting till .9.2 wont solve anything. The numbers that came up now dont make much sense in many cases. We got to figure out, whats the reason for this, otherwise we ll have the same problem with 0.9.2

Link to post
Share on other sites

I get that... But I'd think the effort would be better placed on the 'rebalance' patch.

 

I agree the numbers don't make much sense. Where is Praetor?

 

waiting till .9.2 wont solve anything. The numbers that came up now dont make much sense in many cases. We got to figure out, whats the reason for this, otherwise we ll have the same problem with 0.9.2

Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...