Why can't R use the same language as MATLAB? FML


I also managed to crash it trying to sort that huge dataset. I am assuming that tankid is defined underneath countryid when it comes to tank type?

No. Ignore the tankid field in vbaddict data, the real WG tankid is called compDescr (or IDNum in some datasets)

So if I sort by that I get all the data for the same tank?

Correct - for example, 13825 = T-62A


+ the code below produces those damage plots for all the tier tens

#import dossier data from csv file

dataMaster <- vbaddict_dossiers_2014.12.11

#apply 50 battle filter 
userTankStats <- dataMaster
userTankStats <- userTankStats[userTankStats$battles > 50,]
userTankStats$damage_dealt <- as.double(userTankStats$damage_dealt)
userTankStats <- userTankStats[,c("server", "userid", "compDescr","title", "type", "tier", "countryid", "battles","victories","damage_dealt","frags","spotted","defence_points","capture_points","survived","overall_battles","overall_winrate")]

# number of battles in dataset

#calc average stats
userTankStats$aWIN <- 100*userTankStats$victories/userTankStats$battles
userTankStats$aDAMAGE <- userTankStats$damage_dealt/userTankStats$battles
userTankStats$aFRAG <- userTankStats$frags/userTankStats$battles
userTankStats$aSPOT <- userTankStats$spotted/userTankStats$battles
userTankStats$aCAP <- userTankStats$capture_points/userTankStats$battles
userTankStats$aDEF <- userTankStats$defence_points/userTankStats$battles

#plot winrate vs average damage for all tier tens
w <- ggplot(userTankStats[userTankStats$tier == 10,], aes(x = aDAMAGE, y = aWIN)) 
w + geom_point(alpha = I(0.05)) + facet_wrap(~title)

(The first few lines just rename a specific dataset to the name I use for a dataset in every script)

