Jump to content
Android25

WNR - Discussion

Recommended Posts

1 hour ago, Android25 said:

Team performance factor ruins it. You get a coefficient based on the amount of damage your team caused to the enemy team. That's why in games where you do great but the rest of the team absolutely blows it and gets almost no damage, your steller loss game is still only a few hundred xp. I could look into charting it but I suspect it will just be too full of noise.

Here are all the factors http://wiki.wargaming.net/en/Battle_Mechanics#Tank_Experience_and_Credits

Oh yeah I forgot about that, and it gives XP based on what the enemy team did as well. Damn

Link to post
Share on other sites
6 minutes ago, Assassin7 said:

Oh yeah I forgot about that, and it gives XP based on what the enemy team did as well. Damn

 

1 hour ago, Android25 said:

Team performance factor ruins it. You get a coefficient based on the amount of damage your team caused to the enemy team. That's why in games where you do great but the rest of the team absolutely blows it and gets almost no damage, your steller loss game is still only a few hundred xp. I could look into charting it but I suspect it will just be too full of noise.

Here are all the factors http://wiki.wargaming.net/en/Battle_Mechanics#Tank_Experience_and_Credits

I would give it a try. The XP has it flaws, but it has a very important feature: it punishes CHAI sniping. I think this can't be done with any other parameter.

Maybe something like XP*(dmg+assist) wouldn't be a bad idea. 

Link to post
Share on other sites

I don't see why damage assist less track damage wouldn't be far superior to xp when it comes to chai sniping. You still get xp when you do a lot of damage, but if you're straight up using your whole team to spot for you then you get zero assisted damage. Something more along the lines of x*sqrt(dmg*radio) + (x-y)*sqrt(dmg*track) where x and y are positive, radio is spotting assist, and track is track assist.

Link to post
Share on other sites
2 hours ago, Android25 said:

I don't see why damage assist less track damage wouldn't be far superior to xp when it comes to chai sniping. You still get xp when you do a lot of damage, but if you're straight up using your whole team to spot for you then you get zero assisted damage. Something more along the lines of x*sqrt(dmg*radio) + (x-y)*sqrt(dmg*track) where x and y are positive, radio is spotting assist, and track is track assist.

OTOH when you hold flank by yourself, get sh1tton of dmg as you singlehandedly kill everyone trying to push, but you still get zero assist as either there is nobody to support you OR there are allies to support you but hey suck balls and shoot rocks instead OR you simply take all the damage for some unknown reason... This scenario would be severely hurt with ^formula^ as it would be count exactly as chai sniping.

In ideal world, API would sort your dmg as "own spot" and "team spot" which would make everything so much easier. In real world, WG and its support to those things sucks massively.

Link to post
Share on other sites

First of all, great job on starting this project Android25.  In the original thread there was discussion around the lack of continuity and interest in maintaining these projects and I was going to suggest collaborating with the XVM team as they appear to have been extremely consistent in their support of the tool over many years.  Including setting up an ad-supported website and maintaining databases for years with little to no downtime.  If you are willing I'd suggest open-sourcing your work on github or similar much like XVM has done, this may encourage others to get involved or at least learn from your code.

Comments/questions:

  • So the WG API shows spotting, track, and stun damage per-account but not per-tank?  Have they published any reasoning behind this?  It would seem this could make any new stats formula far more simple to calculate.  I wonder if instead of working on the extra math in figuring this data out by using big data mining/cross referencing (referred to here as multiple linear regression I think?) if it would be worth while to contact the WG dev department and request this be added to the API.  They used to have programs where WG devs worked with API-users in the past.
  • In the original post it was mentioned that the data mining portion would be collecting NA/EU/RU but ASIA was not mentioned.  For completeness and community inclusion sake I would strongly recommend collecting data for the ASIA cluster as well.  Their daily peak users are about the same as NA West+East combined and are a meaningful market and part of this community.
Link to post
Share on other sites
7 hours ago, Android25 said:

Average assisted spotted and track damage per game since the release of 8.8 are both in the API. It's not listed per tank, so you would have to take it incrementally (hence recent) and use enormous amounts of data to multi-linear regress the expected assist damage per tank.

 

I take it back; average xp per battle without premium is not something you can get from the API. It's also one of the least useful stats :/

What do you think about linking average assisted damage with average tier? It's just a though but does it make any sense?

Also there should be a little penalty for all accounts below 12k battles (the lower penalty the closer this point).

Link to post
Share on other sites
22 hours ago, kurcatovium said:

OTOH when you hold flank by yourself, get sh1tton of dmg as you singlehandedly kill everyone trying to push, but you still get zero assist as either there is nobody to support you OR there are allies to support you but hey suck balls and shoot rocks instead OR you simply take all the damage for some unknown reason... This scenario would be severely hurt with ^formula^ as it would be count exactly as chai sniping.

In ideal world, API would sort your dmg as "own spot" and "team spot" which would make everything so much easier. In real world, WG and its support to those things sucks massively.

It's true that the API doesn't give us enough information and that knowing even which damage you were doing within 200m would be better than just having damage as a lone number. When I mine for tank data I think I'll save blocked damage or damage received and see if those correlate at all with winning, Some sort of z*sqrt(damage*received) or z*sqrt((received+blocked)/damage) or z*sqrt((received+blocked)*assisted) modifier.

Edit: The more I think about this the less sure I am about it... There's just really no way to sort out damage taken from trading well vs just getting your tank trashed at the end of the game. Tanking factor may be something to look at but it really only applies to well armored tanks, the ability to determine anything from it with any lightly armored tanks is nothing. There are just situations that the numbers have no way of quantifying and that's why win rate is considered.

20 hours ago, bounceplink said:

First of all, great job on starting this project Android25.  In the original thread there was discussion around the lack of continuity and interest in maintaining these projects and I was going to suggest collaborating with the XVM team as they appear to have been extremely consistent in their support of the tool over many years.  Including setting up an ad-supported website and maintaining databases for years with little to no downtime.  If you are willing I'd suggest open-sourcing your work on github or similar much like XVM has done, this may encourage others to get involved or at least learn from your code.

Comments/questions:

  • So the WG API shows spotting, track, and stun damage per-account but not per-tank?  Have they published any reasoning behind this?  It would seem this could make any new stats formula far more simple to calculate.  I wonder if instead of working on the extra math in figuring this data out by using big data mining/cross referencing (referred to here as multiple linear regression I think?) if it would be worth while to contact the WG dev department and request this be added to the API.  They used to have programs where WG devs worked with API-users in the past.
  • In the original post it was mentioned that the data mining portion would be collecting NA/EU/RU but ASIA was not mentioned.  For completeness and community inclusion sake I would strongly recommend collecting data for the ASIA cluster as well.  Their daily peak users are about the same as NA West+East combined and are a meaningful market and part of this community.

I will have to work with XVM at least a little bit if I want them to incorporate the rating into their mod. It wouldn't be as simple to calculate and store as overall data so they would likely have to make API calls to the central WNR database for scores.

1) WG has never published a reason as to why they don't make assisted damage per tank available. They have added per tank functions even as recently as 9.0 but have never added spotting despite requests from other WN contributors trying to get them to do so. At this point they may simply not be releasing it to spite those trying to make performance ratings citing "hostile environments" created by them.

2) I've already collected ID and last played data for the entire SEA server. Because the secondary objective of the WNR server would be to store recent data on every player playing the game, I see no reason why I would leave SEA out of the calculations. I've even partially interested in offering overall expected values as well as per server expected values. Anyone could see how they were performing in general or specific to their server. I worry though that NA and SEA don't have enough recent data to create expected spotting data for all tanks.

18 hours ago, MacusFlash said:

What do you think about linking average assisted damage with average tier? It's just a though but does it make any sense?

Also there should be a little penalty for all accounts below 12k battles (the lower penalty the closer this point).

I think linking it per class would make more sense. LT's for example should certainly be assisting more than Arty (at least spotting damage). But I don't see what would be wrong with just per tank expected assist.

Recent only stats completely remove any incentive to reroll, why start a new account when you can just keep all your tanks and get a fresh new number every month (or 3-4 months if I decide on a 6 month weighted rolling average)

I plan on using a 1k battle cutoff, plus WGPR to sort players initially into top 50 and bottom 50 percent categories when regressing expected values. WGPR punishes for low battle count, so those with lower overall battle count will already be excluded from the creation of expected values. I see no purpose to punish people for being new though.

Link to post
Share on other sites

Punishing people for being new... well, I don't think it's completely off, as there are not many categories where you can say it's really punishing innocents.

  1. Reroll - I've never understood why would anyone do so, but people tastes vary and numbers are everything for some, so... Reroll and its "unfair" advantage of having better starting conditions is probably primary reason why battle count could/should have some weight to it.
  2. New account "being good"
    1. Second account - similar to reroll above, not quite as "unfair" as with new account you start pretty much fresh (no premiums, equipment etc.), yet you definitely are NOT new to the game. Why should you benefit from this in terms of rating?
    2. Miracle child - well, some people understand faster, have good friends that teach them this, that and whatnot, their learning curve is out of this world and they simply get used to game concepts easier. Those would be real "victims" of penalties. But how many of those you know?
  3. New account "being bad" - this is your average Joe that plays for fun. Whether he's just realizing his childhood dream of driving a tank, enjoying graphics and benchmarking his PC or checking ideas of how to paint his brand new plastic model of M4A3E8, he certainly can't be a victim as he doesn't care for numbers anyway. And once he starts to care, he'll probably be out of "penalty zone" anyway...

It also very much depends on where would the line be. Is it 500 battles? 1k battles? 5k battles? 10k battles? I think for start everyone should peek on his beginnings. Taking a look back at my start in the game, I was really really bad. There were nobody in my neighbourhood that played the game that would tell me what to do, so I just drove tanks, enjoyed "big guns" and that was cool. But that was cool only for few hundred battles. Then I got pissed off that I have like 40-45% WR and wanted it to be 50%, because "I'm not that bad". I didn't want to hinder my team so I went to wot wiki and checked some tanks I had problem with. I played Hetzer with howitzer having no idea what ammo types do, so the main problem was obviously KV. It did not take that much effort to learn what AP and HE is and that it's a bad idea to shoot AP at KV... And from that moment I slowly started to work on my skills and improve WR every day. It was maybe like 1k battles area? I'm not saying I'm good now, there are always so much to improve, but the thing is there need to be willpower to do so. And IMO everyone that doesn't get better (have constantly improving WR/WN/whatever metric you choose) after like 5k battles doesn't have one.

Link to post
Share on other sites

I really like the idea of Data Mining for a new metric. There are some things I would like to mention:

-have you thought of implementing something like distributed computing into your program? As pulling API requests probably does not need that much computing power. If you split it up to several PCs from people which participate you could drastically speed up the data gathering process thus giving you even more detailed data. 

-having pretty accurate data from all players it would probably be possible to calculate how a player performed over an ammount of time. If I remember it correctly you are only able to pull out overall account data. But if we can compare 2 "checkpoints" to each other we could calculate what a player played on average to get to his stats. (For example we have 100 battles on a tank in the first Dataset with 500 average damage and 105 battles with 510 average damage. So there should be a way to calculate what stats were needed to play for the 5 battles difference to come up for this change.

-if you create a  rating like this, taking into account active accounts for the last 30 days. The rating would be only viable for recent stats (30 days or so). Therefore it would be probably the best to calculate the expecteds always for the time of 2 game patches that they anyways reflect the current meta + the previous patch with might be still in the recents for some players. Then we would have a metric always covering the current meta. This would make it viable for comparing recent games played (which is what most people are looking into anyways)

And just out of interest how far are you with coding or is there any way we could help you?

Link to post
Share on other sites
On 5/10/2017 at 4:30 AM, kurcatovium said:

Punishing people for being new... well, I don't think it's completely off, as there are not many categories where you can say it's really punishing innocents.

  1. Reroll - I've never understood why would anyone do so, but people tastes vary and numbers are everything for some, so... Reroll and its "unfair" advantage of having better starting conditions is probably primary reason why battle count could/should have some weight to it.
  2. New account "being good"
    1. Second account - similar to reroll above, not quite as "unfair" as with new account you start pretty much fresh (no premiums, equipment etc.), yet you definitely are NOT new to the game. Why should you benefit from this in terms of rating?
    2. Miracle child - well, some people understand faster, have good friends that teach them this, that and whatnot, their learning curve is out of this world and they simply get used to game concepts easier. Those would be real "victims" of penalties. But how many of those you know?
  3. New account "being bad" - this is your average Joe that plays for fun. Whether he's just realizing his childhood dream of driving a tank, enjoying graphics and benchmarking his PC or checking ideas of how to paint his brand new plastic model of M4A3E8, he certainly can't be a victim as he doesn't care for numbers anyway. And once he starts to care, he'll probably be out of "penalty zone" anyway...

It also very much depends on where would the line be. Is it 500 battles? 1k battles? 5k battles? 10k battles? I think for start everyone should peek on his beginnings. Taking a look back at my start in the game, I was really really bad. There were nobody in my neighbourhood that played the game that would tell me what to do, so I just drove tanks, enjoyed "big guns" and that was cool. But that was cool only for few hundred battles. Then I got pissed off that I have like 40-45% WR and wanted it to be 50%, because "I'm not that bad". I didn't want to hinder my team so I went to wot wiki and checked some tanks I had problem with. I played Hetzer with howitzer having no idea what ammo types do, so the main problem was obviously KV. It did not take that much effort to learn what AP and HE is and that it's a bad idea to shoot AP at KV... And from that moment I slowly started to work on my skills and improve WR every day. It was maybe like 1k battles area? I'm not saying I'm good now, there are always so much to improve, but the thing is there need to be willpower to do so. And IMO everyone that doesn't get better (have constantly improving WR/WN/whatever metric you choose) after like 5k battles doesn't have one.

Your stats would only be based on the games you've played in the last 15 to 30 days though, so there really wouldn't be any such thing as re-rolling. Theoretically you get to "reroll" every month with this type of recent stat. If it was a weighted 6 month rolling average, you might be able to boost your score a little by starting over but it would literally be pointless because if you gave the old account 3 or 4 months you would drop any old bad games anyway. It would be the same with all the other cases as well. Since the metric will only be based off 30 day change data I could see putting a cutoff on accounts that haven't played at least 50 games in this time period, or a weighted scale for the first 100 games, but other than that, it wouldn't make sense for a recent stat to punish you by the number of overall games you've played.

 

On 5/10/2017 at 5:30 AM, binmaa10 said:

I really like the idea of Data Mining for a new metric. There are some things I would like to mention:

-have you thought of implementing something like distributed computing into your program? As pulling API requests probably does not need that much computing power. If you split it up to several PCs from people which participate you could drastically speed up the data gathering process thus giving you even more detailed data. 

-having pretty accurate data from all players it would probably be possible to calculate how a player performed over an ammount of time. If I remember it correctly you are only able to pull out overall account data. But if we can compare 2 "checkpoints" to each other we could calculate what a player played on average to get to his stats. (For example we have 100 battles on a tank in the first Dataset with 500 average damage and 105 battles with 510 average damage. So there should be a way to calculate what stats were needed to play for the 5 battles difference to come up for this change.

-if you create a  rating like this, taking into account active accounts for the last 30 days. The rating would be only viable for recent stats (30 days or so). Therefore it would be probably the best to calculate the expecteds always for the time of 2 game patches that they anyways reflect the current meta + the previous patch with might be still in the recents for some players. Then we would have a metric always covering the current meta. This would make it viable for comparing recent games played (which is what most people are looking into anyways)

And just out of interest how far are you with coding or is there any way we could help you?

- The problem isn't so much speed of data collection as the actual data collection only takes around 15 days, which is half of the 30 day change period, meaning the server would have 15 extra days to calculate expected values and WNR for each player and simply rest without making calls to the API. While cutting down the time it takes to gather the data would make stats update more quickly, it would also require the server to hold more data points at one time. The purpose of this main server is more to calculate expected values than it is to update and store stats. Because more people are likely to play more games in 30 days than, say 7, it makes more sense to calculate expected values on a 30 day clock.

- You would have perfectly accurate data for all players, and the secondary objective of the server is to calculate how players are playing over a given period of time (30 days). The only time you have to calculate differences of averages is when you are given an average stat like average data blocked or tanking factor (neither of which do I think will be in the end formula). All other stats are provided as total values, not average values, and change values are very easy to calculate by taking new data and subtracting old data. It's already done by wotlabs and just about every other stat tracking site to calculate personal change data in 1, 7, 30, and 60 day intervals.

- Expected values for all tanks would be calculated with the 30 day change values the server has for all active players every 30 days, or, if it ends up being viable, every single day with the new 30 day data that has come in that day, plus the latest 30 day values of every other player on the server. This would make the expected values change much more smoothly and make it so new tanks would have accurate expected values within a few days of release. . It wouldn't need to consider the time from x number of patches because the only data that would be run through the formula would already be being used to calculate the expected values.

At the moment I have yet to code an initial data pulling program to get my first batch of values from active players on the server. I plan on doing that tonight or tomorrow as time permits. Then i'll need to design a proper database to hold all this data, as for now I'm tossing it into text files to be added to a database later. Considering I'll have at least 15 days between finishing the first data mine and starting the next, I'm not too worried about translating all my text files into a database by then. As far as help goes, just throwing new ideas for me or others to mull over is incredibly helpful. Even when I'm busy writing up why one idea wouldn't work or is already going to be implemented, it sparks a lot of new ideas for me to work with.

Link to post
Share on other sites

I do not think that updating the expected values on a daily basis makes much sense. If the expecteds change every day, then also the player stats would increase/decrease slightly every day. Of course you will not have the big drops you are having right now with the WN8 updates. But updating it every day would make it hard for players to track there stats/improvement or to compare stats, that arent refreshed everyday. 

As said I would at least go for 30 days values maybe even longer. Because if there are no patches in between the updates I doubt that there will be many changes to the overall server stats anyways. The only thing that needs to be checked ist that if there is a patch, the data for calculation is big enough to contain pre and post patch data. As otherwise buffs or nerfs would seriously affect the recents.

Lets imagine Maus gets nerfed bacl to its old form with the next patch:
If you do not cover both patches, people who only drove the Maus pre patch will have incredibly high recents until there Maus battles drop out.

Link to post
Share on other sites
6 hours ago, Android25 said:

Your stats would only be based on the games you've played in the last 15 to 30 days though, so there really wouldn't be any such thing as re-rolling. Theoretically you get to "reroll" every month with this type of recent stat. If it was a weighted 6 month rolling average, you might be able to boost your score a little by starting over but it would literally be pointless because if you gave the old account 3 or 4 months you would drop any old bad games anyway. It would be the same with all the other cases as well. Since the metric will only be based off 30 day change data I could see putting a cutoff on accounts that haven't played at least 50 games in this time period, or a weighted scale for the first 100 games, but other than that, it wouldn't make sense for a recent stat to punish you by the number of overall games you've played.

Yeah, my bad I didn't really think about this metric in particular. When it's all about x-day recent, it doesn't make sense, obviously. I was too biased by old WN7/8 in mind...

And as far as other ideas go. Are you sure strict 30 day period is a good idea? I think it would make sense to bind data pull to patch releases, similarly as binmaa says. Pros are that changes to values are to be expected more or less only when tanks get buffed/nerfed which happens with patches (wow, strong logic there :-D ). Cons are that it would require more admin interaction as I presume it would be needed to manualy start fetching when it's patch time. It would surely even out with 30 day interval too, but it would require more time to do so IMO.

Link to post
Share on other sites

 

4 hours ago, kurcatovium said:

Yeah, my bad I didn't really think about this metric in particular. When it's all about x-day recent, it doesn't make sense, obviously. I was too biased by old WN7/8 in mind...

And as far as other ideas go. Are you sure strict 30 day period is a good idea? I think it would make sense to bind data pull to patch releases, similarly as binmaa says. Pros are that changes to values are to be expected more or less only when tanks get buffed/nerfed which happens with patches (wow, strong logic there :-D ). Cons are that it would require more admin interaction as I presume it would be needed to manualy start fetching when it's patch time. It would surely even out with 30 day interval too, but it would require more time to do so IMO.

The beauty of an automated system is you don't have to worry about this ever again.  If the system is constantly rolling through the list of active players grabbing data the expected values can be calculated daily (or even instantly with every new fetch) so the latest data will always be very close to the average.  When a new patch is released with a tank buff/nerf the collected data starting from that moment will begin to reflect the new performance of the tank and the expected value will immediately start to change.

There are plenty of ways to automatically check if a new patch is released but I can't think of any automated methods to check for tank buffs/nerfs, that would require manual intervention (unless you pull the patch notes and parse there, but the format isn't standardized).

The current method from WN8 means there's a jarring change when a new version of the expected tables is released where all of your past performance in a tank is judged based on the new expected values, even if you haven't played that tank since the buff/nerf.  IMO an automated system that constantly recalculates expected values providing a gradual change is far superior.

Link to post
Share on other sites

 

19 hours ago, binmaa10 said:

I do not think that updating the expected values on a daily basis makes much sense. If the expecteds change every day, then also the player stats would increase/decrease slightly every day. Of course you will not have the big drops you are having right now with the WN8 updates. But updating it every day would make it hard for players to track there stats/improvement or to compare stats, that arent refreshed everyday. 

As said I would at least go for 30 days values maybe even longer. Because if there are no patches in between the updates I doubt that there will be many changes to the overall server stats anyways. The only thing that needs to be checked ist that if there is a patch, the data for calculation is big enough to contain pre and post patch data. As otherwise buffs or nerfs would seriously affect the recents.

Lets imagine Maus gets nerfed bacl to its old form with the next patch:
If you do not cover both patches, people who only drove the Maus pre patch will have incredibly high recents until there Maus battles drop out.

In addition to what @bounceplink said, the other nice thing about a hands off automated system calculating expected values based on only recent values is that your expected values ALWAYS reflect the stats that are going to be fed through the formula.

Here's my idea on how this would work:

Initially the WNR server would take WPR (which is unrelated to WN but matches it almost linearly until about WN8 1800 of which less than 5% of the server is above) and group players together into several groups where all players in each group have a score within some x value of each other, maybe 500. You take these groups and divide the players in each of them by 30, then create 30 "bins" with 1/30th of the players from each group. This will give you 30 bins with each bin having a distribution of player skill closely matching the server. Do this for each server so you have 4 sets of 30 bins, one set for each server. Each day the WNR server gets tank data from one bin in each set. Players are permanently attached to their bins and any new players added to the server either by being a new account or an old account just becoming active again, are evenly dispersed into each bin. With this process each bin will be a "close enough" representation of the server average. Every 180 days the server would look at the average stats of each bin and make sure none of them drift too far out of line, shuffling accounts around and re-collecting their data at appropriate times to make sure their data is still 30 days old (I don't think too many changes would need to be ever made but having a check for it is important).

Each day expected values are calculated but because each day the players checked would be a fairly accurate representation of each server, the values would only change very slowly unless a tank was modified, in which case it would begin to gradually move towards it's new expected value over 30 days.

WNR for each player would be calculated on whatever day their bin gets pulled every 30 days. WNR would be not only be a representation of the players last 30 days, but a double weighted rolling average of the last 180 days. That's 6 sets of data. The first weight in the average would be directly by battles played. The second weight would be by age of the set. The oldest set would be weighted the least while the newest set would be weighted the most from 20% to 120% incremented by 20% so that the newest data would weigh 6 times more than the oldest data.

Players could still check out their daily progress, possibly even on the WNR website. The expected value changes would be so miniscule that it would be impossible to tell the difference daily between movement due to performance or movement due to change in the values chart. The expected values would flow very smoothly based less on the micro changes like individual player skill, but on macro changes like modifications to maps, additions of new tanks that counter others better, changes directly to tanks stats, modifications to mechanics, addition of new equipment (like in the next patch) etc.

In your example where a tank suddenly changes, the accounts that are to be updated on the first day would only have at most 1 day worth of playing that tank in it's nerfed status, but up to 29 days of playing it in it's buffed status. The expected values for the tank would be slightly lower because of the day of people playing it in its nerfed form, but it should actually relate well, 1/30th of their games would be in the nerfed tank and the stats for that tank would be 1/30th of the way normalized. This would be the same for each day until those accounts that potentially have 1 day of playing the tank while it was buffed and 29 days of it being nerfed while the expected values will be 29/30ths of the way normalized. it doesn't work quite as elegantly as this because people probably won't play the tank every day, but it massively limits the potential of gaming the system because the most you could possibly do would be to play the tank like crazy while it's buffed and ride the next 30 days with a slightly increased WNR (Remember that WNR would be a weighted average), and then every month the weight of that boost would be lessened.

Link to post
Share on other sites

It would be interesting to see what the difference is in values server to server as different servers seem to get into different map metas at different times. 

Also i always thought base exp weighted against tier is one of the better metrics due to the weighting based on shooting higher tiers and being close to the enemy since that generally means you're actively contributing to the battle and doesn't punish spotting. And generally, if you win more, you're doing something right and your base exp would be higher accordingly. Even if they get carried in platoons or tier 4 pref padders, a quick look in the service record would make it obvious. XVM (not that i use it. Fuck XVM.) can show average tier + win rates anyway which is enough to make a decision on how good the player is. 
Although you can still do the same with WN8 now. 

 

Anyway, good luck in implementing the automated WNR. Guess I'll wait till we get some actual numbers to make a decision on if i think its good or not. 

Link to post
Share on other sites

zu9kjJw.png

Data collection has begun... Finally got around to writing the initial data pulling program.

The program still needs to be optimized and probably made a bit more readable, and as you can see it's just dumping it to a text file for now... The goal is to have a database before the next round of data pulling

Unfortunately I'm currently looking at a rate of about 1.6 accounts per second which means the entire active playerbase will take some 64 days... with Russia taking 3/4 of that. So I think I'll collect NA and maybe SEA and then multithread the program as soon as I have the chance. Based on how I wrote it, it will be very easy to multithread. It's finals week though and I doubt I'll have the time until after Thursday.

As you can see I'm taking Base_WGPR, which is WGPR without the games played modifier.

Which for programming languages without atanh is iUkF5NR.png

or the simplified zBBahTf.png

(The different capitalization in battles doesn't mean anything, they're all overall battles, I just wrote them differently without thinking about it)

Link to post
Share on other sites

Unexpected case that broke my initial mining...

This player who had over 40k battles and last week displayed in the API as having played in the last 60 days, has since had their account wiped. Because I was only looking at recent accounts I had nothing to catch a unix timestamp of 0 (never played)

hEtKV0h.png

Fixed the system to discard these players when collecting data but that's been my first annoying bump.

Link to post
Share on other sites

I ran into some issues collecting my data and decided to get a proper JSON parser as well as clean up the code a little. As long as I was modifying code I decided I may as well write the multithreading as well.

I went with 8 threads and switched my API application over to a server instead of mobile (allowing for 20 calls per second instead of 10). At 8 threads the program pulls 800 accounts every ~1:05 minutes, or about 12.5 accounts per second. Now instead of 64 days the entire 4 server active player pull will take between 7 and 8 days. I could up the thread count but the program already pulls a constant 350KB/s of my internet, so i'd rather not; plus returns start to diminish once you go over the logical cores on the server and my server only has 8.

The really sad thing is that only 11% of the return information for the API tank calls is actually tank data. The rest is formatting. Now of course you need formatting, so assuming they keep JSON (for some reason), that's 24% of the return information. Which means that their horrendously long JSON tags take up 65% of the return data... They could likely cut their total API bandwidth in half if they had just picked much shorter tag names with documentation of what each of them means.

Some examples of way overly long JSON tag names:

"battles_on_stunning_vehicles"

"dropped_capture_points"

"stun_assisted_damage"
Link to post
Share on other sites

All tank data on NA, EU, and SEA from players that played a game in the last 60 days (as of about 2 weeks ago) has been pulled.

The file sizes are 877MB, 4.203GB, and 580MB respectively.

I started pulling RU data 12 minutes ago and assume the file size will be about 3.2 times larger than EU... so about 13.4GB.

 

Edit: RU api seems considerably slower. Same program that took 1:05 to collect 800 players tank data on NA is taking 1:45 on RU.

Link to post
Share on other sites
1 hour ago, CompeAnansi said:

What language are you going to be doing this in? I'd be happy to help if you're doing it in python.

It's in Java for now. I haven't even truly compiled it, just running it through eclipse until I get all the bugs worked out... I'm shocked that the API can return so many unexpected results but I think I have them all handled now.

I should be writing it in one of the C languages because I'll be running it on a Windows server but Java is just so easy.

Link to post
Share on other sites
16 hours ago, moogleslam said:

You are all my heroes :)

Mine too. I have no clue what, or how they are doing these metrics, data minings and shiz. But I'm somewhat impressed.....

Link to post
Share on other sites
On 2017-5-20 at 0:38 PM, Android25 said:

It's in Java for now. I haven't even truly compiled it, just running it through eclipse until I get all the bugs worked out... I'm shocked that the API can return so many unexpected results but I think I have them all handled now.

I should be writing it in one of the C languages because I'll be running it on a Windows server but Java is just so easy.

Have you tried multi-threading it with each thread running its own token, taken from a list of known tokens?

That could speed things up a bit

Link to post
Share on other sites
1 minute ago, Gryphon_ said:

Have you tried multi-threading it with each thread running its own token, taken from a list of known tokens?

That could speed things up a bit

My biggest fear with this is that they might see that as me trying to circumvent the "max requests per second" rule, which is 20/second for a server. At the moment I'm not even remotely breaking any of the API rules, even though I'm making over 8 million requests in a span of about 10 days.

I could make the program go slightly faster by increasing the number of threads, as I calculate I'm only making about 12.5 requests per second at any given time. However, initially my server will only have 8 logical cores, so overloading on threads (especially because they all need to complete before any can start again) would quickly lead to diminishing returns.

From what I've seen, I can pull the entire active playerbase in less than 12 days, so I could even run recents every 15 days instead of every 30.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...