Mathematics Contributor
  • Content count

  • Joined

  • Last visited


About Android25

  • Rank
    [email protected] and shit
  • Birthday 06/02/1994

Profile Information

  • Gender
  • Location
  • Interests
    Computer Programming/Building, Gaming, Working on Computer Science bachelor's degree at UW Milwaukee..
  • Server

Recent Profile Visitors

16,544 profile views
  1. Daily is somewhat useful for immediate data after a patch I guess (plus showing people how they're doing daily) but ~115,000 players really isn't a lot to generate expected values. How are those accounts chosen exactly? I assume it's just people who are in clans that are searched or specific players that are searched for, in which case the data would be slightly biased upwards (although it was biased in that direction when data from vbaddict was taken as well). I'm still taking 30 day interval data (hoping to eventually automate it down to 14 days) of all active players worldwide (as of 3 months ago, hope to eventually get it down to worldwide active players monthly), which is ~8.7 million accounts. The nice thing about so much data is that you can somewhat accurately linearly regress expected assisted values for each tank from the account overall averages. I'm on the 3rd interval now, I expect eventually to send the data through Eureqa and come up with a WN9-like formula to use as a 14 day recent w/ 3 month weighted average rating (much more about that in the WNR topic), but I just don't have time this summer. I expect to get something working maybe mid September or October. I would share my data but each run is a little over 22GB. The other thing to remember is that expected values aren't the same as the true average values on each tank. It's linearized by the upper 50% of players per tank, then adjusted if the top 10% of players in that tank fall to much above or below the top 10% of players overall.
  2. Starting Collection 2, when this finishes in ~11 days I'll have 30 day change values across all active accounts on all servers.
  3. All of the data has been pulled. It finished a few hours ago. I'm on vacation until the 9th though so I'll stop by this topic and answer stuff then. Will be collecting the second run of data a month from the 17th. I don't plan on having to tweek any tanks because I should have plenty of data on all of them. Plus the system is supposed to be fully automated. You can still pad low tiers easily with WN8. I think that the signature image/square thing will make it obvious which tiers are giving you your score.
  4. You just need to platoon wi... Oh right, they ruined that.
  5. Unfortunately there's nothing in the API that can give you this type of data, you're limited to games you play, or that people share or upload. You might be able to look at an uploading website, but data would be skewed towards people who actually want to upload their replay.
  6. My biggest fear with this is that they might see that as me trying to circumvent the "max requests per second" rule, which is 20/second for a server. At the moment I'm not even remotely breaking any of the API rules, even though I'm making over 8 million requests in a span of about 10 days. I could make the program go slightly faster by increasing the number of threads, as I calculate I'm only making about 12.5 requests per second at any given time. However, initially my server will only have 8 logical cores, so overloading on threads (especially because they all need to complete before any can start again) would quickly lead to diminishing returns. From what I've seen, I can pull the entire active playerbase in less than 12 days, so I could even run recents every 15 days instead of every 30.
  7. Besides the fact that his sample size is far too small (150 games for HT's and 90 games for TD's wtf is that), I think he's drawing the wrong reasoning for the conclusion of the data. He claims that all tanks should have a 20% chance of being top tier, which would make sense in a perfect system as 3/15 is 20%. However, we know that tier 10 and tier 8 are the most heavily played tiers in the game at any given time. I think you would be very hard pressed to find any given time where the proportion of tier 6 tanks was 2.33x that of tier 8 tanks, or even the proportion of tier 7 tanks 1.4x that of tier 8 tanks. According to what WG has claimed, the system attempts to make 3/5/7 games as often as it possibly can, unless it is unable to produce this scenario, in which case it will start creating 5/10 games and then just whatever it can throw together that follows the tank MM rules in order to find you a game quickly. The simple fact is that there aren't enough tier 6's or 7's playing the game for the system to make as many 3/5/7 games with tier 8 on the top as it can with tier 9 or 10 on the top. Because tier 8 is one of the most played tiers and unlike tier 10, can be bottom tier, it just makes sense statistically that tier 8 in general would have a higher probability of being the majority of the team, and with the mm's current attempt to make the majority of the tanks lower tier, tier 8 tanks tend to not be top tier. The other conclusion he draws is that HT's are being treated differently than TD's because, based on his data, it's obvious that TD's appear as the top tier more often. I think that this also has to do with uneven distributions within the MM itself rather than WG specifically punishing heavy tanks. Just as more players tend to play tier 10 and tier 8 than other tiers, more players tend to play HT's than they do TD's. The patch notes specifically state that "The matchmaker considers the number of light tanks and tank destroyers." That generally means that the number of TD's on either side should be +/- 1 of each other but tier usually isn't accounted, and that would be where most of the skewing is going on. Any TD that gets on one side is almost guaranteed to pull another TD into the match to put onto the other side. I think it's likely that TD's tend to get into the more prime 3/5/7 (Which initially would be pretty evenly distributed until the MM runs out of tier 6 and 7 tanks) games because there are fewer of them than the flood of HT's, and then once all the prime spots are taken, they still have a better chance of getting into the 5/10 MM (which would favor tier 9 and tier 8 vs tier 8 and tier 7 based on tier distribution, especially later when all the tier 7 tanks are used up) before all the rest of the HT's and just a few TD's are simply dumped into whatever the MM can come up with (Which is why in his data HT's got single tier MM more than twice as often as TD's did per game). TLDR: Tier 8 HT's get shafted more often than Tier 8 TD's not because WG did something to specifically punish tier 8 HT's, but because there are generally more of them than any other specific class/tier in the MM queue at any given time and so they tend to get thrown into whatever the MM can come up with rather than WG's "fair distribution".
  8. When people say "But you don't need 32GB of ram"


    1. Show previous comments  4 more
    2. ZXrage


      What about EU, though...

    3. BlackAdder


      People are stupid. Modern OS is smart enough to use RAM for cache even if you not using whole RAM. 

      32GB FTW

    4. Android25


      @ZXrage EU won't fit either, excel max rows are in the lower one millions. Data from all servers will work fine once the data goes into a database. When I worked with EU data in excel years ago I had to spread it out across multiple sheets to make it fit.

  9. What exactly do you consider "comprehensive data across all clusters"? Because for WNR I'm pulling all tank data from all servers of all players who have played in the last 60 days. That's a little over 8.7 million players, 6.2 million of which are coming from the RU server. The mining has been pulling (optimally) 12.5 accounts per second for the past 7 days (with only a few spots of downtime to fix some program errors) and I've got about 2.7 million accounts left to pull. The expected optimum time to pull all the data was about 8 days, but the RU server is under very heavy load around 4am eastern and the collection average goes down to almost half of optimal during that time due to the number of retries for 504 errors. If I actually tried to pull account data from every player who has account data, that would be 46.2 million accounts and my optimal expected mining time would be about 43 days, which isn't practical.
  10. A VPN would help only assuming that you can connect to it before your ISP routes you the same way. Usually this only happens when a node (or several) owned by the ISP is down. It's cheaper to route traffic to other nodes it controls or someone they have a deal with, than it is to try to go through another providers node. Usually a call to the ISP can at least give you info on why your connection is going everywhere. They are almost certainly aware of the routing. Your connection is controlled wholly by your ISP until it connects to a new network (like a VPN), so you have to be able to get from your computer through your ISP to the VPN without getting routed all over the map or it won't make any difference. The problem is you don't really know where your ISP's node is down or backed up. It would be better to try a free VPN that offers to route through the same location as a paid VPN, before going straight to a paid VPN. Also, many paid VPN's offer connection tests before you have to pay for them.
  11. It's in Java for now. I haven't even truly compiled it, just running it through eclipse until I get all the bugs worked out... I'm shocked that the API can return so many unexpected results but I think I have them all handled now. I should be writing it in one of the C languages because I'll be running it on a Windows server but Java is just so easy.
  12. All tank data on NA, EU, and SEA from players that played a game in the last 60 days (as of about 2 weeks ago) has been pulled. The file sizes are 877MB, 4.203GB, and 580MB respectively. I started pulling RU data 12 minutes ago and assume the file size will be about 3.2 times larger than EU... so about 13.4GB. Edit: RU api seems considerably slower. Same program that took 1:05 to collect 800 players tank data on NA is taking 1:45 on RU.
  13. Could someone on EU do me a favor and in-game look at the service record of Amphibiios

    It's the only account I've found while doing API stuff that has battles but no tanks. It was breaking my program until I added an exception for it.

    A screenshot of the main service record and then one of their tank list (even if it's empty) would be most helpful.

    1. Epic
    2. Android25


      Thank you much!

      Strange that it is showing tanks in game but not on the website or through the API...

  14. No, I agree, unicums tend to be able to abuse just about everything in the game... including light tanks with incorrect WN8 values ;)