Jump to content

SteelRonin

Verified Tanker [NA]
  • Content Count

    7
  • Joined

  • Last visited


Reputation Activity

  1. Upvote
    SteelRonin reacted to SolarflareGrenraven in Hi again, noob here   
    Hi Steel,
    It kinda depends what sort of info you are looking for really, are you new to the game or just new to the forums?
    If you are new to the game and are looking for tips on how to improve the mechanics / vehicles sections will give a good insight into the game.
    Alternatively if you are looking a complete beginners guide to the basics its something i was planning on doing on my YouTube channel soon.
    Anyway welcome whichever and if theres anything i can help with let me know.

    Regards.

    Solar
  2. Upvote
    SteelRonin reacted to SaintLaurentius in Hi again, noob here   
    Welcome! Serious-face enabled discussions is where its at.
  3. Upvote
    SteelRonin reacted to RichardNixon in WN9 candidate prototype   
    Introduction
    This is a WN9 candidate based on an idea of bjshnog's from a few months back, shortly before I quit forums to finish another project. WG are still ignoring requests to add assisted damage to the API, so it's not ideal, but I think it's still a sufficient improvement to be worth posting.
    The general idea is that instead of using a full set of expected values for each tank, you have a single set per matchmaking type, or in a simplified version, per tier. Per-tank adjustment is only added at the last stage. The principle (for what it's worth) is that the value of a point of damage, frag or spot towards winning the game is the same regardless of which tank you're driving.
    The practical differences are small for most tanks, but there are significant effects on the two problem classes: Scouts and arty. For scouts, spots are naturally better rewarded, while damage-padding is less rewarded. For arty, the big random chunk of WN8 provided by spots vanishes.
    Improvements from other WN9 candidates were also included: True-recent expected values, per-tank skill-scaling and a linearised formula. These account for the vast majority of the advantages over WN8.

    Method
    Part 1: Expected values
    I based the initial expected values set on EU tier averages, although there's nothing special about those. With this method, you can effectively implement different formulas depending on tier and class by varying the expected values. For example, if frags are relatively important in the mid tiers, you can drop the expected frags value for mid tiers. I was lazy and just shifted scouts by a tier.

    Part 2: Formula
    I did a fair amount of formula testing with my custom multiple linear regression solver. To avoid repeating WN8's primary mistake, platoon-padded players were filtered out using a model based on topgun, BIA and CC medals. Each component was automatically pre-linearised to avoid tracking the slight non-linearity in contribution vs winrate.
    Rather than chucking all the data into one test, I ran separate tests on various tiers and classes, with the following highlights:
    Damage totally dominates frags at high tiers, but frags and damage have roughly equal importance at tier 6. This is probably because high tier tanks see a far smaller hp variance than lower tiers. Cap points have a strong influence at high tiers and for scouts, although not for mid tiers. I'm not going to advocate including them, but it does suggest that the pathological fear of the cap circle you see in high tier games is counter-productive. Scouts really like frags*spots, and to a lesser degree dmg*spots. The correlation is actually quite good these days (R2 ~=0.83), not far below mediums. Mediums also like frags*spots more heavies do, but the difference isn't huge. Artillery just wants damage & frags, nothing else. Correlation is pretty bad, probably because artillery had a big performance dive since the missions started. Defense tends to come and go, and it's never large. Frags*defence is generally preferred to plain defence. Parameter caps are generally counterproductive as long as the components are roughly linearised. Now, formulas aren't magical. You can mix and match various components and the correlation doesn't change much as long as they're roughly linearised. There are also reasons to prefer a formula with an inferior overall correlation:
    Damage is far more accurate than other parameters over short runs. Important if you want a good per-tank stat. Trivially-paddable parameters should be avoided, especially if they may harm the game. Obvious examples are cap points and survival rate (which is generally negative anyway), and it's another reason to lessen the use of frags. You may want the formula to be more accurate for players who care about the result. Hence the formula may prioritise accuracy for high tiers and medium tanks at the expense of mid tiers and artillery. I compromised on 0.7*rDmg + 0.25*sqrt(rFrag*rSpotC) + 0.05*sqrt(rFrag*rDef), although there are likely benefits from class-splitting the expected values for scouts and arty. rDef should probably be capped and scaled to prevent it blowing up for tanks with a handful of games.
    Single battle formulas (as used by mods) shouldn't have multiplications in them because they make the result too noisy. Something like 0.65*rDmg + 0.12*rFrag + 0.15*rSpotC + 0.03*rDef would be fine.

    Part 3: Per-tank adjustment
    Only two values are used per tank, an offset and a scale. They're not easily comparable cross-tier, because they also encode tier difficulty variations. Within a tier, tanks with a higher offset work better for bad players, while tanks with a higher scale work better for good players. If both values are low, then that tank is simply bad.
    The values are generated using a regression method from a 6-week sample of 300k high-activity EU players. Earned XP values are used to exclude stock tanks and to perform some mild crew-skill & player learning adjustment. Non-random battles are filtered to an extent, with a reduction to around 1% of all battles. Most tanks from tier 5 upwards had sufficient data, although a handful had values substituted manually: One version each of the KV-220 and Pz V/IV, plus the IS-5 and ISU-122C.
    Various issues with the per-tank adjustment:
    Tiers 1-3 (especially 1) are much more difficult since the newbie protection was introduced. Older sealclubbing performances will be overrated. There is an option of using older data for the low tiers instead. Arty have been doing relatively badly since the missions, so arty played previously will come out relatively high. As this is based purely on recent data, performances in pre-nerf tanks will be overrated. The T-55A and Object 260 are probably overrated, due to players switching back from completing missions to padding damage.  
    Testing
    To test how good various metrics are as overall metrics, I used the following method:
    Throw away accounts with significant platoon-padding. Calculate the result of each metric for each account. Place the results in bins split by tank-adjusted winrate. Calculate the standard deviation of the results in each bin. Scale the results for each metrics so that they're roughly comparable. Lower standard deviations mean that the players with similar "skill" have less numerical variation in that metric. Tank-adjusted winrate is used for "skill", because it's probably the best account metric for solo players. When scaling the metrics, I left the zero points alone, so to get an idea of relative error you need two graphs:

    Notes:
    WN8 and especially WG-PR should gain an advantage here from using winrate in their formulas (and proxies such as survival rate and base XP), at the expense of accuracy in platoon-padding cases. The vast majority of high-skill players are somewhat platoon-padded. I used true battle-weighted WN8 and WN9 rather than the usual fudge, because all the other metrics involved are battle-weighted. This may give both metrics an unfair advantage over WG-PR: I didn't check which weighting method resulted in higher error. Using rWin (WN8 component) instead of tank-adjusted winrate made little difference to the relative position of the metrics, although it helped WN8 slightly at the middle of the skill range. All errors increased with rWin, because it's an inferior metric. Zero points are optional. I chose zero-means-zero for WN9 to retain linearity for player comparisons, even though players who do nothing don't really exist. WN8 uses something like zero-means-average-bot. It wasn't possible to get accurate top-end values for WG-PR because you can't calculate it for a subset of an account. Observations:
    WN8 is an increasingly poor metric with increasing skill, as expected. On absolute error it cheats at the low end, because the non-linear terms and zero-point caps drag the bad players closer together. TEFF is terrible in the middle, but better than WN8 at the top end, as expected. WG-PR runs pretty close to WN8 on relative error, except at the low end where it's the single best metric. This is probably because the bottom of the graph is heavily populated with deep campers, and so access to spotting damage and direct use of winrate are big advantages. The WN9 candidate is a much bigger improvement than I expected. Ideally I'd also do a per-tank error comparison, but because I changed my data collection methods a few times I don't have sufficient data for it. Eyeballing recent tanks on various accounts suggests that the per-tank error is very low compared to WN8.

    Implementation
    I threw together some terrible noob javascript for an experimental WoT stats site, which you can access here. Have a look through your per-tank results for anything weird:
    http://jaj22.org.uk/wotstats.html
    Feel free to dig out the javascript and data file if you want it. The code's kinda hidden amongst a pile of other metrics, so I'd do a readable version eventually.
    The account WN9c2 values only use account/info and account/tanks, so they have essentially the same overhead as WN8. The tier and tank values use tanks/stats instead. Ideally the tier & tank values would only use random data, but the API's so broken that you end up with a lot of negative junk, so currently they include clan, team & stronghold battles as usual.
     
    Further work
    Tweak the expected values to improve some tank classes. Make a decision on what to do with tiers 1-3. Cap & scale defense to prevent it blowing up on low game counts.
  4. Upvote
    SteelRonin got a reaction from ViktorKitov in Hi :3   
    I like purple and blue colours
×
×
  • Create New...