Sign in to follow this  
Followers 0
Ravenmaster

Effect of Platoons on Personal Statistics aka Preliminary Stat-Padding Optimizer

11 posts in this topic

I copied and pasted this post from the WoT Forums in the hopes that I could obtain some positive feedback and collaborators from a more scientifically-minded community.

 

 

1. Introduction
Many WoT players have opinions about the effects of platooning or Tank Companies on personal stats such as Winrate, Survival Rate, Efficiency, WN6 or Performance Rating.  I believe most people think that platooning in general will have a positive effect on Winrate and a negative effect on Efficiency (or any other individual performance calculation.)  (I just want to clarify here that I have no interest in knowing what your personal opinion is about these effects, so please don’t spam this thread by posting them.)
In order to get some real evidence, not based on personal experience or speculation, I designed an experiment which relates Player Skill, Platoon Size, and Communication to Winrate, Survival Rate, and Efficiency.
Due to limitations with the maximum size of posts, I'm going to have to break this into parts and I will go on to explain the entire experimental process, although it’s a bit technical at times. If you don’t know anything about experimental design and/or don’t care, I suggest you go right on down to section 7.2 (Preliminary Stat-Padding Optimizer).

2. Objectives

  1. Determine the effect of platooning on personal statistics.
  2. Determine what part of that effect depends on teamwork and what part depends only on having other players of a certain skill level on your team.
  3. Find equations that show WoT players how to improve their stats only based on the way in which they enter a battle, and not on their tactics within the battle.
  4. Generate interest within the WoT community in order to perform more robust, more complete experiments in the future.



3. Experimental Design



3.1 Factors (Independent Variables)      
                                
Three factors were taken into account for the purpose of this experiment:    
      
sknqyRu.jpg                        
                                                  
  1. Platoon: The number of people in the platoon                                                                                    
  2. Communication: The tools used for communicating with the platoon and the rest of the team. No Communication means that no chat or hotkeys such as “Help!” or “Follow me!” are allowed.                                                    
  3. Skill: Skill for each player is based on Efficiency, not because I think it’s the best measure of individual skill, but because World of Tanks Statistics (WoT Statistics) was used to gather data in this experiment, and it shows Efficiency directly.  The Efficiency used to determine the Skill level of each player was the average between their overall Efficiency, and their Efficiency in the tank they would be using for the experiment. All players within the same platoon needed to fall into the same Skill range.

              

3.2 Responses (Dependent Variables)
WoT Statistics was used to keep track of the results of each match, three of which were saved into excel spreadsheets.

  1. Winrate                                (WR)
  2. Efficiency                             (Eff)
  3. Survival Rate                      (SR)

Besides these three direct responses, 3 more relative responses were calculated by subtracting the results of each sample from the average results from the same player over all samples:

  1. Relative Winrate              (Rel WR)
  2. Relative Efficiency           (Rel Eff)
  3. Relative Survival Rate     (Rel SR)

3.3 Design Matrix

A Box-Behnken experimental design matrix was created, with 2 replicas of each sample.           
rWiOLSB.png



4. Experimental Procedure


 

4.1 Recruitment

Because one of the factors to be taken into account is skill, it was necessary to select players based on their Efficiency.  6 players, two from each skill level, were chosen to participate in the experiment. Each Good or Bad player performed 4 samples, while the Average players performed 6 (see the design matrix).  A special excel spreadsheet was made for each participant with instructions and a format for capturing data.

4.2 Description of a Sample

Each sample in the experiment consisted of playing 10-20 battles with the appropriate Platoon and Communication levels, taking into account the following considerations:

  1. The sample should be run until 10 battles are completed. In case a sample of 10 battles ends with less than 2 victories or less than 2 losses, the players should continue playing and recording data until at least 2 victories and 2 losses are obtained or they accumulate 20 battles.
  2. All battles for all samples done by the same player must be performed with the same tank, crew, modules, equipment, consumables, ammunition and camouflage.
  3. All battles of the same sample must be done with the same platoonmates, and they must be using the same tanks.
  4. The results of any battles in which one platoonmate is severely affected by connectivity or other technical issues should not be recorded.
  5. All platoonmates must be within the same Skill range.

 

5. Results



The final results of each sample, which are the average Winrate, Efficiency and Survival rate over the course of 10-20 are shown in the following table:                                        

iWy7Evs.png



6. Analysis


For each response, a number of regression models were tested to determine the relationship between the factors and responses.  The models used took into account the linear, quadratic and interactive effects.  The following table shows the coefficients and P-values of the best models found.                                                                              

HWWtfnZ.png
 

6.1 Regression Models

Using the coefficients from the table in 6.1, we have the following equations to predict each response based on the factors, where:

KUGsVe0.png
 

6.2 Table of Predictions

Using the aforementioned regression models, the following table was created which shows the expected outcomes for each possible combination of Platoon, Communication and Skill.
                                        
ezK5TGX.png

 

7. Conclusions

            7.1 Observations



In the cases of Winrate, Efficiency, Survival Rate and Relative Winrate, we can conclude with over 99.9% certainty that at least one factor affects the response. In the case of Relative Survival Rate, there is still good evidence that shows that the factors affect the response, and there is not enough evidence to suggest that Relative Efficiency is significantly affected by these three factors.
This means that by playing in a platoon of a certain size or communicating with their team in a certain way, a player can significantly affect their Winrate and Survival Rate, but their personal achievements (or at least their Efficiency) do not change enough to distinguish between a real improvement and the natural fluctuation from one battle to the next.
It is interesting to note that Communication is the only factor that consistently appears with a significant quadratic term.  This means it has a certain parabolic behavior. Based on the Table of Predictions, one can see that the central level of communication almost always gives the worst results.  This means that it is better to ignore the chat completely than to read it and type.  Speaking may be better or worse than not communicating, depending on other factors.  In my experience, this result makes a lot of sense.  When typing, one can lose critical seconds if an enemy suddenly appears, can crash into their allies causing damage, death or annoyance to either or both, or could simply not arrive to the scene of the battle.  Reading messages can also have negative effects, since it is much more common for people to criticize or insult you than to say anything useful.
Another noteworthy observation is that Efficiency depends only on Skill, and that the predicted Efficiency levels of each Skill level falls within the appropriate ranges (except for Good players, who were a little bit above their range).  This means that it is very likely that players with a certain Efficiency will continue playing with a similar Efficiency.
Finally, it is interesting to note that Winrate and Survival Rate are very closely related. Where one increases, the other does too.  Obviously it is much easier to contribute to your team’s success if you are alive.



            7.2 Preliminary Stat-Padding Optimizer


First of all, I want to define what I mean with the word “Stat-Padding,” as it may be different from your own personal definition. Stat-padding to me has neither a positive nor negative connotation. It simply means playing in a certain way in order to increase certain personal statistics.
By this definition, anyone who plays without paying any attention to statistics is not a stat-padder. Anyone who tries to increase their Winrate, Survival Rate, Hitrate, Damage per game, or any other stat or who hunts for specific medals is a stat-padder.  Anyone who brags about their stats or judges people based on stats is probably also a stat-padder.
The purpose of a Stat-Padding Optimizer is so that players may choose which stat they want to pad and find out how to do it.  There are many people who, for example, are happy with their Winrate, Survival Rate, Average Exp per Game and Efficiency, but try as they might, can’t increase their WN6. A Stat-Padding Optimizer would help them to focus on that stat.
For this experiment, we focus only on Winrate, Efficiency and Survival Rate, in terms of Platoon, Communication and Skill. Since each player cannot adjust their Skill without months of practice and certain innate ability, I’ve divided the Table of Responses into three parts, one for each Skill level.

The following table is for players with under 800 Efficiency:  
oCEKg4q.png                                    
According to this table, the best thing a bad player can do is play alone and ignore the chat.  The second best thing they can do is to play in a platoon of three, but have good voice communication.  In case they only have one other friend online with whom they want to platoon, they should ignore them and the rest of the team. The worst thing a bad player can do is platoon with two other bad players and not talk to them.

The following table is for players with 800-1200 Efficiency:      
GopElt6.png                                
The results for average players are similar those for bad players. The best thing an average player can do for their winrate is play alone and ignore chat.  If they want to have better survival rate, they should play in a platoon of 3 with voice communication. In case they want to play in a platoon of two, voice communication is recommended. If you are an average player and don’t have a microphone, platooning will likely have detrimental effects on your stats.

The following table is for players with over 1200 Efficiency:        
Qbm6MxE.png                              
Good players are the only ones who can really increase their stats by platooning.  The best thing a good player can do is platoon with two other good players and talk to them.  The second best is to platoon with one other good player and talk. If playing solo, a good player should ignore chat, just like everyone else.  Playing alone and typing will greatly decrease Survival, while playing in a platoon of 3 but not talking will reduce Winrate.

In general, the most important conclusions to be drawn from this experiment are:
  • Never platoon without a microphone.
  • Typing gets you killed.
  • You have to be good before you can stat-pad effectively.




            7.3 Sources of Error

  1. The sample size of 10 battles is too small. We knew this from the beginning, but it’s hard enough to convince people to play 60 consecutive battles in the same tank, and this is only a preliminary analysis.
  2. In the experimental design, the sample “Platoon 1, Communication 2, Skill 2” makes no sense. You can’t communicate with your team by voice if you are playing Solo. This sample was taken as “Platoon 1, Communication 1, Skill 2”, meaning that the player was communicating with their team by keyboard.
  3. Efficiency fluctuates enormously from one battle to another. It is not strange for a player with 600 Efficiency to gain over 3000 in one battle, and one can gain over 16000 simply by resetting the cap in the last seconds.  This is part of the reason why there is no notable change in Efficiency based on Communication and Platoon levels.  More consistent stats such as hitrate would probably give more satisfactory results.
  4. Almost all the samples had spectacular results for Winrates.  The regression models predict winrates up to 98.7%, which is beyond anyone’s reach over a large number of battles. Bad players play solo without communication all the time and definitely don’t get 74.4% winrates. Since the experimenters had such amazing results, the predictions are skewed to show higher responses than we can really expect. Nevertheless, I’m pretty sure these tendencies do exist, just not to the magnitude expressed in the table of predictions.
  5. When almost all the samples had been done, the 8.6 patch came out, so some samples were done in 8.5 and others in 8.6.  This almost certainly added error to the results, considering the fundamental changes in gameplay mechanics that came with the new patch.


            7.4 Future Projects

This experiment was done with a small group of players over a limited time, but the results do give us some useful information.  With more participants, more time, and larger sample sizes, we could create much more robust experiments.  Here are a couple of suggestions:

 

 

 

 

  1. Stat-padding optimizer taking into account all the responses of output given by WoT Statistics, and all the personal performance calculations that can be derived from them.  Possible new factors can include Vehicle Class, Tier, Use of Premium Shells, Platooning with players of other Skill levels, or overall tactics (like playing aggressively or defensively).
  2. A credit optimizer could also be established by taking into account net credit gain depending on whether or not you use consumables and premium ammo at different tiers.
  3. Experiments could also be done on individual tanks to determine which gun gives better results or how much free experience is gained when grinding a new vehicle depending on whether it was played from stock or all the top parts were immediately bought with free exp.




8. Request from the Community

    The purpose of posting these results on the WoT forum is to obtain feedback from the more scientifically-minded WoT community, regarding which experiments would give the most useful results and how to design them.
I am also looking for players of all skill levels who are interested in participating in further experiments. You can either respond here or PM me.
I am open to constructive criticism, inquiries and revealing more details about any part of the process. Having said that, I know this forum is filled with trolls, so if you don’t want me to permanently ignore you, please don’t do any of the following:

 

  1. Tell me your personal opinions on platooning and/or stat-padding, unless you are doing so as a preliminary to suggest an experiment.
  2. Criticize the experiment unless you read the whole thing and both understand it and know what you’re talking about. (A hint, if you don’t at least have a university degree in some field related to math, science, or engineering, you probably don’t know what you’re talking about)
  3. Make fun of the wording I used.
  4. Claim that I said something I didn’t or that I didn’t say something I did.

Finally, I’d like to thank everyone who participated in this experiment. Their names will remain anonymous, but they know who they are.
-Ravenmaster

Share this post


Link to post
Share on other sites

The fact you're using efficiency instead of WN7 already has me doubting the legitimacy of this. It was already shown to be a poor metric years ago, and replaced by many stat-tracking sites with WN7/8. 

 

The sample sizes are way too small as well. 

 

Some examples of how badly skewed the results can be. 

 

My first 30 T57 games had 3400 DPG, a few hundred games later, 4450. 

My first 30 T110E4 games had 6000~ DPG, a few hundred games later, 4850. 

 

 

Share this post


Link to post
Share on other sites

The fact you're using efficiency instead of WN7 already has me doubting the legitimacy of this. It was already shown to be a poor metric years ago, and replaced by many stat-tracking sites with WN7/8. 

 

The sample sizes are way too small as well. 

 

Some examples of how badly skewed the results can be. 

 

My first 30 T57 games had 3400 DPG, a few hundred games later, 4450. 

My first 30 T110E4 games had 6000~ DPG, a few hundred games later, 4850. 

 

Yes, I am well aware that Eff is not the best measure of personal skill, but the ranges I used were so enormous anyway, that I don't think it makes much difference.

 

Also, I agree about small sample sizes.

 

Let me clarify right now that this was a preliminary experiment done with very few volunteers over a limited time.  The purpose is not to give conclusive results, but to generate interest and recruit participants both for experimental design and data collection.

Share this post


Link to post
Share on other sites

So if I were to participate in your experiment, what exactly would I need to record?

I could do:

• Platoon: A platoon of 3

• Communication: Keyboard communication

• Skill:

It seems that I somehow exceed your skill scale though..?

I'm interested because I rarely speak while platooning. It doesn't really seem that typing gets me killed either.

Share this post


Link to post
Share on other sites

A future experiment would not take into account the same factors.  I think the communication one is not really necessary. I'd much rather take into account factors such as tier and tank class.  Skill ranges can be adjusted.  If platoon were a factor, you'd need to do platoons of 3, 2 and solo.  Some people express interest in multi-skill platoons, which bad players would gladly sign up for and unicums would not.  In any event, I don't think that's a valid stat-padding technique because if you're bad and not participating in the experiment, it is very unlikely that good or better players will platoon with you anyway.

 

Anyway, are you actually interested in data collection, or were you just joking?

Share this post


Link to post
Share on other sites

For skilled players

while playing in a platoon of 3 but not talking will reduce Winrate.

 

Me, Kewei, Verilogus, Banzai, and geraldjack used to platoon all the time without any communication device and almost never type much in chat.

Share this post


Link to post
Share on other sites

For skilled players

while playing in a platoon of 3 but not talking will reduce Winrate.

 

Me, Kewei, Verilogus, Banzai, and geraldjack used to platoon all the time without any communication device and almost never type much in chat.

 

Those results basically apply to the 6 players who participated in the experiment.  If they are a good representation of the entire WoT community, then these results can be applied to everyone. 

 

Of course, you're talking about some of the top players on the server. They are outside the range of Skill level 3, so they'd have different results.  If you're interested in a Stat-Padding Optimizer which takes into account your skill level and that of your friends, I'd welcome your participation.

Share this post


Link to post
Share on other sites

I think this is a very interesting start, however, I do believe there are a couple things you should do to simplify the experiment if you plan on taking things further. I am not a statistician, do not have a degree in math (I failed calc miserably in high school and college), but I know enough about armchair statistics, economics, and game theory that I can understand the difficulties of capturing differences between variables in such a dynamic environment. I think your conclusions from the limited data set show an inkling of what a larger sample size will get you, however, the sample size is so small and quantity of variables so large that the variance on your results is going to fall in to an unacceptable range.

 

Currently, you are attempting to count for three variables, and trying to suss out the difference between all three.  I think if the experiment is to be continued, you will acheive better results if you focus on a smaller set of variables, and move from there.  I actually really like the dynamic of comparing solo to duo to trio, and comparing the differences in performance over time with different skill levels.

 

In order to make the results more meaningful, I think you would be better off aiming for fewer variables and a much higher count of trials - as you noted in the OP, this is not going to be easy, and some of the results are going to merely display the obvious (Bads platooning hurts them, averages will give mixed to slightly positive results, goods will show a marked increase in WR, and unicums would show a drastic increase in WR).  I do feel that your categorizations are not correctly placed, too.

 

My recommendations are as such to reduce the number of variables:
 

Have every single person in the study use the same tank.  It should be a tank that provides a more stable MM environment, too, so one group doesn't get boned while another group gets to roll over and crush everyone.  So, IMO, the tank that should be used in the study should be a preferred MM one that many people own - my recommendation would be the IS-6. With T8 preferred MM, it gives the players a strong tank that is guaranteed to be in an environment where it can be effective.

 

The rationale for that is several things:

 

The tier 8/9 environment is not 90% bad players, so the mediocrity of average players will be less mixed, IE, average players can ROFLstomp in tier 2s, or get squished in tier 10s, and everything in between.  Comparing the results of average players in T18s with good players using IS-7s is just adding unnecessary variables.  Furthermore, the players at tier 8/9 have more experience and skill than those at low tiers, and the average player faced in that environment is orange/yellow/green, rather than the sea of red at low tiers.  In other words, you're comparing players in one of the most commonly played environments in the game, and eliminating variables in the process. It will let average players be average, bad players be bad, and great players be great.

 

So, there is one variable that you did not control for that needs to be eliminated somehow.

 

 

Next:

 

Forget communication for now.  I appreciate the concept, however, you should simply mandate that platoons use TS (Kitty TS should be fine, we have tons have space).  Again, this is for simplification purposes, and another study can be done looking at multiple variables once you get the basics down here.

 

Also, forget multi-skill platoons for now.  Why?  It's a complex variable.

 

So, IMO, you should begin by looking for one thing, and one thing only: how much platooning solo/2/3 affects win rate in a consistent gameplay environment, given platoons of variable skill levels.

 

Once you establish that, you will have a real baseline to compare to.  I would recommend requiring the use of WOT Statistics, and show a screenshot of recent battles with WN7 enabled in the results for all players involved.

 

Then, you take all of the "bad, solo" groupings, and put them in one pile. You can have someone play 20 games, but that is still an inadequate sample size to be included as a data point.  So, put it in one pile with all the other "solo, bad" results.  In order to pool them effectively, you will want tighter skill groupings.  I would recommend using last 1000 battle WN7 scores, and aim to keep them +- 100 for each skill range you choose.  With tier 8 as the target, you should probably aim for something like 600-800 (bad), 1000-1200 (average), and 1400-1600 (good), and you may want to consider an 1800-2000 bracket too.

 

Again, the tighter groupings of WN7 (and slightly higher than yours) is to normalize for the more experienced environment that the test takes place in.  Most people in tier 8-9 battles have several thousand battles under their belt - and an average player with 5k games is much better than an average player amongst those with 2k games.

 

Pooling all of the results, rather than looking at an individual sample of 10 or 20 games will enable you to more easily take in results from players.  If a platoon plays ten games together for your study, that is useless on its own, but as part of a larger pool (Which is made more useful by selecting for tank and WN7 much more carefully).

 

 

From that point, you can then run another series with the same basic controls but modifying one of the variables (IS-6, no communication), and you will have a much more significant sample size to compare against as you will have more a more homogeneous data set.

Share this post


Link to post
Share on other sites

Anyway, are you actually interested in data collection, or were you just joking?

Well, this depends on what data I would need to collect, a question that I asked earlier as well.

Share this post


Link to post
Share on other sites

I'm keeping tabs in a thread on the EU forums. Battle results are posted there too, but it's hardly anything scientifical. I'm doing it more to get some pressure behind me doing well, since I hit a ceiling in performance recently.

 

Results so far are too small sample size to draw conclusions from, but they are surprising nonetheless...

 


Total Stats:

Total games played: 259, won 169 (65.3%)
Solo games played: 170, won 117 (68.8%)
Platooned games played: 90, won 52 (57.8%)

 

http://forum.worldoftanks.eu/index.php?/topic/269935-the-influence-of-platooning/

Share this post


Link to post
Share on other sites

Having Platooned with Servios a few times, I can tell you the lack of voice chat is not an issue; most purples can read a map and the behavior of other tankers quite well and can usually tell when to push or hold.  I think the last time I ran with kilgor and Servios we went 13 and 2, each of us pulling over 2.5k dpg in T8s for the night.  Basically, from what i've seen of late, the pubs are now so bad that a platoon of purples can easily pull 80%/2K+ wn7 on the night even in a less than optimal platoon build.

Share this post


Link to post
Share on other sites
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.