Jump to content
EchelonIII

EchIII Import #5: Why Streaks Happen

Recommended Posts

Let's do an experiment.

Let's say you flip a coin with a 50% chance of landing heads.

You flip the coin X times.

What is the longest streak of continuous heads (or tails) you can possibly hope to achieve?

My answer is below:

The longest streak of heads (or tails), out of a series of X throws, you can expect is approximately that of log2(X)

More on Streaks

Here's why.

The chance of a streak of N consecutive heads is (0.5)N, this is because the odds of flipping four heads in a row, for example, would be (0.5 x 0.5 x 0.5 x 0.5), or 0.0625.

Now look at a streak of N consecutive heads, it is bound to start SOMEWHERE.

So, in a series of X flips, we have a certain number of starting points, if there are G starting points for a series, the average number of times a streak of N will occur is therefore G (0.5)N.

For example, if there were 100 start points, one would expect about three sequences of length 5 or more, because 100 x 0.03125 is 3.125. Hence, we would expect six streaks of five, three streaks of five heads, and three more of five tails, on average.

What if we had a streak of 10 in a row out of 100 starting points (i.e. 110 flips)?

The average number of streaks of ten heads in a row out of 100 is that of 100 x (0.5)10, which is about 0.1 (0.977, to be precise), doubling that, the odds of having a streak of ten heads or tails, in a row would be double that, 0.2

Let's see where this gets us.

If three players flipped a coin 109 times (109 coins means 100 start points), the odds of at least ONE of them getting ten heads or tails in a row is about 50%.

So, how do we find what the longest expected streak is?

In general, when X(0.5)N is greater than 1, we can expect at least one streak of the sort to occur, when X(0.5)N is less than 1, we can expect the streak NOT to occur. The point between these two can be approximated to log2X, to give us the longest expected streak.

Note that X is the number of STARTING points, not the number of FLIPS.

Let's use an example, if I flipped a coin 32 times, I would have 28 possible starting points for a streak of 5, 27 start points for a streak of 6, and 26 for a streak of 7.

On average, I can expect 0.42 streaks of 7 to occur, meaning that it is unlikely, but I could expect about 2 streaks of 5 to occur.


So what do we have here?

Let's assume a player determines his games purely by luck.

He plays 1000 games.

His longest expected streak would be about 10 games, and he can expect to have about two such streaks.


The Best Part:

This is if the player has a 50% chance of winning, good players drag their odds of losing streaks down, bad players drag them up

For the purposes of this experiment, we shall ignore draws

Let's say we have five players playing 1009 games (for 1000 start points), let's calculate the number of times they can expect to have a ten game losing streak (we can calculate the losing streak by NOT doubling the number)

A 40%er would expect to have SIX such losing streaks

A 45%er would expect to have just 2.5 such losing streaks

A 50%er, as said earlier, could expect to have just ONE such streak

A 55%er is unlikely to have such a streak, he could expect just 0.34 of a streak, meaning he could expect to have such a streak after about TWO such cycles of 1000 games, and even then, the odds of a streak of 10 losses after that many games would just be seven in ten.

A 60%er could expect just 0.1 of such a streak, he would expect to have just a 50% chance of having such a streak after SIX cycles.

All this, of course, is theoretical, and doesn't account for factors like playing form, but it's interesting to know. A bad player on bad form could easily be expected to have losing streaks of more than that, and a good player is certainly not exempt from the 60% chance either.

 

 

 

Link to post
Share on other sites

Echelon, how much do you think a player's tendency toward frustration and trying to force wins on a bad day influences the likelyhood of loss streaks?

 

Am I, and others, correct in thinking that an average player not taking a break when getting 3-4 losses in a row leads to bad loss streaks? Or is that just confirmation bias?

Link to post
Share on other sites

Warning math will be involved.

 

I see it all the time

"Evidence MM is rigged"

"Evidence RNG is biased"

"Evidence WN7 favors xxx"

"Game hates me I lost 10 in a row...."

 

How can you tell if the game "hates" you.  What can an average person expect from random events.

 

Now an intelligent person will say, this game isn't random and there is skill involved.  This isn't really addressed to those people.  This is addressed to those people who feel MM or RNG is biased against them and the kind of "proof" they need to establish those claims.

 

Lets start off by coin flipping/Examining Crazy's stats.  My Win Rate is very close to 50% (save your comments, I admidt I am a scrub).  Assuming all other things equal, in my roughly 10k battle lifetime what can i expect by biggest win or loss streak to be?  Well since I am at 50/50 we can model this on a coin toss.  IF my win rate is truly random due to MM/RNG etc... at 50/50 win rate, what would be evidence of me being cheated by the system.  Or another way, what kind of "Streak" win or lose could I expct to see if every game was truly 50/50.

 

The answer?  about 13!  That means I could expect a streak of losing 13 battles in a row at some point in my battle lifetime.  So is 14 proof?  No.  It is just less probable.

 

Details:



 

The streak you can expect is LOG2 (10000) or the logarithim in base 2 of 10,000 

 

But wait:   I'm a unicum...I have a 70% win rate!!  We can do that too!  The answer is between 7 and 8 losses in a row during the course of a 10,000 game career.

 



 

Actual formula is LOG(1/Loss chance) (Battles) or the LOG in base (1/loss rate) of your battles;

 

 

The bottom line is 10 games of data proves NOTHING, 10 measurements of the RNG is NOTHING.  When comiling statistics you need THOUSANDS of data points AND STREAKS CAN HAPPEN!  You cannot pick and choose your data.  Humans deal poorly with streaks.  WHen comapring statistics there are established methods known such as the t-test and ANOVA.  When was the last time you saw someone on the forums quote a p-value?

 

Never heard of a p-value?  t-test?  Anyone with basic statistics has.....

 

Hopefully this can "Arm" you with a basic knowledge that

(A)streaks can and and do happen in "random" events and they are bigger than our intuition would guess. 

(B)There are established methods for comparing random events.  IF someone is making an argument with statistics and can't quote things like the P-Value or is comparing small changes whith sample size < 1000, they are most likely winging it.

 

Other things to look for the "Alpha" or confidence interval.  95% is standard. 

Watch for over zealous use of "Standard Dev"  it makes people sound smart--has very little to do with comparing distributions.  It is also really only of use for "bell curves"

Link to post
Share on other sites

Wasn't there a formula to calculate the minimum sample size needed to do certain hypothesis testing? Could be relevant, but at this point I just have a faint recollection of its existence....

Link to post
Share on other sites

Wasn't there a formula to calculate the minimum sample size needed to do certain hypothesis testing? Could be relevant, but at this point I just have a faint recollection of its existence....

 

Power Analysis. Very important for study design. 

http://www.ats.ucla.edu/stat/seminars/Intro_power/

 

And I merged Crazy's topic into this one. 

Link to post
Share on other sites

Wasn't there a formula to calculate the minimum sample size needed to do certain hypothesis testing? Could be relevant, but at this point I just have a faint recollection of its existence....

(Sorry the above thread was prior to my arrival--I got tired of the "rigged threads lately on main forums")

 

Yes (as crab linked).  It is also important to establish you "confidence" intervals ahead of time as they directly impact your needed sample size.

 

Confidence intervals are somewhat confusing but they fall into 2 places--first you need to understand that the use of the various tests is to prove or disprove a hypothesis.

 

The "standard hypothesis" is essentially that the 2 populations are matched.

 

Type-I error is when the populations don't match but your statistics say they do. (aka a False Positive)

 

Type-II error is when the populations match but statistics say they don't.(False Negative).

 

There are reasons where you may err on the side of caution.  For example--judicially, we really try and clamp down on false positives. Our type 1 error is very very low, but at the expense of type two errors (Aka better a guilty person goes free, than a innocent person goes to jail).

 

For blood tests, we try and eliminate false negatives--its better to have a false positive for pregnancy than assume you aren't pregnant when you really are.

 

Why I say "ahead of time" is one of the ways to "fudge" your answer is to change your confidence intervals after the fact.  You can actually "Raise" your confidence levels to get things to match when they shoudln't

Link to post
Share on other sites

"But suppose you throw a coin enough times...suppose one day, it lands on its edge."


 

Link to post
Share on other sites

Well thought out and explained... however... theory meets reality.

 

I have almost 60% winrate, and 24k games, so you would expect 2 such streaks for me, and OMFG, your a wizard! I ahev had an 11 game and a 12 game losing streaks. :D

Link to post
Share on other sites

(Sorry the above thread was prior to my arrival--I got tired of the "rigged threads lately on main forums")

 

Yes (as crab linked).  It is also important to establish you "confidence" intervals ahead of time as they directly impact your needed sample size.

 

Confidence intervals are somewhat confusing but they fall into 2 places--first you need to understand that the use of the various tests is to prove or disprove a hypothesis.

 

The "standard hypothesis" is essentially that the 2 populations are matched.

 

Type-I error is when the populations don't match but your statistics say they do. (aka a False Positive)

 

Type-II error is when the populations match but statistics say they don't.(False Negative).

 

There are reasons where you may err on the side of caution.  For example--judicially, we really try and clamp down on false positives. Our type 1 error is very very low, but at the expense of type two errors (Aka better a guilty person goes free, than a innocent person goes to jail).

 

For blood tests, we try and eliminate false negatives--its better to have a false positive for pregnancy than assume you aren't pregnant when you really are.

 

Why I say "ahead of time" is one of the ways to "fudge" your answer is to change your confidence intervals after the fact.  You can actually "Raise" your confidence levels to get things to match when they shoudln't

 

I believe you meant "Confidence level" instead of "interval". Confidence level refers to the probability of false positive while confidence interval is an interval that on the average (when the experiment is repeated many times) captures the true parameter value. These terms are not interchangeable. In many often seen tests, there exists relationships between "rejection regions" and "confidence intervals". However, it's not always true.

 

Ex. Suppose we are interested in the average win % of WoT players. I flip a coin, if it lands on head I accept the hypothesis that the average win % is 50% and if it lands on tail I accept the hypothesis that it is greater than 50%. This is a 0.5 level test, a crapy test but nonetheless. A 50% confidence interval on the other hand, has to be dictated by the observed mean win % from sample such as the plus minus 3 standard deviation thing.

 

Edit: Sorry about being a Stat Nazi.

Link to post
Share on other sites

I think it also deserves a mention to say that loss streaks cause loss streaks. I know for a fact when I play poorly and lose games I should win I start to adjust my playing which causes me always to play worse. This is not just a thing about being tired, frustrated or unfocused but I think lack of skill can also cause this. When you do not understand or see your mistakes you may start to adjust the wrong things after losses keep coming in which case you just makes things worse.

Link to post
Share on other sites

No apology nessecary.  As you point out there is a possibility for confusion.  Clarity is always welcome from my point of view

 

 

Edit: Sorry about being a Stat Nazi.

 

Link to post
Share on other sites

Echelon, how much do you think a player's tendency toward frustration and trying to force wins on a bad day influences the likelyhood of loss streaks?

 

Am I, and others, correct in thinking that an average player not taking a break when getting 3-4 losses in a row leads to bad loss streaks? Or is that just confirmation bias?

 

Form matters, but the player's skill at a point in time (I've had days I play like a 50%er, then I've had days where I play like a 65%er), is going to be based around that point in time.

 

A dip in form for a superunicum may mean *only* winning 55% of his games.

 

A dip in form for me means I'm below 50% for the session.

 

3 to 4 losses in a row are pretty common things actually, I did the technical maths about it here

 

Link to post
Share on other sites

I'd also assume that running doubles promotes some loses -- you will typically be one-and-done with the tanks that you are good with. Tanks you struggle with you end up playing more games with.

 

Wonder if the opposite strategy; play until you lose works better (loop back to the start once you've played all of the tanks)

Link to post
Share on other sites

This isn't science, it's pseudo-sci-talk. The ramblings of a person obsessed with the colour purple.

 

So perhaps you'd like to back your statement up and show us why the math doesn't hold?

Link to post
Share on other sites

So perhaps you'd like to back your statement up and show us why the math doesn't hold?

 

Well if you are purple enough you can't lose; if you can't lose... no losing streaks.

Link to post
Share on other sites

So perhaps you'd like to back your statement up and show us why the math doesn't hold?

 

Wait, isn't that up to you? Moreover, trying to apply maths to a game scenario in which every little variable changes with every new game is more than stupid - as I said, you're too obsessed with being purple.

 

As I said, I've had losing streaks while playing tanks that can't carry and I've had winning streaks in stronk tonks. Your maths are hypothetical and you're trying to apply by analogy something which is 'simple' (flipping a coin) to something as complex as World of Tanks. Reminder that every game, 95% of the time, 29 out of 30 players in that match will not be the same as the previous match. Trying to explain streaks in such absolute terms without accounting for individual player skill and tank used is stupid.

 

Or, in short, just someone obsessed with the colour purple trying to come up with something innovative.

 

Besides which, the fact that after three years, people still insist on considering W/R as an objective way to determine skills or chances of winning is pretty ludicrous.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...