Adding Infinite Numbers: A Probabilistic Approach to Mathematical Philosophy

19 min readOct 31, 2021

It’s really hard for someone who loves mathematics to get bored. That’s because mathematicians can play fun games when they get bored. For instance, they can play coin flipping with a quarter they take off from their pocket till they land heads and try to figure out how long it takes them to play the game. They can even improve the game and give themselves a point for each hand they play. Let’s suppose that you threw the coin, and you got tails. In that case, you earn a point and throw the coin repeatedly until you get heads. So, to earn 10 points, your first nine throws should be tails, and your last throw should be heads. Or, to earn 3 points, you should respectively get tail-tail-head (TTH). From now on, I will refer to tails with T and heads with H.

Theoretical and experimental probability: Coin flipping

Of course, by the way, we accept that the coin we are using is fair, which means that the probability of getting heads or tails is equal, and 50% as agreed. However, we could only assume the coin is fair because we don’t know if the coin is fair or biased. We are only told that the coin is fair, and we haven’t proved it yet. So how can we be sure that the coin is fair? In other words, how can we prove that the coin we are betting a flip is fair?

First of all, since we are very simple creatures as humans, we can never understand whether a coin is fair or biased by just looking at it. Any coin might be slightly lumpy, with a highly non-uniform distribution of weight; in fact, a spinning coin tends to fall toward the heavier side more often. For instance, flipping an American quarter is pretty close to being fair, but it is biased toward whatever side was upside when the coin was thrown into the air. Certain mathematicians have proven that the ratio is 51/49, not 50/50, which comes in handy to some prisoners in America who know this fact and use the information to earn money by playing the coin-flipping game.

We might try to put the coin upright because it couldn’t stand upright and topple if it is biased. However, if the non-uniform distribution of weight is very low, the coin can still stand upright.

A funny scene from the Turkish movie, Korkusuz Korkak

Therefore, we might decide that the coin is fair or biased by throwing it into the air, but can we conclude its fairness by throwing it flicking it one or two times? If tossing the coin two times is not enough to reach a conclusion, how many times do we need to flip the coin to decide if it is fair or biased? Incidentally, we are not tossing different coins simultaneously. We are throwing one coin more than once. For instance, if we are planning to flag the coin five times, we throw it into the air five times separately.

We cannot decide upon the usefulness of this mindset without trying to throw the coin, which is why we need to do it more than once and observe. Our logic says that if the coin is fair, then getting a heads and a tails would be enough to tell on the coin’s fairness. If we throw a coin twice, our odds would be TT, HH, TH, HT, and the probability of each event would be 1/4. So, the probability of getting one tails and one heads is 2/4 or 1/2. This still doesn’t prove a point since our chances of being correct are 1/2. Hence, if the coin is fair, but we get two heads (HH) or two tails (TT), we would think that the coin is biased. This statement is valid for its reverse position, too. For instance, if the coin is biased and we flip it in the air two times, we could still get one heads and one tails, and we might think that the coin is fair.

Unfortunately, the method of throwing a coin two times doesn’t work as well.

Let’s try to flip the coin four times. Then if the coin is fair, the number of events we get will be 16, and the probability of each event will be 1/16.

TTTT HHHH TTTH HHHT TTHT HHTH HTTT THHHTTHH HHTT THHT HTTH HHTT TTHH THTH HTHT

This time, to decide that the coin is fair after four flips, we need to get two heads and two tails. There are only six events when we get two heads and two tails. So this time round, the chance that we’ll be correct is 6/16 or 3/8, whereas, if we flipped it twice, the odds would be 50%. As you can see, the more that we flip the coin, the less likely we are to conclude that the coin is fair. If we flip the coin 1000 times, the chances of getting 500 tails and 500 heads will be almost 0.

So, we can not find the answer to this method as well.

If you are a physicist, you might think that the center of gravity of a coin would tell us if our coin is fair or biased. However, this is also not the right way to answer our question because it wears out whenever we flip the coin and its center of gravity changes. Don’t forget that even a minuscule change affects the center of gravity. Besides, if the center of gravity is in the middle of the coin, how can we find the midpoint of the coin to begin with?

There are known to be some rules in Casinos which are hung on the walls to bring attention — just like the non-amendable articles of a constitution. Foremost of all, it’s written how fair a casino dice should ideally be: transparent, symmetrical, and homogeneous. As the structure of a dice necessitates, one of its sides is drilled once, and another six times. That’s why they instill more substance equal to the density of the whole dice on that one side to maintain randomness and uniqueness. Secondly, dice are made up of neat, sharp corners to ensure their roll chaotically when they bounce on a table. If the edges were curved, dice would move like a soccer ball, and their probability of landing on a side would be decreased. Finally, the dice should be changed almost every day with brand new versions because even a negligible change could have a significant effect on the randomness of its outcome when rolled.

In summary, theoretical thinking does not help us reach our conclusion here. In real life, we could never understand whether a coin or a dice is fair or biased simply by flipping it — which means that a fair coin or a dice is a purely cognitive concept. Every dice or coin is biased: the fair dice or coin doesn’t exist. Even if we used computers to produce dice, they would still wear out whenever we rolled them and become biased.

Let’s give another example. Let’s say you have to conduct an election survey in a country with 50 million voters from which you need to choose 1000 people. When selecting the people for the survey, dozens of factors should be considered, such as gender, age, jobs, income, residence, etc. All of the 1000 people selected can vote for only party A, or none of them can vote for party A. Mathematically, this is possible. However, we have never seen that kind of result in real life up to this day because the probability of getting 0 votes from an election is extremely low in a country with 50 million voters.

For example, if 1000 people participated in our survey and 450 people voted for party A, we would expect party A to get 45% of the votes from the elections. However, it is almost impossible to get exactly 45% of the votes from the elections. Party A would get 45.00000000001 of the votes, which is different than 45%. That is why it is unlikely to yield an exact result by trusting 1000 people out of 50 million. Then how do survey companies survive? If I were a founder of a survey company, I would instead report an exact result by saying that part A would get between 0% and 100% vote turnout from the elections.45 percent of 50 million people will be doubtful to choose Party A because it is impossible to obtain the results of 1000 people out of 50 million people. I want to know all 100 percent if they ask me. Then I would say that the rate that Party A would get is between 0 and 100 percent. Therefore, the survey company estimates “between 40 and 50 percent” instead of saying “45 percent” — through which the probability of holding the election predictions increases.

Similarly, survey companies don’t share exact results and apply “standard deviation” to their data. For instance, they say that Party A will probably get votes between 40% and 50%. Then the probability of their predictions very much increases.

In mathematics, such calculations and deviations are made with a remarkable mathematical fact called the Gaussian distribution or Normal distribution. I met the Gaussian distribution for the first time in a museum. There was a toy called Galton Board, and there were thousands of pins that followed a particular geometrical pattern on a black surface. When you dropped marbles onto the top pin, they bounced their way down to the bottom. When a marble hit a pin, it bounced either left or right with an equal probability of 50%.

The Galton Board: 3000 steel balls fall through 12 levels of branching paths and always end up matching a bell curve distribution. | Source: Quanta Magazine

Additionally, each marble followed a different path because of the unpredictability. Watching random moves of the marbles was very satisfying. But what made the Galton Board unique was, whenever I dropped thousands of marbles, I almost got the same shape with minimum differences! I couldn’t predict the exact location of any marble, but I could make out the exact form we were getting. It was called the Normal Distribution, and it was “supercalifragilisticexpialidocious.” We were dropping marbles randomly, marbles were bouncing randomly, but we were always getting the same shape!

Of course, there is a mathematical law behind the incident, and it is surprisingly related to the human body. We know that every human’s nose, ears, eyes, and mouth have a particular place on their face. Yes, some of us have a smoothly-curved nose; others have arches, but the locations of our eyes and ears are apparent. If there were no normal distribution in nature, all people would look like the characters in Picasso’s works, with our legs and arms placed in disparate locations on our bodies.

Examples of Picasso’s works: Left: Woman sitting in an armchair | Right: Sitting woman

There is a beautiful project idea that you can do with your child or students!

Anyway, let’s go back to our topic. We were planning to flip a coin in the air until we got heads. We also wanted to play this game as much as we could and predict its end.

Of course, the game could end if we got heads, or it could end in 5 or 32 moves. The number of moves varies from person to person: I could get heads after ten throws while you could get it after three throws. Or if someone is very unlucky, they could get heads at their first throwing. So we can neither have an average number of throws nor any expectations.

To do mathematics, however, we do need rational expectations. Where there are no expectations, there is no science.

For example, is seeing snow in the middle of summer possible? Yes, it is. But most people’s expectations would be the opposite. If someone comes and tells you that there will be no snow on August 23, 2020, you won’t say, “Oh, you are a genius! How did you know that?”

Putting everything aside, we need to think about the end of this game: will it end, or not? Mathematically speaking, our game can last forever if the coin we flip consistently results in tails. It is possible! That’s why we have to figure out whether there is an end to this game or not! Let’s say that our math teacher gives us homework, in which we have to flip a coin and always get tails. We give it a try at home: we flip a coin and get heads, and we are immediately done. But maybe if we played another turn, we could keep going for a few more rounds before we get heads. So we restart and flip the coin again, and we get TTTTTTTTTTH respectively; the game is still over. No problem! We can try again. So we embark on a journey where we always get tails (just assuming). But how can we play forever? Death is inevitable, and so everybody dies! As you can see, we can only guess or talk in theory. In real life, getting tails forever is impossible!

To conclude from everything we talked about, it is indeed more logical to calculate the average number of coin flips that would result in heads. Firstly, if we approach intuitively, we can reach some immediate conclusions. For example, the average number of coin flips to get heads must be greater than 0 because we must play at least once. Besides, if our game ends with a single flip, that is, if our first flip lands a heads, a chance factor will be involved. We will need to eliminate this chance factor by playing the game many times. Lastly, the average cannot be as large as eight because getting seven tails consecutively (TTTTTTTH) is unlikely to happen.

How do we find the average, by the way? For example, if we get heads in the third flip, we will gain 3 points. Then if we get heads in the first throw, we will gain 1 point. Then the average will be (3+1)/2 = 2. We will continue playing the game many times just like this, and we will find the average by collecting the points received and dividing the resulting number by the number of times we played the game.

Now, how about playing this game one billion times? Well, that is technically impossible because an average life of 80 years equals 2 billion seconds. In practice, such a thing is not possible and has no meaning whatsoever. Thus, we don’t know whether this game will end or not, but at least we learned that we could play this game an infinite amount of times in our minds.

The first coin we toss will result in heads or tails. If it is heads, the game will be over, and we will get the point. If it is tails, we will get the point, and the game will continue. There will only be two possibilities: heads or tails. In this fashion, we will now play the game n times and write down all the scores we have received. Let’s call this total t. Then our expectation will be t over n (t/n). But let me repeat, this t/n number will purely be a philosophical concept as it has no use in practice. Let say that our real expectation in understanding whether a coin is biased is five, and the average t/n is 5.01. In this case, we cannot say that the coin is not biased. We are looking for the absolute truth.

Now let’s imagine that we have played this game a billion times, half of which most likely ended in one move. Then we will play 500 million times and collect 500 million points. Two hundred fifty million games will be finished in two moves, and we will collect 500 million points again. One hundred twenty-five million games will be completed in three moves, and we will get 125 million x 3 points. If we formulate;

t/n = (500million x 1 + 250million x 2 + 125million x 3 + ….) / 1 billion.

When simplified;

t/n = [1/2¹ x 1] + [1/2² x 2] + [1/2³ x 3] + … (up to a billion)

As I mentioned above, we had to eliminate the luck factor. Now that we have played this game a billion times, we have reduced the luck factor to almost zero. But we still couldn’t make it to zero, not completely. To eliminate the luck factor, we have to take the number of games we will play (n) up to infinity. So the expectation must be;

t/n = [1/2¹ x 1] + [1/2² x 2] + [1/2³ x 3] + … to infinity.

The equation above is an infinite sum. Have you ever added infinite numbers in your life? In fact, we often made infinite sums, but we never knew what infinite sums meant.

We all heard about the number pi (3.1415926… ). Pi goes forever without repeating itself and is full of surprises; calculating Pi is exciting, too!

The Beauties Hidden in Pi(π)

It was finally the weekend! After my long mathematics presentation, I came home to watch my favorite tv show, Person of…

medium.com

We can find Pi by dividing the circumference of any circle by its diameter. However, this is not as easy as it seems. For example, we can never get an absolute value by picking up a rope, wrapping it around the wheel of our bike, and measuring the length. What we can do is find an approximate value. For example, let’s measure the circumference of the bike’s wheel as 7-something. So what is that something? Let it be 7.45. Then what? Our eyes are not strong enough to find the following number. We may need to look at the rope we use to measure the wheel’s circumference with a microscope.

Because it is fieldwork, it may be enough to accept the number of pi as 3.14. However, to make a stunning Swiss watch, it is necessary to know the Pi’s first 40–50 digits. It takes 46 digits of Pi to go into space. When we mark two points and try to measure the distance between them using a telescope, they turn out to be huge. In a nutshell, Pi is a cognitive construct: there is no such thing as a perfect circle. A line might be one-dimensional by definition, but if we draw a line on the board now, it will be three-dimensional. A line goes to infinity as well, but our drawings are finite. Thus, the Pi number is purely cognitive.

When we write each of these terms after the other and add them, we get the number pi. If pi is 3.1415926… We can write it as:

3 + 1/10 + 4/10² + 1/10³ + 5/10⁴ + …3
 0.1
 0.04
 0.001
 0.0005
 0.00009
 0.000002
 0.0000006
 ..........
 ...........
 ............
+__________________

When we collect infinite numbers, we follow a different path. We usually start adding from the far right, but since Pi does not have that far-right one, we need to begin adding from the far left. Let’s make up another infinite sum example.

2.357911131719232931374143475361... is a number that consists of primes, and it never ends.1.2345678910111121314151617181920... is a number that consists of natural numbers and goes on forever.

Now let’s try adding these numbers. Typically, if these numbers are finite, addition is easy, but unfortunately, these numbers are infinite. We have to calculate differently.

2.357911131719232931374143475361....
 1.2345678910111121314151617181920...
+__________________________________________

The whole part looks like it will be 3, but we cannot be sure. This total is definitely greater than 3. Also, because the first number is less than three and the second number is less than 2, their sum is definitely less than 5. So the total is supposed to be 3-something or 4-something.

In other words, the sum is greater than three but less than 5.

If we go one step further, the first number is greater than 2.3, the other is greater than 1.2, so the total is greater than 3.5. The first number is less than 2.4, and the second number is less than 1.3. So the total is less than 3.7. We should be a little closer to the absolute answer.

So, the sum is greater than 3.5 but less than 3.7.

We will go one more digit to get a little closer to that. The first number is greater than 2.35, and the second number is greater than 1.23, so the total is greater than 3.58. Also, the first number is less than 2.36, and the second number is less than 1.24, so the total is less than 3.60.

Now, the sum is greater than 3.58 but less than 3.60.

Generalizing from this example, where we tried to add two infinite numbers, a and b, we didn’t know how to find a + b. So we first summed up fractional numbers less than a and b and found a lower limit. Then we added the fractional numbers greater than a and b and found an upper limit. We realized that the sum of a + b is somewhere between the lower bound and the upper limit. If there is a gap between the lower bound and the upper limit, we will not be able to know the exact value of a + b. Therefore, we are talking about the existence of a number in a range after bringing the lower and upper limits closer to each other.

Logic works exactly as demonstrated here. As seen, we did not know what the exact total was. We only thought we knew what the real total was, and we repeated simple additions.

Let’s go back to the equation above. Now;

Now we want to figure out the infinite sum oft/n = [1/2¹ x 1] + [1/2² x 2] + [1/2³ x 3] + … to infinity.

However, we don’t know what it means to add infinite numbers. In the previous example, we were adding two numbers. Now we will try to add an infinite number.

- On the first flip, heads came, and we got 1 point with 1/2 probability. 
- On the second flip, TH came, and we got 2 points with 1/4 probability.
- On the third flip, TTH came, and we got 3 points with 1/2³ probability.
- .........As you can see, this process can go on forever.

Let’s say the sum of the points we got is b. Also, since we will play this game an infinite amount of times, tails will come every time we toss, so; TTTTTTTTT… Is this probable? Since all probabilities will be 1, we can write an equation like this:

(1/2 + 1/4 + 1/8 + 1/16 + ……) + p = 1Left side-part with parenthesis- is the probability of ending the game, p is the probability of not ending the game. If we can find the value of p, we will find our expectations.

If you look at the fractional numbers in the parentheses, you will notice a fascinating equation. Let’s try to go from point A to point B, the distance between being 1 meter. Our condition is to go each time halfway. So let’s go half the road (1/2) first, then half the rest (1/4), then half the rest (1/8), etc. Now let’s try adding the distances we’ve gone.

b= 1/2 x 1 + 1/2² x 2 + 1/2³ x 3 + …

Then,

1/2 = 0.5
1/4 = 0.25
1/8 = 0.125
1/16 = 0.0625
1/32 = 0.03125
1/64 = 0.015625
1/128 = 0.0078125
1/256 = 0.00390625
1/512 = 0.001953125
1/1024 = 0.0009765625
1/2048 = 0.00048828125
.......................
........................
.........................

How do we add infinite numbers? The normal collection starts from the far right, but since we don’t have a thing called “far-right” in an infinite number, we have to start from the far left again.

Can it not be that all numbers are greater than 0 and less than 1, but the total is greater than 1? It cannot because this infinite sum plus p number has to equal 1. If the sum of the leftmost numbers is somehow 1, then the total will undoubtedly be greater than 1, but it has to be 1. So this total will be 0-something.

Let’s add the first digits after the decimal point. 5 + 2 + 1 = 8. So the total is greater than 0.8. But after a while, that number will be 0.9. How do we know this? When you add the next second digits at your point, 17 is coming, so there will be a total of 8 at the time. When we add the next digits, we will see that the sum of all digits will be 9.

The total will be 0.99999999…. Nines will go on forever.

Now we will subtract the sum we found from 1 to find p.

1 – 0.9999999999... = p

Then,

1
0.999
-_______
0.001

And

1
0.9999
-_______
0.0001

When we do this subtraction with infinite nines, it seems like we will put 1 at the end of the result. But let’s believe that this subtraction is equal to the number “a” instead. If we multiply a by 10,

a = 0.0000000000………… 1
10a = 0.000000000………… 1 again. The exact same thing. Thus;10a = a
9a = 0
a = 0.

Actually, what we did last is wrong. If there is a number as 0.000000…………… 1, we would not know how to multiply that number by 10.

Fun fact, the vast majority of people know the number 0.9999999999… as less than 1, but that number actually equals 1. The mistake stems from the fact that they don’t know the infinite sum.

0.9999999999… is right under 1’s very nose. We cannot say that it is close to 1 because 0.9999999… is also close to 2. That’s why we say that it’s right under 1’s very nose.

0.9999999… by definition is equal to 1.So, if 0.999999... + p = 1, then p=0.

This game is over. Of course, theoretically. In practice, this makes no sense.

This article is written using one of Ali Nesin’s lectures on Youtube.

Note 1: Beyond Euclid! is my weekly newsletter that brings you only high-quality math and science stuff to ensure you are having a good week. Please do yourself a favor by subscribing to Beyond Euclid and enjoy it! And if you can be a member and support my work, that would be awesome! Thank you!Note 2: I get commissions for purchases made through links in this post.