Useful Tips

Poisson Distribution and Football Betting

How to make money on the expected goals modelxGtrusted by Wenger, Guardiola, Nagelsman and dozens of other world-famous coaches.

Why you need to use the xG model

Predicting the number of goals solely on the statistics of strikes is wrong. This season, 586 out of 5447 shots were able to turn the teams into goals, but each team has a different quality: one goal requires an average of 6 hits, the other more than 14.

By January 2018, in the season of the English Premier League, the teams played 161 times in a draw:

  • in 106 matches (65.8% of cases), the team with the most hits won
  • 123 times (76.4%) the club won, having gained the upper hand in the xG-model.

Forecasting Methods for Expected Goals

To calculate the number of goals and predict the winner of the match, you need to calculate:

  • expected ownership
  • hit statistics
  • xG model data.

For clarity, take the match Liverpool - Manchester City.

Find the expected ball ownership ratio

On average, Liverpool owns the ball 60% of the playing time, and the townspeople 72%. From these values, we assume that in the upcoming match the distribution of ownership will be 56 by 44 in favor of City.

Determine the expected number of strokes

  • On average, Jurgen Klopp's team delivers 17.7 shots on goal by rivals (with possession 60%). With 44% of the time (versus the conditional average opponent) 13.0 hits are expected from the Reds,
  • on goal Manchester City hit an average of 6.2 times per game with 72% of possession, and with 56% - 9.8,
  • we calculate the expected number of Liverpool hits in the upcoming match: (13 + 9.8) / 2 = 11.4.

In the same way, we count the guests - 12.8 hits are expected from City.

We calculate xG / shot for both teams based on the expected goals model

  • this season the Liverpool nuclear submarine delivered 389 hits with a total danger of 47.5 xG (as a result - 0.122 xG / stroke),
  • they hit 136 times on City gates, which at 14.8 xGA was 0.108 xG / hit,
  • on average (between 0.122 and 0.108) we expect 0.115 goals from every Liverpool hit in the upcoming match. Calculated similarly, this indicator for "City" is 0.141.

We calculate the number of expected goals

  • Liverpool: 11.4 x 0.115 = 1.31 xG,
  • Manchester City: 12.8 x 0.141 = 1.81 xG,
  • total total - 3.12, distribution of goals (%) - 42/58.

Note: for these calculations, we took all the matches of the current season, however, in other methods you can additionally take into account the influence of:

  • home and away matches,
  • current team form (last X matches), etc.

Prematch Betting

The scope of the obtained xG-calculations is limited by the imagination and the line of the bookmaker:

  • Betting on the winner and the handicap,
  • Individual and total totals,
  • Both will score
  • The exact score.

The data on the match Liverpool - Manchester City say that without taking into account the draw, the hosts will win with a 42% probability. Compare this with the odds of bookmakers. For example, the legal bookmaker “1xBid” at P1 (0) offers a coefficient of 2.48 (probability 40%), which differs from the results calculated by the expected goals model by 5%. Without taking into account additional factors, this is a significant discrepancy, which must be transformed into profit.

Live betting

Separately, it is worth noting the effectiveness of the xG methodology when betting on total in live.

Example: according to our calculations, on average, Anfield will score 0.035 goals per minute (3.1 / 90), that is, if the account is not open by the end of half an hour, the chances of the match will end in both TM and TB ( 2) are the same. In this case, when the odds on the total (2.0) are different from equal, it is worth considering betting on the larger of them.

Long term rates

Using xG calculations, you can determine the expected number of points for a particular team and evaluate its chances, for example, for the championship.

Models of expected goals are effective, but do not take into account, for example:

  • moments when there was no hit (misses on the ball from a meter from an empty net will be rated at 0 xG),
  • several hits in one attack (in the case of an unbeaten penalty with the finish, the xG-model will calculate 0.75 times 2 times) - in this case it is fairer to use the full probability formula.

However, the repeatability of the moments incorrectly calculated by the model is too small; these inaccuracies at a distance can be neglected.

xG-models are not ideal, however, in combination with traditional statistics, they are the foundation on which to base high-quality football forecasts now.

Where to get statistics

  • traditional (shots on goal, on target, possession, etc.): whoscored.com, myscore.com
  • advanced (according to the expected goals model, xG): understat.com

Vyacheslav Levitsky

How many times I am convinced that what would be the laudatory praises of all theories of betting. No one has come up with a better analysis of the coefficient line of the method. Therefore, studying these tables is simply a waste of time.

I haven’t noticed much difference in my rates. That I studied the average performance of teams, that I switched to this tactic. There is no difference, the profit from bets has remained at the same level. So this is all monkey labor.

The xG model, as the author noted at the beginning of the article, is effective for coaches of football teams. Therefore, they will do everything to reduce the effectiveness of creating moments by the opposing team. Therefore, I think that this is just the same average statistics, only called a more intelligent language.

Everything seems to be clear, but I start on my own, something to count turns out that in every match there will be goals. But, unfortunately, they do not always occur. What is the reason it is difficult to understand, I think it is better to develop intuition.

The method is very easy to understand, the main thing is to find tables. Although, if you are not lazy then all the calculations can be done using the available data from the Miskore site. Most importantly, the method really helps to find underestimated outcomes and have a significant advantage over the line.

Application of the methodology in bets
In the pre-match line for concluding deals, the line of the main outcomes, totals and odds is suitable. For example, it is possible with great confidence to give preference to the market “both will score” or take a chance and bet the townspeople on victory.
In live, you can open the hunt for an equiprobable total. For example, the calculations show that in the match at least two goals will be scored. Therefore, this is a great help to enter the total market for more than 2 on the 15-20 minute of the match, provided that the account is not open.
Also, the model of the expected number of goals helps visionary players understand from whom in the season to expect wild performance, and from whom a dull game.
Summary
Today, many resources offer tables of xG models. The bettors have priority over understat.com and whoscored.com, so manual calculations are minimized. If we talk about the error of this method, then it is minimal. Therefore, the theory should be in service with every professional player.

How to predict matches using xG models
To determine the possible winner of the meeting and calculate the approximate number of goals, the bettor needs:
• Calculate the approximate percentage of possession for each team.
• Calculate average impact statistics.
• Summarize the xG model.
For calculations, a match between Manchester City and Liverpool is suitable.
According to the plan, we calculate the indicator of possession of the ball for each team. If we study the statistics of the clubs, we can conclude that the players of Manchester City on average own the ball 70% of the playing time, and Klop players about 60%. In the upcoming game, such values ​​may well look like 55 to 45 in favor of the “townspeople”.
The average statistics of hits is calculated according to three criteria:
1. Liverpool footballers with the expected percentage of possession hit no more than 14 shots on goal per game.
2. Manchester City footballers per match allow hitting their own goal no more than 10 times.
3. The approximate number of strokes of the Klop wards will look like (10 + 14) / 2 = 12
If you do this with City indicators, we’ll get 13 hits from them.
Next, the “xG / stroke” indicator is calculated:
• We take into account that the Merseysides create a danger of 47.5 xG for 390 attacks (strokes) or 0.122 xG / stroke.
• The indicator for the townspeople, at which they allowed to hit on their own goal, will be 0.108 xG / shot (136 attacks in case of danger 14.8).
• The average value for Liverpool with a ratio of 0.122 and 0.108 will be 0.115, and for Manchester City it will be 0.141 (it is calculated similarly).
Now it remains to calculate only the expected number of goals for each of the teams:
• For citizens, this is an indicator of 1.81xG (13 * 0.141)
• For reds, this is an indicator of 1.31xG (12 * 0.115)
We get the approximate total goals of the model, equal to 3.12.
Important! To calculate a more accurate value, you can take statistics of away or home games. We made calculations based on all the games played in the season.

The effectiveness of the xG model in football betting
Predicting the exact number of goals in a match, relying only on the obvious statistics of the strikes delivered by the teams will be the wrong decision. For example, in the 2017/18 season of the English Premier League, 590 goals scored corresponded to an approximate number of shots equal to 5500. It is worth noting that some clubs had better quality of punches and they had 4-5 dangerous moments to score, others had to create 15-20 moments.
Most importantly, around the middle of the championship in 161 matches, the winner came to light:
1. In 66% of games, Victoria was won by the club, which hit the most on goal.
2. In 77% of games, Victoria was the club with the best xG scores.
Obviously, the expected goals model looks 11% more efficient than the conventional forecasting method.

Advanced Betting Statistics
Football professionals such as Guardiola, Wenger and Nagelsman have long built club tactics using xG tables. An advanced betting statistic or model of expected goals xG helps to understand how many dangerous moments created in a match at the goal turn into the number of goals scored. Is it worth the bettor to take such tactics into service and is it possible to earn with it?

Poisson formula

In a textbook on probability theory B. V. Gnedenko gives such a definition of Poisson's theorem.



In simple terms, on the basis of this formula, we can calculate the probability distribution for events that:

  • occur per unit time or per unit area,
  • discrete, they can be numbered,
  • are independent of each other,
  • do not have an upper limit, at least theoretical

You can give a lot of examples from everyday life: the number of victims of a lightning strike, from an accident, the number of advertising calls and messages, the number of breakdowns of the elevator per year, highlights in the Easter cake, etc. For us, examples of sports are of interest.

Football statistics

It turns out that the number of goals scored in a football match fits very well in the Poisson distribution. So, in the diagram on the left we see the number of goals scored by each team in each of the 380 matches of the Spanish Championship 2008-2009. The theoretical data of the model are calculated in the diagram on the right.



The diagrams are very similar, therefore there is reason to believe that the Poisson model can explain the ratio of goals scored by the team during the match.

Now a little about sports betting. Those who go to the bookmakers can see them on the scoreboard. Here's what some of the English Premier League team bets look like this coming Monday.


England - Premier League1x2
Manchester United - Sunderland1,256,8015,00
Chelsea - Bournemouth1,395,1010,50
Hull - Manchester City10,505,801,35

1 - victory of the first team
x - draw
2 - victory of another team.

It is enough to take a quick look at the standings and it will become clear why the stakes are so high and the possible win if Sunderland wins over Manchester United. For each ruble in this case, you can earn 15 rubles, in case of victory for the team of Manchester United, the win will be only 1.25 rubles.

Now let's try to calculate our own bets on the game Manchester United vs Manchester City, which should take place February 26th, 2017. Despite the fact that the two clubs have similar names, MS plays in his field, and Manchester United - away.

Before using the Poisson formula, you need to calculate the value of µ, and for this you need to get the average number of goals scored and missed by the participants in the game regarding average for all teams.

µ MU - Estimated Goals Scored for Manchester United
µ MC - Estimated Goal Scores for Manchester City

  • µ MU = MJ attack Ю MS defense МС average number of goals scored in away matches
  • µ MC = assault ✕ орона defense of Manchester United ✕ average number of goals scored in his field.

For the entire 2015/2016 season, there were 380 football games in which teams scored 567 goals (1.49 per game) at home and 459 away (1.20 per game).

Further, we consider the coefficients of the attack of the Manchester United, the attack of the Ministry of Defense, the defense of the Manchester United and the defense of the Ministry of Defense.



In away matches, Manchester United scored an average of 1.15 goals per game, and conceded 1.36.

At home, Manchester City scored an average of 2.47 goals per game, and conceded - 1.10.

Now we are weighing the same values ​​with respect to general averages. The relative average of the attacks by Manchester United in a foreign field is 0.958, and the relative average of the defense of the MS in their field is 0.915.

The relative average defense of Manchester United in a foreign field is 0.917, and the relative average of an attack by an IS in its field is 1.657.

Finally, multiplying all intermediate mean values, we find μ.

µ MU = 1.06
µ MC = 2.27

Goals by Poisson

Now we can calculate the probabilities for the different outcomes of this meeting. In R, this is very simple, but you can cut corners and use a statistical online calculator.

So, the probabilities are distributed as follows.


Goals012345
Manchester United34.663%36.725%19.454%6.870%1.819%0.385%
MS10.345%23.469%26.622%20.131%11.417%5.180%

The probability that Manchester United will not score a single goal is 34.663%, the same for Manchester City - 10.345%, the probability of a zero outcome of the meeting is equal to their product and is 3.586%. Matrix of all results from 0: 0 to 5: 5.

Now let's try to calculate the probability of victory for each of the parties, the probability of a tie outcome and finally decide on the bets. Let's start with a draw. We multiply the event vectors for MJ and MS, and consider the sum of the diagonal matrix. 1 to bid 5.264.

The chances of victory of the MC are equal to the sum of all kinds of 1: 0, 2: 0, ... 5: 0, 2: 1, 3: 1, ... etc. up to 5: 4. Bid is equal 1.619.

The chances of victory for Manchester United are smaller, respectively, there will be a larger bet and a monetary gain - 1 5.191.


Game Betting1x2
Manchester City - Manchester United1.6205.2645.192

Of course, the Poisson model is quite simple and does not take into account many factors and circumstances: a new player, a new coach, match status, club circumstances, etc. However, Elihu Feustel manages to earn millions using bets using mathematical algorithms.

Model creator

As a rule, this is a football statistician who is deeply immersed in the world of numbers. And, as a rule, they make all their conclusions on the basis of the circumstances under which the blow was dealt. That is, people have statistics in which cases a football player scores a goal more often, and in which less often. Roughly speaking, if a footballer hits 11 meters from the area, then he has good chances to score a goal. If a football player decided to hit from the center of the field, then he has much fewer such trenches (although such goals have not been rare lately).

In short, competent people are responsible for the odds and can be trusted.

What do we get as a result?

As a result, we get a parameter that allows you to look at the expense and other statistics of the match. And by which we can understand how dangerous this or that team was, how many moments it created and which of them were really good, and which ones were so-so.

I.e. this indicator can tell a lot about the match and the current form of the team.

Where do analysts get the information they need?

In general, there is a company called Opta. And she is engaged in the maximum possible “digitization” of football matches. Working with her directly is an expensive pleasure. However, there are popular sites that share some information. For example, the site whoscored.

This picture is very similar to the one on top. This is a shot on goal. You can see it by clicking on the "Match center", and there select "Chalkboard". In fact, this is the same card, only without hazard risk factors.

Why should xG indicators be used in the analysis?

Every week, hundreds of matches are played in the football world. Only in popular championships there are about 50-60 of them. You can actually watch them live from anywhere from 1 to 5. In a rewind recording, about 10. And this is if you are passionate and you have no other business.

In reality, I’m sure that most matches look between things in life-scores. Or a live picture, but here most are passionate about the match itself (or setting bets on the match) rather than analysis.

In short, conducting an independent and in-depth analysis is an expensive pleasure in terms of time. Which is not at all a fact that it will beat off at a distance.

And here comes the xG card to the rescue. Seeing a picture is a matter of minutes. A thoughtful look is a matter of 3-4 minutes. But you cool save your time and get quite complete information.

I propose to bring an experiment. Take any match you like, see the exposure of dangerous moments and rate them on a scale convenient for you. Next, find an xG card for him and compare his impressions. If the match was interesting, the teams created a lot of points, then the xG indicators of the teams will be quite high. If the match was boring, then xG indicators will be low.

“Okay, if a lot of goals were scored in the match, then there were a lot of moments.”

Not always. Именно для таких случаев и нужно "смотреть за статистику". Приведу пример с Лионом в этом году. Команда, несмотря на потерю многих игроков, продолжала забивать много голов. Было немного удивительно. Однако, после просмотра хайлайтов и просмотра xG-карт всё встало на свои места - команда реализует всё, что создаёт.

Вот, например, XG-карта матча Лион - Бордо.

Do not be lazy, look at the review of the match. I think that it will become clear to you that 3–3 was a very high result.

In many cases, the xG-indicators of the teams and the number of goals scored are the same. But not always, and precisely because of such moments it is worth analyzing showed xG.

xG-card helps to determine how well-deserved the result was.

In football, it is not so rare that one of the teams, as they say, “bounces”. That is, the team achieves a victory or a draw, but this result was achieved due to luck.

I think it takes a long time to explain why it is worth separating “clean” results (which were based on the logic of the game), from “dirty” (which were achieved due to a combination of circumstances). And then use this knowledge.

xGThe card helps in analyzing your bets after matches.

The thing that players do not pay due attention to is a cold analysis of their bets. No one is immune from error. However, for a properly built distance it is necessary not to repeat bad bets and “bend your line” for other elections, even if they were “minus” that day.

If your bid hasn’t entered, analyze it. Perhaps your choice was good, just "did not spin," the team was not able to realize their moments. Seeing the match’s xG map is the fastest way to get some food for analysis.

Why is xG indicators not the ultimate truth?

Watching some of the communities that knew what xG is and how it helps in analyzing matches, I noticed that people are beginning to overestimate the significance of these indicators. And do not see anything but these "digits". Now I will explain why this is not an XG card - this is not a grail.

The number of moments may be greater than the number of strokes.

It is believed that if a player is in a comfortable position, then he will break. However, this does not always happen. For example, it is not uncommon when a player misses the ball or the player is interrupted by a foul. Such moments are not calculated by xG. Some models also do not consider own goals, which is not entirely correct.

Models cannot always accurately calculate the danger of the moment.

In general, many models try to take into account other factors besides the position for the strike. Also on whoscored, there is a parameter called "Big chance". This strengthens the score, but still the position for the impact affects the coefficient.

Recently watched the map of the match Liverpool - Arsenal. Liverpool created many moments and won. I would focus on the 3rd goal, which scored Salah. There was a 1 on 1 exit, while Salah could pass the ball to a neighboring player. For me it’s almost 100 points. tegen11 rated it quite high, but not the maximum rating.

The quality of the realized moments depends on the championship and skill of the players.

Everything is simple. The xG of Barcelona and the xG of the Levant cannot be equally valued. The skill of Barcelona players is much greater than that of Levante players. The same Messi does not need uniquely good moments to score goals. Strong players can realize not obvious moments, and also top players have a good realization of the moments - they do not need a lot of them.

Therefore, it amuses me very much when in one championship all teams are compared in the same way. And, for example, they can talk about the luck of some and not the luck of others.

Creating a good moment is not enough, you need to have players who can realize, as well as players who can score an unclear moment. Also, the goalkeeper skill strongly influences. I am sure that comparing De Gea, who can drag a lot, with, say, Bournemouth's goalkeeper, is meaningless.

If a weaker team comes to a stronger one, and if a weaker team is lucky to score the ball, then it will be very difficult to expect the banquet to continue the banquet. Most teams will defend and defend. Naturally, the number of created moments can vary greatly, although this will flow from the logic of the game.

Well, to find out how good a team is in a team game, how well it looks good in general, to see how the ball goes is impossible.

Let's take some championship and try to evaluate the attack and defense of each team based on statistics. We get a certain coefficient, which we can later use in the analysis of matches.

What is the result?

This time we examined one of the methods for classifying teams, working with an array of statistics. This method is quite simple, but informative, because allows you to get a certain amount, from which you can start with the analysis of matches. In the post I considered only football and goals, however, the same approach can be used for other indicators (corners, cards, shots, offsides, etc.), as well as applied to other sports (hockey, basketball, volleyball).

In addition, important nuances of statistics about which you should not forget when you work with it were considered.

It is important to understand that the bookmaker is not asleep and also works with statistics. Therefore, any of your work should take into account the nuances that the bookmaker might not have taken into account. Well, sometimes it’s useful, having the necessary statistical calculations, to understand that the coefficient is “drawn” precisely according to statistical data, without taking into account additional factors that may affect the course of the match.