In this analysis, we aim to identify arbitrage opportunities in head-to-head (H2H) betting markets. Arbitrage betting, also known as sure betting, involves placing bets on all possible outcomes of an event across different bookmakers to guarantee a profit regardless of the result. This is possible because different bookmakers offer different odds for the same event. Here, we’ll use the oddsapiR package to fetch the latest odds data (Gilani 2022) as well as the tidyverse, stringr and DT packages for data manipulation and analysis.
Before we begin, let’s load the required R packages for this analysis:
library(oddsapiR)
library(tidyverse)
library(stringr)
library(DT)
library(knitr)
library(kableExtra)
To access the odds data, you’ll need an API key from the Odds API. You can obtain a unique API key by visiting the-odds-api.com.
#Sys.setenv(ODDS_API_KEY = "XXXX-YOUR-API-KEY-HERE-XXXX")
Sys.setenv(ODDS_API_KEY = "27e7870c95b4633d21404d4bff34b320")
The following command will tell you how many requests you have remaining and how many requests you have used so far this month:
datatable(toa_requests())
Now, let’s check which sports currently have active markets (not including outright markets):
active_sports <- toa_sports() %>%
filter(active == TRUE, has_outrights == FALSE) %>% select(key, title, active, has_outrights)
Here is a table of the active sports currently available:
datatable(active_sports, options = list(scrollX = TRUE, pageLength = 5))
Next, we will extract head-to-head odds for all active markets in Australia. We’ll create a dataframe to store all sports odds and loop through each sport to fetch the odds data.
Odds will be scraped from the following Australian bookmakers:
#Create an initial dataframe where we will store all sports odds
sports_odds <- data.frame()
#Loop through each sport to fetch the odds data
for(i in 1:nrow(active_sports)){
sports_odds_i <- tryCatch({toa_sports_odds(
sport_key = active_sports$key[i],
regions = 'au',
markets = 'h2h',
odds_format = 'decimal',
date_format = 'iso')},
error = function(i) NA)
#If there is a market for the sport in Australia proceed
if(length(sports_odds_i) > 1){
#Add sport name column to the table
sports_odds_i <- sports_odds_i %>% mutate(sport = active_sports$key[i])
#Bind rows of the sport's odds to the initial odds dataframe
sports_odds <- bind_rows(sports_odds, sports_odds_i)
}
}
#Change the event commence time to a datetime index
sports_odds$commence_time <- as_datetime(sports_odds$commence_time)
#Filter so that only matches starting sometime in the future are shown (removes live matches)
sports_odds <- sports_odds %>% filter(commence_time > now(tzone = "zulu"))
#Convert to commence_time to AEST
sports_odds$commence_time <- sports_odds$commence_time + hours(10)
To identify head-to-head arbitrage opportunities, we will filter out lay bets, account for commissions, and only keep the best odds for each outcome of a unique match. We will also note the bookmakers offering the best odds.
Finally, we will calculate the arbitrage percentage for each match. We can find arbitrage opportunities by summing the inverse of the best odds for all possible outcomes and identifying where this sum is less than 1 (or 100%).
best_h2h_odds <- sports_odds %>%
#Filter out lay bets from Betfair
filter(market_key != "h2h_lay") %>%
#Add commission variable, which is a % of the profit that Betfair will take from the winnings.
#From what I've observed, Betfair takes 10% from NRL matches and 5% from all other sporting events
mutate(commission = ifelse(bookmaker == "Betfair", ifelse(sport == "rugbyleague_nrl", 10, 5), 0)) %>%
#Add a converted odds variable which adjusts odds to their odds once commission is taken into account
mutate(converted_odds = ifelse(market_key == "h2h_lay",
1 + (1-commission/100)/(outcomes_price-1),
1 + (1-commission/100)*(outcomes_price-1))) %>%
#Only keep the best odds for each outcome of a unique match
group_by(id, outcomes_name, home_team, away_team, sport) %>%
filter(converted_odds == max(converted_odds)) %>%
#Make a note of the bookmakers that offer the best odds. I will note a maximum of 4
#A lot of the time more than 1 bookmaker will offer the best possible odds you can find
group_by(id, outcomes_name, home_team, away_team, converted_odds, sport, commence_time) %>%
summarise(
bookmaker1 = sort(bookmaker)[1],
bookmaker2 = sort(bookmaker)[2],
bookmaker3 = sort(bookmaker)[3],
bookmaker4 = sort(bookmaker)[4]
)
#Change NA entries to "" so that the bookmaker columns can be combined
best_h2h_odds$bookmaker2[is.na(best_h2h_odds$bookmaker2)] <- ""
best_h2h_odds$bookmaker3[is.na(best_h2h_odds$bookmaker3)] <- ""
best_h2h_odds$bookmaker4[is.na(best_h2h_odds$bookmaker4)] <- ""
#Combine all bookmaker columns into a single column
best_h2h_odds$bookmakers <- paste(best_h2h_odds$bookmaker1,
best_h2h_odds$bookmaker2,
best_h2h_odds$bookmaker3,
best_h2h_odds$bookmaker4)
#Remove bookmaker1, bookmaker2, bookmaker3, bookmaker4 columns as they are now redundant
best_h2h_odds <- subset(best_h2h_odds, select = -c(bookmaker1,bookmaker2,bookmaker3,bookmaker4))
#Identify the market % of the best odds. Anything under 100% represents an arbitrage opportunity
market_percentage <- best_h2h_odds %>%
group_by(sport, id, commence_time) %>%
summarise(
market = sum(100/converted_odds)) %>%
arrange(market)
#Merge market_percentage with best_h2h_odds
best_h2h_odds <- merge(market_percentage,
best_h2h_odds,
by = c("sport", "id", "commence_time")) %>%
arrange(market) %>%
select(-id)
The resulting table will show us the matches with potential arbitrage opportunities, including the best odds for each outcome, the bookmakers offering these odds, and the calculated arbitrage percentage.
#Showcase the best markets, arranged from lowest market % to highest.
datatable(best_h2h_odds, options = list(scrollX = TRUE, pageLength = 10))
You can also find different arbitrage opportunities with a standard back bet, and a lay bet on Betfair
Similar to how we identified arbitrage opportunities for To identify arbitrage opportunities with standard back and lay bets, we will account for commissions and convert the odds accordingly. We’ll then keep the best odds for each outcome of a unique match and note the bookmakers offering these odds. Finally, we’ll calculate the market percentage for each match. Arbitrage opportunities arise when the sum of the inverse of the best odds for all possible outcomes is less than 1 (or 100%).
best_back_lay_odds <- sports_odds %>%
#Add commission variable, which is a % of the profit that Betfair will take from the winnings.
#From what I've observed, Betfair takes 10% from NRL matches and 5% from all other sporting events
mutate(commission = ifelse(bookmaker == "Betfair", ifelse(sport == "rugbyleague_nrl", 10, 5), 0)) %>%
#Add a converted odds variable which adjusts odds to their odds once commission is taken into account
mutate(converted_odds = ifelse(market_key == "h2h_lay",
1 + (1-commission/100)/(outcomes_price-1),
1 + (1-commission/100)*(outcomes_price-1))) %>%
#Only keep the best back odds and best lay odds for each outcome of a unique match
group_by(id, outcomes_name, home_team, away_team, market_key, sport, commence_time) %>%
#Make a note of the bookmakers that offer the best odds. I will note a maximum of 4
#A lot of the time more than 1 bookmaker will offer the best possible back odds you can find.
filter(converted_odds == max(converted_odds)) %>%
group_by(id, outcomes_name, home_team, away_team, converted_odds, market_key, sport, commence_time, outcomes_price) %>%
summarise(
bookmaker1 = sort(bookmaker)[1],
bookmaker2 = sort(bookmaker)[2],
bookmaker3 = sort(bookmaker)[3],
bookmaker4 = sort(bookmaker)[4]
)
#Change NA entries to "" so that the bookmaker columns can be combined
best_back_lay_odds$bookmaker2[is.na(best_back_lay_odds$bookmaker2)] <- ""
best_back_lay_odds$bookmaker3[is.na(best_back_lay_odds$bookmaker3)] <- ""
best_back_lay_odds$bookmaker4[is.na(best_back_lay_odds$bookmaker4)] <- ""
#Combine all bookmaker columns into a single column
best_back_lay_odds$bookmakers <- paste(best_back_lay_odds$bookmaker1,
best_back_lay_odds$bookmaker2,
best_back_lay_odds$bookmaker3,
best_back_lay_odds$bookmaker4)
#Remove bookmaker1, bookmaker2, bookmaker3, bookmaker4 columns as they are now redundant
best_back_lay_odds <- subset(best_back_lay_odds, select = -c(bookmaker1,bookmaker2,bookmaker3,bookmaker4))
#Identify if there is a pair of back odds and lay odds for each outcome in a unique match
back_and_lay_pair <- best_back_lay_odds %>% group_by(id, outcomes_name) %>%
summarise(events = n()) #events will equal 2 if there is a pair, and 1 if there isn't a pair
#Merge best_back_lay_odds with back_and_lay_pair
best_back_lay_odds <- merge(best_back_lay_odds, back_and_lay_pair, by = c("id", "outcomes_name")) %>%
#And filter so that it only includes outcomes that have a pair of back odds and lay odds
filter(events == 2) %>%
select(-events)
#Identify the market % of the best odds. Anything under 100% represents an arbitrage opportunity
market_percentage <- best_back_lay_odds %>%
group_by(sport, id, outcomes_name, commence_time) %>%
summarise(
market = sum(100/converted_odds)) %>%
arrange(market)
#Merge market_percentage with best_back_lay_odds
best_back_lay_odds <- merge(market_percentage,
best_back_lay_odds,
by = c("sport", "id", "outcomes_name", "commence_time")) %>%
arrange(market) %>%
filter(!sport %in% c("icehockey_sweden_hockey_league", "icehockey_nhl")) %>%
select(-id)
The resulting table will show us the matches with potential arbitrage opportunities, including the best odds for each outcome, the bookmakers offering these odds, and the calculated arbitrage percentage.
#Showcase the best markets, arranged from lowest market % to highest.
datatable(best_back_lay_odds, options = list(scrollX = TRUE, pageLength = 10))
In this analysis, we have successfully identified potential arbitrage
opportunities in H2H betting markets by using the oddsapiR,
tidyverse, stringr and DT
packages in R. We have demonstrated how to fetch the latest odds data,
filter and manipulate the data to find the best odds, and calculate
arbitrage percentages. By following this approach, betters can
potentially guarantee a profit regardless of the outcome of an event by
taking advantage of different odds offered by various bookmakers. This
methodology can be further extended and refined to enhance the accuracy
and efficiency of identifying profitable arbitrage opportunities.