Predicting Hotel Booking Cancellation using papAI

The hotel industry is one of the most fascinating markets for artificial intelligence-based predicting hotel booking cancellation. It demands a high level of data analysis and management from many stakeholders including clients, suppliers, management and employees. This can be attributed to the nature of the business wherein hotels have varying degrees of flexibility in meeting demand fluctuations caused by multiple factors including seasonality, weather conditions and competition. In this article we will see modeling approach followed by classification techniques to predicting Hotel Booking Cancellation using papAI Platform.

Predicting hotel booking cancellation is a problem that many hotels and travel companies face. It is important for hotels to be able to accurately predict whether or not a booking will be cancelled, as this allows them to better manage their inventory and optimize their revenue. There are several factors that can influence the likelihood of a booking being cancelled, including the type of booking (e.g. individual or group), the length of stay, the location of the hotel, and the season in which the booking was made. Other factors that may influence cancellation rates include the availability of alternative accommodation options, the price of the hotel room, and the overall satisfaction of the guest.

One way in which the papAI platform can address this use case is through the use of predictive analytics. Predictive analytics is the use of machine learning algorithms to analyse historical data and make predictions about future events. In this context it may involve the analysis of data such as booking patterns, occupancy rates and room rates

Overview of the hospitality industry

It is difficult to predict the exact rate of hotel booking cancellations for 2022, as it will depend on various factors such as the state of the global economy, travel restrictions and restrictions, and individual customer behavior. However, some general trends can be identified based on past data and current trends.

The latest 2022 research shows that cancellation rates have risen rapidly, with 1 in 5 (20%) hotel bookings cancelled, It is also noted that couples have a higher cancellation rate than families (66.7% and 11%) – the highest rate of cancellation in the world (Avvio research).

Factors that may affect the rate of hotel booking cancellations in 2022 include the ongoing COVID-19 pandemic and related travel restrictions. While the rollout of vaccines is expected to lead to an increase in travel demand, it is possible that some individuals may still be hesitant to travel due to health concerns. This could lead to an increase in cancellations. Additionally, economic uncertainties may also impact the rate of hotel booking cancellations. If consumers are facing financial difficulties, they may be more likely to cancel their travel plans.

Additionally, economic uncertainties may also impact the rate of hotel booking cancellations. If consumers are facing financial difficulties, they may be more likely to cancel their travel plans.

Why is predicting the cancellation of hotel bookings so important in the current climate ?

Predicting hotel booking cancellation using papAI platform offers a number of benefits for hotel owners and managers. Some of the key benefits include:

  • Improved efficiency: paAI platform allows for the automatic processing of large amounts of data, making the process of predicting cancellations much faster and more accurate. This can help hotel managers make quicker and more informed decisions about how to allocate rooms and resources.
  • Increased revenue: By accurately predicting cancellations, hotels can better plan for any potential gaps in bookings and work to fill those gaps with new reservations. This can help increase overall revenue for the hotel. 
  • Enhanced customer satisfaction: By being able to predict cancellations and have a plan in place to fill those gaps, hotels can offer a more consistent and reliable experience for their guests. This can lead to increased customer satisfaction and loyalty. 
  • Reduced costs: Predicting cancellations can help hotels avoid overbooking, which can lead to costly last-minute adjustments and lost revenue. It can also help hotels save on costs related to preparing and cleaning rooms that may have gone unused due to cancellations. 
  • Improved decision-making: papAI platform can provide hotels with valuable insights and data-driven recommendations for how to optimize their operations and better serve their customers. This can help hotel managers make more informed and strategic decisions about how to best run their business. 
" At a time when explainable artificial intelligence (XAI) takes an important part in the public discussion, the high interpretability of papAI platform ensures a full disclosure and transparency on the results obtained with AI, identifies biases and guarantees a protection of your data management through a localized implementation".
Jean-marc BRIQUET
Global Sales Director


The dataset contains booking information between 2015 and 2017 on two different hotels : one Resort Hotel and one City Hotel. Both hotels are located in Portugal, specifically the first one in the region of Algarve and the second in the city of Lisbon. This dataset has 119390 data points and 32 features such as arrival_month, number of adults, the average daily rate, the country of origin etc.

1- Exploring and analysis the data

Thanks to papAI’s data visualization module, it offers you the possibility to visualize 2D, 3D or even cartographic graphic visualizations in order to make assessments and analyze what the dataset has to offer. In this case, the graph represents the average daily rate per person per month.

We can see the prices increase more in the summer than in the winter season due to the high demand in that period of time, with the highest average in August of 72 € per person compared to the lowest average in November of 30€ per person.

2- Preprocessing the data

Before going forward with the ML step, a cleaning step is required in order to obtain the right dataset, ready for the model training and for the prediction. Here, we extract only the bookings from the city hotel and from the resort hotel separately along with small cleaning steps such dropping null values or columns with no importance before splitting into the training and testing datasets to build our model.

We dropped some columns that are irrelevant for our use case and dropped some null values to keep it as clean and robust for the ML part.

3- Building the model

After cleaning your dataset and separating it into training and testing datasets, the use case must be evaluated. In our case, the use case is predicting hotel cancellations, which means we are dealing with a Classification problem. As a result, using the AutoML module, we specify the type of problem we want to solve as well as the target we want to predict (the column “is canceled”).

Thanks to our built-in ML algorithms, papAI offers a multitude of Classification models to choose from, easy to configure and gives you the option of running multiple models to compare their results and select what suits you best.


4- Evaluating the model

When the training is finished, a dashboard of metrics is displayed for each model in order to understand and evaluate the performance of the trained model to help choose the right one for the use case. For this Decision Tree, we have a great evaluation score and a confusion matrix that shows how effective the model predicts compared to the true values. However before choosing it, we need to understand how the model works.

5- Interpret the model

To get more in-depth with the model, papAI integrates an explainability module to not only evaluate the model but to understand how the prediction is calculated and how the features influence the model’s decision making. Here we can see that, for example, the type of deposit influences greatly on the prediction depending on whether the person makes no deposit or that pays the full price of his stay.

6- Apply the prediction on another set

After looking through the explainability module and choosing the right model, you can promote it to apply prediction on other datasets with the same schema used for the training. A new dataset is created to compare the predicted value and the real value.

Adding to that, you look through each individual prediction and get more details on the decision making process through our local interpretability module.

For this booking, we can look how each value of each feature influences the probability thus on the predicted value. Furthermore, through our counterfactuals module, you can change some true values to check some hypotheses as to what could potentially influence the person’s booking to cancel or not.

Predicting hotel booking cancellations using the papAI platform can greatly benefit the hotel industry by enabling them to accurately forecast demand and optimise their resources. This can lead to increased revenue and profitability for the hotel, as well as improved guest satisfaction by ensuring that there are enough rooms available for guests. In addition, thanks to the high level of explicability (XAI) of the results on our platform, hotels can also better understand the factors that lead to cancellations. This information can be used to implement strategies to reduce cancellations and improve the overall customer experience. For example, hotels may be able to offer incentives or promotions to encourage guests to keep their reservations, or they may be able to identify and resolve issues that may be causing cancellations.

Interested in discovering papAI ?

 Our commercial team is at your disposal for any question

Predicting Hotel Booking Cancellation using papAI​
Scroll to top