Airport Passenger Flow Forecasting using papAI

Using data from the airport terminal and airport operations center, passengers can be tracked from several different angles. papAI platform in this use case is trained to predict passenger flow in the airport. In a bid to improve efficiency of airport operations.

There is a need to predict passenger flows in an airport in order to have efficient and effective resource management. papAI platform offers a solution that uses various techniques including regression, classification, clustering and Time series forecasting. A methodology is presented in this research whereby passenger flow data collected from airports over a period of time has been analyzed using the last technique resulting in successful predictions.

Airport passenger flow prediction using papAI

What are the benefits of Passenger Flow forecasting for airports ?

  • Improve gate assignments : There is a problem with the gate assignments on the airport. The airport has many gates available, but they are not using more of them. The reason they are not using them is because they do not have proper solutions to improve their assignment process. There is no way that they can find out which routes are the slowest and then optimize their gate assignments. With papAI platform , we will provide them with a tool that not only allows them to track the current status of each flight but also suggest new strategies to optimize their gate assignments.
  • Reduce empty parking and increase revenue at airports : Empty parking is a recurring problem at airports, due to poor management and forecasting of passenger flows. However, people tend not to be prepared to pay the price of parking and prefer to wait until the last moment to go there. This results in a mass arrival and high costs for airport companies who have to use more parking spaces than will be used to absorb the rush of traffic.
  • Enhanced airport staff management : Among the main benefits of passenger flow prediction is knowing how many passengers will pass through a particular airport or check-in counter, so that you know how many staff are needed and where to place them, thus reducing downtime and increasing efficiency.
  • More effective investment planning : Forecasting passenger flows at airports allows airports and managing authorities to better locate and plan investments, which in turn creates benefits for passengers who get better service, shorter waiting times and efficient rest services.
  • Decrease congestion at security checkpoints : Passenger flow forecasting facilitates the work of the security checkpoint agents by optimising their planning, which allows for optimal passenger handling.

1 - Context

The purpose of this use case is to study the passenger traffic in the airports each month in the following years, in order to proceed to our need, we took the database of the international airport of “San Francisco”. this airport located in the south of the city centre of San Francisco is an important air transport platform that serves domestic and international flights, as well as transpacific flights. it is one of the main hubs of the company United Airline, which generates a relatively important part of the traffic of the airport. 

This database contains the data of all aircraft landings from 2002 to 2018. The only limitation of this data is that the recording period ends in 2018, which means that the period of vacancy that certainly impacted the number of flights is not taken into account. We have used this database to predict over time what the monthly passenger flow will be in the following years.


2 - Data analysis and preparation

a) Analysis Data

Before launching a machine learning model, we have made some visualizations on our database that will allow us to get into the subject. papai platform generates all forms of visualization 2D, 3D or Map if we put information on the latitude and the longitude of localization of our information.

  • The first observation on our dataset of total passengers in thousands at San Francisco airport over time is that the number of passengers is generally increasing from year to year
  • Secondly, if we plot the number of passengers per period, we see a clear seasonal pattern with a period of about one year. The peak of traffic is around summer, while the lowest traffic is in winter, at the beginning of the year.
  • Third, by doing a 3D display of the number of passengers over time for each airline, we can see the average number of passengers per airline each year and calculate its share of total traffic: 1) United Airlines, 2) SkyWest Airlines, 3) American Airlines, 4) Virgin America, 5) Delta Airlines

b) Cleaning Data

In order to study a time series case, we first need to change the type of the Activity_Period to timestamp, which is the ideal format for Papai to use the Time series cleaning module, After having set the type that suits our columns, we have to resample our table, in order to have a single interval between all the data. Papais recommends us frequencies of resampling, we notice that the intervale which intervenes the most is 31 days.

After choosing the date interval of our data, we choose the target we want to predict, which is the number of passengers, and we set the min to take the average number of passengers over each month.

3 - Training Data

After the cleaning step, your dataset is ready for training any ML model. With our AutoML module, you can simply create multiple model experiments from built-in scalable ML algorithms with just a few clicks. You can build your own pipeline to create the best possible model with no-code capabilities, which is useful for practitioners or non-data experts.

First step is to choose the number of data on which our model will train, we chose 123, which is the total amount of our dataset, so our model will train on all the dataset, and predict 80 points in the future (FUTURE) which represents 6 years and 6 months.

In the Time series forecasting module (univariate) we have 15 machine learning models including: ARIMA Forecaster, block RNN, FFT, Prophet, LSTM …… these models are parameterized by default with the values of SKLEARN .

After running several models at the same time, we have a ranking of the results obtained, on which we can notice that the best result was obtained with PROPHET in dark green, and the lowest result was with the Regular ML Models. We promote the best result to make the prediction with.

4 - Analyse & Understand the model

In order to evaluate the quality of the trained model, papAI integrates an applicability module for each experiment created, In our case we can notice that the backtest, which are predicted values on years where we already had the number of passengers. The result is clearly satisfactory, as the orange points overlap with the blue points, and the green points represent the forecast.

5 - Predict on the testing set

The result is a value that represents the number of passengers over the following months, which were generated with Papai

6- Visualizing the output

In order to see the progression of the values over time, we have below a visualization of the already existing values in orange, and the predicted values in blue.

The airport business is characterized by high competition and increasing pressure on costs. This makes it essential to operate efficiently, without compromising safety. The development of passenger demand forecasts is a key tool to achieve that. It helps improve operational efficiency through better resource planning and can also be used in investment decisions, such as deciding whether or not to build an extra stand or gate area.


Interested in discovering papAI ?

 Our commercial team is at your disposal for any question

Airport Passenger Flow Forecasting using papAI
Scroll to top