F1 Tyre Life Prediction Project

This project analyzes Formula 1 race data to predict tyre life using machine learning models such as LightGBM, Random Forest, and Linear Regression. It includes data pipelines for session extraction, feature engineering, and model training. The codebase supports data cleaning, feature selection, and model evaluation, with a focus on tyre compound analysis and stint strategies. See more project story: www.linkedin.com/in/dіana-antoniuk-067b28362

Deployed project: https://antoniukdin34.pythonanywhere.com/

Demo info: The demo showcases the best-performing model by plotting the entire test set in a graph, so you can see how the model works with real data. Most predictions are accurate, with a few outliers visible in the last graph. The first three graphs show the data I gathered for the project. I chose this approach to make the results and model behavior clear and understandable.

How It Was Built

Read up on F1 strategy and talked to people on LinkedIn to define the problem.
Tested the FastF1 API with a single race (Monaco 2022) to get familiar with the data.
Built a dataset covering 2022–2025, tracking every lap and tyre change.
Tried out several models—random forest worked best for predicting tyre stints.
Switched from one-hot encoding (separate columns for Soft/Medium/Hard) to numeric encoding (Soft=1, Medium=2, Hard=3).
Cleaned up the data, normalized features, and removed outliers (like a tyre stint of only 3 laps).
Noticed the model was focusing on irrelevant features (like year), so simplified the feature set.
Ended up with a model using just three features: circuit length, compound, and tyre life.
Normalized per track, since “Soft” tyres at Monaco aren’t the same as “Soft” at Monza.
Used scatter plots to spot and remove outliers—some tyre lives just didn’t fit the trend.
Dealt with data collisions (same features, different results) by refining the dataset.

Main Features

Automated extraction of F1 session data using FastF1
Data cleaning, normalization, and feature engineering scripts
Model training with LightGBM, Random Forest, and Linear Regression
Tools for encoding categorical features and handling outliers
Visualization of feature importances and tyre compound statistics
Utilities for filtering and transforming CSV datasets
Modular pipeline for experimenting with different features and models

Directory Structure

Dataset_Preparation/: Scripts and CSVs for data cleaning and preparation
Model_Training/: Model training scripts and baseline models
README.md: Project instructions and overview

Getting Started

Allow running local scripts for this session (required for activation script)

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process

Activate the virtual environment

.\venv\Scripts\Activate.ps1

Create a new virtual environment named 'venv'

python -m venv venv

(Alternative) Activate the virtual environment (Command Prompt)

venv\Scripts\activate

Install the fastf1 module inside the virtual environment

pip install fastf1 pip install scikit-learn pip install seabornrn

(Alternative) Install all dependencies from requirements.txt

pip install -r requirements.txt

--- Data Set Preparation ---

To generate the all_years_sessions.csv file with all session data, run the batch pipeline script:

python batch_pipeline.py

Name	Name	Last commit message	Last commit date
Latest commit diantoniuksc Update README.md Aug 4, 2025 f53c1fc · · Aug 4, 2025 History 60 Commits
Dataset_Preparation	Dataset_Preparation	Prepared data for conversion	Jul 28, 2025
Model_Training	Model_Training	Prepared data for production	Jul 27, 2025
Website	Website	Reqiurements added	Jul 27, 2025
.gitignore	.gitignore	Data restructured	Jul 20, 2025
DEVLOG.md	DEVLOG.md	Data restructured	Jul 20, 2025
README.md	README.md	Update README.md	Aug 4, 2025
rf_model_v5.joblib	rf_model_v5.joblib	Prepared data for production	Jul 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F1 Tyre Life Prediction Project

How It Was Built

Main Features

Directory Structure

Getting Started

Allow running local scripts for this session (required for activation script)

Activate the virtual environment

Create a new virtual environment named 'venv'

(Alternative) Activate the virtual environment (Command Prompt)

Install the fastf1 module inside the virtual environment

(Alternative) Install all dependencies from requirements.txt

--- Data Set Preparation ---

To generate the all_years_sessions.csv file with all session data, run the batch pipeline script:

The generated file all_years_sessions.csv will be saved in the Dataset_Preparation directory.

About

Releases

Packages

Contributors 2

Languages

diantoniuksc/F1Strategy

Folders and files

Latest commit

History

Repository files navigation

F1 Tyre Life Prediction Project

How It Was Built

Main Features

Directory Structure

Getting Started

Allow running local scripts for this session (required for activation script)

Activate the virtual environment

Create a new virtual environment named 'venv'

(Alternative) Activate the virtual environment (Command Prompt)

Install the fastf1 module inside the virtual environment

(Alternative) Install all dependencies from requirements.txt

--- Data Set Preparation ---

To generate the all_years_sessions.csv file with all session data, run the batch pipeline script:

The generated file all_years_sessions.csv will be saved in the Dataset_Preparation directory.

About

Resources

Stars

Watchers

Forks

Releases

Packages

Contributors 2

Languages