I recently got into board climbing, and I enjoy climbing on the TB2 and the Kilter Board. I've been climbing on 12x12ft boards that are available at my local gym, and I've never felt that the phrase "it hurts so good" would be so apt. As such, I've done an in depth analysis of TB2 data here, and have decided to mimic that analysis with available Kilter Board data.

Setup and Reproducibility
Part I — Data Analysis (Notebooks 01–03)
Part II — Predictive Modelling (Notebooks 04–06)
Using the Trained Model

Overview

This project analyzes ~300,000 climbs on the Kilter Board in order to do the following.

Understand hold usage patterns and difficulty distributions

Quantify empircal hold difficulty scores

Predict climb grades from hold positions and board angle

Climbing grades are inherently subjective. Different climbers use different beta, setters have different grading standards, and difficulty depends on factors not always captured in data. Moreover, on the boards, the displayed grade for any specific climb is based on user input.

Using a Kilter Board dataset, this project combines:

SQL-based data exploration
statistical analysis and visualization
feature engineering
machine learning and deep learning

The project is intentionally structured in two parts:

Part I — Data Analysis
Part II — Predictive Modelling

Project Structure

data/        # processed datasets and feature tables
images/      # saved visualizations used in README and analysis
models/      # trained models and scalers
notebooks/   # full pipeline (01–06)
scripts/     # utility + prediction scripts
sql/         # SQL exploration
README.md

Setup and Reproducibility

Requirements

pip install requirements.txt

Retrieving the Data

The utility BoardLib is used for interacting with climbing board APIs, and works with all Aurora Climbing boards. We'll work with the Kilter Board. I downloaded Kilter data as kilter.db, and I also downloaded the images.

# install boardlib (also in requirements.txt)
pip install boardlib

# download the database
boardlib database kilter data/kilter.db

# download the images
# this puts the images into images/product_sizes_layouts_sets
boardlib images kilter data/kilter.db images

Note. I downloaded the database in March 2026, and the data was last updated on 2026-01-22. There is no user data in this database. The image I use to overlay the heatmaps on is images/kilter-original-16x12_compose.png. It is just the two following images put together:

images/product_sizes_layouts_sets/original-16x12-bolt-ons-v2.png
images/product_sizes_layouts_sets/original-16x12-screw-ons-v2.png

Running the project

Go to your working directory and run notebooks in order:

01 -> 02 -> 03 -> 04 -> 05 -> 06

Note:

Notebook 03 generates hold difficulty tables
Notebook 04 generates feature matrix
Notebook 05 trains models
Notebook 06 trains neural network

Part I — Data Analysis (Notebooks 01–03)

This section focuses on understanding the data, identifying patterns, and forming hypotheses. We start off by mentioning that we don't have any user data. We are still able to determine some user-trends from features of climbs like fa_at (when it was first ascented) and ascensionist_count (how many people have logged an ascent) from the climbs and climb_stats tables, but that's about it.

1. Data Overview and Climbing Statistics

There are about 30 tables in this database, about half of which contain useful information. See sql/01_data_exploration.sql for the full exploration of tables. We examine many climbing statistics, starting off with grade distribution.

Grade distribution is skewed toward mid-range climbs
Extreme difficulties are relatively rare
Multiple entries per climb reflect angle variations

2. Climbing Popularity and Temporal Patterns

Beyond structural analysis, we can also study how board-climbers behave over time (despite the lack of user data).

General uptrend of popularity over the years, both in term of first ascents and unique setters

3. Angle vs Difficulty

Wall angle is one of the strongest predictors of difficulty
Steeper climbs tend to be harder
Significant variability remains within each angle
Things tend to stabilize past 50 degrees

4. Board Structure and Hold Usage

Hold usage is highly non-uniform
Certain board regions are heavily overrepresented
Spatial structure plays a key role in difficulty

5. Hold Difficulty Estimation

Hold difficulty is estimated from climb data
We averaged (per-role/per-angle) difficulty for each hold (with Bayesian smoothing)

Key technique: Bayesian smoothing

Raw averages are noisy due to uneven usage. To stabilize estimates:

frequently used holds retain their empirical difficulty
rarely used holds are pulled toward the global mean

This significantly improves downstream feature quality.

6. Many more

There are many other statistics, see notebooks 01 (climbing statistics), 02 (climbing hold statistics), and 03 (hold difficulty). Included are:

Time-Date analysis based on fa_at. We include month, day of week, and time analysis based on first ascent log data. Winter months are the most popular, and Tuesday is the most popular day of the week.
Distribution of climbs per angle, with 40 degrees being the most common.
Distribution of climb quality, along with the relationship between quality & angle + grade.
"Match" vs "No Match" analysis (whether or not you can match your hands on a hold). "No match" climbs are fewer, but harder and have more ascensionists
Prolific statistics: most popular routes & setters
Per-Angle, Per-Grade hold frequency & difficulty analyses
more!

Part II — Predictive Modelling (Notebooks 04–06)

This section focuses on building predictive models and evaluating performance. We will build features from the angle and frames of a climb (the frames feature of a climb tells us which hold to use and which role it plays).

Features are constructed at the climb level using only structural and geometric information derived from the climb definition (angle and frames).

We explicitly avoid using hold-difficulty-derived features in the predictive models to prevent target leakage.

Feature categories include:

Geometry — spatial footprint of the climb (height, spread, convex hull)
Movement — reach distances and spatial relationships between holds
Density — how tightly or sparsely holds are arranged
Symmetry — left/right balance and distribution
Path structure — approximations of movement flow and efficiency
Normalized position — relative positioning on the board
Interaction features — simple nonlinear combinations (e.g., angle × hold count)

This results in a leakage-free feature set that better reflects the physical structure of climbing.

Category	Description	Examples
Geometry	Shape and size of climb	bbox_area, range_x, range_y
Movement	Reach and movement structure	mean_hand_reach, path_efficiency
Density	Hold spacing and compactness	hold_density, holds_per_vertical_foot
Symmetry	Left/right balance	symmetry_score, left_ratio
Path	Approximate movement trajectory	path_length_vertical
Position	Relative board positioning	mean_y_normalized, start_height_normalized
Distribution	Vertical distribution of holds	y_q75, y_iqr
Interaction	Nonlinear feature combinations	angle_squared, angle_x_holds

Important design decision

The dataset is restricted to:

climbs with angle ≤ 55°

to reduce variability and improve consistency. (see Angle vs Difficulty, where average climb grade seems to stabilize or get lower over 50°)

Important: Leakage and Feature Design

Earlier iterations of this project included features derived from hold difficulty scores (computed from climb grades). While these features slightly improved predictive performance, they introduce a form of target leakage if computed globally.

In this version of the project:

Hold difficulty scores are still computed in Notebook 03 for exploratory analysis
Predictive models (Notebooks 04–06) use only leakage-free features
No feature is derived from the target variable (display_difficulty)

This allows the model to learn from the structure of climbs themselves, rather than from aggregated statistics of the labels.

Note: Hold-difficulty-based features can still be valid in a production setting if computed strictly from historical (training) data, similar to target encoding techniques.

8. Feature Relationships

Here are some relationships between features and difficulty

higher angles allow for harder difficulties
distance between holds seems to relate to difficulty
geometric and structural features capture non-trivial climbing patterns

We have a full feature list in data/04_climb_features/feature_list.txt. Explanations are available in data/04_climb_features/feature_explanations.txt.

9. Predictive Models

Models tested:

Linear Regression
Ridge
Lasso
Random Forest
Gradient Boosting
Neural Networks

Feature importance

Key drivers:

wall angle
reach-based features (e.g., mean/max hand reach)
spatial density and distribution
geometric structure of the climb

This confirms that difficulty is strongly tied to spatial arrangement and movement constraints, rather than just individual hold properties.

10. Model Performance

Results (in terms of V-grade)

Both the RF and NN models performed similarly.

~70% within ±1 V-grade (~36% within ±1 difficulty score)
~90% within ±2 V-grade (~65% within ±2 difficulty scores)

In earlier experiements, we were able to achieve ~83% within one V-grade and ~96% within 2. However, that setup used hold-difficulties from notebook 03 derived from climbing grades, creating leakage. This result is more realistic and more independent: the model relies purely on spatial and structural information, without access to hold-based information or beta.

This demonstrates that a substantial portion of climbing difficulty can be attributed to geometry and movement constraints.

Interpretation

Models capture meaningful trends
Exact prediction is difficult due to:
- subjective grading
- missing beta (movement sequences)
- climber variability

Results Summary

Metric	Performance
Within ±1 V-grade	~70%
Within ±2 V-grades	~90%

The model can still predict subgrades (e.g., V3 contains 6a and 6a+), but it is not as accurate.

Metric	Performance
Within ±1 difficulty-grade	~36%
Within ±2 difficulty-grades	~65%

Limitations

No explicit movement / beta information
Grading inconsistency
No climber-specific features
Dataset noise

Future Work

Unified grade prediction across boards
Test other models
Better spatial features
GUI to create climb and instantly tell you a predicted difficulty

Using the Trained Model

Load model in Python

import joblib

model = joblib.load('models/random_forest_tuned.pkl')

Predict from feature matrix

import pandas as pd

df = pd.read_csv('data/04_climb_features/climb_features.csv')
X = df.drop(columns=['climb_uuid', 'display_difficulty'])

predictions = model.predict(X)

Model files

models/random_forest_tuned.pkl — trained Random Forest

Using the Prediction Script

The repository includes a prediction script that can estimate climb difficulty directly from:

wall angle
frames string
optional metadata such as is_nomatch and description

The script reconstructs the engineered feature vector used during training, applies the selected model, and returns:

predicted numeric difficulty
rounded display difficulty
mapped boulder grade

Supported models

The script supports the following trained models:

random_forest — default and recommended
linear
ridge
lasso
nn — alias for the best neural network checkpoint
nn_best

Single-climb prediction

Example:

python scripts/predict.py --angle 35 --frames 'p1084r15p1146r12p1163r12p1206r15p1214r13p1231r13p1236r13p1242r15p1256r13p1270r13p1307r13p1324r13p1361r13p1395r14' --model random_forest

Example output:

{
    'predicted_numeric': 16.4272248633235,
    'predicted_display_difficulty': 16,
    'predicted_boulder_grade': '6a/V3',
    'model': 'random_forest'
}

You can also use the neural network:

python scripts/predict.py --angle 40 --frames 'p1084r15p1094r12p1163r12p1231r13p1236r13p1256r13p1270r13p1324r13p1361r13p1395r14p1498r15p1499r15' --model nn

Batch prediction from CSV

The same script can run predictions for an entire CSV file.

Required columns

angle
frames

Optional columns

is_nomatch
description

Example input CSV

angle,frames,is_nomatch,description
40,p1131r15p1168r12p1169r12p1237r13p1287r13p1300r13p1385r14,0,
35,p1171r15p1208r15p1239r12p1289r12p1302r13p1353r13p1384r14p1389r15,1,no matching

Run batch prediction

python scripts/predict.py --input_csv data/new_climbs.csv --output_csv data/new_climbs_with_predictions.csv --model random_forest

This appends prediction columns to the original CSV, including:

predicted_numeric
predicted_display_difficulty
predicted_boulder_grade
model

Evaluate predictions on labeled data

If your CSV also contains a true difficulty column named display_difficulty, the script can compute simple evaluation metrics:

python scripts/predict.py --input_csv data/test_climbs.csv --output_csv data/test_preds.csv --model random_forest --evaluate

Reported metrics include:

mean absolute error
RMSE
fraction within ±1 grade
fraction within ±2 grades

Python usage

You can also call the prediction function directly:

from scripts.predict import predict

result = predict(
    angle=40,
    frames="p1131r15p1168r12p1169r12p1237r13p1287r13p1300r13p1385r14",
    model_name="random_forest"
)

print(result)

Notes

random_forest is the recommended default model for practical use.
Linear, ridge, lasso, and neural network models are included for comparison.
The prediction pipeline depends on the same engineered features used during model training, so the script internally reconstructs these from raw route input.
The neural network checkpoints are loaded from saved PyTorch state dictionaries using the architecture defined in the project.

License

This project is licensed under the MIT License. See the LICENSE file for details.

The project is for educational purposes. Climb data belongs to Kilter.

README.md Unescape Escape

Kilter Board: Predicting Climbing Route Difficulty from Board Data