deep learning notebook
22
.gitignore
vendored
Normal file
@@ -0,0 +1,22 @@
|
|||||||
|
__pycache__/
|
||||||
|
.ipynb_checkpoints/
|
||||||
|
*.pyc
|
||||||
|
.env
|
||||||
|
.venv/
|
||||||
|
venv/
|
||||||
|
.DS_Store
|
||||||
|
.vscode/
|
||||||
|
|
||||||
|
data/*.csv
|
||||||
|
data/*.parquet
|
||||||
|
data/*.db
|
||||||
|
data/03_hold_difficulty/*.csv
|
||||||
|
data/04_climb_features/*.csv
|
||||||
|
data/05_predictive_modelling/*.csv
|
||||||
|
data/06_deep_learning/*.csv
|
||||||
|
data/06_deep_learning/*.npy
|
||||||
|
models/*.pth
|
||||||
|
models/*.pkl
|
||||||
|
models/*.csv
|
||||||
|
|
||||||
|
BLOG.md
|
||||||
@@ -62,7 +62,7 @@ The utility [`BoardLib`](https://github.com/lemeryfertitta/BoardLib) is used for
|
|||||||
We'll work with the Tension Board 2. I downloaded TB2 data as `tb2.db`, and I also downloaded the images.
|
We'll work with the Tension Board 2. I downloaded TB2 data as `tb2.db`, and I also downloaded the images.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# install boardlib
|
# install boardlib (also in requirements.txt)
|
||||||
pip install boardlib
|
pip install boardlib
|
||||||
|
|
||||||
# download the database
|
# download the database
|
||||||
|
|||||||
35
data/05_predictive_modelling/model_summary.txt
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
|
||||||
|
### Model Performance Summary
|
||||||
|
|
||||||
|
| Model | MAE | RMSE | R² | Within ±1 | Within ±2 | Exact V | Within ±1 V |
|
||||||
|
|-------|-----|------|----|-----------|-----------|---------|-------------|
|
||||||
|
| Linear Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
|
||||||
|
| Ridge Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
|
||||||
|
| Lasso Regression | 1.475 | 1.891 | 0.780 | 42.2% | 73.0% | 34.6% | 79.3% |
|
||||||
|
| Random Forest (Tuned) | 1.325 | 1.718 | 0.818 | 47.0% | 77.7% | 38.6% | 83.0% |
|
||||||
|
|
||||||
|
### Key Findings
|
||||||
|
|
||||||
|
1. **Tree-based models remain strongest on this structured feature set.**
|
||||||
|
- Random Forest (Tuned) achieves the best overall balance of MAE, RMSE, and grouped V-grade performance.
|
||||||
|
- Linear models remain useful baselines but leave clear nonlinear signal unexplained.
|
||||||
|
|
||||||
|
2. **Fine-grained difficulty prediction is meaningfully harder than grouped grade prediction.**
|
||||||
|
- On the held-out test set, the best model is within ±1 fine-grained difficulty score 47.0% of the time.
|
||||||
|
- The same model is within ±1 grouped V-grade 83.0% of the time.
|
||||||
|
|
||||||
|
3. **This gap is expected and informative.**
|
||||||
|
- Small numeric errors often stay inside the same or adjacent V-grade buckets.
|
||||||
|
- The model captures broad difficulty bands more reliably than exact score distinctions.
|
||||||
|
|
||||||
|
4. **The project’s main predictive takeaway is practical rather than perfect.**
|
||||||
|
- The models are not exact grade replicators.
|
||||||
|
- They are reasonably strong at placing climbs into the correct neighborhood of difficulty.
|
||||||
|
|
||||||
|
### Portfolio Interpretation
|
||||||
|
|
||||||
|
From a modelling perspective, this project shows:
|
||||||
|
- feature engineering grounded in domain structure,
|
||||||
|
- comparison of linear and nonlinear models,
|
||||||
|
- honest evaluation on a held-out test set,
|
||||||
|
- and the ability to translate raw regression performance into climbing-relevant grouped V-grade metrics.
|
||||||
33
data/06_deep_learning/neural_network_summary.txt
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
|
||||||
|
### Neural Network Model Summary
|
||||||
|
|
||||||
|
**Architecture:**
|
||||||
|
- Input: 119 features
|
||||||
|
- Hidden layers: [256, 128, 64]
|
||||||
|
- Dropout rate: 0.2
|
||||||
|
- Total parameters: 72,833
|
||||||
|
|
||||||
|
**Training:**
|
||||||
|
- Optimizer: Adam (lr=0.001)
|
||||||
|
- Early stopping: 25 epochs patience
|
||||||
|
- Best epoch: 121
|
||||||
|
|
||||||
|
**Test Set Performance:**
|
||||||
|
- MAE: 1.270
|
||||||
|
- RMSE: 1.643
|
||||||
|
- R²: 0.834
|
||||||
|
- Accuracy within ±1 grade: 49.0%
|
||||||
|
- Accuracy within ±2 grades: 80.2%
|
||||||
|
- Exact grouped V-grade accuracy: 39.2%
|
||||||
|
- Accuracy within ±1 V-grade: 84.3%
|
||||||
|
- Accuracy within ±2 V-grades: 96.8%
|
||||||
|
|
||||||
|
**Key Findings:**
|
||||||
|
1. The neural network is competitive, but not clearly stronger than the best tree-based baseline.
|
||||||
|
2. Fine-grained score prediction remains harder than grouped grade prediction.
|
||||||
|
3. The grouped V-grade metrics show that the model captures broader difficulty bands more reliably than exact score labels.
|
||||||
|
4. This makes the neural network useful as a comparison model, and potentially valuable in an ensemble.
|
||||||
|
|
||||||
|
**Portfolio Interpretation:**
|
||||||
|
This deep learning notebook extends the classical modelling pipeline by testing whether a neural architecture can improve prediction quality on engineered climbing features.
|
||||||
|
The main result is not that deep learning wins outright, but that it provides a meaningful benchmark and helps clarify where model complexity does and does not add value.
|
||||||
|
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 55 KiB |
|
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 37 KiB |
|
Before Width: | Height: | Size: 100 KiB After Width: | Height: | Size: 100 KiB |
|
Before Width: | Height: | Size: 467 KiB After Width: | Height: | Size: 473 KiB |
|
Before Width: | Height: | Size: 108 KiB After Width: | Height: | Size: 104 KiB |
|
Before Width: | Height: | Size: 57 KiB After Width: | Height: | Size: 57 KiB |
119
models/feature_names.txt
Normal file
@@ -0,0 +1,119 @@
|
|||||||
|
angle
|
||||||
|
total_holds
|
||||||
|
hand_holds
|
||||||
|
foot_holds
|
||||||
|
start_holds
|
||||||
|
finish_holds
|
||||||
|
middle_holds
|
||||||
|
is_nomatch
|
||||||
|
mean_x
|
||||||
|
mean_y
|
||||||
|
std_x
|
||||||
|
std_y
|
||||||
|
range_x
|
||||||
|
range_y
|
||||||
|
min_y
|
||||||
|
max_y
|
||||||
|
start_height
|
||||||
|
start_height_min
|
||||||
|
start_height_max
|
||||||
|
finish_height
|
||||||
|
finish_height_min
|
||||||
|
finish_height_max
|
||||||
|
height_gained
|
||||||
|
height_gained_start_finish
|
||||||
|
bbox_area
|
||||||
|
bbox_aspect_ratio
|
||||||
|
bbox_normalized_area
|
||||||
|
hold_density
|
||||||
|
holds_per_vertical_foot
|
||||||
|
left_holds
|
||||||
|
right_holds
|
||||||
|
left_ratio
|
||||||
|
symmetry_score
|
||||||
|
hand_left_ratio
|
||||||
|
hand_symmetry
|
||||||
|
upper_holds
|
||||||
|
lower_holds
|
||||||
|
upper_ratio
|
||||||
|
max_hand_reach
|
||||||
|
min_hand_reach
|
||||||
|
mean_hand_reach
|
||||||
|
std_hand_reach
|
||||||
|
hand_spread_x
|
||||||
|
hand_spread_y
|
||||||
|
max_foot_spread
|
||||||
|
mean_foot_spread
|
||||||
|
foot_spread_x
|
||||||
|
foot_spread_y
|
||||||
|
max_hand_to_foot
|
||||||
|
min_hand_to_foot
|
||||||
|
mean_hand_to_foot
|
||||||
|
std_hand_to_foot
|
||||||
|
mean_hold_difficulty
|
||||||
|
max_hold_difficulty
|
||||||
|
min_hold_difficulty
|
||||||
|
std_hold_difficulty
|
||||||
|
median_hold_difficulty
|
||||||
|
difficulty_range
|
||||||
|
mean_hand_difficulty
|
||||||
|
max_hand_difficulty
|
||||||
|
std_hand_difficulty
|
||||||
|
mean_foot_difficulty
|
||||||
|
max_foot_difficulty
|
||||||
|
std_foot_difficulty
|
||||||
|
start_difficulty
|
||||||
|
finish_difficulty
|
||||||
|
hand_foot_ratio
|
||||||
|
movement_density
|
||||||
|
hold_com_x
|
||||||
|
hold_com_y
|
||||||
|
weighted_difficulty
|
||||||
|
convex_hull_area
|
||||||
|
convex_hull_perimeter
|
||||||
|
hull_area_to_bbox_ratio
|
||||||
|
min_nn_distance
|
||||||
|
mean_nn_distance
|
||||||
|
max_nn_distance
|
||||||
|
std_nn_distance
|
||||||
|
mean_neighbors_12in
|
||||||
|
max_neighbors_12in
|
||||||
|
clustering_ratio
|
||||||
|
path_length_vertical
|
||||||
|
path_efficiency
|
||||||
|
difficulty_gradient
|
||||||
|
lower_region_difficulty
|
||||||
|
middle_region_difficulty
|
||||||
|
upper_region_difficulty
|
||||||
|
difficulty_progression
|
||||||
|
max_difficulty_jump
|
||||||
|
mean_difficulty_jump
|
||||||
|
difficulty_weighted_reach
|
||||||
|
max_weighted_reach
|
||||||
|
mean_x_normalized
|
||||||
|
mean_y_normalized
|
||||||
|
std_x_normalized
|
||||||
|
std_y_normalized
|
||||||
|
start_height_normalized
|
||||||
|
finish_height_normalized
|
||||||
|
start_offset_from_typical
|
||||||
|
finish_offset_from_typical
|
||||||
|
mean_y_relative_to_start
|
||||||
|
max_y_relative_to_start
|
||||||
|
spread_x_normalized
|
||||||
|
spread_y_normalized
|
||||||
|
bbox_coverage_x
|
||||||
|
bbox_coverage_y
|
||||||
|
y_q25
|
||||||
|
y_q50
|
||||||
|
y_q75
|
||||||
|
y_iqr
|
||||||
|
holds_bottom_quartile
|
||||||
|
holds_top_quartile
|
||||||
|
angle_x_holds
|
||||||
|
angle_x_difficulty
|
||||||
|
angle_squared
|
||||||
|
difficulty_x_height
|
||||||
|
difficulty_x_density
|
||||||
|
complexity_score
|
||||||
|
hull_area_x_difficulty
|
||||||