deep learning notebook

2026-03-26 21:07:12 -04:00
parent 09454ba38b
commit 7e09141766
13 changed files with 9259 additions and 43 deletions
@@ -0,0 +1,22 @@
 __pycache__/
 .ipynb_checkpoints/
 *.pyc
 .env
 .venv/
 venv/
 .DS_Store
 .vscode/
 data/*.csv
 data/*.parquet
 data/*.db
 data/03_hold_difficulty/*.csv
 data/04_climb_features/*.csv
 data/05_predictive_modelling/*.csv
 data/06_deep_learning/*.csv
 data/06_deep_learning/*.npy
 models/*.pth
 models/*.pkl
 models/*.csv
 BLOG.md
@@ -62,7 +62,7 @@ The utility [`BoardLib`](https://github.com/lemeryfertitta/BoardLib) is used for
 We'll work with the Tension Board 2. I downloaded TB2 data as `tb2.db`, and I also downloaded the images.
 ```bash
-# install boardlib
+# install boardlib (also in requirements.txt)
 pip install boardlib
 # download the database
@@ -0,0 +1,35 @@
 ### Model Performance Summary
 | Model | MAE | RMSE | R² | Within ±1 | Within ±2 | Exact V | Within ±1 V |
 |-------|-----|------|----|-----------|-----------|---------|-------------|
 | Linear Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
 | Ridge Regression | 1.467 | 1.882 | 0.782 | 42.6% | 73.3% | 34.9% | 79.4% |
 | Lasso Regression | 1.475 | 1.891 | 0.780 | 42.2% | 73.0% | 34.6% | 79.3% |
 | Random Forest (Tuned) | 1.325 | 1.718 | 0.818 | 47.0% | 77.7% | 38.6% | 83.0% |
 ### Key Findings
 1. **Tree-based models remain strongest on this structured feature set.**
   - Random Forest (Tuned) achieves the best overall balance of MAE, RMSE, and grouped V-grade performance.
   - Linear models remain useful baselines but leave clear nonlinear signal unexplained.
 2. **Fine-grained difficulty prediction is meaningfully harder than grouped grade prediction.**
   - On the held-out test set, the best model is within ±1 fine-grained difficulty score 47.0% of the time.
   - The same model is within ±1 grouped V-grade 83.0% of the time.
 3. **This gap is expected and informative.**
   - Small numeric errors often stay inside the same or adjacent V-grade buckets.
   - The model captures broad difficulty bands more reliably than exact score distinctions.
 4. **The project’s main predictive takeaway is practical rather than perfect.**
   - The models are not exact grade replicators.
   - They are reasonably strong at placing climbs into the correct neighborhood of difficulty.
 ### Portfolio Interpretation
 From a modelling perspective, this project shows:
 - feature engineering grounded in domain structure,
 - comparison of linear and nonlinear models,
 - honest evaluation on a held-out test set,
 - and the ability to translate raw regression performance into climbing-relevant grouped V-grade metrics.
@@ -0,0 +1,33 @@
 ### Neural Network Model Summary
 **Architecture:**
 - Input: 119 features
 - Hidden layers: [256, 128, 64]
 - Dropout rate: 0.2
 - Total parameters: 72,833
 **Training:**
 - Optimizer: Adam (lr=0.001)
 - Early stopping: 25 epochs patience
 - Best epoch: 121
 **Test Set Performance:**
 - MAE: 1.270
 - RMSE: 1.643
 - R²: 0.834
 - Accuracy within ±1 grade: 49.0%
 - Accuracy within ±2 grades: 80.2%
 - Exact grouped V-grade accuracy: 39.2%
 - Accuracy within ±1 V-grade: 84.3%
 - Accuracy within ±2 V-grades: 96.8%
 **Key Findings:**
 1. The neural network is competitive, but not clearly stronger than the best tree-based baseline.
 2. Fine-grained score prediction remains harder than grouped grade prediction.
 3. The grouped V-grade metrics show that the model captures broader difficulty bands more reliably than exact score labels.
 4. This makes the neural network useful as a comparison model, and potentially valuable in an ensemble.
 **Portfolio Interpretation:**
 This deep learning notebook extends the classical modelling pipeline by testing whether a neural architecture can improve prediction quality on engineered climbing features.
 The main result is not that deep learning wins outright, but that it provides a meaningful benchmark and helps clarify where model complexity does and does not add value.
@@ -0,0 +1,119 @@
 angle
 total_holds
 hand_holds
 foot_holds
 start_holds
 finish_holds
 middle_holds
 is_nomatch
 mean_x
 mean_y
 std_x
 std_y
 range_x
 range_y
 min_y
 max_y
 start_height
 start_height_min
 start_height_max
 finish_height
 finish_height_min
 finish_height_max
 height_gained
 height_gained_start_finish
 bbox_area
 bbox_aspect_ratio
 bbox_normalized_area
 hold_density
 holds_per_vertical_foot
 left_holds
 right_holds
 left_ratio
 symmetry_score
 hand_left_ratio
 hand_symmetry
 upper_holds
 lower_holds
 upper_ratio
 max_hand_reach
 min_hand_reach
 mean_hand_reach
 std_hand_reach
 hand_spread_x
 hand_spread_y
 max_foot_spread
 mean_foot_spread
 foot_spread_x
 foot_spread_y
 max_hand_to_foot
 min_hand_to_foot
 mean_hand_to_foot
 std_hand_to_foot
 mean_hold_difficulty
 max_hold_difficulty
 min_hold_difficulty
 std_hold_difficulty
 median_hold_difficulty
 difficulty_range
 mean_hand_difficulty
 max_hand_difficulty
 std_hand_difficulty
 mean_foot_difficulty
 max_foot_difficulty
 std_foot_difficulty
 start_difficulty
 finish_difficulty
 hand_foot_ratio
 movement_density
 hold_com_x
 hold_com_y
 weighted_difficulty
 convex_hull_area
 convex_hull_perimeter
 hull_area_to_bbox_ratio
 min_nn_distance
 mean_nn_distance
 max_nn_distance
 std_nn_distance
 mean_neighbors_12in
 max_neighbors_12in
 clustering_ratio
 path_length_vertical
 path_efficiency
 difficulty_gradient
 lower_region_difficulty
 middle_region_difficulty
 upper_region_difficulty
 difficulty_progression
 max_difficulty_jump
 mean_difficulty_jump
 difficulty_weighted_reach
 max_weighted_reach
 mean_x_normalized
 mean_y_normalized
 std_x_normalized
 std_y_normalized
 start_height_normalized
 finish_height_normalized
 start_offset_from_typical
 finish_offset_from_typical
 mean_y_relative_to_start
 max_y_relative_to_start
 spread_x_normalized
 spread_y_normalized
 bbox_coverage_x
 bbox_coverage_y
 y_q25
 y_q50
 y_q75
 y_iqr
 holds_bottom_quartile
 holds_top_quartile
 angle_x_holds
 angle_x_difficulty
 angle_squared
 difficulty_x_height
 difficulty_x_density
 complexity_score
 hull_area_x_difficulty