fixed leakage

2026-03-28 16:03:04 -04:00
parent 880272aaf5
commit 3ab9b77bb7
36 changed files with 2296 additions and 681 deletions
@@ -1,4 +1,5 @@
 angle
+angle_squared
 total_holds
 hand_holds
 foot_holds
@@ -6,7 +7,6 @@ start_holds
 finish_holds
 middle_holds
 is_nomatch
-mean_x
 mean_y
 std_x
 std_y
@@ -14,107 +14,36 @@ range_x
 range_y
 min_y
 max_y
-start_height
-start_height_min
-start_height_max
-finish_height
-finish_height_min
-finish_height_max
 height_gained
 height_gained_start_finish
 bbox_area
-bbox_aspect_ratio
-bbox_normalized_area
 hold_density
 holds_per_vertical_foot
-left_holds
-right_holds
 left_ratio
 symmetry_score
-hand_left_ratio
-hand_symmetry
-upper_holds
-lower_holds
 upper_ratio
-max_hand_reach
-min_hand_reach
 mean_hand_reach
+max_hand_reach
 std_hand_reach
 hand_spread_x
 hand_spread_y
-max_foot_spread
-mean_foot_spread
-foot_spread_x
-foot_spread_y
-max_hand_to_foot
 min_hand_to_foot
 mean_hand_to_foot
 std_hand_to_foot
-mean_hold_difficulty
-max_hold_difficulty
-min_hold_difficulty
-std_hold_difficulty
-median_hold_difficulty
-difficulty_range
-mean_hand_difficulty
-max_hand_difficulty
-std_hand_difficulty
-mean_foot_difficulty
-max_foot_difficulty
-std_foot_difficulty
-start_difficulty
-finish_difficulty
-hand_foot_ratio
-movement_density
-hold_com_x
-hold_com_y
-weighted_difficulty
 convex_hull_area
-convex_hull_perimeter
 hull_area_to_bbox_ratio
-min_nn_distance
-mean_nn_distance
-max_nn_distance
-std_nn_distance
-mean_neighbors_12in
-max_neighbors_12in
-clustering_ratio
+mean_pairwise_distance
+std_pairwise_distance
 path_length_vertical
 path_efficiency
-difficulty_gradient
-lower_region_difficulty
-middle_region_difficulty
-upper_region_difficulty
-difficulty_progression
-max_difficulty_jump
-mean_difficulty_jump
-difficulty_weighted_reach
-max_weighted_reach
-mean_x_normalized
 mean_y_normalized
-std_x_normalized
-std_y_normalized
 start_height_normalized
 finish_height_normalized
-start_offset_from_typical
-finish_offset_from_typical
 mean_y_relative_to_start
-max_y_relative_to_start
 spread_x_normalized
 spread_y_normalized
-bbox_coverage_x
-bbox_coverage_y
-y_q25
-y_q50
 y_q75
 y_iqr
-holds_bottom_quartile
-holds_top_quartile
+complexity_score
 display_difficulty
 angle_x_holds
-angle_x_difficulty
-angle_squared
-difficulty_x_height
-difficulty_x_density
-complexity_score
-hull_area_x_difficulty
@@ -0,0 +1,35 @@
+
+### Model Performance Summary
+
+| Model | MAE | RMSE | R² | Within ±1 | Within ±2 | Exact V | Within ±1 V |
+|-------|-----|------|----|-----------|-----------|---------|-------------|
+| Linear Regression | 2.088 | 2.670 | 0.560 | 30.1% | 55.9% | 25.9% | 64.8% |
+| Ridge Regression | 2.088 | 2.670 | 0.560 | 30.0% | 55.9% | 25.9% | 64.8% |
+| Lasso Regression | 2.089 | 2.672 | 0.559 | 29.9% | 55.9% | 25.9% | 64.8% |
+| Random Forest (Tuned) | 1.846 | 2.375 | 0.652 | 34.8% | 62.4% | 29.6% | 69.7% |
+
+### Key Findings
+
+1. **Tree-based models remain strongest on this structured feature set.**
+   - Random Forest (Tuned) achieves the best overall balance of MAE, RMSE, and grouped V-grade performance.
+   - Linear models remain useful baselines but leave clear nonlinear signal unexplained.
+
+2. **Fine-grained difficulty prediction is meaningfully harder than grouped grade prediction.**
+   - On the held-out test set, the best model is within ±1 fine-grained difficulty score 34.8% of the time.
+   - The same model is within ±1 grouped V-grade 69.7% of the time.
+
+3. **This gap is expected and informative.**
+   - Small numeric errors often stay inside the same or adjacent V-grade buckets.
+   - The model captures broad difficulty bands more reliably than exact score distinctions.
+
+4. **The project’s main predictive takeaway is practical rather than perfect.**
+   - The models are not exact grade replicators.
+   - They are reasonably strong at placing climbs into the correct neighborhood of difficulty.
+
+### Portfolio Interpretation
+
+From a modelling perspective, this project shows:
+- feature engineering grounded in domain structure,
+- comparison of linear and nonlinear models,
+- honest evaluation on a held-out test set,
+- and the ability to translate raw regression performance into climbing-relevant grouped V-grade metrics.