1040 lines
34 KiB
Plaintext
1040 lines
34 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f301146a",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Kilter Board: Hold Difficulty Analysis\n",
|
||
"\n",
|
||
"We continue on with our hold analysis, except we will solely be interested in computing the difficulty of each hold.\n",
|
||
"\n",
|
||
"Recall some of the following findings.\n",
|
||
"\n",
|
||
"- The Kilter Board Original has `layout_id` 1, and has two sets: bolt holes and screw holes. These have `set_id` 1 and 20 respectively. \n",
|
||
"- the `frame` feature of a climb determines the climb: it looks something like `p3r4p29r2p59r1p65r2p75r3p89r2p157r4p158r4`. A substring `pXrY` tells us the placement (`placement_id=X`) and the role (whether it is a start, finish, foot, or middle hold) comes from the `placement_role_id=Y`. The role will also tell us which color to use if we plot our climb against the board.\n",
|
||
"- the `holes` table will tell us which `placement_id` goes where on the (x,y) coordinate system. It also tells us the ID of its mirror image, which let's us unravel the `placement_id` of its mirror image.\n",
|
||
"\n",
|
||
"## Output\n",
|
||
"\n",
|
||
"The final products are hold-level difficulty scores saved to CSV files. These scores encode, for each placement, the average difficulty of climbs that use that hold. The scores are computed per-angle, per-role, and also aggregated. A Bayesian smoothing step shrinks noisy estimates for rarely-used holds toward the global mean..\n",
|
||
"\n",
|
||
"## Notebook Structure\n",
|
||
"\n",
|
||
"1. [Setup and Imports](#setup-and-imports)\n",
|
||
"2. [Hold Usage DataFrame](#hold-usage-dataframe)\n",
|
||
"3. [Difficulty Score](#difficulty-score)\n",
|
||
"4. [Visualization](#visualization)\n",
|
||
"5. [Conclusion](#conclusion)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6e17c7da",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Setup and Imports"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "2cd8a53a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Setup and Imports\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"\n",
|
||
"# Imports\n",
|
||
"import pandas as pd\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import seaborn as sns\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.patches as mpatches\n",
|
||
"\n",
|
||
"import sqlite3\n",
|
||
"\n",
|
||
"import os\n",
|
||
"\n",
|
||
"import re\n",
|
||
"from collections import defaultdict\n",
|
||
"\n",
|
||
"from PIL import Image\n",
|
||
"\n",
|
||
"# Set some display options\n",
|
||
"pd.set_option('display.max_columns', None)\n",
|
||
"pd.set_option('display.max_rows', 100)\n",
|
||
"\n",
|
||
"# Set style\n",
|
||
"palette=['steelblue', 'coral', 'seagreen'] #(for multi-bar graphs)\n",
|
||
"\n",
|
||
"# Set board image for some visual analysis\n",
|
||
"board_img = Image.open('../images/kilter-original-16x12_compose.png')\n",
|
||
"\n",
|
||
"# Connect to the database\n",
|
||
"DB_PATH=\"../data/kilter.db\"\n",
|
||
"conn = sqlite3.connect(DB_PATH)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c9da4ef8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Query our data from the DB\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"This time we restrict to where `layout_id=10` for the TB2 Mirror.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Query climbs data\n",
|
||
"climbs_query = \"\"\"\n",
|
||
"SELECT\n",
|
||
" c.uuid,\n",
|
||
" c.name AS climb_name,\n",
|
||
" c.setter_username,\n",
|
||
" c.layout_id AS layout_id,\n",
|
||
" c.description,\n",
|
||
" c.is_nomatch,\n",
|
||
" c.is_listed,\n",
|
||
" l.name AS layout_name,\n",
|
||
" p.name AS board_name,\n",
|
||
" c.frames,\n",
|
||
" cs.angle,\n",
|
||
" cs.display_difficulty,\n",
|
||
" dg.boulder_name AS boulder_grade,\n",
|
||
" cs.ascensionist_count,\n",
|
||
" cs.quality_average,\n",
|
||
" cs.fa_at\n",
|
||
"FROM climbs c\n",
|
||
"JOIN layouts l ON c.layout_id = l.id\n",
|
||
"JOIN products p ON l.product_id = p.id\n",
|
||
"JOIN climb_stats cs ON c.uuid = cs.climb_uuid\n",
|
||
"JOIN difficulty_grades dg ON ROUND(cs.display_difficulty) = dg.difficulty\n",
|
||
"WHERE cs.display_difficulty IS NOT NULL AND c.is_listed=1 AND c.layout_id=1 AND cs.fa_at > '2016-01-01'\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Query information about placements (and their mirrors)\n",
|
||
"placements_query = \"\"\"\n",
|
||
"SELECT\n",
|
||
" p.id AS placement_id,\n",
|
||
" h.x,\n",
|
||
" h.y,\n",
|
||
" p.default_placement_role_id AS default_role_id,\n",
|
||
" p.set_id AS set_id,\n",
|
||
" s.name AS set_name\n",
|
||
"FROM placements p\n",
|
||
"JOIN holes h ON p.hole_id = h.id\n",
|
||
"JOIN sets s ON p.set_id = s.id\n",
|
||
"WHERE p.layout_id = 1 AND y <=156\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Load it into a DataFrame\n",
|
||
"df_climbs = pd.read_sql_query(climbs_query, conn)\n",
|
||
"df_placements = pd.read_sql_query(placements_query, conn)\n",
|
||
"\n",
|
||
"# Save placements csv in data (for other things later on)\n",
|
||
"df_placements.to_csv('../data/placements.csv')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "336687a9",
|
||
"metadata": {},
|
||
"source": [
|
||
"We've added a column for the mirror of a hold. Let's take a look at `df_placements`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b2f74d89",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"display(df_placements)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "1a4a5612",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Role definitions\n",
|
||
"ROLE_DEFINITIONS = {\n",
|
||
" 'start': 12,\n",
|
||
" 'middle': 13,\n",
|
||
" 'finish': 14,\n",
|
||
" 'foot': 15\n",
|
||
"}\n",
|
||
"\n",
|
||
"HAND_ROLES = ['start', 'middle', 'finish']\n",
|
||
"FOOT_ROLES = ['foot']\n",
|
||
"ROLE_TYPES = ['start', 'middle', 'finish', 'hand', 'foot']\n",
|
||
"\n",
|
||
"MATERIAL_PALETTE = {'Wood': '#8B4513', 'Plastic': '#4169E1'}\n",
|
||
"\n",
|
||
"def get_role_type(role_id):\n",
|
||
" \"\"\"Map role_id to role_type string.\"\"\"\n",
|
||
" for role_type, rid in ROLE_DEFINITIONS.items():\n",
|
||
" if role_id == rid:\n",
|
||
" return role_type\n",
|
||
" return 'unknown'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b395dd64",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Placement Data\n",
|
||
"# Build placement_coordinates dict\n",
|
||
"placement_coordinates = {\n",
|
||
" row['placement_id']: (row['x'], row['y'])\n",
|
||
" for _, row in df_placements.iterrows()\n",
|
||
"}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "3fee6f6b",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"get_role_type(15)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "51e0bd84",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"## Boundary conditions\n",
|
||
"x_min, x_max = -24, 168\n",
|
||
"y_min, y_max = 0, 156"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8b8d9abd",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Hold Usage DataFrame"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "85f7ac83",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Hold Usage DataFrame\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"Explodes climb frames into individual hold usages.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"records = []\n",
|
||
"\n",
|
||
"for _, row in df_climbs.iterrows():\n",
|
||
" frames = row['frames']\n",
|
||
" if not isinstance(frames, str):\n",
|
||
" continue\n",
|
||
" \n",
|
||
" matches = re.findall(r'p(\\d+)r(\\d+)', frames)\n",
|
||
" \n",
|
||
" for p_str, r_str in matches:\n",
|
||
" role_type = get_role_type(int(r_str))\n",
|
||
" records.append({\n",
|
||
" 'placement_id': int(p_str),\n",
|
||
" 'role_id': int(r_str),\n",
|
||
" 'role_type': role_type,\n",
|
||
" 'is_hand': role_type in HAND_ROLES,\n",
|
||
" 'is_foot': role_type in FOOT_ROLES,\n",
|
||
" 'difficulty': row['display_difficulty'],\n",
|
||
" 'angle': row['angle'],\n",
|
||
" 'climb_uuid': row['uuid']\n",
|
||
" })\n",
|
||
"\n",
|
||
"df_hold_usage = pd.DataFrame(records)\n",
|
||
"\n",
|
||
"print(f\"Built hold usage DataFrame: {len(df_hold_usage):,} records\")\n",
|
||
"print(f\"Unique placements: {df_hold_usage['placement_id'].nunique():,}\")\n",
|
||
"print(f\"Unique angles: {sorted(df_hold_usage['angle'].unique())}\")\n",
|
||
"\n",
|
||
"print(\"\\nRecords by role type:\")\n",
|
||
"display(df_hold_usage['role_type'].value_counts().to_frame('count'))\n",
|
||
"\n",
|
||
"print(f\"\\nHand usages: {df_hold_usage['is_hand'].sum():,}\")\n",
|
||
"print(f\"Foot usages: {df_hold_usage['is_foot'].sum():,}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "38df6453",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Difficulty Score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "107b223f",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Bayesian Smoothing of Hold Difficulty\n",
|
||
"\n",
|
||
"Raw hold difficulty estimates can be unstable for rarely used holds. To reduce\n",
|
||
"noise, we apply Bayesian smoothing, shrinking hold-level averages toward the\n",
|
||
"global mean difficulty. Frequently used holds remain close to their empirical\n",
|
||
"means, while sparse holds are pulled more strongly toward the overall average.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f9a4e3c9",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Bayesian Smoothing\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"SMOOTHING_M = 20\n",
|
||
"\n",
|
||
"def bayesian_smooth(mean_col, count_col, global_mean, m=SMOOTHING_M):\n",
|
||
" \"\"\"\n",
|
||
" Bayesian smoothing toward the global mean.\n",
|
||
" \"\"\"\n",
|
||
" return (count_col * mean_col + m * global_mean) / (count_col + m)\n",
|
||
"\n",
|
||
"GLOBAL_DIFFICULTY_MEAN = df_hold_usage['difficulty'].mean()\n",
|
||
"print(f\"Global difficulty mean: {GLOBAL_DIFFICULTY_MEAN:.3f}\")\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d54c005d",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Raw Difficulty Score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7547d6dd",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\n",
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Raw difficulty score (averged & smoothed)\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"\n",
|
||
"Average difficulty of all climbs that use this hold, plus a Bayesian-smoothed\n",
|
||
"version that is more stable for low-usage holds.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"raw_scores = df_hold_usage.groupby('placement_id').agg(\n",
|
||
" raw_difficulty=('difficulty', 'mean'),\n",
|
||
" usage_count=('climb_uuid', 'count'),\n",
|
||
" climbs_count=('climb_uuid', 'nunique')\n",
|
||
")\n",
|
||
"\n",
|
||
"raw_scores['raw_difficulty_smoothed'] = bayesian_smooth(\n",
|
||
" raw_scores['raw_difficulty'],\n",
|
||
" raw_scores['usage_count'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"\n",
|
||
"raw_scores = raw_scores.round(2)\n",
|
||
"\n",
|
||
"print(\"### Top 10 Hardest Holds (Raw)\\n\")\n",
|
||
"display(raw_scores.sort_values('raw_difficulty', ascending=False).head(10))\n",
|
||
"\n",
|
||
"print(\"\\n### Top 10 Easiest Holds (Raw)\\n\")\n",
|
||
"display(raw_scores.sort_values('raw_difficulty', ascending=True).head(10))\n",
|
||
"\n",
|
||
"print(\"\\n### Example of Raw vs Smoothed Difficulty\\n\")\n",
|
||
"display(raw_scores[['raw_difficulty', 'raw_difficulty_smoothed', 'usage_count']].head(10))\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "df819708",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Per-Angle Difficulty Score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "13a2d53f",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Per-Angle Difficulty Score\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"Computes difficulty score per angle, then aggregates with weighting.\n",
|
||
"Uses Bayesian-smoothed per-angle difficulty throughout.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Calculate per-angle scores\n",
|
||
"angle_scores = df_hold_usage.groupby(['placement_id', 'angle']).agg(\n",
|
||
" avg_difficulty=('difficulty', 'mean'),\n",
|
||
" usage_count=('climb_uuid', 'count')\n",
|
||
").reset_index()\n",
|
||
"\n",
|
||
"# Apply Bayesian smoothing\n",
|
||
"angle_scores['avg_difficulty_smoothed'] = bayesian_smooth(\n",
|
||
" angle_scores['avg_difficulty'],\n",
|
||
" angle_scores['usage_count'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"\n",
|
||
"# Pivot to see angles side-by-side\n",
|
||
"angle_pivot = angle_scores.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns='angle',\n",
|
||
" values='avg_difficulty_smoothed',\n",
|
||
" aggfunc='mean'\n",
|
||
")\n",
|
||
"angle_pivot.columns = [f'diff_{int(col)}deg' for col in angle_pivot.columns]\n",
|
||
"\n",
|
||
"# Calculate weighted average using the smoothed per-angle values\n",
|
||
"weighted_scores = []\n",
|
||
"\n",
|
||
"for pid in angle_scores['placement_id'].unique():\n",
|
||
" df_pid = angle_scores[angle_scores['placement_id'] == pid].copy()\n",
|
||
"\n",
|
||
" total_count = df_pid['usage_count'].sum()\n",
|
||
" weighted_diff = (\n",
|
||
" df_pid['avg_difficulty_smoothed'] * df_pid['usage_count']\n",
|
||
" ).sum() / total_count\n",
|
||
"\n",
|
||
" weighted_scores.append({\n",
|
||
" 'placement_id': pid,\n",
|
||
" 'angle_weighted_difficulty': weighted_diff,\n",
|
||
" 'angles_used': len(df_pid),\n",
|
||
" 'min_angle': int(df_pid['angle'].min()),\n",
|
||
" 'max_angle': int(df_pid['angle'].max()),\n",
|
||
" 'angle_range': int(df_pid['angle'].max() - df_pid['angle'].min())\n",
|
||
" })\n",
|
||
"\n",
|
||
"df_angle_scores = pd.DataFrame(weighted_scores).set_index('placement_id')\n",
|
||
"\n",
|
||
"print(\"### Per-Angle Difficulty Analysis (Sample)\\n\")\n",
|
||
"display(angle_pivot.join(df_angle_scores).head(15))\n",
|
||
"\n",
|
||
"print(f\"\\nAngles used per hold:\")\n",
|
||
"print(df_angle_scores['angles_used'].describe())\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2164c4fe",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Per-Role Difficulty Score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f6c9dd60",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Per-Role Difficulty Score\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"Individual roles (start, middle, finish, foot) AND aggregate (hand).\n",
|
||
"All exported difficulty values are Bayesian-smoothed.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Individual role scores\n",
|
||
"role_scores = df_hold_usage.groupby(['placement_id', 'role_type']).agg(\n",
|
||
" avg_difficulty=('difficulty', 'mean'),\n",
|
||
" usage_count=('climb_uuid', 'count')\n",
|
||
").reset_index()\n",
|
||
"\n",
|
||
"# Apply Bayesian smoothing\n",
|
||
"role_scores['avg_difficulty_smoothed'] = bayesian_smooth(\n",
|
||
" role_scores['avg_difficulty'],\n",
|
||
" role_scores['usage_count'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"\n",
|
||
"# Pivot for individual roles\n",
|
||
"role_pivot = role_scores.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns='role_type',\n",
|
||
" values='avg_difficulty_smoothed',\n",
|
||
" aggfunc='mean'\n",
|
||
")\n",
|
||
"role_pivot.columns = [f'diff_as_{col}' for col in role_pivot.columns]\n",
|
||
"\n",
|
||
"# Usage counts per individual role\n",
|
||
"role_counts = role_scores.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns='role_type',\n",
|
||
" values='usage_count',\n",
|
||
" aggfunc='sum',\n",
|
||
" fill_value=0\n",
|
||
")\n",
|
||
"role_counts.columns = [f'uses_as_{col}' for col in role_counts.columns]\n",
|
||
"\n",
|
||
"# Aggregate hand difficulty\n",
|
||
"hand_usage = df_hold_usage[df_hold_usage['is_hand']].groupby('placement_id').agg(\n",
|
||
" diff_as_hand_raw=('difficulty', 'mean'),\n",
|
||
" uses_as_hand=('climb_uuid', 'count')\n",
|
||
")\n",
|
||
"\n",
|
||
"hand_usage['diff_as_hand'] = bayesian_smooth(\n",
|
||
" hand_usage['diff_as_hand_raw'],\n",
|
||
" hand_usage['uses_as_hand'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"\n",
|
||
"hand_usage = hand_usage[['diff_as_hand', 'uses_as_hand']]\n",
|
||
"\n",
|
||
"# Combine role tables\n",
|
||
"df_role_analysis = role_pivot.join(role_counts).join(hand_usage).round(2)\n",
|
||
"\n",
|
||
"cols_order = [\n",
|
||
" 'diff_as_start', 'uses_as_start',\n",
|
||
" 'diff_as_middle', 'uses_as_middle',\n",
|
||
" 'diff_as_finish', 'uses_as_finish',\n",
|
||
" 'diff_as_hand', 'uses_as_hand',\n",
|
||
" 'diff_as_foot', 'uses_as_foot'\n",
|
||
"]\n",
|
||
"cols_order = [c for c in cols_order if c in df_role_analysis.columns]\n",
|
||
"df_role_analysis = df_role_analysis[cols_order]\n",
|
||
"\n",
|
||
"print(\"### Role-Specific Difficulty Scores (Sample)\\n\")\n",
|
||
"display(df_role_analysis.head(15))\n",
|
||
"\n",
|
||
"print(\"\\n### Holds Used as Both Hand and Foot\\n\")\n",
|
||
"dual_use = df_role_analysis[\n",
|
||
" df_role_analysis['diff_as_hand'].notna() &\n",
|
||
" df_role_analysis['diff_as_foot'].notna()\n",
|
||
"].copy()\n",
|
||
"\n",
|
||
"if len(dual_use) > 0:\n",
|
||
" dual_use['hand_minus_foot'] = dual_use['diff_as_hand'] - dual_use['diff_as_foot']\n",
|
||
" display(\n",
|
||
" dual_use[['diff_as_hand', 'diff_as_foot', 'hand_minus_foot']]\n",
|
||
" .sort_values('hand_minus_foot', ascending=False)\n",
|
||
" .head(15)\n",
|
||
" )\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6f0635f6",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Per-Role Per-Angle Difficulty Score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "2ff53ab4",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Per-Role Per-Angle Difficulty Score\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"\n",
|
||
"Granular scores: placement_id × role_type × angle\n",
|
||
"Includes both individual roles AND aggregate hand.\n",
|
||
"All downstream tables use the smoothed difficulty values.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Individual roles per angle\n",
|
||
"role_angle_scores = df_hold_usage.groupby(['placement_id', 'role_type', 'angle']).agg(\n",
|
||
" avg_difficulty=('difficulty', 'mean'),\n",
|
||
" usage_count=('climb_uuid', 'count')\n",
|
||
").reset_index()\n",
|
||
"\n",
|
||
"role_angle_scores['avg_difficulty_smoothed'] = bayesian_smooth(\n",
|
||
" role_angle_scores['avg_difficulty'],\n",
|
||
" role_angle_scores['usage_count'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"\n",
|
||
"# Aggregate hand per angle\n",
|
||
"hand_angle_scores = df_hold_usage[df_hold_usage['is_hand']].groupby(['placement_id', 'angle']).agg(\n",
|
||
" avg_difficulty=('difficulty', 'mean'),\n",
|
||
" usage_count=('climb_uuid', 'count')\n",
|
||
").reset_index()\n",
|
||
"\n",
|
||
"hand_angle_scores['avg_difficulty_smoothed'] = bayesian_smooth(\n",
|
||
" hand_angle_scores['avg_difficulty'],\n",
|
||
" hand_angle_scores['usage_count'],\n",
|
||
" GLOBAL_DIFFICULTY_MEAN\n",
|
||
")\n",
|
||
"hand_angle_scores['role_type'] = 'hand'\n",
|
||
"\n",
|
||
"# Combine all\n",
|
||
"df_role_angle = pd.concat([role_angle_scores, hand_angle_scores], ignore_index=True)\n",
|
||
"\n",
|
||
"print(f\"Total role-angle records: {len(df_role_angle):,}\")\n",
|
||
"print(\"\\nBreakdown by role_type:\")\n",
|
||
"display(df_role_angle.groupby('role_type').size().to_frame('count'))\n",
|
||
"\n",
|
||
"print(\"\\n### Per-Role Per-Angle Difficulty Scores (Sample)\\n\")\n",
|
||
"display(df_role_angle.head(20))\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "75ed3028",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Creating Tables"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "5b324cd0",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Role-Specific Tables\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"Tables for: start, middle, finish, hand, foot\n",
|
||
"Each with per-angle columns and overall average.\n",
|
||
"Uses Bayesian-smoothed role-angle difficulty values.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"angles = sorted(df_hold_usage['angle'].unique())\n",
|
||
"role_tables = {}\n",
|
||
"\n",
|
||
"for role in ROLE_TYPES:\n",
|
||
" df_role = df_role_angle[df_role_angle['role_type'] == role].copy()\n",
|
||
"\n",
|
||
" if df_role.empty:\n",
|
||
" print(f\"No data for role: {role}\")\n",
|
||
" continue\n",
|
||
"\n",
|
||
" pivot = df_role.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns='angle',\n",
|
||
" values='avg_difficulty_smoothed',\n",
|
||
" aggfunc='mean'\n",
|
||
" )\n",
|
||
" pivot.columns = [f'{role}_diff_{int(col)}deg' for col in pivot.columns]\n",
|
||
" pivot[f'{role}_overall_avg'] = pivot.mean(axis=1).round(2)\n",
|
||
"\n",
|
||
" usage_pivot = df_role.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns='angle',\n",
|
||
" values='usage_count',\n",
|
||
" aggfunc='sum',\n",
|
||
" fill_value=0\n",
|
||
" )\n",
|
||
" usage_pivot.columns = [f'{role}_uses_{int(col)}deg' for col in usage_pivot.columns]\n",
|
||
" pivot[f'{role}_total_uses'] = usage_pivot.sum(axis=1).astype(int)\n",
|
||
"\n",
|
||
" role_tables[role] = pivot.join(usage_pivot)\n",
|
||
"\n",
|
||
" print(f\"\\n### {role.upper()} Difficulty by Angle\\n\")\n",
|
||
" display(role_tables[role].head(8))\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "37428cb9",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Combined Table for Modelling\n",
|
||
"==================================\n",
|
||
"\n",
|
||
"Build a single placement-level table used downstream in feature\n",
|
||
"engineering. The smoothed overall difficulty is exposed under the simple\n",
|
||
"name `overall_difficulty`, while the raw version is retained as\n",
|
||
"`overall_difficulty_raw` for reference.\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Start with placement info\n",
|
||
"df_model_features = df_placements[['placement_id', 'x', 'y', 'set_name', 'default_role_id']].copy()\n",
|
||
"df_model_features = df_model_features.set_index('placement_id')\n",
|
||
"df_model_features = df_model_features.rename(columns={\n",
|
||
" 'set_name': 'material',\n",
|
||
" 'default_role_id': 'default_role'\n",
|
||
"})\n",
|
||
"\n",
|
||
"# Add raw + smoothed overall scores\n",
|
||
"df_model_features = df_model_features.join(\n",
|
||
" raw_scores[['raw_difficulty', 'raw_difficulty_smoothed', 'usage_count', 'climbs_count']],\n",
|
||
" how='left'\n",
|
||
")\n",
|
||
"\n",
|
||
"# Add angle scores\n",
|
||
"df_model_features = df_model_features.join(\n",
|
||
" df_angle_scores[['angle_weighted_difficulty', 'angles_used', 'min_angle', 'max_angle', 'angle_range']],\n",
|
||
" how='left'\n",
|
||
")\n",
|
||
"\n",
|
||
"# Add per-role tables\n",
|
||
"for role in ROLE_TYPES:\n",
|
||
" if role in role_tables:\n",
|
||
" df_model_features = df_model_features.join(role_tables[role], how='left')\n",
|
||
"\n",
|
||
"# Add aggregate hand / foot scores if missing\n",
|
||
"extra_role_cols = [c for c in ['diff_as_hand', 'uses_as_hand', 'diff_as_foot', 'uses_as_foot'] if c in df_role_analysis.columns]\n",
|
||
"missing_extra_cols = [c for c in extra_role_cols if c not in df_model_features.columns]\n",
|
||
"if missing_extra_cols:\n",
|
||
" df_model_features = df_model_features.join(df_role_analysis[missing_extra_cols], how='left')\n",
|
||
"\n",
|
||
"# Rename for clarity\n",
|
||
"df_model_features = df_model_features.rename(columns={\n",
|
||
" 'raw_difficulty': 'overall_difficulty_raw',\n",
|
||
" 'raw_difficulty_smoothed': 'overall_difficulty'\n",
|
||
"})\n",
|
||
"\n",
|
||
"print(\"### Combined Model Features Table (Before Mirror)\\n\")\n",
|
||
"display(df_model_features.head(10))\n",
|
||
"print(f\"\\nShape: {df_model_features.shape}\")\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fa443f22",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Visualization"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "706305f9",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Visualization: difficulty heatmaps\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"os.makedirs('../images/03_hold_difficulty', exist_ok=True)\n",
|
||
"\n",
|
||
"def plot_difficulty_heatmap(score_column='overall_difficulty', title_suffix=\"\", save=True):\n",
|
||
" \"\"\"Plot hold difficulty scores on the board.\"\"\"\n",
|
||
" \n",
|
||
" fig, ax = plt.subplots(figsize=(17, 12))\n",
|
||
" ax.imshow(board_img, extent=[x_min, x_max, y_min, y_max], aspect='auto')\n",
|
||
" \n",
|
||
" df_plot = df_model_features[df_model_features['x'].notna()].copy()\n",
|
||
" \n",
|
||
" if score_column not in df_plot.columns:\n",
|
||
" print(f\"Column '{score_column}' not found\")\n",
|
||
" plt.close()\n",
|
||
" return\n",
|
||
" \n",
|
||
" df_plot = df_plot[df_plot[score_column].notna()]\n",
|
||
" \n",
|
||
" if df_plot.empty:\n",
|
||
" print(f\"No data for '{score_column}'\")\n",
|
||
" plt.close()\n",
|
||
" return\n",
|
||
" \n",
|
||
" max_usage = df_plot['usage_count'].max()\n",
|
||
" size_scale = 20 + 150 * (df_plot['usage_count'] / max_usage)\n",
|
||
" \n",
|
||
" scatter = ax.scatter(\n",
|
||
" df_plot['x'],\n",
|
||
" df_plot['y'],\n",
|
||
" c=df_plot[score_column],\n",
|
||
" s=size_scale,\n",
|
||
" cmap='seismic',\n",
|
||
" alpha=0.85,\n",
|
||
" edgecolors='black',\n",
|
||
" linewidths=0.5\n",
|
||
" )\n",
|
||
" \n",
|
||
" ax.set_xlabel('X Position (inches)', fontsize=12)\n",
|
||
" ax.set_ylabel('Y Position (inches)', fontsize=12)\n",
|
||
" ax.set_title(f'Hold Difficulty: {score_column} {title_suffix}', fontsize=14)\n",
|
||
" \n",
|
||
" cbar = plt.colorbar(scatter, ax=ax, shrink=0.5)\n",
|
||
" cbar.set_label('Difficulty')\n",
|
||
" \n",
|
||
" plt.tight_layout()\n",
|
||
" \n",
|
||
" if save:\n",
|
||
" safe_name = score_column.replace('/', '_')\n",
|
||
" plt.savefig(f'../images/03_hold_difficulty/difficulty_heatmap_{safe_name}.png', dpi=150, bbox_inches='tight')\n",
|
||
" \n",
|
||
" plt.show()\n",
|
||
"\n",
|
||
"\n",
|
||
"# Plot main scores\n",
|
||
"plot_difficulty_heatmap('overall_difficulty', \"(Raw Average)\")\n",
|
||
"plot_difficulty_heatmap('angle_weighted_difficulty', \"(Angle-Weighted)\")\n",
|
||
"\n",
|
||
"# Plot role scores\n",
|
||
"plot_difficulty_heatmap('hand_overall_avg', \"(Hand)\")\n",
|
||
"plot_difficulty_heatmap('foot_overall_avg', \"(Foot)\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "3eb840ec",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Visualization: per-role per-angle heatmaps\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"def plot_role_angle_heatmap(role_type='hand', angle=40):\n",
|
||
" \"\"\"Plot difficulty scores for a specific role and angle.\"\"\"\n",
|
||
" \n",
|
||
" df_role = df_role_angle[\n",
|
||
" (df_role_angle['role_type'] == role_type) & \n",
|
||
" (df_role_angle['angle'] == angle)\n",
|
||
" ].copy()\n",
|
||
" \n",
|
||
" if df_role.empty:\n",
|
||
" print(f\"No data for {role_type} at {angle}°\")\n",
|
||
" return\n",
|
||
" \n",
|
||
" df_role['x'] = df_role['placement_id'].map(lambda p: placement_coordinates.get(p, (None, None))[0])\n",
|
||
" df_role['y'] = df_role['placement_id'].map(lambda p: placement_coordinates.get(p, (None, None))[1])\n",
|
||
" df_role = df_role.dropna(subset=['x', 'y'])\n",
|
||
" \n",
|
||
" fig, ax = plt.subplots(figsize=(17, 12))\n",
|
||
" ax.imshow(board_img, extent=[x_min, x_max, y_min, y_max], aspect='auto')\n",
|
||
" \n",
|
||
" scatter = ax.scatter(\n",
|
||
" df_role['x'],\n",
|
||
" df_role['y'],\n",
|
||
" c=df_role['avg_difficulty_smoothed'],\n",
|
||
" s=100,\n",
|
||
" cmap='seismic',\n",
|
||
" alpha=0.85,\n",
|
||
" edgecolors='black',\n",
|
||
" linewidths=0.5\n",
|
||
" )\n",
|
||
" \n",
|
||
" ax.set_xlabel('X Position (inches)', fontsize=12)\n",
|
||
" ax.set_ylabel('Y Position (inches)', fontsize=12)\n",
|
||
" ax.set_title(f'{role_type.capitalize()} Hold Difficulty at {angle}°', fontsize=14)\n",
|
||
" \n",
|
||
" cbar = plt.colorbar(scatter, ax=ax, shrink=0.5)\n",
|
||
" cbar.set_label('Difficulty')\n",
|
||
" \n",
|
||
" plt.tight_layout()\n",
|
||
" plt.savefig(f'../images/03_hold_difficulty/difficulty_{role_type}_{angle}deg.png', dpi=150, bbox_inches='tight')\n",
|
||
" plt.show()\n",
|
||
"\n",
|
||
"\n",
|
||
"# Plot for common angles\n",
|
||
"common_angles = [30, 40, 45, 50]\n",
|
||
"\n",
|
||
"for role in ['hand', 'foot']:\n",
|
||
" for angle in common_angles:\n",
|
||
" print(f\"\\n{role.capitalize()} at {angle}°:\")\n",
|
||
" plot_role_angle_heatmap(role, angle)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "44c53251",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Conclusion"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b4f1431c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Summary Statistics\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"# Material comparison\n",
|
||
"print(\"### Difficulty by Material\\n\")\n",
|
||
"material_diff = df_model_features.groupby('material').agg(\n",
|
||
" count=('overall_difficulty', 'count'),\n",
|
||
" avg_difficulty=('overall_difficulty', 'mean'),\n",
|
||
" median_difficulty=('overall_difficulty', 'median'),\n",
|
||
" avg_hand=('hand_overall_avg', 'mean'),\n",
|
||
" avg_foot=('foot_overall_avg', 'mean'),\n",
|
||
" avg_usage=('usage_count', 'mean')\n",
|
||
").round(2)\n",
|
||
"\n",
|
||
"display(material_diff)\n",
|
||
"\n",
|
||
"# Default role comparison\n",
|
||
"print(\"\\n### Difficulty by Default Role\\n\")\n",
|
||
"role_diff = df_model_features.groupby('default_role').agg(\n",
|
||
" count=('overall_difficulty', 'count'),\n",
|
||
" avg_difficulty=('overall_difficulty', 'mean'),\n",
|
||
" avg_hand=('hand_overall_avg', 'mean'),\n",
|
||
" avg_foot=('foot_overall_avg', 'mean'),\n",
|
||
" avg_usage=('usage_count', 'mean')\n",
|
||
").round(2)\n",
|
||
"\n",
|
||
"display(role_diff)\n",
|
||
"\n",
|
||
"# Correlation\n",
|
||
"if 'hand_overall_avg' in df_model_features.columns and 'foot_overall_avg' in df_model_features.columns:\n",
|
||
" valid = df_model_features.dropna(subset=['hand_overall_avg', 'foot_overall_avg'])\n",
|
||
" if len(valid) > 0:\n",
|
||
" corr = valid['hand_overall_avg'].corr(valid['foot_overall_avg'])\n",
|
||
" print(f\"\\nCorrelation (hand vs foot difficulty): {corr:.3f}\")\n",
|
||
" print(f\"(based on {len(valid)} holds used as both)\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "8d333751",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"\"\"\"\n",
|
||
"==================================\n",
|
||
"Save to files\n",
|
||
"==================================\n",
|
||
"\"\"\"\n",
|
||
"\n",
|
||
"import os\n",
|
||
"os.makedirs('../data/03_hold_difficulty', exist_ok=True)\n",
|
||
"\n",
|
||
"# Main features table\n",
|
||
"df_model_features.to_csv('../data/03_hold_difficulty/hold_difficulty_scores.csv')\n",
|
||
"\n",
|
||
"# Full pivot for modeling\n",
|
||
"pivot_value_col = 'avg_difficulty_smoothed' if 'avg_difficulty_smoothed' in df_role_angle.columns else 'avg_difficulty'\n",
|
||
"\n",
|
||
"pivot_full = df_role_angle.pivot_table(\n",
|
||
" index='placement_id',\n",
|
||
" columns=['role_type', 'angle'],\n",
|
||
" values=pivot_value_col,\n",
|
||
" aggfunc='mean'\n",
|
||
")\n",
|
||
"pivot_full.columns = [f'diff_{role}_{int(angle)}deg' for role, angle in pivot_full.columns]\n",
|
||
"pivot_full.to_csv('../data/03_hold_difficulty/hold_role_angle_difficulty_scores.csv')\n",
|
||
"\n",
|
||
"# Per-role tables\n",
|
||
"for role in ROLE_TYPES:\n",
|
||
" if role in role_tables:\n",
|
||
" role_tables[role].to_csv(f'../data/03_hold_difficulty/hold_{role}_difficulty_by_angle.csv')\n",
|
||
"\n",
|
||
"# Detailed records\n",
|
||
"df_role_angle.to_csv('../data/03_hold_difficulty/hold_role_angle_detailed.csv', index=False)\n",
|
||
"\n",
|
||
"print(\"Saved files to ../data/03_hold_difficulty/:\")\n",
|
||
"print(\" - hold_difficulty_scores.csv (main table)\")\n",
|
||
"print(\" - hold_role_angle_difficulty_scores.csv (full pivot)\")\n",
|
||
"for role in ROLE_TYPES:\n",
|
||
" if role in role_tables:\n",
|
||
" print(f\" - hold_{role}_difficulty_by_angle.csv\")\n",
|
||
"print(\" - hold_role_angle_detailed.csv (detailed records)\")\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "443a6779",
|
||
"metadata": {},
|
||
"source": [
|
||
"\n",
|
||
"## Tables produced:\n",
|
||
"\n",
|
||
"1. `df_model_features` - Main feature table for downstream modeling\n",
|
||
" - One row per `placement_id`\n",
|
||
" - Includes metadata, overall scores, angle-level summaries, and role-specific scores\n",
|
||
" - `overall_difficulty` is the Bayesian-smoothed overall score\n",
|
||
" - `overall_difficulty_raw` is retained only as a reference column\n",
|
||
"\n",
|
||
"2. `df_role_angle` - Detailed records for visualization / export\n",
|
||
" - One row per (`placement_id`, `role_type`, `angle`) combination\n",
|
||
" - Rebuilt after mirror-averaging so plots and exports reflect the final mirrored values\n",
|
||
"\n",
|
||
"3. `role_tables[role]` - Per-role tables\n",
|
||
" - start, middle, finish, hand, foot\n",
|
||
" - each with per-angle columns, overall averages, and usage counts"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.14.3"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|