20226-05-22 minor update

2026-05-22 14:22:54 -04:00
parent 1089793165
commit ecc4723013
6 changed files with 20 additions and 5 deletions
@@ -9,7 +9,7 @@
    "\n",
    "## What is tokenization and why does it matter?\n",
    "\n",
-    "In natural language processing, **tokenization** is the process of converting raw text into a sequence of discrete symbols (tokens) that a model can process. For example, the sentence \"I love climbing\" might be tokenized as `[\"I\", \" love\", \" climbing\"]` using a subword tokenizer like BPE.\n",
+    "In natural language processing, **tokenization** is the process of converting raw text into a sequence of discrete symbols (tokens) that a model can process. For example, the sentence \"I climb rocks\" might be tokenized as `[\"I\", \" climb\", \" rocks\"]` using a subword tokenizer like BPE.\n",
    "\n",
    "For climbing board routes, we face an analogous problem: how do we convert a climb — which is fundamentally a *set of holds at specific positions with specific roles* — into a sequence of tokens that a transformer can learn from?\n",
    "\n",