From 894ea72db9daaf4c2c6b3304d47809dd5c8e2ee2 Mon Sep 17 00:00:00 2001 From: Pawel Date: Wed, 27 May 2026 09:01:22 -0400 Subject: [PATCH] Fix typos and math errors in notebooks --- README.md | 11 ++++++----- bibliography.md | 6 +++--- notebooks/02_qr_svd.ipynb | 22 ++++++++++------------ notebooks/04_pca.ipynb | 8 ++++---- requirements.txt | 3 ++- 5 files changed, 25 insertions(+), 25 deletions(-) diff --git a/README.md b/README.md index fe8e7df..d3baaed 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,9 @@ A practical, linear-algebra-first introduction to data science. This repository demonstrates how core linear algebra concepts -- least squares, matrix decompositions, and spectral methods -- directly power modern data science and machine learning workflows. We finish off with a mini-project involving image denoising using the truncated SVD. -Rather than treating data science as a collection of tools, this project builds everything from first principles and connects theory to implementation through jupyter notebooks. +Rather than treating data science as a collection of tools, this project builds everything from first principles and connects theory to implementation through Jupyter notebooks. -The compiled notebooks in this project can be viewed as a single webpage on my [website](https://pawelsarkowicz.xyz/posts/ds_for_la). Note that if you view in the notebooks in Gitlab/Github, they have a tendency to not render the latex properly. +The compiled notebooks in this project can be viewed as a single webpage on my [website](https://pawelsarkowicz.xyz/posts/ds_for_la). Note that if you view the notebooks in GitLab/GitHub, they have a tendency to not render the LaTeX properly. ## Structure @@ -31,6 +31,7 @@ Each notebook is self-contained and moves from theory to implementation to visua * **Matplotlib** -- visualization * **Pillow** -- imaging library * **scikit-learn** -- machine learning utilities +* **scikit-image** -- image quality metrics ## How to Run @@ -38,7 +39,7 @@ Each notebook is self-contained and moves from theory to implementation to visua git clone https://gitlab.com/psark/ds-for-la.git cd ds-for-la -pip install requirements.txt +pip install -r requirements.txt jupyter notebook ``` @@ -137,7 +138,7 @@ For color images, this is applied independently to each channel (R, G, B). * Regularization connects directly to linear algebra: * Ridge shifts singular values, improving condition number - * Lasso exploits $L^1$ geometry to product sparse solutions + * Lasso exploits $L^1$ geometry to produce sparse solutions * Gradient descent convergence is governed by singular value structure * Condition number determines learning rate stability @@ -164,4 +165,4 @@ This project is part of a broader effort to translate a background in pure mathe # License This project is licensed under the MIT License. -See the [`LICENSE`](./LICENSE) file for details. \ No newline at end of file +See the [`LICENSE`](./LICENSE) file for details. diff --git a/bibliography.md b/bibliography.md index 23c2291..2e3b1ae 100644 --- a/bibliography.md +++ b/bibliography.md @@ -30,7 +30,7 @@ - numpy.shape: https://numpy.org/doc/stable/reference/generated/numpy.shape.html (Return the shape of an array.) - numpy.polyfit: https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html (Least squares polynomial fit.) - numpy.mean: https://numpy.org/doc/stable/reference/generated/numpy.mean.html (Compute the arithmetic mean along the specified axis.) -- numyp.poly1d: https://numpy.org/doc/stable/reference/generated/numpy.poly1d.html (A one-dimensional polynomial class.) +- numpy.poly1d: https://numpy.org/doc/stable/reference/generated/numpy.poly1d.html (A one-dimensional polynomial class.) - numpy.set_printoptions: https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html (These options determine the way floating point numbers, arrays and other NumPy objects are displayed.) - numpy.finfo: https://numpy.org/doc/stable/reference/generated/numpy.finfo.html (Machine limits for floating point types.) - numpy.logspace: https://numpy.org/doc/stable/reference/generated/numpy.logspace.html (Return numbers spaced evenly on a log scale.) @@ -53,7 +53,7 @@ - numpy.random.uniform: https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html (Draw samples from a uniform distribution.) #### numpy.linalg (https://numpy.org/doc/stable/reference/routines.linalg.html) -- numpy.linalg.qr: https://numpy.org/doc/stable/reference/generated/numpy.linalg.qr.html (Compute the qr factorization of a matrix. +- numpy.linalg.qr: https://numpy.org/doc/stable/reference/generated/numpy.linalg.qr.html (Compute the qr factorization of a matrix.) - numpy.linalg.svd: https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html (Singular Value Decomposition.) - numpy.linalg.solve: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html (Solve a linear matrix equation, or system of linear scalar equations.) - numpy.linalg.lstsq: https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html (Return the least-squares solution to a linear matrix equation.) @@ -139,7 +139,7 @@ - sklearn.model_selection.cross_val_score: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html (Evaluate a score by cross-validation.) - sklearn.model_selection.KFold: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html (K-Fold cross-validator.) -#### sklearn.metrices +#### sklearn.metrics - sklearn.metrics.mean_squared_error: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html (Mean squared error regression loss.) - sklearn.metrics.r2_score: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html (R^2 (coefficient of determination) regression score function.) diff --git a/notebooks/02_qr_svd.ipynb b/notebooks/02_qr_svd.ipynb index 369b1bf..e1a943f 100644 --- a/notebooks/02_qr_svd.ipynb +++ b/notebooks/02_qr_svd.ipynb @@ -6,17 +6,17 @@ "source": [ "# QR Decompositions\n", "\n", - "QR decompositions are a powerful tool in linear algebra and data science for several reasons. They provide a way to decompose a matrix into an orthogonal matrix $Q$ aand an upper triangular matrix $R$, which can simplify many computations and analyses.\n", + "QR decompositions are a powerful tool in linear algebra and data science for several reasons. They provide a way to decompose a matrix into an orthogonal matrix $Q$ and an upper triangular matrix $R$, which can simplify many computations and analyses.\n", "\n", - "> **Theorem**: Let $A$ is an $m \\times n$ matrix with linearly independent columns ($m \\geq n$ in this case), then $A$ can be decomposed as $A = QR$ where $Q$ is an $m \\times n$ matrix whose columns form an orthonormal basis for Col($A$) and $R$ is an $n \\times n$ upper-triangular invertible matrix with positive entries on the diagonal.\n", + "> **Theorem**: Let $A$ be an $m \\times n$ matrix with linearly independent columns ($m \\geq n$ in this case), then $A$ can be decomposed as $A = QR$ where $Q$ is an $m \\times n$ matrix whose columns form an orthonormal basis for Col($A$) and $R$ is an $n \\times n$ upper-triangular invertible matrix with positive entries on the diagonal.\n", "\n", - "In the literature, sometimes the QR decomposition is phrased as follows: any $m \\times n$ matrix $A$ can also be written as $A = QR$ where $Q$ is an $m \\times m$ orthogonal matrix ($Q^T = Q^{-1}$), and $R$ is an $m \\times n$ upper-triangular matrix. One follows from the other by playing around with some matrix equations. Indeed, suppose that $A = Q_1R_1$ is a decomposition as above (that is, $Q_1$ is $m \\times n$ and $R_1$ is $n \\times n$). Use can use the Gram-Schmidt procedure to extend the columns of $Q_1$ to an orthonormal basis for all of $\\mathbb{R}^m$, and put the remaining vectors in a $(m - n) \\times n$ matrix $Q_2$. Then\n", + "In the literature, sometimes the QR decomposition is phrased as follows: any $m \\times n$ matrix $A$ can also be written as $A = QR$ where $Q$ is an $m \\times m$ orthogonal matrix ($Q^T = Q^{-1}$), and $R$ is an $m \\times n$ upper-triangular matrix. One follows from the other by playing around with some matrix equations. Indeed, suppose that $A = Q_1R_1$ is a decomposition as above (that is, $Q_1$ is $m \\times n$ and $R_1$ is $n \\times n$). We can use the Gram-Schmidt procedure to extend the columns of $Q_1$ to an orthonormal basis for all of $\\mathbb{R}^m$, and put the remaining vectors in an $m \\times (m-n)$ matrix $Q_2$. Then\n", "\n", "$$ A = Q_1R_1 = \\begin{bmatrix} Q_1 & Q_2 \\end{bmatrix}\\begin{bmatrix} R_1 \\\\ 0 \\end{bmatrix}. $$\n", "\n", "The left matrix is an $m \\times m$ orthogonal matrix and the right matrix is $m \\times n$ upper triangular. Moreover, the decomposition provides orthonormal bases for both the column space of $A$ and the perp of the column space of $A$; $Q_1$ will consist of an orthonormal basis for the column space of $A$ and $Q_2$ will consist of an orthonormal basis for the perp of the column space of $A$. \n", "\n", - "However, we will often want to use the decomposition when $Q$ is $m \\times n$, $R$ is $n \\times n$, and the columns of $Q$ form an orthonormal basis for the column space of $A$. For example, the python function `numpy.linalg.qr` give QR decompositions this way (again, assuming that the columns of $A$ are linearly independent, so $m \\geq n$).\n", + "However, we will often want to use the decomposition when $Q$ is $m \\times n$, $R$ is $n \\times n$, and the columns of $Q$ form an orthonormal basis for the column space of $A$. For example, the Python function `numpy.linalg.qr` gives QR decompositions this way (again, assuming that the columns of $A$ are linearly independent, so $m \\geq n$).\n", "\n", "> **Key take-away**. The QR decomposition provides an orthonormal basis for the column space of $A$. If $A$ has rank $k$, then the first $k$ columns of $Q$ will form a basis for the column space of $A$. \n", "\n", @@ -254,7 +254,7 @@ "\n", "> **Example**. Working with the matrix\n", "> $$ A = \\begin{bmatrix} 1 & 0 & 0 \\\\ 1 & 1 & 0 \\\\ 1 & 1 & 1 \\\\ 1 & 1 & 1 \\end{bmatrix}, $$\n", - "> the projection onto the column space if given by\n", + "> the projection onto the column space is given by\n", "> $$ QQ^T = \\begin{bmatrix} 1 \\\\ & 1 \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\end{bmatrix}. $$\n", "> This is a well-understood projection: it is the direct sum of the identity on $\\mathbb{R}^2$ and the projection onto the line $y = x$ in $\\mathbb{R}^2$.\n", "\n", @@ -299,11 +299,9 @@ ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "id": "d26b49a6", "metadata": {}, - "outputs": [], "source": [ "array([[1.00000000e+00, 2.89687929e-17, 2.89687929e-17, 2.89687929e-17],\n", " [2.89687929e-17, 1.00000000e+00, 7.07349921e-17, 7.07349921e-17],\n", @@ -351,7 +349,7 @@ "> $$ P = A(A^TA)^{-1}A^T. $$\n", "> Indeed, let $b \\in \\mathbb{R}^n$ and let $x_0 \\in \\mathbb{R}^p$ be a solution to the normal equations\n", "> $$ A^TAx_0 = A^Tb. $$\n", - "> Then $x_0 = (A^TA)^{-1}A^Tb$ and so $Ax_0 = A(A^TA^{-1})A^Tb$ is the (unique!) vector in the column space of $A$ which is closest to $b$, i.e., the projection of $b$ onto the column space of $A$.\n", + "> Then $x_0 = (A^TA)^{-1}A^Tb$ and so $Ax_0 = A(A^TA)^{-1}A^Tb$ is the (unique!) vector in the column space of $A$ which is closest to $b$, i.e., the projection of $b$ onto the column space of $A$.\n", "> However, taking transposes, multiplying, and inverting is not what we would like to do numerically. " ] }, @@ -503,7 +501,7 @@ "> and note that\n", "> $$ A = U_rDV_r^T. $$\n", "> We call this the reduced singular value decomposition of $A$. Note that $D$ is invertible, and its inverse is simply\n", - "> $$ D = \\begin{bmatrix} \\sigma_1^{-1} \\\\ & \\sigma_2^{-1} \\\\ & & \\ddots \\\\ & & & \\sigma_r^{-1} \\end{bmatrix}. $$\n", + "> $$ D^{-1} = \\begin{bmatrix} \\sigma_1^{-1} \\\\ & \\sigma_2^{-1} \\\\ & & \\ddots \\\\ & & & \\sigma_r^{-1} \\end{bmatrix}. $$\n", "> The **pseudoinverse** (or **Moore-Penrose inverse**) of $A$ is the matrix\n", "> $$ A^+ = V_rD^{-1}U_r^T. $$\n", "\n", @@ -656,7 +654,7 @@ "> $$ \\kappa(A) = 1 \\text{ and } \\kappa(B) = 10^6. $$\n", "> Inverting $X_2$ includes dividing by $\\frac{1}{10^6}$, which will amplify errors by $10^6$.\n", "\n", - "Let's look our main example in python by using `numpy.linalg.cond`. \n", + "Let's look at our main example in Python by using `numpy.linalg.cond`. \n", "\n" ] }, @@ -680,7 +678,7 @@ "# Create a pandas DataFrame\n", "df = pd.DataFrame(data)\n", "\n", - "# Create out matrix X\n", + "# Create our matrix X\n", "X = df[['Square ft', 'Bedrooms']].to_numpy()\n", "\n", "# Check the condition number\n", diff --git a/notebooks/04_pca.ipynb b/notebooks/04_pca.ipynb index e6fe488..ad0f68d 100644 --- a/notebooks/04_pca.ipynb +++ b/notebooks/04_pca.ipynb @@ -52,7 +52,7 @@ "# Create a pandas DataFrame\n", "df = pd.DataFrame(data)\n", "\n", - "# Create out matrix X\n", + "# Create our matrix X\n", "X = df.to_numpy()\n" ] }, @@ -155,8 +155,8 @@ "\n", "# Create our rank-1 approximation\n", "sigma1 = S[0]\n", - "u1 = U[:, [0]]\t\t#shape (2,2)\n", - "v1T = Vh[[0], :]\t\t#shape (3,3)\n", + "u1 = U[:, [0]]\t\t#shape (2,1)\n", + "v1T = Vh[[0], :]\t\t#shape (1,3)\n", "A1 = sigma1 * (u1 @ v1T)\n", "\n", "# Take norms and view errors\n", @@ -275,7 +275,7 @@ "# Create a pandas DataFrame\n", "df = pd.DataFrame(data)\n", "\n", - "# Create out matrix X\n", + "# Create our matrix X\n", "X = df.to_numpy()\n", "\n", "# Get our vector of means\n", diff --git a/requirements.txt b/requirements.txt index 0106600..57ba2c0 100644 --- a/requirements.txt +++ b/requirements.txt @@ -2,4 +2,5 @@ pandas numpy matplotlib pillow -scikit-image \ No newline at end of file +scikit-learn +scikit-image