Fix typos and math errors in notebooks
This commit is contained in:
11
README.md
11
README.md
@@ -4,9 +4,9 @@ A practical, linear-algebra-first introduction to data science.
|
|||||||
|
|
||||||
This repository demonstrates how core linear algebra concepts -- least squares, matrix decompositions, and spectral methods -- directly power modern data science and machine learning workflows. We finish off with a mini-project involving image denoising using the truncated SVD.
|
This repository demonstrates how core linear algebra concepts -- least squares, matrix decompositions, and spectral methods -- directly power modern data science and machine learning workflows. We finish off with a mini-project involving image denoising using the truncated SVD.
|
||||||
|
|
||||||
Rather than treating data science as a collection of tools, this project builds everything from first principles and connects theory to implementation through jupyter notebooks.
|
Rather than treating data science as a collection of tools, this project builds everything from first principles and connects theory to implementation through Jupyter notebooks.
|
||||||
|
|
||||||
The compiled notebooks in this project can be viewed as a single webpage on my [website](https://pawelsarkowicz.xyz/posts/ds_for_la). Note that if you view in the notebooks in Gitlab/Github, they have a tendency to not render the latex properly.
|
The compiled notebooks in this project can be viewed as a single webpage on my [website](https://pawelsarkowicz.xyz/posts/ds_for_la). Note that if you view the notebooks in GitLab/GitHub, they have a tendency to not render the LaTeX properly.
|
||||||
|
|
||||||
|
|
||||||
## Structure
|
## Structure
|
||||||
@@ -31,6 +31,7 @@ Each notebook is self-contained and moves from theory to implementation to visua
|
|||||||
* **Matplotlib** -- visualization
|
* **Matplotlib** -- visualization
|
||||||
* **Pillow** -- imaging library
|
* **Pillow** -- imaging library
|
||||||
* **scikit-learn** -- machine learning utilities
|
* **scikit-learn** -- machine learning utilities
|
||||||
|
* **scikit-image** -- image quality metrics
|
||||||
|
|
||||||
## How to Run
|
## How to Run
|
||||||
|
|
||||||
@@ -38,7 +39,7 @@ Each notebook is self-contained and moves from theory to implementation to visua
|
|||||||
git clone https://gitlab.com/psark/ds-for-la.git
|
git clone https://gitlab.com/psark/ds-for-la.git
|
||||||
cd ds-for-la
|
cd ds-for-la
|
||||||
|
|
||||||
pip install requirements.txt
|
pip install -r requirements.txt
|
||||||
|
|
||||||
jupyter notebook
|
jupyter notebook
|
||||||
```
|
```
|
||||||
@@ -137,7 +138,7 @@ For color images, this is applied independently to each channel (R, G, B).
|
|||||||
|
|
||||||
* Regularization connects directly to linear algebra:
|
* Regularization connects directly to linear algebra:
|
||||||
* Ridge shifts singular values, improving condition number
|
* Ridge shifts singular values, improving condition number
|
||||||
* Lasso exploits $L^1$ geometry to product sparse solutions
|
* Lasso exploits $L^1$ geometry to produce sparse solutions
|
||||||
|
|
||||||
* Gradient descent convergence is governed by singular value structure
|
* Gradient descent convergence is governed by singular value structure
|
||||||
* Condition number determines learning rate stability
|
* Condition number determines learning rate stability
|
||||||
@@ -164,4 +165,4 @@ This project is part of a broader effort to translate a background in pure mathe
|
|||||||
# License
|
# License
|
||||||
|
|
||||||
This project is licensed under the MIT License.
|
This project is licensed under the MIT License.
|
||||||
See the [`LICENSE`](./LICENSE) file for details.
|
See the [`LICENSE`](./LICENSE) file for details.
|
||||||
|
|||||||
@@ -30,7 +30,7 @@
|
|||||||
- numpy.shape: https://numpy.org/doc/stable/reference/generated/numpy.shape.html (Return the shape of an array.)
|
- numpy.shape: https://numpy.org/doc/stable/reference/generated/numpy.shape.html (Return the shape of an array.)
|
||||||
- numpy.polyfit: https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html (Least squares polynomial fit.)
|
- numpy.polyfit: https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html (Least squares polynomial fit.)
|
||||||
- numpy.mean: https://numpy.org/doc/stable/reference/generated/numpy.mean.html (Compute the arithmetic mean along the specified axis.)
|
- numpy.mean: https://numpy.org/doc/stable/reference/generated/numpy.mean.html (Compute the arithmetic mean along the specified axis.)
|
||||||
- numyp.poly1d: https://numpy.org/doc/stable/reference/generated/numpy.poly1d.html (A one-dimensional polynomial class.)
|
- numpy.poly1d: https://numpy.org/doc/stable/reference/generated/numpy.poly1d.html (A one-dimensional polynomial class.)
|
||||||
- numpy.set_printoptions: https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html (These options determine the way floating point numbers, arrays and other NumPy objects are displayed.)
|
- numpy.set_printoptions: https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html (These options determine the way floating point numbers, arrays and other NumPy objects are displayed.)
|
||||||
- numpy.finfo: https://numpy.org/doc/stable/reference/generated/numpy.finfo.html (Machine limits for floating point types.)
|
- numpy.finfo: https://numpy.org/doc/stable/reference/generated/numpy.finfo.html (Machine limits for floating point types.)
|
||||||
- numpy.logspace: https://numpy.org/doc/stable/reference/generated/numpy.logspace.html (Return numbers spaced evenly on a log scale.)
|
- numpy.logspace: https://numpy.org/doc/stable/reference/generated/numpy.logspace.html (Return numbers spaced evenly on a log scale.)
|
||||||
@@ -53,7 +53,7 @@
|
|||||||
- numpy.random.uniform: https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html (Draw samples from a uniform distribution.)
|
- numpy.random.uniform: https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html (Draw samples from a uniform distribution.)
|
||||||
|
|
||||||
#### numpy.linalg (https://numpy.org/doc/stable/reference/routines.linalg.html)
|
#### numpy.linalg (https://numpy.org/doc/stable/reference/routines.linalg.html)
|
||||||
- numpy.linalg.qr: https://numpy.org/doc/stable/reference/generated/numpy.linalg.qr.html (Compute the qr factorization of a matrix.
|
- numpy.linalg.qr: https://numpy.org/doc/stable/reference/generated/numpy.linalg.qr.html (Compute the qr factorization of a matrix.)
|
||||||
- numpy.linalg.svd: https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html (Singular Value Decomposition.)
|
- numpy.linalg.svd: https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html (Singular Value Decomposition.)
|
||||||
- numpy.linalg.solve: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html (Solve a linear matrix equation, or system of linear scalar equations.)
|
- numpy.linalg.solve: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html (Solve a linear matrix equation, or system of linear scalar equations.)
|
||||||
- numpy.linalg.lstsq: https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html (Return the least-squares solution to a linear matrix equation.)
|
- numpy.linalg.lstsq: https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html (Return the least-squares solution to a linear matrix equation.)
|
||||||
@@ -139,7 +139,7 @@
|
|||||||
- sklearn.model_selection.cross_val_score: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html (Evaluate a score by cross-validation.)
|
- sklearn.model_selection.cross_val_score: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html (Evaluate a score by cross-validation.)
|
||||||
- sklearn.model_selection.KFold: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html (K-Fold cross-validator.)
|
- sklearn.model_selection.KFold: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html (K-Fold cross-validator.)
|
||||||
|
|
||||||
#### sklearn.metrices
|
#### sklearn.metrics
|
||||||
- sklearn.metrics.mean_squared_error: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html (Mean squared error regression loss.)
|
- sklearn.metrics.mean_squared_error: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html (Mean squared error regression loss.)
|
||||||
- sklearn.metrics.r2_score: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html (R^2 (coefficient of determination) regression score function.)
|
- sklearn.metrics.r2_score: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html (R^2 (coefficient of determination) regression score function.)
|
||||||
|
|
||||||
|
|||||||
@@ -6,17 +6,17 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# QR Decompositions\n",
|
"# QR Decompositions\n",
|
||||||
"\n",
|
"\n",
|
||||||
"QR decompositions are a powerful tool in linear algebra and data science for several reasons. They provide a way to decompose a matrix into an orthogonal matrix $Q$ aand an upper triangular matrix $R$, which can simplify many computations and analyses.\n",
|
"QR decompositions are a powerful tool in linear algebra and data science for several reasons. They provide a way to decompose a matrix into an orthogonal matrix $Q$ and an upper triangular matrix $R$, which can simplify many computations and analyses.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"> **Theorem**: Let $A$ is an $m \\times n$ matrix with linearly independent columns ($m \\geq n$ in this case), then $A$ can be decomposed as $A = QR$ where $Q$ is an $m \\times n$ matrix whose columns form an orthonormal basis for Col($A$) and $R$ is an $n \\times n$ upper-triangular invertible matrix with positive entries on the diagonal.\n",
|
"> **Theorem**: Let $A$ be an $m \\times n$ matrix with linearly independent columns ($m \\geq n$ in this case), then $A$ can be decomposed as $A = QR$ where $Q$ is an $m \\times n$ matrix whose columns form an orthonormal basis for Col($A$) and $R$ is an $n \\times n$ upper-triangular invertible matrix with positive entries on the diagonal.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"In the literature, sometimes the QR decomposition is phrased as follows: any $m \\times n$ matrix $A$ can also be written as $A = QR$ where $Q$ is an $m \\times m$ orthogonal matrix ($Q^T = Q^{-1}$), and $R$ is an $m \\times n$ upper-triangular matrix. One follows from the other by playing around with some matrix equations. Indeed, suppose that $A = Q_1R_1$ is a decomposition as above (that is, $Q_1$ is $m \\times n$ and $R_1$ is $n \\times n$). Use can use the Gram-Schmidt procedure to extend the columns of $Q_1$ to an orthonormal basis for all of $\\mathbb{R}^m$, and put the remaining vectors in a $(m - n) \\times n$ matrix $Q_2$. Then\n",
|
"In the literature, sometimes the QR decomposition is phrased as follows: any $m \\times n$ matrix $A$ can also be written as $A = QR$ where $Q$ is an $m \\times m$ orthogonal matrix ($Q^T = Q^{-1}$), and $R$ is an $m \\times n$ upper-triangular matrix. One follows from the other by playing around with some matrix equations. Indeed, suppose that $A = Q_1R_1$ is a decomposition as above (that is, $Q_1$ is $m \\times n$ and $R_1$ is $n \\times n$). We can use the Gram-Schmidt procedure to extend the columns of $Q_1$ to an orthonormal basis for all of $\\mathbb{R}^m$, and put the remaining vectors in an $m \\times (m-n)$ matrix $Q_2$. Then\n",
|
||||||
"\n",
|
"\n",
|
||||||
"$$ A = Q_1R_1 = \\begin{bmatrix} Q_1 & Q_2 \\end{bmatrix}\\begin{bmatrix} R_1 \\\\ 0 \\end{bmatrix}. $$\n",
|
"$$ A = Q_1R_1 = \\begin{bmatrix} Q_1 & Q_2 \\end{bmatrix}\\begin{bmatrix} R_1 \\\\ 0 \\end{bmatrix}. $$\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The left matrix is an $m \\times m$ orthogonal matrix and the right matrix is $m \\times n$ upper triangular. Moreover, the decomposition provides orthonormal bases for both the column space of $A$ and the perp of the column space of $A$; $Q_1$ will consist of an orthonormal basis for the column space of $A$ and $Q_2$ will consist of an orthonormal basis for the perp of the column space of $A$. \n",
|
"The left matrix is an $m \\times m$ orthogonal matrix and the right matrix is $m \\times n$ upper triangular. Moreover, the decomposition provides orthonormal bases for both the column space of $A$ and the perp of the column space of $A$; $Q_1$ will consist of an orthonormal basis for the column space of $A$ and $Q_2$ will consist of an orthonormal basis for the perp of the column space of $A$. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"However, we will often want to use the decomposition when $Q$ is $m \\times n$, $R$ is $n \\times n$, and the columns of $Q$ form an orthonormal basis for the column space of $A$. For example, the python function `numpy.linalg.qr` give QR decompositions this way (again, assuming that the columns of $A$ are linearly independent, so $m \\geq n$).\n",
|
"However, we will often want to use the decomposition when $Q$ is $m \\times n$, $R$ is $n \\times n$, and the columns of $Q$ form an orthonormal basis for the column space of $A$. For example, the Python function `numpy.linalg.qr` gives QR decompositions this way (again, assuming that the columns of $A$ are linearly independent, so $m \\geq n$).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"> **Key take-away**. The QR decomposition provides an orthonormal basis for the column space of $A$. If $A$ has rank $k$, then the first $k$ columns of $Q$ will form a basis for the column space of $A$. \n",
|
"> **Key take-away**. The QR decomposition provides an orthonormal basis for the column space of $A$. If $A$ has rank $k$, then the first $k$ columns of $Q$ will form a basis for the column space of $A$. \n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -254,7 +254,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"> **Example**. Working with the matrix\n",
|
"> **Example**. Working with the matrix\n",
|
||||||
"> $$ A = \\begin{bmatrix} 1 & 0 & 0 \\\\ 1 & 1 & 0 \\\\ 1 & 1 & 1 \\\\ 1 & 1 & 1 \\end{bmatrix}, $$\n",
|
"> $$ A = \\begin{bmatrix} 1 & 0 & 0 \\\\ 1 & 1 & 0 \\\\ 1 & 1 & 1 \\\\ 1 & 1 & 1 \\end{bmatrix}, $$\n",
|
||||||
"> the projection onto the column space if given by\n",
|
"> the projection onto the column space is given by\n",
|
||||||
"> $$ QQ^T = \\begin{bmatrix} 1 \\\\ & 1 \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\end{bmatrix}. $$\n",
|
"> $$ QQ^T = \\begin{bmatrix} 1 \\\\ & 1 \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\\\ & & \\frac{1}{2} & \\frac{1}{2} \\end{bmatrix}. $$\n",
|
||||||
"> This is a well-understood projection: it is the direct sum of the identity on $\\mathbb{R}^2$ and the projection onto the line $y = x$ in $\\mathbb{R}^2$.\n",
|
"> This is a well-understood projection: it is the direct sum of the identity on $\\mathbb{R}^2$ and the projection onto the line $y = x$ in $\\mathbb{R}^2$.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -299,11 +299,9 @@
|
|||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "markdown",
|
||||||
"execution_count": null,
|
|
||||||
"id": "d26b49a6",
|
"id": "d26b49a6",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
"source": [
|
||||||
"array([[1.00000000e+00, 2.89687929e-17, 2.89687929e-17, 2.89687929e-17],\n",
|
"array([[1.00000000e+00, 2.89687929e-17, 2.89687929e-17, 2.89687929e-17],\n",
|
||||||
" [2.89687929e-17, 1.00000000e+00, 7.07349921e-17, 7.07349921e-17],\n",
|
" [2.89687929e-17, 1.00000000e+00, 7.07349921e-17, 7.07349921e-17],\n",
|
||||||
@@ -351,7 +349,7 @@
|
|||||||
"> $$ P = A(A^TA)^{-1}A^T. $$\n",
|
"> $$ P = A(A^TA)^{-1}A^T. $$\n",
|
||||||
"> Indeed, let $b \\in \\mathbb{R}^n$ and let $x_0 \\in \\mathbb{R}^p$ be a solution to the normal equations\n",
|
"> Indeed, let $b \\in \\mathbb{R}^n$ and let $x_0 \\in \\mathbb{R}^p$ be a solution to the normal equations\n",
|
||||||
"> $$ A^TAx_0 = A^Tb. $$\n",
|
"> $$ A^TAx_0 = A^Tb. $$\n",
|
||||||
"> Then $x_0 = (A^TA)^{-1}A^Tb$ and so $Ax_0 = A(A^TA^{-1})A^Tb$ is the (unique!) vector in the column space of $A$ which is closest to $b$, i.e., the projection of $b$ onto the column space of $A$.\n",
|
"> Then $x_0 = (A^TA)^{-1}A^Tb$ and so $Ax_0 = A(A^TA)^{-1}A^Tb$ is the (unique!) vector in the column space of $A$ which is closest to $b$, i.e., the projection of $b$ onto the column space of $A$.\n",
|
||||||
"> However, taking transposes, multiplying, and inverting is not what we would like to do numerically. "
|
"> However, taking transposes, multiplying, and inverting is not what we would like to do numerically. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -503,7 +501,7 @@
|
|||||||
"> and note that\n",
|
"> and note that\n",
|
||||||
"> $$ A = U_rDV_r^T. $$\n",
|
"> $$ A = U_rDV_r^T. $$\n",
|
||||||
"> We call this the reduced singular value decomposition of $A$. Note that $D$ is invertible, and its inverse is simply\n",
|
"> We call this the reduced singular value decomposition of $A$. Note that $D$ is invertible, and its inverse is simply\n",
|
||||||
"> $$ D = \\begin{bmatrix} \\sigma_1^{-1} \\\\ & \\sigma_2^{-1} \\\\ & & \\ddots \\\\ & & & \\sigma_r^{-1} \\end{bmatrix}. $$\n",
|
"> $$ D^{-1} = \\begin{bmatrix} \\sigma_1^{-1} \\\\ & \\sigma_2^{-1} \\\\ & & \\ddots \\\\ & & & \\sigma_r^{-1} \\end{bmatrix}. $$\n",
|
||||||
"> The **pseudoinverse** (or **Moore-Penrose inverse**) of $A$ is the matrix\n",
|
"> The **pseudoinverse** (or **Moore-Penrose inverse**) of $A$ is the matrix\n",
|
||||||
"> $$ A^+ = V_rD^{-1}U_r^T. $$\n",
|
"> $$ A^+ = V_rD^{-1}U_r^T. $$\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -656,7 +654,7 @@
|
|||||||
"> $$ \\kappa(A) = 1 \\text{ and } \\kappa(B) = 10^6. $$\n",
|
"> $$ \\kappa(A) = 1 \\text{ and } \\kappa(B) = 10^6. $$\n",
|
||||||
"> Inverting $X_2$ includes dividing by $\\frac{1}{10^6}$, which will amplify errors by $10^6$.\n",
|
"> Inverting $X_2$ includes dividing by $\\frac{1}{10^6}$, which will amplify errors by $10^6$.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Let's look our main example in python by using `numpy.linalg.cond`. \n",
|
"Let's look at our main example in Python by using `numpy.linalg.cond`. \n",
|
||||||
"\n"
|
"\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -680,7 +678,7 @@
|
|||||||
"# Create a pandas DataFrame\n",
|
"# Create a pandas DataFrame\n",
|
||||||
"df = pd.DataFrame(data)\n",
|
"df = pd.DataFrame(data)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Create out matrix X\n",
|
"# Create our matrix X\n",
|
||||||
"X = df[['Square ft', 'Bedrooms']].to_numpy()\n",
|
"X = df[['Square ft', 'Bedrooms']].to_numpy()\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Check the condition number\n",
|
"# Check the condition number\n",
|
||||||
|
|||||||
@@ -52,7 +52,7 @@
|
|||||||
"# Create a pandas DataFrame\n",
|
"# Create a pandas DataFrame\n",
|
||||||
"df = pd.DataFrame(data)\n",
|
"df = pd.DataFrame(data)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Create out matrix X\n",
|
"# Create our matrix X\n",
|
||||||
"X = df.to_numpy()\n"
|
"X = df.to_numpy()\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -155,8 +155,8 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"# Create our rank-1 approximation\n",
|
"# Create our rank-1 approximation\n",
|
||||||
"sigma1 = S[0]\n",
|
"sigma1 = S[0]\n",
|
||||||
"u1 = U[:, [0]]\t\t#shape (2,2)\n",
|
"u1 = U[:, [0]]\t\t#shape (2,1)\n",
|
||||||
"v1T = Vh[[0], :]\t\t#shape (3,3)\n",
|
"v1T = Vh[[0], :]\t\t#shape (1,3)\n",
|
||||||
"A1 = sigma1 * (u1 @ v1T)\n",
|
"A1 = sigma1 * (u1 @ v1T)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Take norms and view errors\n",
|
"# Take norms and view errors\n",
|
||||||
@@ -275,7 +275,7 @@
|
|||||||
"# Create a pandas DataFrame\n",
|
"# Create a pandas DataFrame\n",
|
||||||
"df = pd.DataFrame(data)\n",
|
"df = pd.DataFrame(data)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Create out matrix X\n",
|
"# Create our matrix X\n",
|
||||||
"X = df.to_numpy()\n",
|
"X = df.to_numpy()\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Get our vector of means\n",
|
"# Get our vector of means\n",
|
||||||
|
|||||||
@@ -2,4 +2,5 @@ pandas
|
|||||||
numpy
|
numpy
|
||||||
matplotlib
|
matplotlib
|
||||||
pillow
|
pillow
|
||||||
scikit-image
|
scikit-learn
|
||||||
|
scikit-image
|
||||||
|
|||||||
Reference in New Issue
Block a user