ST 503 – Problem Set 2

Using the real data set that you have found as part of problem set 1, to be used for your course project, propose a statistical model with population features that may be appropriate to estimate from your data. For example, if your data set includes housing prices in US along with a variety of other housing related covariates, then you might be interested in learning regression parameters for explaining the variation in housing prices. Specifically, if $Y$ denotes the price of a given house, $X_1$ is the square footage of the house, $X_2$ is the city where the house is located, etc., then perhaps you could formulate the linear model
$Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + U,$
where $U$ is a random error variable, and $\beta_0, \beta_1, \beta_2, \ldots$ are the coefficient parameters (i.e., the population features) that you will estimate with your real data.
Let $A \in \mathbb{R}^{p \times p}$ be symmetric. Use the spectral decomposition of $A$ to show that
$\sup_{x \in \mathbb{R}^p \setminus \{0\}} \frac{x'Ax}{x'x} = \lambda_{\max},$
where $\lambda_{\max}$ is the largest eigenvalue of $A$. Observe that this is a special case of the Courant-Fischer theorem (see https://en.wikipedia.org/wiki/Min-max_theorem).
Construct an $n \times n$ matrix $A$ such that
$\lambda_{\max}(A) \neq \sup_{v \neq 0} \frac{v'Av}{v'v},$
where $\lambda_{\max}(\cdot)$ denotes the maximum eigenvalue of its argument. Why does your counter example not violate the Courant-Fischer theorem?
Write an R function that takes as input $(v, U)$, where $v$ is an $n$-dimensional column vector and $U$ is an $n \times p$ dimensional matrix. Your function should do the following:
1. Determine if the columns of $U$ are linearly independent. If the columns of $U$ are not linearly independent, then your function should determine a basis for the column space of $U$. (Hint: in the latter case, use the fact that $\text{col}(U) = \text{col}(UU')$ for any matrix $U \in \mathbb{R}^{n \times p}$, where $\text{col}(\cdot)$ denotes the column space of a matrix argument).
2. Using the basis for the column space of $U$ constructed in part (a), determine if $v$ is in the column space. If so, determine the coefficient vector $a$ such that $v$ can be expressed as a linear combination of the basis vectors for the column space of $U$.
Your function should return the basis returned in part (a) along with the coefficient vector $a$ in part (b), if it exists. If $a$ does not exist, then return a warning message in place of $a$. In any case, you will always be able to return the basis from part (a).
Obtain a .jpg file of an image. You can use a personal .jpg file, or download a .jpg file from the internet – keep it professionally appropriate.
1. In your homework script file, load your .jpg file, and plot the original image, to scale, in a base R plot.
2. Change the pixel colors in your image file by adding a small perturbation to each RGB value, and plot the new image, to scale, in a base R plot.
3. Plot the original image in grayscale, to scale, in a base R plot.
4. Obtain the $U$, $D$, and $V$ matrices in the SVD of your grayscale image. Verify that the grayscale image matrix can be reconstructed from its SVD.
5. Plot the ordered singular values as points in a base R plot, and note the index $k$ for which the ordered singular values become “close” to zero. In what sense is “close” to zero meaningfully quantified?
6. Reconstruct your grayscale image matrix using a low-rank (or dimension reduced) approximation starting with only the two largest singular values, and successively adding singular values one at a time to improve the approximation. At each iteration, plot the image, to scale, in a base R plot, and save the plot as a new page in a .pdf file. See the lecture code for an example of how to do this.
7. In the successive low-rank matrix approximations of your grayscale image, at what index, $k$, would you say adding an additional singular value to the approximation has a negligible visual effect on the image quality? Using the dimension reduced SVD with only $k$ singular values, calculate the memory efficiency gained versus storing the full grayscale matrix (i.e., how many fewer pieces of information do you need to store in your computer memory using the low-rank approximation)?