plot PCA of transpose of matrix #496

vivekbhr · 2017-03-24T11:02:28Z

plotPCA issue (not projecting the samples properly, #477 ) is fixed in R by using a transpose of matrix (scaling/centering is not required). But matplotlib.mlab.PCA doesn't accept a transposed matrix. We need to fix this issue.

The text was updated successfully, but these errors were encountered:

dpryan79 · 2017-03-28T08:41:50Z

@vivekbhr Do you still have the numpy file for this?

vivekbhr · 2017-03-28T12:00:22Z

@dpryan79 yes.. can share with you ..

dpryan79 · 2017-03-28T21:12:54Z

It looks like the following works (m is a numpy matrix with nrows > ncols):

U, s, V = np.linalg.svd(m.T, full_matrices=False)
return np.dot(m.T, V.T)

That's among what prcomp() is doing internally from the best I can tell.

dpryan79 · 2017-03-29T06:42:23Z

I'm not sure SVD is really returning equivalent results to what R is doing in this case. I tried the above code on a play dataset and got reasonable results, but I only got nonsense on real data. This may well turn into a "can't implement without rewriting parts of numpy". I'll remove this from the 2.5 milestone, since I don't think it'll happen for that.

fidelram · 2017-03-29T07:12:21Z

for a different PCA implementation we will need to use sklearn or statsmodels (or find out how they do the PCA).

…

On Wed, Mar 29, 2017 at 8:42 AM, Devon Ryan ***@***.***> wrote: I'm not sure SVD is really returning equivalent results to what R is doing in this case. I tried the above code on a play dataset and got reasonable results, but I only got nonsense on real data. This may well turn into a "can't implement without rewriting parts of numpy". I'll remove this from the 2.5 milestone, since I don't think it'll happen for that. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#496 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEu_1UDHCZAsmDvf1NNpfTJvsXiusEBXks5rqf1QgaJpZM4MoFs0> .

-- Fidel Ramirez

dpryan79 · 2017-03-29T10:22:06Z

I tried sklearn briefly but didn't get much better results.

dpryan79 · 2017-07-11T11:57:08Z

Here's the python code that seems to work correctly (m is a matrix with nrows > ncols):

m2 = (m.T - np.mean(m, axis=1))
U, s, V = np.linalg.svd(m2, full_matrices=False, compute_uv=True)
V = V.T
PCs = np.dot(m2, V)

Each column of PCs is a principal component, with rows as samples. This matches with what R is doing.

dpryan79 · 2017-07-21T14:04:38Z

@vivekbhr There's now a betterPCA branch, which adds the --transpose, --ntop, and --PCs options. --transpose will produce the PCA on the transposed matrix. As in R, the projection of the samples on the PCs is then plotted rather than the weights/loadings. --ntop specifies how many of the top N most variable rows to use for the PCA (again, exactly as in R). --PCs specifies which components to plot. The default is 1 2, but you can specify whichever components you want. This should then produce exactly the same output as you'd get with prcomp(foo, scale=T, center=T) in R.

dpryan79 · 2017-07-24T14:06:10Z

I seem to now be getting the same results as prcomp, which makes me happy. Even the scaling that matplotlib was doing was suboptimal (it wasn't using Bessel's correction). This is all implemented in the develop branch now and will be included in the 2.6 release.

vivekbhr added this to the 2.5.0 milestone Mar 24, 2017

dpryan79 removed this from the 2.5.0 milestone Mar 29, 2017

dpryan79 added the enhancement label Mar 29, 2017

dpryan79 added this to the 2.6.0 milestone May 4, 2017

dpryan79 added a commit that referenced this issue Jul 21, 2017

implement #496

961ae5b

dpryan79 self-assigned this Jul 21, 2017

dpryan79 closed this as completed Jul 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plot PCA of transpose of matrix #496

plot PCA of transpose of matrix #496

vivekbhr commented Mar 24, 2017 •

edited

Loading

dpryan79 commented Mar 28, 2017

vivekbhr commented Mar 28, 2017

dpryan79 commented Mar 28, 2017

dpryan79 commented Mar 29, 2017

fidelram commented Mar 29, 2017 via email

dpryan79 commented Mar 29, 2017

dpryan79 commented Jul 11, 2017

dpryan79 commented Jul 21, 2017

dpryan79 commented Jul 24, 2017

plot PCA of transpose of matrix #496

plot PCA of transpose of matrix #496

Comments

vivekbhr commented Mar 24, 2017 • edited Loading

dpryan79 commented Mar 28, 2017

vivekbhr commented Mar 28, 2017

dpryan79 commented Mar 28, 2017

dpryan79 commented Mar 29, 2017

fidelram commented Mar 29, 2017 via email

dpryan79 commented Mar 29, 2017

dpryan79 commented Jul 11, 2017

dpryan79 commented Jul 21, 2017

dpryan79 commented Jul 24, 2017

vivekbhr commented Mar 24, 2017 •

edited

Loading