<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407</link>
    <title>MATLAB Central Newsreader - SVD with Missing Values</title>
    <description>Feed for thread: SVD with Missing Values</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Thu, 04 Dec 2008 04:06:04 -0500</pubDate>
      <title>SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#614949</link>
      <author>Samuel </author>
      <description>In MATLAB, what is the best way to handle a single value decomposition where k is much less then m or n (MxN matrix) for a data set with many missing values such that the missing values have a minimal effect on the decomposition. Thanks.</description>
    </item>
    <item>
      <pubDate>Thu, 04 Dec 2008 11:12:26 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615011</link>
      <author>BHUPALA</author>
      <description>On Dec 4, 9:06=A0am, &quot;Samuel &quot; &amp;lt;sdods...@jhu.edu&amp;gt; wrote:&lt;br&gt;
&amp;gt; In MATLAB, what is the best way to handle a single value decomposition wh=&lt;br&gt;
ere k is much less then m or n (MxN matrix) for a data set with many missin=&lt;br&gt;
g values such that the missing values have a minimal effect on the decompos=&lt;br&gt;
ition. Thanks.&lt;br&gt;
&lt;br&gt;
Try using svd(X,0) or svd(X,'econ')  which will give you economy&lt;br&gt;
singular values.&lt;br&gt;
&lt;br&gt;
bhupala</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 01:39:02 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615166</link>
      <author>Samuel </author>
      <description>BHUPALA &amp;lt;bhupala@gmail.com&amp;gt; wrote in message &amp;lt;07963786-05dd-4dde-8381-fb56604d2ea8@k36g2000pri.googlegroups.com&amp;gt;...&lt;br&gt;
&amp;gt; On Dec 4, 9:06=A0am, &quot;Samuel &quot; &amp;lt;sdods...@jhu.edu&amp;gt; wrote:&lt;br&gt;
&amp;gt; &amp;gt; In MATLAB, what is the best way to handle a single value decomposition wh=&lt;br&gt;
&amp;gt; ere k is much less then m or n (MxN matrix) for a data set with many missin=&lt;br&gt;
&amp;gt; g values such that the missing values have a minimal effect on the decompos=&lt;br&gt;
&amp;gt; ition. Thanks.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Try using svd(X,0) or svd(X,'econ')  which will give you economy&lt;br&gt;
&amp;gt; singular values.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; bhupala&lt;br&gt;
&lt;br&gt;
The matrix for reference is 241x241 so an economy SVD won't do the trick. I'm looking for an alternative to mean imputation for the missing values. Is there any method I can use where I can treat the values as true unknown NaN values.</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 14:20:20 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615244</link>
      <author>Peter Perkins</author>
      <description>Samuel wrote:&lt;br&gt;
&amp;gt; In MATLAB, what is the best way to handle a single value decomposition where k is much less then m or n (MxN matrix) for a data set with many missing values such that the missing values have a minimal effect on the decomposition. Thanks.&lt;br&gt;
&lt;br&gt;
Samuel, SVD is an algorithm in computational linear algebra.  Many statistical models/methods use SVD as a computational tool, but it is not a statistical model pe se.  You're asking about missing data, which is a statistical issue.  It's impossible to give advice about statistical issues without knowing what what you're really doing, statistically.  It may be Principal Components Analysis, it may be something else entirely.</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 16:23:02 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615288</link>
      <author>Samuel </author>
      <description>Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; wrote in message &amp;lt;ghbdb4$2sf$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Samuel wrote:&lt;br&gt;
&amp;gt; &amp;gt; In MATLAB, what is the best way to handle a single value decomposition where k is much less then m or n (MxN matrix) for a data set with many missing values such that the missing values have a minimal effect on the decomposition. Thanks.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Samuel, SVD is an algorithm in computational linear algebra.  Many statistical models/methods use SVD as a computational tool, but it is not a statistical model pe se.  You're asking about missing data, which is a statistical issue.  It's impossible to give advice about statistical issues without knowing what what you're really doing, statistically.  It may be Principal Components Analysis, it may be something else entirely.&lt;br&gt;
&lt;br&gt;
Thanks Peter for the response. What I am doing is most analogous to the Netflix Competition, albeit with a much smaller less sparse matrix. I have users and ratings of products and missing values for the products they have not rated. I am then trying to get a prediction of the values of missing products using a thin SVD approach. If I use thin SVDS in MATLAB, I have to use a numerical imputation for the missing values, which I think will disrupt the results when I multiply U*S*V' to generate the predictions (if I am understanding that correctly). I would really appreciate any thoughts on how to handle the missing values better for prediction purposes, or if I am missing the point altogether. Thanks.</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 16:37:02 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615290</link>
      <author>Johan Carlson</author>
      <description>&quot;Samuel &quot; &amp;lt;sdodson2@jhu.edu&amp;gt; wrote in message &amp;lt;ghbkh6$ski$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; wrote in message &amp;lt;ghbdb4$2sf$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &amp;gt; Samuel wrote:&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; In MATLAB, what is the best way to handle a single value decomposition where k is much less then m or n (MxN matrix) for a data set with many missing values such that the missing values have a minimal effect on the decomposition. Thanks.&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; Samuel, SVD is an algorithm in computational linear algebra.  Many statistical models/methods use SVD as a computational tool, but it is not a statistical model pe se.  You're asking about missing data, which is a statistical issue.  It's impossible to give advice about statistical issues without knowing what what you're really doing, statistically.  It may be Principal Components Analysis, it may be something else entirely.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Thanks Peter for the response. What I am doing is most analogous to the Netflix Competition, albeit with a much smaller less sparse matrix. I have users and ratings of products and missing values for the products they have not rated. I am then trying to get a prediction of the values of missing products using a thin SVD approach. If I use thin SVDS in MATLAB, I have to use a numerical imputation for the missing values, which I think will disrupt the results when I multiply U*S*V' to generate the predictions (if I am understanding that correctly). I would really appreciate any thoughts on how to handle the missing values better for prediction purposes, or if I am missing the point altogether. Thanks.&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Well, computing the SVD without replacing the missing values is, as far as I know, not possible, since it is really a matrix factorization.&lt;br&gt;
&lt;br&gt;
So, the question would then become: With what should you replace the missing values. The answer is not that easy, and it is indeed more of a statistical question than a numerical one. Some statistics packages replace missing data with means of the columns, which may or may not be a good idea. Another approach would be to use existing data to predict the missing values, using for example cross-validation. This is sometimes done in Principal Component Analysis (which is really nothing but an SVD, numerically speaking). Some variants of the NIPALS algorithm can handle this. I suggest you do a literature search on NIPALS and &quot;missing data&quot; and see what you come up with. Please let us know how it works out!&lt;br&gt;
&lt;br&gt;
/JC</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 17:37:02 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615306</link>
      <author>Samuel </author>
      <description>&quot;Johan Carlson&quot; &amp;lt;Johan.E.Carlson@gmail.com&amp;gt; wrote in message &amp;lt;ghblbd$ahl$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &quot;Samuel &quot; &amp;lt;sdodson2@jhu.edu&amp;gt; wrote in message &amp;lt;ghbkh6$ski$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &amp;gt; Peter Perkins &amp;lt;Peter.PerkinsRemoveThis@mathworks.com&amp;gt; wrote in message &amp;lt;ghbdb4$2sf$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; Samuel wrote:&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; &amp;gt; In MATLAB, what is the best way to handle a single value decomposition where k is much less then m or n (MxN matrix) for a data set with many missing values such that the missing values have a minimal effect on the decomposition. Thanks.&lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; &amp;gt; Samuel, SVD is an algorithm in computational linear algebra.  Many statistical models/methods use SVD as a computational tool, but it is not a statistical model pe se.  You're asking about missing data, which is a statistical issue.  It's impossible to give advice about statistical issues without knowing what what you're really doing, statistically.  It may be Principal Components Analysis, it may be something else entirely.&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; Thanks Peter for the response. What I am doing is most analogous to the Netflix Competition, albeit with a much smaller less sparse matrix. I have users and ratings of products and missing values for the products they have not rated. I am then trying to get a prediction of the values of missing products using a thin SVD approach. If I use thin SVDS in MATLAB, I have to use a numerical imputation for the missing values, which I think will disrupt the results when I multiply U*S*V' to generate the predictions (if I am understanding that correctly). I would really appreciate any thoughts on how to handle the missing values better for prediction purposes, or if I am missing the point altogether. Thanks.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Well, computing the SVD without replacing the missing values is, as far as I know, not possible, since it is really a matrix factorization.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; So, the question would then become: With what should you replace the missing values. The answer is not that easy, and it is indeed more of a statistical question than a numerical one. Some statistics packages replace missing data with means of the columns, which may or may not be a good idea. Another approach would be to use existing data to predict the missing values, using for example cross-validation. This is sometimes done in Principal Component Analysis (which is really nothing but an SVD, numerically speaking). Some variants of the NIPALS algorithm can handle this. I suggest you do a literature search on NIPALS and &quot;missing data&quot; and see what you come up with. Please let us know how it works out!&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; /JC&lt;br&gt;
&lt;br&gt;
Thanks for your thoughts; theyve been really helpful. I read a bit about NIPALS and it kind of led me to the thought that nearest neighbor interpolation might be helpful in replacing the missing values. Im not sure yet though as im very unfamiliar with the concept. I was under the impression also that some form of the Lanczos method could handle missing values, but im incredibly uncertain how or if to implement that.</description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 18:00:20 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615312</link>
      <author>Bruno Luong</author>
      <description>&quot;Samuel &quot; &amp;lt;sdodson2@jhu.edu&amp;gt; wrote in message &amp;lt;ghboru$60u$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&lt;br&gt;
&amp;gt; I was under the impression also that some form of the Lanczos method could handle missing values, but im incredibly uncertain how or if to implement that.&lt;br&gt;
&lt;br&gt;
No Lanczos is simply a specific algorithm to compute the eigen spaces of symmetric matrix, and it can be used to compute the SVD (because SVD of M is closely related to eigen spaces of M'*M and M*M'). No more no less. It more suitable for sparse matrix.&lt;br&gt;
&lt;br&gt;
As Peter said, those are linear algebra *tools*.&lt;br&gt;
&lt;br&gt;
Bruno </description>
    </item>
    <item>
      <pubDate>Fri, 05 Dec 2008 18:46:02 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615330</link>
      <author>Johan Carlson</author>
      <description>&quot;Bruno Luong&quot; &amp;lt;b.luong@fogale.findmycountry&amp;gt; wrote in message &amp;lt;ghbq7k$o5r$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &quot;Samuel &quot; &amp;lt;sdodson2@jhu.edu&amp;gt; wrote in message &amp;lt;ghboru$60u$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; I was under the impression also that some form of the Lanczos method could handle missing values, but im incredibly uncertain how or if to implement that.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; No Lanczos is simply a specific algorithm to compute the eigen spaces of symmetric matrix, and it can be used to compute the SVD (because SVD of M is closely related to eigen spaces of M'*M and M*M'). No more no less. It more suitable for sparse matrix.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; As Peter said, those are linear algebra *tools*.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Bruno &lt;br&gt;
&lt;br&gt;
Depending on what type of problem you're dealing with, cross-validation might also be an option. That is, if you're trying to find A in a model like&lt;br&gt;
AX = Y&lt;br&gt;
where you're missing data in either X or Y. If you remove the rows and columns of X or Y where data is missing you can try to predict the missing values using what's left of your system. For prediction you could use either a principal components approach (pseudo inverse) or PLS regression (which is somewhat messier but with better predictive performance if part of X and Y are correlated, but parts of either X or Y aren't).&lt;br&gt;
&lt;br&gt;
/JC</description>
    </item>
    <item>
      <pubDate>Tue, 09 Dec 2008 15:19:21 -0500</pubDate>
      <title>Re: SVD with Missing Values</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/240407#615932</link>
      <author>Peter Perkins</author>
      <description>Samuel wrote:&lt;br&gt;
&lt;br&gt;
&amp;gt; Thanks Peter for the response. What I am doing is most analogous to the Netflix Competition, albeit with a much smaller less sparse matrix. I have users and ratings of products and missing values for the products they have not rated. I am then trying to get a prediction of the values of missing products using a thin SVD approach. If I use thin SVDS in MATLAB, I have to use a numerical imputation for the missing values, which I think will disrupt the results when I multiply U*S*V' to generate the predictions (if I am understanding that correctly). I would really appreciate any thoughts on how to handle the missing values better for prediction purposes, or if I am missing the point altogether. Thanks.&lt;br&gt;
&lt;br&gt;
Samuel, I am not even remotely up on this subject, and may be completely misunderstanding what you've said.  But it seems to me that what you describe is predicting the missing values in a matrix using an SVD on that same matrix that has had its missing values somehow filled in.  And if it's like the NetFlix case, most entries in that matrix are missing.  This would indeed seem to depend crucially on the way you fill in those values.&lt;br&gt;
&lt;br&gt;
I suspect (do a google search on &quot;missing svd netflix&quot;) that rather than thinking in terms of a single imputation and prediction, people use something like the E-M algorithm to do what you describe iteratively until convergence.  Presumably the big question would be &quot;What's the M step?&quot;.  I don't know enough about this area to help.</description>
    </item>
  </channel>
</rss>

