Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
similarity between two vector with different sizes

Subject: similarity between two vector with different sizes

From: sara

Date: 16 Nov, 2012 02:47:06

Message: 1 of 8

Hi,

I converted the text to frequency vectors. The thing is all the vectors have different sizes. Could you help me how I can calculate the similarity between two vector with different sizes. I need them to cluster my data.
Thanks

Subject: similarity between two vector with different sizes

From: Nasser M. Abbasi

Date: 16 Nov, 2012 03:01:32

Message: 2 of 8

On 11/15/2012 8:47 PM, sara wrote:
> Hi,
>
> I converted the text to frequency vectors. The thing is all the vectors have different sizes.
>Could you help me how I can calculate the similarity between two vector with different sizes.
>I need them to cluster my data.
> Thanks
>

not clear what you mean by `similarity'. In which sense?

norm? correlation? What is the measure of similarity you want?

--Nasser

Subject: similarity between two vector with different sizes

From: sara

Date: 16 Nov, 2012 03:13:16

Message: 3 of 8

I mean distance or cosine similarity between vectors. I want to use k-means for clustering.
Sara


"Nasser M. Abbasi" <nma@12000.org> wrote in message <k84aaj$ikb$1@speranza.aioe.org>...
> On 11/15/2012 8:47 PM, sara wrote:
> > Hi,
> >
> > I converted the text to frequency vectors. The thing is all the vectors have different sizes.
> >Could you help me how I can calculate the similarity between two vector with different sizes.
> >I need them to cluster my data.
> > Thanks
> >
>
> not clear what you mean by `similarity'. In which sense?
>
> norm? correlation? What is the measure of similarity you want?
>
> --Nasser

Subject: similarity between two vector with different sizes

From: Nasser M. Abbasi

Date: 16 Nov, 2012 04:04:10

Message: 4 of 8

On 11/15/2012 9:13 PM, sara wrote:
> I mean distance or cosine similarity between vectors. I want to use k-means for clustering.
> Sara
>

Well, cosine similarity is defined here

http://en.wikipedia.org/wiki/Cosine_similarity

So, all what you have to do is type the Matlab code for it?
may be

           dot(A,B)/(norm(A)*norm(B))

Subject: similarity between two vector with different sizes

From: sara

Date: 16 Nov, 2012 04:41:11

Message: 5 of 8

I know the definition. My problem is the dot(A.B) works for same size vectors but my vectors have different sizes. (I converted documents to vectors to cluster them)

> Well, cosine similarity is defined here
>
> http://en.wikipedia.org/wiki/Cosine_similarity
>
> So, all what you have to do is type the Matlab code for it?
> may be
>
> dot(A,B)/(norm(A)*norm(B))
>
>
>
>
>

Subject: similarity between two vector with different sizes

From: Nasser M. Abbasi

Date: 16 Nov, 2012 06:07:56

Message: 6 of 8

On 11/15/2012 10:41 PM, sara wrote:
> I know the definition. My problem is the dot(A.B) works for same size vectors
>but my vectors have different sizes. (I converted documents to vectors to cluster them)
>

If a vector V1 spans 2D only and the other vector V2 spans 3D, this
means V1 has a zero in that third component.

So, all you have to do is make them all live in the same
space, which means the same length. Add zeros to make them same
size then then use the definition.


>> Well, cosine similarity is defined here
>>
>> http://en.wikipedia.org/wiki/Cosine_similarity
>>
>> So, all what you have to do is type the Matlab code for it?
>> may be
>>
>> dot(A,B)/(norm(A)*norm(B))
>>
>>
>>
>>
>>

Subject: similarity between two vector with different sizes

From: Torsten

Date: 16 Nov, 2012 07:23:13

Message: 7 of 8

"sara" wrote in message <k849fa$igh$1@newscl01ah.mathworks.com>...
> Hi,
>
> I converted the text to frequency vectors. The thing is all the vectors have different sizes. Could you help me how I can calculate the similarity between two vector with different sizes. I need them to cluster my data.
> Thanks

What do you mean by
"I converted the text to frequency vectors" ?
What does a single vector component represent ?
Why do the resulting vectors have different sizes ?

Best wishes
Torsten.

Subject: similarity between two vector with different sizes

From: sara

Date: 16 Nov, 2012 15:04:19

Message: 8 of 8

Thanks for your answers.
Nasser: I did and made all the zeros live in the same places. But, when I use Gaussian mixtures for clustering , it dose not work because I have many zeros in my matrix and all of my parameters will be NAN.

Torsten:
I converted the text to term frequency vector. The vector like [1 1 2 1 3 2 4 1 5 1 ] shows the words and there frequencies. 1th word has frequency 1 and 2nd word has frequency 1 and....

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us