Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!v4g2000yqa.googlegroups.com!not-for-mail
From: mprocopio@gmail.com
Newsgroups: comp.soft-sys.matlab
Subject: MATLAB Reshape Challenge: N x d matrix to N/K x d x K matrix
Date: Wed, 12 Nov 2008 16:33:39 -0800 (PST)
Organization: http://groups.google.com
Lines: 85
Message-ID: <f6776e56-5088-4685-85ee-fa28328610e6@v4g2000yqa.googlegroups.com>
NNTP-Posting-Host: 134.253.26.11
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Trace: posting.google.com 1226536421 22943 127.0.0.1 (13 Nov 2008 00:33:41 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Thu, 13 Nov 2008 00:33:41 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: v4g2000yqa.googlegroups.com; posting-host=134.253.26.11; 
	posting-account=YcWSVQoAAAB515z1J3yhHzNgFPlbSrSh
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; WOW64; 
	.NET CLR 2.0.50727; .NET CLR 3.0.04506.30; MS-RTC LM 8; .NET CLR 
	3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2),gzip(gfe),gzip(gfe)
X-HTTP-Via: 1.1 sahp4062.sandia.gov:80 (squid/2.6.STABLE17)
Xref: news.mathworks.com comp.soft-sys.matlab:500528


Hi folks,

I'm working on parallelizing some machine learning code in MATLAB. I'm
using the Parallel Computing Toolbox and the parfor construct;
therefore, I have certain restrictions on how I must "slice" into
certain data structures accessed within the parfor (parallel for)
loop.

Basically, I need to reshape my training data from an (N x d) two-
dimensional matrix to a fully "slicable" (N/K x d x K) matrix, where K
is the number of available parallel threads (which corresponds to the
number of "slicable" input data partitions). Here, I would slice among
the third dimension, i.e., partitioned_data
(:,:,parallel_loop_index_i).

(For now, assume N is a multiple of K.)

I'm usually very good with reshape, permute, shiftdim, etc., but I am
having trouble making this work nicely.

I could hack in something like slicable_data_partitions = cat(3,
manual_partition_1, manual_partition_2, ..., manual_partition_K) but
that's horrible and I was hoping for an efficient one-liner with some
combination of reshape, permute, etc. Performance is important, as
this will be fairly large scale, i.e., numel(data) on the order of
10^7.

Note I can actually get reshape to return output int he desired
dimension, however, the d data elements along the second dimension in
each of the N rows are no longer in the correct order.

Detailed example of desired output:

[N x d, 8 x 3]:

input_data = reshape(1:24, 3, 8)'   % Note transpose

input_data =

     1     2     3
     4     5     6
     7     8     9
    10    11    12
    13    14    15
    16    17    18
    19    20    21
    22    23    24


For K = 4, [N/K x d x K, 2 x 3 x 4]

desired_transformed_data =

ans(:,:,1) =

     1     2    3
     4    5    6

ans(:,:,2) =

    7    8     9
    10     11    12

ans(:,:,3) =

    13    14     15
    16    17     18

ans(:,:,4) =

     19    20    21
    22    23    24


Any help would be greatly appreciated.

On a final note, from a memory performance standpoint: any performance
difference in the dimension along which the data is sliced? I.e.,
performance difference for [N/K x d x K] and slicing along the third
dimension, versus [K x d x N/K] and slicing along the first dimension?


With thanks,

--Mike