Get as many data processing
Show older comments
good,
I previously had a binary sequence and my purpose was the creation of substrings of various lengths, eg length 4:
Sequence
1(1), 0(2), 1(3), 1(4), 0(5), 0(6), 1(7), 0(8), 0(9), 1(10), 1(11), 1(12),
1(13), 0(14), 0(15), 0(16), 1(17), 1(18), 1(19), 0(20)
Substrings
01: 1(01), 0(02), 1(03), 1(04) -> [1,0,1,1],
02: 1(01), 1(03), 0(05), 1(07) -> [1,1,0,1],
03: 1(01), 1(04), 1(07), 1(10) -> [1,1,1,1],
04: 1(01), 0(05), 0(09), 1(13) -> [1,0,0,1],
05: 1(01), 0(06), 1(11), 0(16) -> [1,0,1,0],
06: 1(01), 1(07), 1(13), 1(19) -> [1,1,1,1],
07: 0(02), 1(03), 1(04), 0(05) -> [0,1,1,0],
08: 0(02), 1(04), 0(06), 0(08) -> [0,1,0,0],
09: 0(02), 0(05), 0(08), 1(11) -> [0,0,0,1],
10: 0(02), 0(06), 1(10), 0(14) -> [0,0,1,0],
11: 0(02), 1(07), 1(12), 1(17) -> [0,1,1,1],
12: 0(02), 0(08), 0(14), 0(20) -> [0,0,0,0],
13: 1(03), 1(04), 0(05), 0(06) -> [1,1,0,0],
14: 1(03), 0(05), 1(07), 0(09) -> [1,0,1,0],
15: 1(03), 0(06), 0(09), 1(12) -> [1,0,0,1],
16: 1(03), 1(07), 1(11), 0(15) -> [1,1,1,0],
17: 1(03), 0(08), 1(13), 1(18) -> [1,0,1,1],
18: 1(04), 0(05), 0(06), 1(07) -> [1,0,0,1],
19: 1(04), 0(06), 0(08), 1(10) -> [1,0,0,1],
20: 1(04), 1(07), 1(10), 1(13) -> [1,1,1,1],
21: 1(04), 0(08), 1(12), 0(16) -> [1,0,1,0],
22: 1(04), 0(09), 0(14), 1(19) -> [1,0,0,1],
23: 0(05), 0(06), 1(07), 0(08) -> [0,0,1,0],
24: 0(05), 1(07), 0(09), 1(11) -> [0,1,0,1],
25: 0(05), 0(08), 1(11), 0(14) -> [0,0,1,0],
26: 0(05), 0(09), 1(13), 1(17) -> [0,0,1,1],
27: 0(05), 1(10), 0(15), 0(20) -> [0,1,0,0],
28: 0(06), 1(07), 0(08), 0(09) -> [0,1,0,0],
29: 0(06), 0(08), 1(10), 1(12) -> [0,0,1,1],
30: 0(06), 0(09), 1(12), 0(15) -> [0,0,1,0],
31: 0(06), 1(10), 0(14), 1(18) -> [0,1,0,1],
32: 1(07), 0(08), 0(09), 1(10) -> [1,0,0,1],
33: 1(07), 0(09), 1(11), 1(13) -> [1,0,1,1],
34: 1(07), 1(10), 1(13), 0(16) -> [1,1,1,0],
35: 1(07), 1(11), 0(15), 1(19) -> [1,1,0,1],
36: 0(08), 0(09), 1(10), 1(11) -> [0,0,1,1],
37: 0(08), 1(10), 1(12), 0(14) -> [0,1,1,0],
38: 0(08), 1(11), 0(14), 1(17) -> [0,1,0,1],
39: 0(08), 1(12), 0(16), 0(20) -> [0,1,0,0],
40: 0(09), 1(10), 1(11), 1(12) -> [0,1,1,1],
41: 0(09), 1(11), 1(13), 0(15) -> [0,1,1,0],
42: 0(09), 1(12), 0(15), 1(18) -> [0,1,0,1],
43: 1(10), 1(11), 1(12), 1(13) -> [1,1,1,1],
44: 1(10), 1(12), 0(14), 0(16) -> [1,1,0,0],
45: 1(10), 1(13), 0(16), 1(19) -> [1,1,0,1],
46: 1(11), 1(12), 1(13), 0(14) -> [1,1,1,0],
47: 1(11), 1(13), 0(15), 1(17) -> [1,1,0,1],
48: 1(11), 0(14), 1(17), 0(20) -> [1,0,1,0],
49: 1(12), 1(13), 0(14), 0(15) -> [1,1,0,0],
50: 1(12), 0(14), 0(16), 1(18) -> [1,0,0,1],
51: 1(13), 0(14), 0(15), 0(16) -> [1,0,0,0],
52: 1(13), 0(15), 1(17), 1(19) -> [1,0,1,1],
53: 0(14), 0(15), 0(16), 1(17) -> [0,0,0,1],
54: 0(14), 0(16), 1(18), 0(20) -> [0,0,1,0],
55: 0(15), 0(16), 1(17), 1(18) -> [0,0,1,1],
56: 0(16), 1(17), 1(18), 1(19) -> [0,1,1,1],
57: 1(17), 1(18), 1(19), 0(20) -> [1,1,1,0],
using the following code
if true
% code
N = 20;
n = 4;
A = hankel(1:N-n+1,N-n+1:N);
k = 0:n-1;
c = ceil((N - A(:,end) + 1)/k(end));
i2 = cumsum(c);
i1 = i2 - c + 1;
idx = zeros(i2(end),n);
for jj = 1:N-n+1
idx(i1(jj):i2(jj),:) = bsxfun(@plus,A(jj,:),(0:c(jj)-1)'*k);
end
[j1,j2,j2] = unique(s(idx),'rows')
out = [j1, histc(j2,1:max(j2))/i2(end)]; % This row corrected
end
and at the end get a count of the times to repeat each pattern and their relative frequency:
0 0 0 0------ 161697-- 0,0606515378844711
0 0 0 1------ 163593-- 0,0613627156789197
0 0 1 0------ 164201-- 0,0615907726931733
0 0 1 1------ 166680-- 0,0625206301575394
0 1 0 0------ 164105-- 0,0615547636909227
0 1 0 1------ 166501-- 0,0624534883720930
0 1 1 0------ 167099-- 0,0626777944486122
0 1 1 1------ 168835-- 0,0633289572393098
1 0 0 0------ 164086-- 0,0615476369092273
1 0 0 1------ 166963-- 0,0626267816954239
1 0 1 0------ 166931-- 0,0626147786946737
1 0 1 1------ 169470-- 0,0635671417854464
1 1 0 0------ 166622-- 0,0624988747186797
1 1 0 1------ 169326-- 0,0635131282820705
1 1 1 0------ 169251-- 0,0634849962490623
1 1 1 1------ 170640-- 0,0640060015003751
The problem that arises is that when I processed this way I only processes some 4000 data and need to process many more. I have 4GB of RAM and Matlab 2012. What I thought is this: Assign each patron an integer:
0 0 0 0------ 1
0 0 0 1-------2
0 0 1 0-------3
0 0 1 1-------4
0 1 0 0-------5
0 1 0 1-------6
0 1 1 0-------7
0 1 1 1-------8
1 0 0 0-------9
1 0 0 1-------10
1 0 1 0-------11
1 0 1 1-------12
1 1 0 0-------13
1 1 0 1-------14
1 1 1 0-------15
1 1 1 1-------16
and set as a counter to assign the number of times to repeat that integer. In this way perhaps get as many data processing. thank you very much
Answers (1)
Walter Roberson
on 25 Oct 2013
If you are going to do that, consider using accumarray() to do the additions.
If B is the array of bits, such as
B = [0 0 0 0; 1 0 0 0; 0 1 0 0; 1 0 0 0]
then
counts = accumarray( B(:,1) * 8 + B(:,2) * 4 + B(:,3) * 2 + B(:,4) * 1 + 1, 1 );
16 Comments
FRANCISCO
on 25 Oct 2013
Walter Roberson
on 26 Oct 2013
You have some existing logic that can figure out the 1 0 0 0 part of your
1 0 0 0------ 164086-- 0,0615476369092273
line, for each combination you are trying to process. Convert that existing logic slightly to produce a row-oriented matrix (Samples by 4) of these decoded values. The accumarray() call that I showed will then convert the 4 bits into an integer subscript and accumarray() will do the totaling for you.
The result will be a vector of (probably) 16 elements, one count per element. The bit patterns corresponding are the binary representations of (the index minus 1). So [0 0 0 0] for the first vector entry, [0 0 0 1] for the second vector entry, and so on.
FRANCISCO
on 27 Oct 2013
Edited: Walter Roberson
on 27 Oct 2013
Walter Roberson
on 27 Oct 2013
B(:,1) * 16 + B(:,2) * 8 + B(:,3) * 4 + B(:,4) * 2 + B(:,5) * 1 + 1
Notice the pattern, [8 4 2 1]. You can calculate that pattern for substrings of length N, and do not need to represent it explicitly:
B * (2.^fliplr(1:N)).' + 1
Note: that is * and not .* as it is matrix multiplication.
FRANCISCO
on 27 Oct 2013
Walter Roberson
on 27 Oct 2013
The s * (2. ^ fliplr (1: N)). '+ 1 form can be used for N = 4 as well.
FRANCISCO
on 28 Oct 2013
FRANCISCO
on 28 Oct 2013
Walter Roberson
on 28 Oct 2013
Edited: Walter Roberson
on 28 Oct 2013
accumarray( (s(1:4:end) * 8 + s(2:4:end) * 4 + s(3:4:end) * 2 + s(4:4:end) * 1 + 1) .', 1)
FRANCISCO
on 28 Oct 2013
Walter Roberson
on 28 Oct 2013
In your original code, how do you handle the boundary cases at the end, such as when there are only 3 bits left ?
If you could upload a .txt file with your bit pattern, I will run it through a couple of different counting methods and see if I get agreement.
FRANCISCO
on 28 Oct 2013
FRANCISCO
on 29 Oct 2013
Walter Roberson
on 29 Oct 2013
Sorry, I have been busy, and now I need to go sleep.
FRANCISCO
on 29 Oct 2013
FRANCISCO
on 29 Oct 2013
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!