How to implement a pipe function as in F# (|>) or in R (%>%)

32 views (last 30 days)
I use R from time to another and really love the functionality to have the pipe function in R. Has there been any thoughts on implementing something similar in Matlab?
The pipe function lets you pass an inter­me­di­ate result onto the next func­tion. For example if you have something like
logical(reshape(repmat(A,1,4),n,1))
You could instead write
A %>% repmat(1,4) %>% reshape(n,1) %>% logical()
which is much easier to read.
Is there any way to do this in Matlab and has anyone discussed that this could be included?
  2 Comments
Stephen23
Stephen23 on 28 Aug 2015
Edited: Stephen23 on 28 Aug 2015
"which is much easier to read."
Really?
The example functions repmat and reshape both use three arguments, but your pipe example implies the first argument (via the pipe) and explicitly provides only the second and third arguments. How is implied passing of arguments clearer? What happens if I wanted to pass a value as the second argument instead of the first, how would this be implied?:
foo(1,bar(x)) <- easy and clear
bar(x) %>% foo(1) <- implied second argument?
A complete solution would essentially end up having to allocate the values to named/indexed variables, or use numbered pipes or something similar:
y = bar(x) %>% foo(1,y) <- how to imply y?
which is really just variable allocation by implication. Any solution that only implies the first argument is basically useless.
Why not just learn to read and write MATLAB code, which does not imply anything? Some might even say that not implying arguments is much easier to read...
Walter Roberson
Walter Roberson on 28 Aug 2015
x %>% bar %>% map2(@foo,1)
in my notation could become
pin(x).bar().mapN(2,@foo,1).pout();
so it is possible to handle without introducing new syntax.
If we are going to invent syntax, then:
x %>% bar %>% foo#2(1)
or I guess
x %>% bar %>2% foo(1)
Maybe
x %>% bar %>>% foo(1)
But the syntax is lacking for handling multiple streams.
pdf(x, mean(x), sqrt(std(x))) %easy in MATLAB
We can break out one sequence of calls
x %>% mean %>2% pdf(x, sqrt(var(x)))
or the other
x %>% var %>% sqrt %>3% pdf(x, mean(x))
but both at the same time??
(x %>% mean, x %>% var %>% sqrt) %>2,3% pdf(x)
to be specific about where the parts go, or maybe
(x %>% mean, x %>% var %>% sqrt) %>>% pdf(x)
for the case of all the arguments are to go at the "end" of the provided list.
Maybe some improvements could be made, but I am not at all sure that we are gaining in readability.
Maybe the Bourne Shell (and later) numbered pipes?
x %>% var %>% sqrt >&, x %>% mean %>2% pdf(x, <&)
More flexibility but I'm still not exactly thrilled with the ease of readability.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 28 Aug 2015
Like everybody else, I don't think this pipe syntax is any easier to read. More importantly, it does not resolve matlab's problem that you can't chain expressions that make use of anything but the first return value of a function. If anything, it gets worse, because now, you only chain the first input as well.
Nonetheless, as Walter's suggested, it could be implemented with a class. This simple class pretty much does it:
classdef Pipe
properties (Access = private)
arg
end
methods
function this = Pipe(arg)
this.arg = arg;
end
function result = subsref(this, s)
if mod(numel(s), 2) ~= 0 || any(~strcmp({s(1:2:end).type}, '.')) || any(~strcmp({s(2:2:end).type}, '()'))
error('invalid syntax');
end
result = this.arg;
for idx = 1:2:numel(s)
result = feval(s(idx).subs, result, s(idx+1).subs{:});
end
end
end
end
which you'd use as:
in = rand(3, 5);
result = Pipe(in).repmat(1, 2).reshape([], 1).sqrt()
Because of my crude syntax checks, you always need to have the () after a function call, even if it's only the pipe argument that should be passed. It's fairly trivial to improve the class to be more flexible.
One downside you'll always have is the lack of tab completion, even for built-in functions.
  1 Comment
Guillaume
Guillaume on 14 Sep 2015
Edited: Guillaume on 14 Sep 2015
A slight improved class that allows you to pass as the nth argument to a function (using as(nth) before the call to a function)
classdef Pipe
properties (Access = private)
arg
end
methods
function this = Pipe(arg)
this.arg = arg;
end
function result = subsref(this, s)
if mod(numel(s), 2) ~= 0 || any(~strcmp({s(1:2:end).type}, '.')) || any(~strcmp({s(2:2:end).type}, '()'))
error('invalid syntax');
end
result = this.arg;
argpos = 1;
for idx = 1:2:numel(s)
if strcmp(s(idx).subs, 'as')
argpos = s(idx+1).subs{1};
else
args = [s(idx+1).subs(1:argpos-1), {result}, s(idx+1).subs(argpos:end)];
result = feval(s(idx).subs, args{:});
argpos = 1;
end
end
end
end
end
Usage:
Pipe([2 6]).randi().as(2).repmat([1 2 3], 5)
%equivalent to:
repmat([1 2 3], randi([2 6]), 5)
Pipe([2 6]).randi().as(3).repmat([1 2 3], 5)
%equivalent to:
repmat([1 2 3], 5, randi([2 6]))

Sign in to comment.

More Answers (3)

Walter Roberson
Walter Roberson on 28 Aug 2015
Personally, I find it more difficult to read, but that's just my personal opinion.
You could create a class that had an constructor method to enter things into the pipeline. Then you would have methods for each operation you wanted to be able to perform such as repmat and reshape and logical, each taking a pipe object, extracting the associated value, performing the operation according to the parameters, and putting the result back into a pipe object. And you would have a method that detached the content from the pipe.
Once that is in place then you use "." method notation. For example,
B = pin(A).repmat(1,4).reshape(n,1).logical().pout();
I do not know anything about the spacing requirements for method calls.
The constructor would be unusual in that it would need to be able to act upon pipe objects and encapsulate them within a pipe. For example if
A = rand(3,5);
B = pin(A);
C = pin(B).repmat(1,1).reshape([],1).pout();
D = pout(C);
then that should work, whereas if the pipe construct applied to an existing pipe just returned the existing pipe then the pout() for C would get back the rand(3,5) and you would not be able to pout() that. That would break type transparency for tools.
MATLAB does not allow the construction of new operators so we cannot for example use
B = |"A.repmat(1,4).logical()"|;
where |" and "| would call the pipe constructor and destructor respectively.

Benjamin
Benjamin on 31 Aug 2015
Thanks a lot for all the answers!
As to whether the pipe notation is easier to ready, I guess it depends on what you are used to. However, if you have to nest a lot of functions, I find that the pipe function makes this really easy to read. You can read each pile as "and then" and quickly understand a lot of nested calls.
The point about passing as the second argument can probably be solved in different ways. The R way is to write for example
bar(x) %>% foo(1,.)
where the . indicates that bar(x) should be the second input to foo().
Once again thanks for the comments.
Ben :)
  3 Comments
Benjamin
Benjamin on 14 Sep 2015
Edited: Benjamin on 14 Sep 2015
Then you will have to pipe a list of arguments. Like in matlab when you have a cell with multiple elements c = {a, b} then you can write f(c{:}) to give both a and b as inputs.
Walter Roberson
Walter Roberson on 14 Sep 2015
And how do you build that list of arguments if more than one of them needs to be computed? For example, how would you write f(x(t),y(t)) in piped notation where x(t) and y(t) are functions ?

Sign in to comment.


Alejandro Garcia Fernandez
Most of the comments on this tread think that the pipe syntax is harder to read. But to me and the original poster (@Benjamin) it is clearer.
I think it is a matter of background, I'm used to use Pipes in the unix sense so that makes them desirable.
As a way of example currently I have an algorithm that is basically a series of transformations on data. right now it looks like this
a=transformation1(original);
b=transformation2(a);
c=transformation3(b);
d=transformation4(c);
But then I need to make an extra transformation between 1 and 2 so it becomes
a=transform1()
d=Interntransform(a);
b=transform2(d); %had to change this variable here from a to b.
And you end up which a bunch of variables whose only purpose is to carry data from the output of one function to the next. to me it would be more clear to use
transform1() |> transform2() |> transfrom3()
or when things change:
transfrom1() |> interimTransform |> transform2 |>transform3()
Of course the other option is to use
transform3(transform2(transform1());
But that is less readable for non-coders or beginning coders, because the last transformation is the first thing you read. I actually tested this by asking non-coders what they thought each of the different options was doing.
The whinner was the pipe...
  1 Comment
Walter Roberson
Walter Roberson on 8 Apr 2016
However you are only using a single input and single output. If you were willing to restrict to that, you would probably use class methods and write something like
transform1(original).transform2().transform3()
The hard part is what we wrote about extensively: making a workable and readable syntax for multiple arguments, or for the piped value getting injected into something other than the first argument, or dealing with multiple outputs.
Pipes in the sh / ksh / bash / zsh sense can be numbered, which solves some of the difficulties. On the other hand, in the *sh syntax, it gets messy to generate multiple piped arguments. For example,
A >&3 && B >&4 | C <&3 2<&4
or
B >&3 && A | C 2<&3
or
A >&3 && B >&4 | <&3 2<&4 C
or
B >&3 && A | 2<&3 C
or
(A >&3; B >&4) | C <&3 2<&4
or
(B >&4; A) | C 2<&4
to implement the equivalent of
C(A(), B())
..unless you into the newer shell variants that allow piping from a command, such as
C <$(A) 2&<$(B)
But, Hey! At least it's piping, right?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!