Thread Subject: Why are nested function variables so slow?

Subject: Why are nested function variables so slow?

From: Peter Meilstrup

Date: 28 Nov, 2007 22:59:45

Message: 1 of 10

I keep running across a problem when I have a handle to a nested
function which uses a variable in the nested function's workspace. If
there is a large struct or cell array in the variable, invoking the
nested function becomes immensely slow.

Below is code to demonstrate the effect by building stacks inside a
nested function's scope. Note that the only difference between the two
conditions is the word 'persistent' which takes the variable out of
nested scope. This is enough to change the algorithm's performance
from linear to worse than quadratic!

I've worked with many languages that implement closures/object member
variables/references (features equivalent to nested functions and
handles) and this performance problem does not show up in any of them.
For instance, a straight translation of this code to Python performs
linearly in both cases.

-pm

function StackDemo()
    %demonstrates how nested function invocation is unacceptably slow.

    sizes = round(logspace(2, 3.2, 20));

    timesNested = zeros(size(sizes));
    timesPersistent = zeros(size(sizes));

    for i = 1:numel(sizes)
        disp(sizes(i));
        %time the nested stack -- how a stack would be implemented if
its
        %speed was at all acceptable.
        [push, pop] = nestedStack();
        a = cputime();
        testStack(push, pop, sizes(i), sizes(i));
        timesNested(i) = cputime() - a;

        %What happens when the stack is stored in a persistent,
instead of
        %externally scoped variable. This is unusable for general
purposes
        %because now there can only be one stack at a time.
        [push, pop] = persistentStack();
        a = cputime();
        testStack(push, pop, sizes(i), sizes(i));
        timesPersistent(i) = cputime() - a;
    end

    %plot the results.
    plot(sizes, timesNested, 'r-', sizes, timesPersistent, 'b-');
    xlabel('Input size');
    ylabel('Execution time');
    legend('nested', 'persistent', 'Location', 'NorthWest');

end

function testStack(push, pop, n, m)
    for i = 1:n
        push(n);
    end
    for i = 1:m
        pop();
    end
end

function [pushFn, popFn] = nestedStack()
    %implement a stack using nested functions.
    stack = {};

    pushFn = @push;
    popFn = @pop;

    function push(what)
        stack = {what stack};
    end

    function what = pop()
        if isempty(stack)
            what = [];
            stack = {};
        else
            what = stack{1};
            stack = stack{2};
        end
    end
end

function [pushFn, popFn] = persistentStack()
    %same as above ut the stack is in a persistent var.
    %NO GOOD FOR ACTUAL USE since you cannot have more than one stack
this way.

    persistent stack;

    pushFn = @push;
    popFn = @pop;

    function push(what)
        stack = {what stack};
    end

    function what = pop()
        if isempty(stack)
            what = [];
            stack = {};
        else
            what = stack{1};
            stack = stack{2};
        end
    end
end

Subject: Why are nested function variables so slow?

From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)

Date: 29 Nov, 2007 02:55:30

Message: 2 of 10

In article <4de789a5-08d9-450c-8c3e-6b29381d0f41@e10g2000prf.googlegroups.com>,
Peter Meilstrup <peter.meilstrup@gmail.com> wrote:

>function [pushFn, popFn] = persistentStack()
> %same as above ut the stack is in a persistent var.
> %NO GOOD FOR ACTUAL USE since you cannot have more than one stack
>this way.
>
> persistent stack;
>
> pushFn = @push;
> popFn = @pop;
>
> function push(what)
> stack = {what stack};
> end
>
> function what = pop()
> if isempty(stack)
> what = [];
> stack = {};
> else
> what = stack{1};
> stack = stack{2};
> end
> end
>end

I believe this should generalize your code to allow multiple
stacks. Just pass in a string that is a valid identifier.
You won't need to refer to the identifier afterwards.

function [pushFn, popFn] = persistentStack(stackname)

    persistent stack;

    sn = stackname;
    stack.(sn) = {};

    % since the nested functions are built after sn is set,
    % they will bind into a closure that includes sn.

    function push(what)
        stack.(sn) = {what stack};
    end

    function what = pop()
        if isempty(stack.(sn))
            what = [];
            stack.(sn) = {};
        else
            what = stack.(sn){1};
            stack.(sn) = stack.(sn){2};
        end
    end

    pushFn = @push;
    popFn = @pop;

end

--
   "Okay, buzzwords only. Two syllables, tops." -- Laurie Anderson

Subject: Why are nested function variables so slow?

From: Peter Meilstrup

Date: 29 Nov, 2007 03:22:12

Message: 3 of 10

That wasn't really the question I was asking.... the stacks are a toy
example. For my actual code persistent var storage is a no go. (you'd
need explicit deallocation code or else the persistent struct leaks
memory; your data in memory will get erased every time you tweak a
function's code; data is not saved to disk when you save the
functions, etc.)

-pm

On Nov 28, 6:55 pm, rober...@ibd.nrc-cnrc.gc.ca (Walter Roberson)
wrote:
> In article <4de789a5-08d9-450c-8c3e-6b29381d0...@e10g2000prf.googlegroups.com>,
> Peter Meilstrup <peter.meilst...@gmail.com> wrote:
>
>
>
> >function [pushFn, popFn] = persistentStack()
> > %same as above ut the stack is in a persistent var.
> > %NO GOOD FOR ACTUAL USE since you cannot have more than one stack
> >this way.
>
> > persistent stack;
>
> > pushFn = @push;
> > popFn = @pop;
>
> > function push(what)
> > stack = {what stack};
> > end
>
> > function what = pop()
> > if isempty(stack)
> > what = [];
> > stack = {};
> > else
> > what = stack{1};
> > stack = stack{2};
> > end
> > end
> >end
>
> I believe this should generalize your code to allow multiple
> stacks. Just pass in a string that is a valid identifier.
> You won't need to refer to the identifier afterwards.
>
> function [pushFn, popFn] = persistentStack(stackname)
>
> persistent stack;
>
> sn = stackname;
> stack.(sn) = {};
>
> % since the nested functions are built after sn is set,
> % they will bind into a closure that includes sn.
>
> function push(what)
> stack.(sn) = {what stack};
> end
>
> function what = pop()
> if isempty(stack.(sn))
> what = [];
> stack.(sn) = {};
> else
> what = stack.(sn){1};
> stack.(sn) = stack.(sn){2};
> end
> end
>
> pushFn = @push;
> popFn = @pop;
>
> end
>
> --
> "Okay, buzzwords only. Two syllables, tops." -- Laurie Anderson

Subject: Why are nested function variables so slow?

From: Peter Meilstrup

Date: 29 Nov, 2007 03:36:53

Message: 4 of 10

On Nov 28, 6:55 pm, rober...@ibd.nrc-cnrc.gc.ca (Walter Roberson)
wrote:
> sn = stackname;
> stack.(sn) = {};
>
> % since the nested functions are built after sn is set,
> % they will bind into a closure that includes sn.
>

BTW, the order of when you create nested function handles versus when
you set sn does not matter. It only matters that the variable exists
somewhere in the enclosing function. If you change sn after creating
the handles, the change will still be reflected through the handles.
That's part of what makes it a closure :) The order makes a difference
only for anonymous functions, which are not closures.

-pm

Subject: Why are nested function variables so slow?

From: Titus

Date: 29 Nov, 2007 10:28:42

Message: 5 of 10


"Peter Meilstrup" <peter.meilstrup@gmail.com> schrieb im Newsbeitrag
news:4de789a5-08d9-450c-8c3e-6b29381d0f41@e10g2000prf.googlegroups.com...
>I keep running across a problem when I have a handle to a nested
> function which uses a variable in the nested function's workspace. If
> there is a large struct or cell array in the variable, invoking the
> nested function becomes immensely slow.
>
> Below is code to demonstrate the effect by building stacks inside a
> nested function's scope. Note that the only difference between the two
> conditions is the word 'persistent' which takes the variable out of
> nested scope. This is enough to change the algorithm's performance
> from linear to worse than quadratic!
>
> I've worked with many languages that implement closures/object member
> variables/references (features equivalent to nested functions and
> handles) and this performance problem does not show up in any of them.
> For instance, a straight translation of this code to Python performs
> linearly in both cases.
>
> -pm
>

Hi Peter,

I admit, I don't know the reason. But what I would suggest nevertheless is
not to nest the cell array but to grow it:

function [pushFn, popFn] = nestedStack()
%implement a stack using nested functions.
stack = {};
pushFn = @push;
popFn = @pop;
  function push(what)
    stack = [what stack];
  end
  function what = pop()
    if isempty(stack)
      what = [];
      stack = {};
    else
      what = stack{1};
      stack = stack(2:end);
    end
end
end


This at least brings the nestedStack down from 5 times to about
the double time of the persistentStack, better but not yet optimal.
Perhaps others have explanations about why the performance is lower.

Titus

Subject: Why are nested function variables so slow?

From: Bruno Luong

Date: 29 Nov, 2007 11:04:49

Message: 6 of 10

Peter,

Thanks for the warning. That's quite ugly.

Bruno

Subject: Why are nested function variables so slow?

From: Bruno Luong

Date: 29 Nov, 2007 11:04:50

Message: 7 of 10

Peter,

Thanks for the warning. That's quite ugly.

Bruno

Subject: Why are nested function variables so slow?

From: Bruno Luong

Date: 29 Nov, 2007 11:04:55

Message: 8 of 10

Peter,

Thanks for the warning. That's quite ugly.

Bruno

Subject: Why are nested function variables so slow?

From: Peter Meilstrup

Date: 1 Dec, 2007 01:14:31

Message: 9 of 10

On Nov 29, 2:28 am, "Titus" <titus.edelho...@mathworks.de> wrote:
>
> Hi Peter,
>
> I admit, I don't know the reason. But what I would suggest nevertheless is
> not to nest the cell array but to grow it:
>
> function [pushFn, popFn] = nestedStack()
> %implement a stack using nested functions.
> stack = {};
> pushFn = @push;
> popFn = @pop;
> function push(what)
> stack = [what stack];
> end
> function what = pop()
> if isempty(stack)
> what = [];
> stack = {};
> else
> what = stack{1};
> stack = stack(2:end);
> end
> end
> end
>
> This at least brings the nestedStack down from 5 times to about
> the double time of the persistentStack, better but not yet optimal.
> Perhaps others have explanations about why the performance is lower.
>
> Titus

No good--growing the stack by reallocating leads to quadratic-or-worse
time complexity. Try increasing the input size to 10^4 or so and see
what the graph looks like. Also, read about Shlemiel the painter:

http://www.joelonsoftware.com/articles/fog0000000319.html

"About X times faster" is not an acceptable description of algorithmic
performance. Acceptable would be to express the performance as
function of the input size. The problem is that the time skyrockets,
at least quadratically, for the nested functions as input size
increases, but is linear when the exact same data structure is in a
persistent variable.

For extra fun, try running the profiler. All the extra time is
assigned to overhead in testStack, instead of in nestedStack/push and
nestedStack/pop. But there's nothing slow about testStack, it's the
functions it calls that are slow.

-pm

Subject: Why are nested function variables so slow?

From: Mike Karr

Date: 3 Dec, 2007 14:41:01

Message: 10 of 10

Peter Meilstrup wrote:
> I keep running across a problem when I have a handle to a nested
> function which uses a variable in the nested function's workspace. If
> there is a large struct or cell array in the variable, invoking the
> nested function becomes immensely slow.
>
> Below is code to demonstrate the effect by building stacks inside a
> nested function's scope.
> (snip)

This is an excellent demo. We are aware of the problem. It is a
difficult one, and we're working on it.

mike

Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

Contact us at files@mathworks.com