Code covered by the BSD License  

Highlights from
Table Breakpoint Optimization

image thumbnail

Table Breakpoint Optimization


Tucker McClure (view profile)


04 Apr 2012 (Updated )

A set of tools for finding the best way to reduce the size of a table.

function [x_f, z_f, mse_f, me_f] = find_best_table_nde(...
                                      x_0, z_0, ...
                                      mse_target, max_error, ...

%% find_best_table_nde
% This function finds the best way to reduce the size of a table to the
% so that the smaller table matches the original within specified error 
% bounds. That is, suppose you had a table of 100-by-100 points, but wanted
% to find the smallest table that had no more than 1.5 units of mean 
% squared error. This function would find how many points are necessary 
% along each axis to accomplish this and where those points go. "Goodness"
% will be measured by the difference between the original table and the new
% table sampled at the original data points. See find_best_table_demo.m for
% examples.
% Inputs:
%   x_0    Cell array of values for the first independent variables (e.g., 
%          {x1, x2, ... xn}.
%   z_0    Matrix of values for the dependent variable for all x_0 and
%          y_0 using ndgrid-style syntax (e.g., z_0(2, 3) corresponds
%          to x{1}(2) and x{2}(3)). Note that this is different from what
%          is produced by meshgrid.
%   mse_target Target mean squared error
%   method     Interpolation method to use, e.g., 'linear' (help interp2)
% Outputs:
%   x_f    Cell array of best divisions of independent variables found
%   z_f    Resulting fit of z_0 on x_f
%   mse_f  Mean square error of final table
% This builds off of work by Richard Willey, Stuart Kozola, Eric Johnson, 
% Sarah Zaranek, and Peter Maloney at The MathWorks, Inc.
% Requires the Optimization Toolbox (TM).
% Supports the Parallel Computing Toolbox (TM).
%  Tucker McClure @ The MathWorks
%  Copyright 2013 The MathWorks, Inc.
    % Default to linear interpolation.
    if nargin < 5, method = 'linear'; end;
    if nargin == 4 && ischar(max_error)
        method = max_error;
        max_error = inf;
    if nargin == 3 || isempty(max_error), max_error = inf;  end;
    if isempty(mse_target),               mse_target = inf; end;

    % Number of dimensions
    n_d = length(x_0);
    % Initialize the algorithm with the simplest possible table.
    n = 3 * ones(1, n_d);

    % See how well simplest possible table does.
    [~, ~, mse, me] = find_best_table_nd(x_0, z_0, n, method);

    % Loop until we meet tolerance or have tried too many times.
    count = 0;
    max_iterations = sum(cellfun(@(x) length(x), x_0));
    while (mse > mse_target || me > max_error) && count < max_iterations

        % Keep count of number of iterations.
        count = count + 1;

        % Try increasing the number of points in each dimension by 2 
        % separately.
        mse_i = zeros(1, n_d);
        me_i  = zeros(1, n_d);
        for k = 1:n_d
            n_i = n;
            n_i(k) = 2*n(k);
            [~, ~, mse_i(k), me_i(k)] = find_best_table_nd(x_0, z_0, ...
                                                           n_i, method);
        % Find which made the most improvement and use it alone.
        [~, best_index] = min(mse_i);
        n(best_index)   = 2*n(best_index);
        mse             = mse_i(best_index);
        me              = me_i(best_index);

    % Ok, now we now that (n_x/2, n_y) and (n_x, n_y/2) aren't good enough,
    % but (n_x, n_y) is. Search between them.
    n_left  = max(floor(n/2), 3);
    n_right = n;

    % Try to bring in any of the right bounds if it's possible, starting
    % with what's currently the smallest number of indices. If it's not
    % possible to move in any of the right bounds, move in the left bounds.
    count = 0;
    while    any(n_right > n_left + 1) ...
          && count < max_iterations

        % Keep count of number of iterations.
        count = count + 1;

        % Sort the indices; we want to decrease the number of breakpoints
        % for the biggest dimension first.
        [~, sort_indices] = sort(n_right, 2, 'descend');

        % We're going to look for the first dimension we can update. If we
        % can't update anything, we'll allow multiple dimensions to grow.
        updated = false;
        for k = sort_indices
            % If the bounds are already pressed together, skip to the next
            % dimension.
            if n_right(k) <= n_left(k) + 1

            % Update this dimension as the midpoint between left and right
            % bounds.
            n_t = n_right;
            n_t(k) = round(0.5*(n_right(k) - n_left(k))) + n_left(k);
            % See how well it does.
            [~,~, mse_i, me_i] = find_best_table_nd(x_0, z_0, n_t, method);
            % If we're still good, move the right bound in.
            if mse_i < mse_target && me_i < max_error
                n_right = n_t;
                updated = true;

        % Otherwise, we can't update any dimension, so move in the left 
        % bounds.
        if ~updated
            n_left = n_t;

    % Take the (now just good enough) right bounds.
    n = n_right;

    % Gather the final output.
    [x_f, z_f, mse_f, me_f] = find_best_table_nd(x_0, z_0, n, method);


Contact us