Code covered by the BSD License

# Table Breakpoint Optimization

### Tucker McClure (view profile)

04 Apr 2012 (Updated )

A set of tools for finding the best way to reduce the size of a table.

find_best_table_nde(...
function [x_f, z_f, mse_f, me_f] = find_best_table_nde(...
x_0, z_0, ...
mse_target, max_error, ...
method)

%% find_best_table_nde
%
% This function finds the best way to reduce the size of a table to the
% so that the smaller table matches the original within specified error
% bounds. That is, suppose you had a table of 100-by-100 points, but wanted
% to find the smallest table that had no more than 1.5 units of mean
% squared error. This function would find how many points are necessary
% along each axis to accomplish this and where those points go. "Goodness"
% will be measured by the difference between the original table and the new
% table sampled at the original data points. See find_best_table_demo.m for
% examples.
%
% Inputs:
%   x_0    Cell array of values for the first independent variables (e.g.,
%          {x1, x2, ... xn}.
%   z_0    Matrix of values for the dependent variable for all x_0 and
%          y_0 using ndgrid-style syntax (e.g., z_0(2, 3) corresponds
%          to x{1}(2) and x{2}(3)). Note that this is different from what
%          is produced by meshgrid.
%   mse_target Target mean squared error
%   method     Interpolation method to use, e.g., 'linear' (help interp2)
%
% Outputs:
%   x_f    Cell array of best divisions of independent variables found
%   z_f    Resulting fit of z_0 on x_f
%   mse_f  Mean square error of final table
%
% This builds off of work by Richard Willey, Stuart Kozola, Eric Johnson,
% Sarah Zaranek, and Peter Maloney at The MathWorks, Inc.
%
% Requires the Optimization Toolbox (TM).
% Supports the Parallel Computing Toolbox (TM).
%
%  Tucker McClure @ The MathWorks
%  Copyright 2013 The MathWorks, Inc.

% Default to linear interpolation.
if nargin < 5, method = 'linear'; end;
if nargin == 4 && ischar(max_error)
method = max_error;
max_error = inf;
end
if nargin == 3 || isempty(max_error), max_error = inf;  end;
if isempty(mse_target),               mse_target = inf; end;

% Number of dimensions
n_d = length(x_0);

% Initialize the algorithm with the simplest possible table.
n = 3 * ones(1, n_d);

% See how well simplest possible table does.
[~, ~, mse, me] = find_best_table_nd(x_0, z_0, n, method);

% Loop until we meet tolerance or have tried too many times.
count = 0;
max_iterations = sum(cellfun(@(x) length(x), x_0));
while (mse > mse_target || me > max_error) && count < max_iterations

% Keep count of number of iterations.
count = count + 1;

% Try increasing the number of points in each dimension by 2
% separately.
mse_i = zeros(1, n_d);
me_i  = zeros(1, n_d);
for k = 1:n_d
n_i = n;
n_i(k) = 2*n(k);
[~, ~, mse_i(k), me_i(k)] = find_best_table_nd(x_0, z_0, ...
n_i, method);
end

% Find which made the most improvement and use it alone.
[~, best_index] = min(mse_i);
n(best_index)   = 2*n(best_index);
mse             = mse_i(best_index);
me              = me_i(best_index);

end

% Ok, now we now that (n_x/2, n_y) and (n_x, n_y/2) aren't good enough,
% but (n_x, n_y) is. Search between them.
n_left  = max(floor(n/2), 3);
n_right = n;

% Try to bring in any of the right bounds if it's possible, starting
% with what's currently the smallest number of indices. If it's not
% possible to move in any of the right bounds, move in the left bounds.
count = 0;
while    any(n_right > n_left + 1) ...
&& count < max_iterations

% Keep count of number of iterations.
count = count + 1;

% Sort the indices; we want to decrease the number of breakpoints
% for the biggest dimension first.
[~, sort_indices] = sort(n_right, 2, 'descend');

% We're going to look for the first dimension we can update. If we
% can't update anything, we'll allow multiple dimensions to grow.
updated = false;
for k = sort_indices

% dimension.
if n_right(k) <= n_left(k) + 1
continue;
end

% Update this dimension as the midpoint between left and right
% bounds.
n_t = n_right;
n_t(k) = round(0.5*(n_right(k) - n_left(k))) + n_left(k);

% See how well it does.
[~,~, mse_i, me_i] = find_best_table_nd(x_0, z_0, n_t, method);

% If we're still good, move the right bound in.
if mse_i < mse_target && me_i < max_error
n_right = n_t;
updated = true;
break;
end

end

% Otherwise, we can't update any dimension, so move in the left
% bounds.
if ~updated
n_left = n_t;
end

end

% Take the (now just good enough) right bounds.
n = n_right;

% Gather the final output.
[x_f, z_f, mse_f, me_f] = find_best_table_nd(x_0, z_0, n, method);

end