Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

KeyValueStore

Store key-value pairs for use with mapreduce

Description

The mapreduce function automatically creates a KeyValueStore object during execution and uses it to store key-value pairs added by the map and reduce functions. Although you never need to explicitly create a KeyValueStore object to use mapreduce, you do need to use the add and addmulti object functions to interact with this object in the map and reduce functions.

Creation

The mapreduce function automatically creates KeyValueStore objects during execution.

Object Functions

addAdd single key-value pair to KeyValueStore
addmultiAdd multiple key-value pairs to KeyValueStore

Examples

expand all

The following map function uses the add function to add key-value pairs one at a time to an intermediate KeyValueStore object (named intermKVStore).

function MeanDistMapFun(data, info, intermKVStore)
    distances = data.Distance(~isnan(data.Distance));
    sumLenKey = 'sumAndLength';
    sumLenValue = [sum(distances), length(distances)];
    add(intermKVStore, sumLenKey, sumLenValue);
end

The following map function uses addmulti to add several key-value pairs to an intermediate KeyValueStore object (named intermKVStore). Note that this map function collects multiple keys in the intermKeys variable, and multiple values in the intermVals variable. This prepares a single call to addmulti to add all of the key-value pairs at once. It is a best practice to use a single call to addmulti rather than using add in a loop.

function meanArrivalDelayByDayMapper(data, ~, intermKVStore)
% Mapper function for the MeanByGroupMapReduceExample.

% Copyright 2014 The MathWorks, Inc.

% Data is an n-by-2 table: first column is the DayOfWeek and the second
% is the ArrDelay. Remove missing values first.
delays = data.ArrDelay;
day = data.DayOfWeek;
notNaN =~isnan(delays);
day = day(notNaN);
delays = delays(notNaN);

% find the unique days in this chunk
[intermKeys,~,idx] = unique(day, 'stable');

% group delays by idx and apply @grpstatsfun function to each group
intermVals = accumarray(idx,delays,size(intermKeys),@countsum);
addmulti(intermKVStore,intermKeys,intermVals);

function out = countsum(x)
n = length(x); % count
s = sum(x); % mean
out = {[n, s]};

Introduced in R2014b

Was this topic helpful?