This one-day course focuses on adapting existing algorithms to work with a collection of data files or a single file that is too big to fit in memory. Learn to represent big data in MATLAB®, adjust existing code to work efficiently with it, and scale up the analysis to take advantage of your own computing resources or a cloud.
- Creating datastores to read from data sources
- Representing and manipulating big data using tall arrays
- Importing custom data formats and applying custom functions to tall arrays
- Working with clusters of computers and cloud environments
Day 1 of 1
Prototyping Big Data Algorithms
Objective: Applying existing algorithms to data sets that do not fit into memory.
- Importing data using datastores
- Creating tall arrays
- Running algorithms on tall arrays
- Optimizing code for tall arrays
- Reading data from cloud environments
Handling Custom Data and Algorithms
Objective: Importing custom formatted data and applying algorithms that are not implemented for tall arrays
- Importing custom formatted data using file datastores and custom datastores
- Partially importing single files
- Applying transformations, reductions, and moving window operations to tall arrays
Working with Clusters and Clouds
Objective: Run big data algorithms on a cluster of computers or on cloud environments.
- Local and remote clusters
- Cluster discovery and connection
- Setup of a cluster on a cloud environment
- File access considerations