Data sets can require preprocessing techniques to ensure accurate, efficient, or meaningful analysis. Data cleaning refers to methods for finding, removing, and replacing bad or missing data. Detecting local extrema and abrupt changes can help to identify significant data trends. Smoothing and detrending are processes for removing noise and linear trends from data, while scaling changes the bounds of the data. Grouping and binning methods are techniques that identify relationships among the data variables.
|Find missing values|
|Remove missing entries|
|Fill missing values|
|Create missing values|
|Insert standard missing values|
|Find outliers in data|
|Detect and replace outliers in data|
|Detect and remove outliers in data|
|Moving median absolute deviation|
|Group data into bins or categories|
|Number of group elements|
|Group summary computations|
|Transform by group|
|Histogram bin counts|
|Bivariate histogram bin counts|
|Find groups and return group numbers|
|Split data into groups and apply function|
|Apply function to table or timetable rows|
|Apply function to table or timetable variables|
|Construct array with accumulation|
Handle missing values in data sets.
This example shows how to find, clean, and delete table rows with missing data.
Eliminate unwanted noise or behavior in data, and find, fill, and remove outliers.
Remove linear trends from data.
You can use grouping variables to categorize data variables.
This example shows how to group data and apply statistics functions to each group.
This example shows how to group data variables and apply functions to each group.