Biosciences: Statistical Methods
This curriculum module uses biological data to teach fundamental concepts of statistics, data analysis, and data visualization.
This module will teach students how to use statistical methods in MATLAB® to analyze ecologically relevant data. We will explore the data using descriptive statistics, fit the data using a predictive model, find the linear correlations between variables, and finally, discuss how to test a hypothesis.Make sure you're familiar with the basics of using MATLAB by going through the MATLAB Onramp before continuing. We also recommend reviewing the Biosciences Data module.
This module utilizes the Palmer penguins [1] dataset, which contains data about three different species of penguin in Antarctica.
A Gentoo penguin spreading its flippers
This module assumes basic MATLAB knowledge and it is recommended that all students take the MATLAB Onramp and go through the related Biosciences Data curriculum module.
To learn more about opening and using MATLAB, see the accompanying Getting Started guide.
Notes: These scripts can all be run independently, though we recommend going through these live scripts in order. These live scripts are intended to be used with output inline. To change the output, go to the View tab of the toolstrip, and select Output Inline. The scripts have areas for the students to interact with the code . There will also be exercises in most scripts and the answers will be provided at the end. A problem set for students to practice these concepts is also included here. Throughout the scripts, there are also moments to students to reflect on what they've learned or on what the data means . Particularly interesting examples of how these concepts are used in "real-world" biology are also pointed out .
- Learning objective: Students learn about why statistical methods are important in biology.
- Learning objective: Students will learn how to clean data and prepare a dataset for analysis.
Further explore and visualize penguins
- Learning objective: Students will learn how to use histograms and box plots to understand the distribution of data.
- Learning objective: Students will learn to calculate and interpret descriptive statistics including mean, median, and standard deviation.
- Learning objective: Students will learn how to fit linear regression models to data and make predictions about the data.
- Learning objective: Students will learn how to calculate and visualize linear correlations between variables.
- Learning objective: Students will learn to create null and alternate hypotheses, test them using t-tests, and interpret p-values.
Link to 5 other modules here once set up.
MATLAB®, Statistics and Machine Learning Toolbox™, Curve Fitting Toolbox™
[1] Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0. https://allisonhorst.github.io/palmerpenguins/. doi:10.5281/zenodo.3960218.
The License for this project is in the License.txt file in this repository.
© Copyright 2023 The MathWorks, Inc.
Cite As
Emma Smith Zbarsky (2024). Biosciences: Statistical Methods (https://github.com/MathWorks-Teaching-Resources/Biosciences-Statistical-Methods), GitHub. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxTags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
Versions that use the GitHub default branch cannot be downloaded
Version | Published | Release Notes | |
---|---|---|---|
1.0.0 |
|