Sequenom2pedfile
This script has been freely uploaded to Matlab Central with no warranties
The autorship corresponds to Ismael Huertas, lab of movement disorders, IBiS / CSIC / Universidad de Sevilla
Abstract:
The aim of this script is to combine the genotyping information generated by multiple runs of the Mass Array Sequenom.
Each run of Sequenom has been handled with the use of the software MASSARRAY TYPER 4.0.20. This software allows the user to
generate reports in xls format, in particular this script has been developed to combine multiple Genotypes sheets from the
GenotypeArea.xls report.
Inputs:
* samplefile.txt: This file should contain four columns (sample, sex, age, pheno), the header must be included.
sample indicates the sample ID
sex indicates the sex of the subject (1=male, 2=female)
age indicates the age of the subject
pheno indicates the value for the subject of the trait of study (1=unaffected, 2=affected, or a quantitative value for quantitative traits)
* mapfile.txt:This file should contain four columns (chr, SNP, gene, position), the header must be included.
chr indicates the chromosome in which the SNP is located
SNP indicates the SNP ID
gene indicates the gene in which the SNP is located (tip: type INTERGENIC if does not belong to a gene)
position indicates the bp position of the SNP in the chromosome
* duplications.txt: This file should contain two columns (SUBID, SAMID), the header must be included.
SUBID indicates the subject id
SAMID indicates the sample id
Therefore, two or more samples (they must have different SAMID) with the same SUBID are considered repeated samples and are accounted
for in the concordance calculations.
* GenotypeArea.xls: The genotyping info must be loaded according to the GenotypeArea.xls report format. This script will read
the sheet named "Genotypes" contained in this xls file. Multiple xls files can be loaded.
Outputs:
* descriptives.txt: This file contains four columns indicating: sample ID, sex, age and phenotype.
* GENEX.txt: One textfile is generated per gene containing the genotyping info.
* GENOTYPES.xls: The resulting genotyping matrix is printed in xls format. This can be useful for importing the information to a database.
* pedfile.ped and mapfile.map: These are the texfiles in linkage format. These files can be used as input in many commonly used programs for
genetic analysis, such us PLINK, UNPHASED, etc.
Cite As
Ismael Huertas (2024). Sequenom2pedfile (https://www.mathworks.com/matlabcentral/fileexchange/41569-sequenom2pedfile), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxCategories
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
Version | Published | Release Notes | |
---|---|---|---|
1.0.0.0 |