Sequenom2pedfile

It creates the pedfile from the genotyping info generated from Sequenom Mass Array runs
232 Downloads
Updated 1 May 2013

View License

This script has been freely uploaded to Matlab Central with no warranties
The autorship corresponds to Ismael Huertas, lab of movement disorders, IBiS / CSIC / Universidad de Sevilla

Abstract:
The aim of this script is to combine the genotyping information generated by multiple runs of the Mass Array Sequenom.
Each run of Sequenom has been handled with the use of the software MASSARRAY TYPER 4.0.20. This software allows the user to
generate reports in xls format, in particular this script has been developed to combine multiple Genotypes sheets from the
GenotypeArea.xls report.

Inputs:

* samplefile.txt: This file should contain four columns (sample, sex, age, pheno), the header must be included.
sample indicates the sample ID
sex indicates the sex of the subject (1=male, 2=female)
age indicates the age of the subject
pheno indicates the value for the subject of the trait of study (1=unaffected, 2=affected, or a quantitative value for quantitative traits)

* mapfile.txt:This file should contain four columns (chr, SNP, gene, position), the header must be included.
chr indicates the chromosome in which the SNP is located
SNP indicates the SNP ID
gene indicates the gene in which the SNP is located (tip: type INTERGENIC if does not belong to a gene)
position indicates the bp position of the SNP in the chromosome

* duplications.txt: This file should contain two columns (SUBID, SAMID), the header must be included.
SUBID indicates the subject id
SAMID indicates the sample id

Therefore, two or more samples (they must have different SAMID) with the same SUBID are considered repeated samples and are accounted
for in the concordance calculations.

* GenotypeArea.xls: The genotyping info must be loaded according to the GenotypeArea.xls report format. This script will read
the sheet named "Genotypes" contained in this xls file. Multiple xls files can be loaded.

Outputs:

* descriptives.txt: This file contains four columns indicating: sample ID, sex, age and phenotype.

* GENEX.txt: One textfile is generated per gene containing the genotyping info.

* GENOTYPES.xls: The resulting genotyping matrix is printed in xls format. This can be useful for importing the information to a database.

* pedfile.ped and mapfile.map: These are the texfiles in linkage format. These files can be used as input in many commonly used programs for
genetic analysis, such us PLINK, UNPHASED, etc.

Cite As

Ismael Huertas (2024). Sequenom2pedfile (https://www.mathworks.com/matlabcentral/fileexchange/41569-sequenom2pedfile), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2010b
Compatible with any release
Platform Compatibility
Windows macOS Linux
Tags Add Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.0.0.0