View License

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Highlights from

Join the 15-year community celebration.

Play games and win prizes!

» Learn more

Be the first to rate this file! 2 Downloads (last 30 days) File Size: 21.6 KB File ID: #41569 Version: 1.0
image thumbnail




It creates the pedfile from the genotyping info generated from Sequenom Mass Array runs

| Watch this File

File Information

This script has been freely uploaded to Matlab Central with no warranties
The autorship corresponds to Ismael Huertas, lab of movement disorders, IBiS / CSIC / Universidad de Sevilla

The aim of this script is to combine the genotyping information generated by multiple runs of the Mass Array Sequenom.
Each run of Sequenom has been handled with the use of the software MASSARRAY TYPER 4.0.20. This software allows the user to
generate reports in xls format, in particular this script has been developed to combine multiple Genotypes sheets from the
GenotypeArea.xls report.


* samplefile.txt: This file should contain four columns (sample, sex, age, pheno), the header must be included.
sample indicates the sample ID
sex indicates the sex of the subject (1=male, 2=female)
age indicates the age of the subject
pheno indicates the value for the subject of the trait of study (1=unaffected, 2=affected, or a quantitative value for quantitative traits)

* mapfile.txt:This file should contain four columns (chr, SNP, gene, position), the header must be included.
chr indicates the chromosome in which the SNP is located
SNP indicates the SNP ID
gene indicates the gene in which the SNP is located (tip: type INTERGENIC if does not belong to a gene)
position indicates the bp position of the SNP in the chromosome

* duplications.txt: This file should contain two columns (SUBID, SAMID), the header must be included.
SUBID indicates the subject id
SAMID indicates the sample id

Therefore, two or more samples (they must have different SAMID) with the same SUBID are considered repeated samples and are accounted
for in the concordance calculations.

* GenotypeArea.xls: The genotyping info must be loaded according to the GenotypeArea.xls report format. This script will read
the sheet named "Genotypes" contained in this xls file. Multiple xls files can be loaded.


* descriptives.txt: This file contains four columns indicating: sample ID, sex, age and phenotype.

* GENEX.txt: One textfile is generated per gene containing the genotyping info.

* GENOTYPES.xls: The resulting genotyping matrix is printed in xls format. This can be useful for importing the information to a database.

* pedfile.ped and These are the texfiles in linkage format. These files can be used as input in many commonly used programs for
genetic analysis, such us PLINK, UNPHASED, etc.

Required Products MATLAB
MATLAB release MATLAB 7.11 (R2010b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (1)
26 Sep 2014 mila

mila (view profile)


Comment only

Contact us