Code covered by the BSD License  

Highlights from
Sequenom2pedfile

Be the first to rate this file! 8 Downloads (last 30 days) File Size: 21.6 KB File ID: #41569
image thumbnail

Sequenom2pedfile

by

 

It creates the pedfile from the genotyping info generated from Sequenom Mass Array runs

| Watch this File

File Information
Description

This script has been freely uploaded to Matlab Central with no warranties
The autorship corresponds to Ismael Huertas, lab of movement disorders, IBiS / CSIC / Universidad de Sevilla

Abstract:
The aim of this script is to combine the genotyping information generated by multiple runs of the Mass Array Sequenom.
Each run of Sequenom has been handled with the use of the software MASSARRAY TYPER 4.0.20. This software allows the user to
generate reports in xls format, in particular this script has been developed to combine multiple Genotypes sheets from the
GenotypeArea.xls report.

Inputs:

* samplefile.txt: This file should contain four columns (sample, sex, age, pheno), the header must be included.
sample indicates the sample ID
sex indicates the sex of the subject (1=male, 2=female)
age indicates the age of the subject
pheno indicates the value for the subject of the trait of study (1=unaffected, 2=affected, or a quantitative value for quantitative traits)

* mapfile.txt:This file should contain four columns (chr, SNP, gene, position), the header must be included.
chr indicates the chromosome in which the SNP is located
SNP indicates the SNP ID
gene indicates the gene in which the SNP is located (tip: type INTERGENIC if does not belong to a gene)
position indicates the bp position of the SNP in the chromosome

* duplications.txt: This file should contain two columns (SUBID, SAMID), the header must be included.
SUBID indicates the subject id
SAMID indicates the sample id

Therefore, two or more samples (they must have different SAMID) with the same SUBID are considered repeated samples and are accounted
for in the concordance calculations.

* GenotypeArea.xls: The genotyping info must be loaded according to the GenotypeArea.xls report format. This script will read
the sheet named "Genotypes" contained in this xls file. Multiple xls files can be loaded.

Outputs:

* descriptives.txt: This file contains four columns indicating: sample ID, sex, age and phenotype.

* GENEX.txt: One textfile is generated per gene containing the genotyping info.

* GENOTYPES.xls: The resulting genotyping matrix is printed in xls format. This can be useful for importing the information to a database.

* pedfile.ped and mapfile.map: These are the texfiles in linkage format. These files can be used as input in many commonly used programs for
genetic analysis, such us PLINK, UNPHASED, etc.

Required Products MATLAB
MATLAB release MATLAB 7.11 (R2010b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (1)
26 Sep 2014 mila

ok

Contact us