Author: Irina Zorkoltseva, Institute Cytology and Genetics, Novosibirsk
December, 2004
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1.3 of the License, or (at your option) any later version. You can download it from our website (http://mga.bionet.nsc.ru/soft/gen_qc/gen_qc.zip or http://mga.bionet.nsc.ru/soft/gen_qc/gen_qc.tar.gz) or ftp (ftp://mga.bionet.nsc.ru/gencheck/).
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Contact:
Table of contents
INTRODUCTION
INSTALLATION
USAGE
OPTIONS
INPUT FILE FORMAT
OUTPUT FILE
RUN
EXAMPLE
REFERENCE
In large data sets obtained in genetic studies the presence of errors is almost inevitable. Often the pedigree data and the genotype data are kept in the different files. At that the personal ID in pedigree data may be not coincide with person identification number (code) in genotype file. Program The resulting file is tested for Mendelian inconsistency. At this stage, the external program The information on errors is extracted from the output, connected to the initial coding and reported in table format (program This program is written in Perl, which is available for free both under Windows and Unix/Linux environment. After downloading the distribution file After unzipping the file, you will find a new folder Program will look for This may be changed using command line options:
### Examples:
One, none or both command options may be used. Order of the options is irrelevant.
These are text files, where entries are assumed to be comma-separated (this may be changed using option ### Examples:
If you are using X-linked data, you must include the command option ### Examples:
You need 2 input files to run this program: a pedigree file and a genotype file. Both these input files should be in comma delimited format (this may be changed using option The input files must contain a header line, providing the description of every column.
In these files before a header line, there might be comment lines, starting with Pedigree File
The pedigree file contains genealogy information in a standard pre-makeped LINKAGE pedigree file format.
The first 5 columns should consist of following information:
Column1: family identifier (ped)
Column2: individual ID (id)
Column3: father's ID (fa)
Column4: mother's ID (mo)
Column5: sex (1 - man, 2 - women) (sex)
Column6: unique person identification number (code)
Column7: old individual ID (old_id) (if you used recode_ped.pl; this column may be absent)
Column3 and column4 are both zero when the individual is a founder. Any column after the 7th column is ignored.
Genotype_file
The genotype_file contains no less three columns.
One column is required with name code (a uniquie person identification number).
The others columns contain genotype data.
In this file, genotypes are coded as allele1 and allele2 for homo- and heterozygotes.
If the genotypes have different code from above, then run All information about errors is printed to a file named To run the program, you need to put 2 input files and all programs into the same directory, and then, from the same directory, type the command line in DOS or UNIX (Linux) prompt. If you transferred files from Windows to Unix/Linux, please remember to change the format of these input files (e.g. In the folder To run the example, copy When using this software, please put reference to our web-site (http://mga.bionet.nsc.ru/soft/gen_qc).
We will provide the reference as soon as we publish a note on this software.
Introduction
pre_pedcheck
combines pedigree and genotypic data.
PedCheck
(O`Connell and Weeks, 1998) is used to detect errors of inheritance of autosomal markers and our program x_check
is used for X-linked markers.
rec_pedcheck_err
).
Installation
gen_qc.tar.gz
, put it to a user-defined folder and type the command "tar -xzvf gen_qc.tar.gz
" to unzip it. If you are windows user you can unzip it with WinZip. NOTE: windows users might need to install the platform of Perl to run the program, which can be downloaded for free at http://www.activestate.com/Products/ActivePerl/.
gen_qc
, which contains 7 files (readme.txt
, gen_qc.pl
, pre_pedcheck.pl
, pedcheck.exe
(executable file for Windows), pedcheck
(executable file for Linux), x_check.pl
, and rec_pedcheck_err.pl
) and 2 folders (doc
, and example
). In readme.txt
you will find general information on all files in each folder and short instructions on how to run the program. In the folder doc
, you can find this manual gen_qc.html
, the documentation of this program and the GNU general public license. In another folder, example
, you can find a files, which contain an example command, an example input pedigree file, an example input genotype file.
Usage
perl gen_qc.pl [-p myfile.ped ][-d myfile.dat] [-s /sep/]
Options
pedigree.dat
and genotype.dat
as default inputs.
-p
perl gen_qc.pl -p myfile.ped -d myfile.dat
-s
in command line) and missing data indicated by empty entries.
perl gen_qc.pl -d myfile.dat -s
-x
.
perl gen_qc.pl -x -p myfile.ped
INPUT FILE FORMAT
-s
in command line) and missing data indicated by empty entries.
\"#\"
.
recodesnp.pl
, that is available in this site.
OUTPUT FILE
pedcheck_rec.err
.
RUN
dos2unix
).
EXAMPLE
example
, the file example_command.txt
records the command to run the example. You can also find an example input pedigree file and an example input genotype file.
gen_qc.pl
, pre_pdcheck.pl
, pdcheck.exe
(executable file for Windows), pdcheck
(executable file for Linux), x_check.pl
, rec_ pedcheck_err.pl
to the example
directory, then from the example
directory type:
perl gen_QC.pl -p ped_test.dat -d data_test.dat -s
REFERENCE