SIMCROSS A program for haplotyping on general pedigrees Written by Daniel E. Weeks in collaboration with Eric Sobel, Jeffrey R. O'Connell, and Kenneth Lange (c) 1995 We are maintaining a user e-mail list, so please register by sending e-mail to dweeks@watson.hgen.pitt.edu or daniel.weeks@well.ox.ac.uk. Given a set of codominant markers with known order, simcross carries out haplotyping by simulated annealing. The original program was developed as a research project with Dr. Kenneth Lange in 1987 when I was a graduate student. We were originally trying to order loci by using simulated annealing to minimize cross-over counts within each order -- this problem had too many layers of complexity to work well, and so the project was abandoned until it was revived in this more useful context. Important Note: Since simcross uses simulated annealing to search a space of often immense size, it may not converge to the best answer on the first run. It may be necessary to run simcross several times on your data in order to be assured of finding the optimal haplotype configuration. Input: Two files in MENDEL-format locus.dat pedm.dat Interactively input: locus order recombination fractions Output: out.out - Summary file recomb.first - Haplotype configuration with the minimum energy recomb.ped - Haplotype configuration that simcross converged to The recomb.first and recomb.ped files are pedigree files in MENDEL-format, and may be easily converted to Ped/Draw format for graphical display on the Macintosh using my program PedPrep (available by anonymous ftp to watson.hgen.pitt.edu). For each person, the maternal chromosome is drawn on the left, and the paternal chromosome is drawn on the right. In order to make it easy to find the crossovers on the pedigree drawing, we have added the following symbols: Key to symbols used in recomb.first and recomb.ped: | = No crossovers in the next interval \ = Crossover in the next interval in the right (paternal) chromosome / = Crossover in the next interval in the left (maternal) chromosome + = Crossovers in the next interval in both chromosomes. * = Phase not determined by parental genotypes. Since there are often several states with the same minimum energy, each time a new minimum energy is encountered, we record that state in recomb.first. However, the simulated annealing routine may continue to search the space after this state is encountered, and so may often converge to another state with the same energy. The file 'recomb.first' contains the first state encountered with the minimum energy. Usage: To use 'simcross' on data in LINKAGE-format, first extract only the codominant marker loci using lsp to create a 'datafile.dat' and a 'pedfile.dat'. Then run 'linkmend' to convert from LINKAGE-format to MENDEL-format. This will create a 'locus.dat' and a 'pedm.dat' file. Then run 'simcross'. If the order of the markers in the datafiles differs from the order you'd like to analyze under, you will have to input the new order (as integers) when prompted to do so. Note: 'linkmend' may also be obtained by anonymous ftp to watson.hgen.pitt.edu If you publish results generated by simcross, please cite these articles: Sobel E, Lange K, O'Connell JR, Weeks DE (1995) Haplotyping algorithms. In: Speed TP, Waterman MS (eds) Genetic mapping and DNA sequencing. Springer-Verlag, New York, in press. Weeks DE, Sobel E, O'Connell JR, Lange K (1995) Computer programs for multilocus haplotyping of general pedigrees. American Journal of Human Genetics 56:1506-1507. Please address any bug reports, queries, or comments to me (see addresses below). Don't forget to register by sending me an e-mail message. Thank you, -- Dan Weeks -- ________________________________ Daniel E. Weeks, Ph.D. The Wellcome Trust Centre Department of Human Genetics for Human Genetics University of Pittsburgh University of Oxford Crabtree Hall, Room A310 Windmill Road 130 DeSoto Street Oxford OX3 7BN Pittsburgh, PA 15261 (+44) 865 740 043 (desk) (+44) 865 742 441 (main) 1 412 624-3066 FAX: (+44) 865 742 196 FAX: 1 412 624-3020 daniel.weeks@well.ox.ac.uk dweeks@watson.hgen.pitt.edu Files in distribution: The files are available by anonymous ftp from watson.hgen.pitt.edu. There are three different files available: 1) simcross.tar.Z: simcross.doc Documentation file simcross.f Source code (main module) rsub.f Source code (subroutines) Makefile Unix makefile for compiling programs locus.dat Sample locus file (Krabbe disease pedigree) pedm.dat Sample pedigree file (Krabbe disease pedigree) simcross.sh Unix shell script for running simcross on Krabbe data. out.out Sample output file recomb.first Sample output file recomb.ped Sample output file 2) simcross.SUNOS.tar.Z: simcross.SUNOS simcross executable compiled for SunOS 4.1.3 simcross.doc Documentation file 3) simcross.SOL.tar.Z: simcross.SOL simcross executable compiled for Solaris (SunOS 5.3) simcross.doc Documentation file Compilation Instructions: Our program is written in standard FORTRAN 77 and should be portable across different operating systems. To compile on a Unix system, simply type 'make'. This should generate an executable file called 'simcross'. If this does not work, try f77 -o simcross simcross.f rsub.f