CHAPM(1) APM GENETICS PROGRAMS CHAPM(1) NAME chapm - convert LINKAGE to APM and APM to APM [ link2apm - The same (L.Bachner,Infobiogen 24/6/96) ] SYNTAX Getting brief help: chapm {-help,-usage} Reading from LINKAGE: chapm [-intype ] [-outtype ] [{-pedfile,-infile} ] [-locusfile ] [-outfile ] [-disease ] [-affdata ""] [{-loci,-locus} ""] [-check] chapm -quiet [-intype ] -outtype [{-pedfile,-infile} ] -locusfile [-outfile ] -disease -affdata "" [{-loci,-locus} ""] [-check] Reading from APM: chapm [-intype ] [-outtype ] [{-pedfile,-infile} ] [-outfile ] [{-loci,-locus} ""] [-check] chapm -quiet -intype -outtype [{-pedfile,-infile} ] [-outfile ] [{-loci,-locus} ""] [-check] [] = optional (otherwise required) {} = one of the items in this list = a valid file type = a file name = a number between 1 and the number of loci "" = a string enclosed in quotes (in all cases the quotes may be omitted if there are no spaces or special characters in the string) DESCRIPTION Chapm can read LINKAGE or any of the APM file formats (SL, ML, and MULT) and write any of the APM formats (see the INTRO file for a description of these formats). When reading LINKAGE, it uses other necessary information provided by the user (either through interactive input or command-line arguments) to determine which pedigree members are affected. For LINKAGE files, the locus types that are supported are Affection Status, Binary Factor, Numbered Alleles, and Quantitative Variable. Any of these types may be used to determine who is affected. Since chapm is capable of more extensive checking of the pedigree and locus data than the APM analysis programs themselves, we recommend that you use chapm with the -check option before doing any analyses. Chapm can also be used to polish a data file which has some simple format problem - if the loci are in the wrong order in a ML file, for example. The polished file that it writes will be compatible with the APM programs. The program may be used interactively or non-interactively. With the -quiet option, it suppresses all unimportant output and bypasses certain safety features (such as the prompt to confirm that the user wishes to overwrite a file). It can also read the pedigree file from standard input and write the output to standard output if the -quiet option is used. Chapm performs checks for obvious errors whenever it reads new data. These checks are automatic and do not require the -check option. Also, loops that have been broken in LINKAGE files are automatically reconnected. After the file has been read in (and after it has been internally converted to APM if it was LINKAGE), chapm can, at the user's request, change the order and number of marker loci. It will also perform more extensive checks on pedigree and locus integrity if the -check option was specified (see the description of the -check argument below). Before writing the output file, chapm checks to see if any of the pedigrees need to be renumbered. The APM programs require that the ancestors of any given member have smaller ID numbers than the given member; if this is not the case for all members in a pedigree (as often happens when reading LINKAGE), then the pedigree must be renumbered. Also before writing, it deletes any pedigrees that have fewer than two typed affecteds. What actually happens depends on the output file format - when writing MULT and SL format files, a pedigree is deleted if it does not have at least two affecteds typed at all markers, but when writing ML format files, a pedigree is deleted if it does not have at least two affecteds typed at at least one marker. This is because the APM programs that use ML format can support affecteds which are not typed at all loci. ARGUMENTS -affdata This option may be used to tell the program how it is to determine affection when converting from LINKAGE. The exact format depends on the type of the disease locus (note that the data that goes with this option MUST be enclosed in quotes): For an Affection Status disease locus with liability classes: You might use -affdata "2-2" to declare all members with status 2 (affection) and in liability class 2 of that locus affected. Or you might use -affdata "2-*" to declare all members of status 2 (in all classes) affected. You can also make multiple specifications; -affdata "2-1 2-2" for example. Legal status numbers are 0, 1, and 2. Legal class numbers are between 0 and the number of classes (specified in the locus file). For an Affection Status locus with no liability classes: This is the same as for the above, only without the class specification. For example, to declare all members of status 2 affected, you might use -affdata "2". For a Binary Factor or Numbered Allele locus: Input for both of these locus types is the same. The specification is the pair of allele numbers that define the affected genotype. For example, for a locus with two alleles, you might use -affdata "1/2" or -affdata "2/1" for those with alleles 1 and 2, or -affdata "1/*" for all those with allele 1 present (the other can be anything), and so on. For a Quantitative Variable disease locus: Input for this is a range or a single value. To mark as affected all those of quantitative value 100, you might use -affdata "100". To mark all those between 100 and 200 (inclusive), you might use -affdata "100-200". To mark all those below (or equal to) 100, you might use -affdata "*-100". -check This option requests that pedigree and locus integrity be checked. Some errors will be fatal to the program and will need to be corrected by the user. These are the checks that are currently performed: (the keywords used below whose meanings are not obvious are: defined: has been assigned a value or is known to exist (if values are not applicable) sensible: has been defined a value within reasonable bounds complete: all required attributes have been defined) In checking pedigree integrity: For each pedigree: o The number of members is defined and sensible. o The proband is defined. o All affecteds in the list of affecteds are actually affected, defined, complete, and typed. o The pedigree is connected (that is, we can go from any one member in the pedigree to any other member by moving from one relative to another). For each entry: o The member's ID is defined and sensible. o The member's status and sex are defined and sensible. o The member's mother and father are defined, complete, and of the correct sex, or both parents are missing. o The number of offspring is sensible (>= 0 and < # Entries). o All the offspring are defined and complete. For each locus: o Both alleles are defined or both are missing. o The inherited genotype is possible given the genotypes of the parents. In checking locus integrity: For each locus: o The number of alleles is greater than 1 (a 1-allele locus is not of any interest in terms of mapping a disease). o The array of allele records is present and complete. o The sum of the allele frequencies is 1 (within 0.000001); if it is off by more than 0.001 it is considered to be an error. For each allele: o The allele frequency is between 0 and 1. -disease This is used to specify the disease locus when converting from LINKAGE. The number provided is the number of the locus in the pedigree data file and should be between 1 and the number of loci. -infile Use one of these (synonymous) arguments to provide the -pedfile program with the name of the pedigree data file to be converted. If the name of the input pedigree file is not supplied in -quiet mode, it is read from the standard input. -intype These arguments specify the input and output file types. -outtype Valid parameters are ML, SL, MULT, and LINKAGE. All but the first L in LINKAGE may be omitted (so L alone is valid). -locus These allow you to specify the marker loci that you wish -loci to save and the order in which to save them. The default is all of them (in their current order) for ML and MULT files, or the first locus for SL files. Note that when converting from LINKAGE, only Binary Factor and Numbered Allele loci can be written out to APM files (the others are deleted when the structures are converted internally), so the numbers that you specify must reflect this. As an example, consider a LINKAGE file that you wish to convert to ML format. Say that it has an Affection Status locus followed by two Numbered Allele loci, followed by a Quantitative Variable locus. You might use the Affection Status locus to determine affection, and you might want to save both marker loci (the second and third). To do this, you would simply supply this in the command line: -loci "2 3". Or, you could use -loci "3 2" to reverse the order, or you could save only the second (the third in the original file) via -locus "3", etc. -locusfile This option is used when reading LINKAGE to provide the program with the name of the locus data file. -outfile This is used to specify the name of the output file to write. Standard output is used in -quiet mode if this argument is not provided. -quiet This option is supplied mainly to facilitate use of the program in shell scripts. It requests non-interactive mode, which means that the user is required to supply all necessary information via the arguments. It also suppresses unimportant output. Certain safety features are bypassed, so you should really know what you are doing before using this option. ERROR MESSAGES The error messages are meant to be as concise and informative as possible. The program, of course, can't always tell you what is wrong; most of the time it can only say what it thinks is wrong and describe the symptoms that it found. NOTES ABOUT LIMITATIONS OF FILE FORMATS Most notably, there is a disparity between MULT and ML format: In ML format, affecteds may not be typed at all loci, whereas in MULT format, all affecteds must be typed at all loci. You may, therefore, find that you are losing affecteds or whole pedigrees when converting from ML to MULT, because of some affecteds being untyped at some loci. There is not anything inherently wrong with this, but it is something to keep in mind. BUGS As of this date, the support for Quantitative Variable loci has not been fully tested. REFERENCES See the accompanying REFERENCES file. APM Release 2.0 Last change: 5 Jul 1993