TDTLIKE is a program which will compute the TDT statistic for linkage in family data, and its multiallelic likelihood based counterpart. The TDT statistic looks at all parents of affected children who are heterozygous for a specific marker allele. Then it compares how often that specific marker allele is transmitted to the affected offspring from such heterozygous parents. If there is no linkage, then 50% of the time the H allele would be transmitted and 50% of the time the other allele would be transmitted. If there is linkage AND association, then the H allele would most likely be on the chromosome with the D allele, and they would be inherited together more than 50% of the time. Thus, a simple test of linkage between the two loci can be performed. Note that you MUST have BOTH association AND linkage to expect anything other than 50% transmission of allele H, if this is performed on a sample of singleton affecteds - if there are a small number of large pedigrees, a positive result does NOT mean there must be association. The form of the statistic is very simple. 2 (A - B) ------------- (A + B) where A is the number of affected offspring of heterozygous parents who inherited the H allele from the heterozygous parent, and B is the number of affected offspring of parents heterozygous for H which did not transmit the H allele to the affected child. Under the null hypothesis, A = B, for any given allele H, and this statistic is distributed, asymptotically, as a chi-square with 1 degree of freedom. Further, it should be done as a one-sided test, since the alternative hypothesis of interest is A > B, which indicates a positive association. However, in realistically sized datasets, it is easy enough (and more accurate) to compute the actual p-value from the Binomial distribution, and this is the way this program computes these p-values. Of course, if there are multiple alleles at the marker locus, then you would need to try this statistic for each of the marker alleles in turn, and this multiple testing problem must be taken into account. In fact, application of a Bonferroni correction for n tests, where there are n alleles at a given locus, has been shown by simulation to give appropriate p-values in the multiple allele case. To this end, the TDTLIKE program only outputs these corrected one-sided p-values for each of the alleles. Of course, this does not take into account the problem of multiple markers, but it is something. Additionally, there is a program constant 'min' which determines the minimum value for (A+B) above, below which a given marker allele will not be tested - this avoids a necessity for considering very rare alleles in the multiple testing corrections. To try and make a more powerful statistic for multiallelic systems, I have adapted my likelihood-based disequilibrium model to the TDT design, allowing that one allele out of n would show an increased propensity to be transmitted and all other heterozygotes would transmit their alleles with 50% probability. Then the likelihood of the data can be formulated in terms of a value xi, where the probability that someone heterozygous H/? would transmit the H allele to its affected child with probability 0.5 + xi, such that the null hypothesis is that xi = 0. This multi-allelic test circumvents the problem of multiple testing, and is assumed to be approximately distributed asymptotically as a one-sided chi-square statistic with one degree of freedom. The corresponding p-values for this test are given as well by the program, where the test statistic is of the form: max L( xi > 0) 2 ln ------------------- L(xi = 0) To use this program is very simple. You need a LINKAGE format pedigree file, called PEDIN.DAT, and a corresponding parameter file called DATAIN.DAT. The first locus must also be the trait (disease) locus. The program only uses information from families where both parents are typed, and at least one is heterozygous for the locus under study (of course). The output is written to the screen and also to a file called TDT.OUT, which gives the TDT statistic for each allele corrected for multiple testing, and the overall multiallelic TDT statistic and corresponding p-value. If you are testing the null hypothesis of no linkage disequilibrium, it is imperative that you have only singleton affected cases, not multiplex pedigrees. In this context, each marker locus statistic would be independent relative to the null hypothesis, though this is not the case when multiplex pedigrees are used as in the TDT design of Spielman et al (1993). A statistic with the same form has been referred to as a McNemar test, when association is being tested on singleton affecteds. For the McNemar there is a simple extension to multiple loci. This statistic has been extended to multiple loci as well using similar arguments to those for the DISMULT program (as explained in the Am J Hum Genet paper below). The basic idea is that you assume the disease locus to be at a given point on the map of markers, and at that point you have a value for the strength of presumed association, alpha, and you determine that the association decays such that for a point at distance theta away, lambda = alpha(1 - theta)^n, where lambda is the strength of the association as in the normal disequilibrium statistic outlined in the manuscript. So, the multipoint McNemar consists of the sum of the single locus McNemar LRT statistics, where each locus has its lambda determined as a function of the map distance between that marker and the presumed disease location, the horizontal decay parameter, n, which can be loosely interpreted as the age of the mutation, and alpha, the proportion of disease alleles estimated to be IBD from a common founder (carrying the associated alleles at each locus). This extension is also defined in Terwilliger (1995b). If there are problems or if you detect any bugs, contact me at joe@well.ox.ac.uk Also, in any publications resulting from the use of this program, please cite the following papers Terwilliger, JD (1995a) "A Powerful Likelihood Method for the Analysis of Linkage Disequilibrium between Trait Loci and One or More Polymorphic Marker Loci" American Journal of Human Genetics 56:777-787. Spielman RS, McGinnis RE, Ewens WJ (1993) "Transmission test for linkage disequilibrium: the insulin gene region and insulin dependent diabetes mellitus (IDDM)" Am J Hum Genet 52:506-516 Terwilliger, JD (1995b) "Pedigree-Based Likelihood Methods for Analysis of Linkage and Linkage Disequilibrium for One or More Polymorphic Marker Loci" (Under Review)