PROMOTER SCAN PROMOTER SCAN is a program developed to facilitate the analysis of DNA sequences for Pol II promoter sequences. This program is FREE. You may copy and distribute this program, but you may not charge for its distribution. You MUST register the program by sending your name and E-mail address. Registering this program helps to justify the program for funding purposes. For registering the program you will automatically be notified of PROMOTER SCAN changes and updates. PLEASE REGISTER, you will help assure the future of PROMOTER SCAN. Note that if you obtained this program directly from me, you are already registered. The source code is written in the `C' language and is fairly easy to port over to other hardware and operating systems. The source code will be made available upon request. You may make changes to the source code, but may not release the modified program or source code without my authorization. If you use this program in published research, please site: Prestridge, D.S. 1995. Predicting Pol II Promoter Sequences Using Transcription Factor Binding Sites. J.Mol.Biol 249:923-32. The author welcomes comments and suggestions on the program. Please contact: Dr. Dan S. Prestridge Tele:(612) 625-3744 Advanced Biosciences Computing Center E-mail:danp@biosci.umn.edu 1479 Gortner Ave. University of Minnesota St. Paul, MN 55108 SEQUENCE FORMAT PROMOTER SCAN will accept sequences in either GCG (Genetic Computer Group) or Staden format (ie., as in Roger Staden). You may refer to the GCG manuals to see explanations and examples of GCG format, and may use the GCG SEQED editor to enter sequence data. See example sequence SAMPLE.GCG. Note that sequences with in-sequence comments are not accepted, and there is a problem with sequences that have been reformated using the GCG "reformat" command. Staden format consists of only upper-case letters (A,T,C, & G). Blank lines are ignored, no comments should be in the file, no spaces are allowed, and lines can only be 80 nucleotides long (or shorter). Please look at the sample sequence file SAMPLE.SEQ for details (it should be included with the program files). The file must be ASCII, which means that if you use a word processor (such as WORD (tm), WordPerfect (tm), etc.) that you must export the file into an ASCII format. This is because a word processor adds a lot more things to a file than most of you realize (like page formatting, type of printer you selected, and many other things) hidden to the user. If you have MS-DOS 5.0 or greater, it's 'edit' program serves as good ASCII sequence editor. If you are familiar with the Genetics Computer Group programs you can use their sequence conversion utility programs to convert GenBank files. Currently the maximum sequence length accepted by PROMOTER SCAN is limited to 50kb (UNIX version) or 10kb (PC version). Please note that your DNA sequence must be in the same subdirectory as the PROMOTER SCAN program, and the file name must have a 3-letter extension. PROMOTER SCAN is offered to you as is, and so are its results, with no implied warranty. PROMOTER SCAN is designed to find putative eukaryotic Pol II promoter sequences in primary sequence data. This program is experimental in nature, and should be used as an experimental tool. PROMOTER SCAN is best used to find regions in primary DNA sequence that might be good candidate regions to further test for promoter functionality. At this time, using test promoter and non-promoter sequence test sets, the program recognizes approximately 70% of primate promoter sequences, with a false positive rate of about one in every 17,200 bases. If you locate a promoter sequence using this program, you must cite: Prestridge, D.S. (1995). Predicting Pol II Promoter Sequences Using Transcription Factor Binding Sites. J. Mol. Biol. 249: 923-32. To start scanning your sequence, press "s". Look at other help selections for sequence format, interpreting results, running the program, or finding out about program information and how to contact the author. RUNNING PROMOTER SCAN You will need your sequence in GCG or Staden format (see help on sequence format). You must also have your sequence in the same directory from which you start PROMOTER SCAN. Since you have obviously started the program with the PROSCAN command, I won't tell you that. Next you are given a menu, select the first letter of the menu item you desire. If you chose to Scan your sequence for promoters, you will be asked for the sequence name. Next, the program will begin its search for putative transcriptional elements in your sequence, this can take several minutes depending on the length of your sequence. Then, once this step is completed, PROMOTER SCAN begins to scan you sequence for promoter sequences. This will take even longer, from a few to several minutes. Be patient. Your results are stored in a file named after your sequence and ending with a .pro extension. Thus if your sequence was named sample.seq, the results are stored in a file named sample.pro. Predicted promoter in fickett.sdn from (+)114-364 INTERPRETING THE RESULTS: The result file (whatever.pro) contains your results & can be printed out using the UNIX lp or lpr command once the program has completed. The results show the location of predicted promoter sequences. Predicted sequence regions are regions of DNA that contain a significant number and type of transcriptional elements (TEs) that are usually associated with Pol II promoter sequences. These promoter associated TEs were previously determined by analysis (Prestridge, D.S.(1995)JMB 249:923-32). Reported putative promoters are those regions of your sequence that score past a predetermined cutoff score set to recognize 70% of primate promoter sequences in the Eukaryotic Promoter Database (Bucher & Trifonov. (1986) NAR 14: 10009-26). At this cutoff score, false positive predictions occur at a rate of approximately one in every 17,200 single strand bases. These predictive estimates are based upon experimental test sets of promoter and non-promoter sequences; you may find very different results. I would be very happy to hear what you find, positive or negative results; send E-mail to danp@biosci.umn.edu. Any feedback I can get will help me to improve the program in the future.