HMMER Manual                                           hmmpfam(1)


NAME
     hmmpfam -
           search one or more sequences against an HMM database


SYNOPSIS
     hmmpfam [_o_p_t_i_o_n_s] _h_m_m_f_i_l_e _s_e_q_f_i_l_e


DESCRIPTION
     hmmpfam reads a sequence  file  _s_e_q_f_i_l_e  and  compares  each
     sequence  in  it,  one  at  a  time, against all the HMMs in
     _h_m_m_f_i_l_e looking for significantly similar sequence matches.


     _h_m_m_f_i_l_e will be looked for  first  in  the  current  working
     directory,  then  in  a  directory  named by the environment
     variable  _H_M_M_E_R_D_B.  This  lets  administrators  install  HMM
     library(s) such as Pfam in a common location.


     There is a separate  output  report  for  each  sequence  in
     _s_e_q_f_i_l_e.  This  report  consists of three sections: a ranked
     list of the best scoring HMMs, a list of  the  best  scoring
     domains  in  order  of their occurrence in the sequence, and
     alignments for all the best  scoring  domains.   A  sequence
     score  may  be  higher  than  a  domain  score  for the same
     sequence if there is more than one domain in  the  sequence;
     the  sequence score takes into account all the domains.  All
     sequences scoring above the -_E and -_T cutoffs are  shown  in
     the  first  list,  then  _e_v_e_r_y  domain found in this list is
     shown in the second list of domain  hits.   If  desired,  E-
     value  and  bit  score thresholds may also be applied to the
     domain list using the --_d_o_m_E and --_d_o_m_T options.


OPTIONS
     -h   Print brief help; includes version number  and  summary
          of all options, including expert options.


     -n   Specify that models and sequence are nucleic acid,  not
          protein.   Other  HMMER  programs  autodetect this; but
          because of the order in which hmmpfam accesses data, it
          can't  reliably  determine  the  correct  "alphabet" by
          itself.


     -A <_n>
          Limits the alignment output to  the  <_n>  best  scoring
          domains.  -A0 shuts off the alignment output and can be
          used to reduce the size of output files.


HMMER @RELEASE@    Last change: @RELEASEDATE@                   1


HMMER Manual                                           hmmpfam(1)


     -E <_x>
          Set the E-value cutoff for the per-sequence ranked  hit
          list  to  <_x>, where <_x> is a positive real number. The
          default is 10.0. Hits with E-values better  than  (less
          than) this threshold will be shown.


     -T <_x>
          Set the bit score cutoff for  the  per-sequence  ranked
          hit  list  to  <_x>,  where  <_x>  is a real number.  The
          default is negative infinity; by default, the threshold
          is  controlled  by  E-value and not by bit score.  Hits
          with bit scores better than (greater than) this  thres-
          hold will be shown.


     -Z <_n>
          Calculate the E-value  scores  as  if  we  had  seen  a
          sequence  database  of  <_n>  sequences.  The default is
          arbitrarily set to 59021, the size of Swissprot 34.


EXPERT OPTIONS
     --acc
          Report HMM accessions instead of names  in  the  output
          reports.   Useful for high-throughput annotation, where
          the data are being parsed for storage in  a  relational
          database.


     --compat
          Use the output format of  HMMER  2.1.1,  the  1998-2001
          public release; provided so 2.1.1 parsers don't have to
          be rewritten.


     --cpu <_n>
          Sets the maximum number of CPUs that the  program  will
          run  on. The default is to use all CPUs in the machine.
          Overrides the  HMMER_NCPU  environment  variable.  Only
          affects threaded versions of HMMER (the default on most
          systems).


     --cut_ga
          Use  Pfam  GA  (gathering  threshold)  score   cutoffs.
          Equivalent  to  --globT <GA1> --domT <GA2>, but the GA1
          and GA2 cutoffs are read from each HMM in _h_m_m_f_i_l_e indi-
          vidually.  hmmbuild  puts  these  cutoffs  there if the
          alignment file was annotated in a Pfam-friendly  align-
          ment  format  (extended  SELEX or Stockholm format) and
          the optional GA annotation line was present.  If  these


HMMER @RELEASE@    Last change: @RELEASEDATE@                   2


HMMER Manual                                           hmmpfam(1)


          cutoffs  are  not set in the HMM file, --cut_ga doesn't
          work.


     --cut_tc
          Use Pfam TC (trusted cutoff) score cutoffs.  Equivalent
          to --globT <TC1> --domT <TC2>, but the TC1 and TC2 cut-
          offs are read from each HMM  in  _h_m_m_f_i_l_e  individually.
          hmmbuild puts these cutoffs there if the alignment file
          was  annotated  in  a  Pfam-friendly  alignment  format
          (extended  SELEX  or Stockholm format) and the optional
          TC annotation line was present. If  these  cutoffs  are
          not set in the HMM file, --cut_tc doesn't work.


     --cut_nc
          Use Pfam NC (noise cutoff) score cutoffs. Equivalent to
          --globT <NC1> --domT <NC2>, but the NC1 and NC2 cutoffs
          are  read  from  each  HMM  in  _h_m_m_f_i_l_e   individually.
          hmmbuild puts these cutoffs there if the alignment file
          was  annotated  in  a  Pfam-friendly  alignment  format
          (extended  SELEX  or Stockholm format) and the optional
          NC annotation line was present. If  these  cutoffs  are
          not set in the HMM file, --cut_nc doesn't work.


     --domE <_x>
          Set the E-value cutoff for the  per-domain  ranked  hit
          list  to <_x>, where <_x> is a positive real number.  The
          default is infinity; by default,  all  domains  in  the
          sequences  that  passed  the  first  threshold  will be
          reported in the second list,  so  that  the  number  of
          domains reported in the per-sequence list is consistent
          with the number that appear in the per-domain list.


     --domT <_x>
          Set the bit score cutoff for the per-domain ranked  hit
          list to <_x>, where <_x> is a real number. The default is
          negative infinity;  by  default,  all  domains  in  the
          sequences  that  passed  the  first  threshold  will be
          reported in the second list,  so  that  the  number  of
          domains reported in the per-sequence list is consistent
          with the number that appear  in  the  per-domain  list.
          _I_m_p_o_r_t_a_n_t  _n_o_t_e: only one domain in a sequence is abso-
          lutely controlled by this parameter, or by --domT.  The
          second  and  subsequent domains in a sequence have a de
          facto bit score threshold of 0 because of  the  details
          of  how  HMMER  works. HMMER requires at least one pass
          through the main model per sequence; to  do  more  than
          one  pass (more than one domain) the multidomain align-
          ment must have a better score than  the  single  domain


HMMER @RELEASE@    Last change: @RELEASEDATE@                   3


HMMER Manual                                           hmmpfam(1)


          alignment,  and hence the extra domains must contribute
          positive score. See the Users' Guide for more detail.


     --forward
          Use the Forward algorithm instead of the Viterbi  algo-
          rithm  to determine the per-sequence scores. Per-domain
          scores are still determined by the  Viterbi  algorithm.
          Some have argued that Forward is a more sensitive algo-
          rithm for  detecting  remote  sequence  homologues;  my
          experiments  with  HMMER  have not confirmed this, how-
          ever.


     --informat <_s>
          Assert that the input _s_e_q_f_i_l_e is in format <_s>; do  not
          run  Babelfish  format  autodection. This increases the
          reliability  of  the  program  somewhat,  because   the
          Babelfish  can  make mistakes; particularly recommended
          for unattended, high-throughput runs  of  HMMER.  Valid
          format  strings include FASTA, GENBANK, EMBL, GCG, PIR,
          STOCKHOLM, SELEX, MSF, CLUSTAL,  and  PHYLIP.  See  the
          User's Guide for a complete list.


     --null2
          Turn off the post hoc second null  model.  By  default,
          each  alignment  is  rescored  by a postprocessing step
          that takes into account possible biased composition  in
          either  the HMM or the target sequence.  This is almost
          essential in database searches, especially  with  local
          alignment  models.  There  is  a very small chance that
          this postprocessing might remove real matches,  and  in
          these  cases  --null2  may  improve  sensitivity at the
          expense of reducing specificity by letting biased  com-
          position hits through.


     --pvm
          Run on a Parallel Virtual Machine (PVM). The  PVM  must
          already be running. The client program hmmpfam-pvm must
          be installed on all the PVM nodes.   The  HMM  database
          _h_m_m_f_i_l_e  and  an  associated GSI index file _h_m_m_f_i_l_e.gsi
          must also be installed on all the PVM nodes.  (The  GSI
          index  is  produced  by the program hmmindex.)  Because
          the PVM implementation  is  I/O  bound,  it  is  highly
          recommended that each node have a local copy of _h_m_m_f_i_l_e
          rather than NFS mounting a shared copy.   Optional  PVM
          support must have been compiled into HMMER for --pvm to
          function.


HMMER @RELEASE@    Last change: @RELEASEDATE@                   4


HMMER Manual                                           hmmpfam(1)


     --xnu
          Turn on XNU filtering of target protein sequences.  Has
          no  effect  on nucleic acid sequences. In trial experi-
          ments, --xnu appears to  perform  less  well  than  the
          default post hoc null2 model.


SEE ALSO
     Master man page, with full list of and guide to the  indivi-
     dual man pages: see hmmer(1).

     A User  guide  and  tutorial  came  with  the  distribution:
     Userguide.ps [Postscript] and/or Userguide.pdf [PDF].

     Finally, all documentation is also available online via WWW:
     http://hmmer.wustl.edu/


AUTHOR
     This software and documentation is:
     @COPYRIGHT@
     HMMER - Biological sequence analysis with profile HMMs
     Copyright (C) 1992-1999 Washington University School of Medicine
     All Rights Reserved

         This source code is distributed under the terms of the
         GNU General Public License. See the files COPYING and LICENSE
         for details.
     See the file  COPYING  in  your  distribution  for  complete
     details.

     Sean Eddy
     HHMI/Dept. of Genetics
     Washington Univ. School of Medicine
     4566 Scott Ave.
     St Louis, MO 63110 USA
     Phone: 1-314-362-7666
     FAX  : 1-314-362-7855
     Email: eddy@genetics.wustl.edu


HMMER @RELEASE@    Last change: @RELEASEDATE@                   5