CONSED 8.0 DOCUMENTATION

CONTENTS:
    QUICK TOUR OF CONSED
    INSTALLING CONSED
    NOTE TO SGI USERS
    PRIMER PICKING PARAMETERS
    FOR PROGRAMMERS AND FELLOW TRAVELLERS ONLY
    NEW ACE FILE FORMAT
    ADVANCED PHRAP/CONSED USAGE
    WHAT IS NEW IN CONSED 8.0


QUICK TOUR OF CONSED


Release 8.0

Consed is a program for viewing and editing assemblies assembled with
the phrap assembly program.

If you are already an advanced consed user, you should read through
this and do any of the exercises on features that you are unfamiliar
with.  I frequently run across people who are doing something in
consed a hard way month after month, and request a new feature,
when that new feature is already in consed.

If you have never used consed before, to follow this Quick Tour will
take you less than 2 hours.  However, it will save you approximately 2
days in agony.  If you have 2 extra days to spare, and prefer to waste
them in agony, then do not do this Quick Tour and instead immediately
skip down to 'INSTALLING CONSED' below.

When you do the quick tour, I encourage you to be free about changing
the data set.  If you really mess things up (such as changing all a
read's bases to N's), no problem--just delete the data set and start
again with a fresh copy.

1) After downloading the distribution with netscape (see www.phrap.org
and click on 'consed'), copy the distribution to a unix computer (if
it is not already on one), and then unpack the files by typing the
appropriate line below (which one depends on what you named the file
downloaded by netscape):

zcat consed_solaris.tar.Z | tar -xvf -
zcat consed_alpha.tar.Z   | tar -xvf -
zcat consed_hp.tar.Z      | tar -xvf -
zcat consed_sgi.tar.Z     | tar -xvf -
zcat consed_sunos.tar.Z   | tar -xvf -
zcat consed.tar.Z         | tar -xvf -

Note:  You must untar on a UNIX computer--not on an NT computer.

2)  The only unix commands you must learn are the following 3:
pwd   -- this tells you were you are
ls    -- this tells you what files are there  (Same as DIR in DOS)
cd    -- this moves you  (Same as CD in DOS)
That's it--use them a lot!

USING CONSED GRAPHICALLY

3)  Type the following:

cd standard/edit_dir


4) start consed by typing the appropriate command below:

../../consed_solaris
../../consed_alpha
../../consed_hp
../../consed_sgi
../../consed_sunos

Two windows will appear.  One of these will have the list of .ace
files and say 'select assembly file to open' and
'standard.fasta.screen.ace.1'.  Double click on that name

You will see a list of one contig and a list of reads.  This is the
'Main Consed Window'.  

Double click on 'Contig1'.

The 'Aligned Reads Window' will appear.  

Try scrolling back and forth.  Try scrolling by dragging the thumb of
the scrollbar.  Also try scrolling by clicking on the 4 << < > >>
buttons for scrolling by small amounts.  For scrolling by tiny
amounts, click on the arrows at either end of the scrollbar.  For
scrolling by huge amounts, use the middle mouse button and just click
on some location on the scrollbar.

Notice the colors.  The bases that are in red are the ones that
disagree with the consensus.

Notice the different shades of grey background (around the bases).
They have the following meanings, but first, you need to understand
the meaning of the quality values:

A quality value of 10 means 1 error in ten to the 1.0 power
A quality value of 20 means 1 error in ten to the 2.0 power
A quality value of 30 means 1 error in ten to the 3.0 power
A quality value of 40 means 1 error in ten to the 4.0 power

and for quality values in between:

A quality value of 25 means 1 error in ten to the 2.5 power

Get the idea?


(These have actually been empirically verified--if you are interested
in the gory details, read the phred papers:

Ewing B, Hillier L, Wendl M, Green P: Basecalling of automated
sequencer traces using phred. I. Accuracy assessment.  Genome Research
8, 175-185 (1998).

Ewing B, Green P: Basecalling of automated sequencer traces using
phred. II. Error probabilities.  Genome Research 8, 186-194 (1998).

In that same copy of the journal is a paper about consed, as well.)

These quality values are shown in grey scales:

Quality 0 through 4 is given by dark grey
Quality 5 through 9 is given by a shade lighter
Quality 10 through 14 is given by a shade still lighter
.
.
.
Quality of 40 through 97 is given by white (the brightest shade)

A quality value of 99 is reserved for bases that have been edited and
the user is absolutely sure of the base ('high quality edited').

A quality value of 98 is reserved for bases that have been edited and
the user is not sure of the base ('low quality edit').

The ends of the reads shows bases that are grey and have a black
background.  These are the low quality ends of the reads or the
unaligned ends of reads, as determined by phrap.

5)  Scroll so that location 490 is about in the middle of the aligned
reads window.  Push the left mouse button down on the menu item 'Dim'.
There will be a list of choices that will appear.  Drag the cursor
down to 'Nothing' and release.  Now look what happened to the color of
the bases.  The ends of the reads that used to be with a black
background now appear red with a grey background.  You are seeing
the clipped-off bases with all the same information as any other base.
Since there is a huge amount of red (discrepant) bases, the screen
becomes distracting and busy.  Thus by default the low quality clipped
off bases are made with a black background and a grey foreground so
they don't distract you.

You can play with the dimming options a bit.  Then return it to 'Dim
Low Quality' for the rest of this tour.

(Notice there is a distinction here between 'low quality ends of
reads' and 'unaligned ends of reads'.  Until you get the version of
phrap released only to commercial users in Aug 1998, there is no
difference.  However, when you get that version (or better), there
will be an important difference.)


TRACES AND EDITING

6) Put the cursor on the bases of one of the reads and click with the
middle mouse button.  The trace window showing the traces for that
stretch of read should popup.

There are 4 rows of bases in the trace window:

'CON' is the consensus
'EDT' is where you can edit the base calls of the read
'PHD' is the original phred base calls
'ABI' is the ABI base calls (if you are using ABI chromatograms rather
    than scf chromatograms)

Notice that a red cursor blinks in the corresponding positions of the
aligned reads window and the trace window.


7) Try editing in the trace window.  You can click the left mouse
button on a base on the 'edt' line to set the cursor (a blinking red
rectangle).  You can directly overstrike a base by typing a letter.
Try this.  Try undoing it (by clicking on 'undo' ).  If you want to
undo more than one edit, you will have to go back to the main consed
window and click on the button labeled 'Undo Edit...'.

We believe that the user should only change a base call while
examining the traces.  That is why editing is done here--not in the
Aligned Reads Window.

8)  You can insert a column of pads by pushing the space bar.  Try
this.  (You may need to click on the 'edt' line first.)

(For those of you new to editing assemblies, a 'pad', which in consed
and phrap is represented by the '*' character, is used to align
two or more sequences such as these:
     gttgacagtaatcta
     gttgacataatcta
in which one sequence has an inserted or deleted base with respect to
the other.  By inserting the pad character, it is possible to get a
good alignment: 
     gttgacagtaatcta
     gttgaca*taatcta
This is the purpose of pad character--it is just a placeholder.)

You can then overstrike a pad with a base.  In this way you
can insert a base, and still preserve the alignment.

9) Try highlighting a stretch of a read on the EDT line by holding
down the middle mouse button and dragging the cursor over some bases.
They will turn yellow as you drag.  Then release the mouse button.  A
window will popup giving you some choices of what to do with those
(yellow) bases.:


    Make High Quality--makes the highlighted bases edited high quality
        (99).  This tells phrap (when it reassembles) that you are
        sure of the sequence here.
    Change Consensus--make the highlighted bases edited high quality and
        change the consensus to agree with that stretch of the read.
        This is a directive to phrap (upon reassembly) to use that
        stretch of that read to be the consensus
    Make low quality--makes the highlighted bases edited low quality.
        This tells phrap (when it reassembles) that you are not sure
        of the bases here and phrap can go ahead and make a join even
        if the bases in this region don't match perfectly.
    Make Low Quality to Left End--same as above, but all the way to
        the left end of the read
    Make Low Quality to Right End--same as above, but all the way to
        the right end of the read
    Change to n's--Change the highlighted bases to n's which means
        they are unknown bases.  This tells phrap (when it
        reassembles) to not make any join based on these bases.  It is
        useful when you believe the bases may be in the chimeric
        portion of a read.
    Change to n's to left end--same as above but to left end
    Change to n's to right--same as above but to right end
    Add Comment Tag--allows user to add a comment to a stretch of read
        bases
    Add Tag--allows user to add any tag to a stretch of read bases
    Dismiss--you decided you don't really want to do anything with
        this stretch of bases

This popup is made so that nothing else works until you choose
something.  Try each of these choices, except for tags, which you'll
try below.

'Change Consensus' has an additional function--if a read extends out
on the right beyond the end of the consensus, you can extend the
consensus by using this function.  You might want to do this, for
example, if crossmatch did not correctly find the cloning site.  You
can't try it with this dataset since no read extends beyond the end of
the consensus, but you may see this phenomenon with your own data.

10) To delete a base, overstrike it with a '*' character.  (Phrap
ignores '*', so this is the same as deleting the charcter.)  There is
no way to remove the resulting '*' from an assembly even if the entire
column now consists of *'s.  This is OK since when you export the
consensus (try the exercise on EXPORTING THE CONSENSUS), the
*'s are not exported.  While you are editing in consed, 
we believe there should be a visual indication that a base was
deleted.

SAVING THE ASSEMBLY

11)  To save the assembly, pull down the 'File' menu on the Aligned
Reads Window, and release on 'Save assembly'.  A box will pop up with
a suggested name.  I suggest you always use the one it suggests.  The
idea is that the ace files:


(project).fasta.screen.ace.1
(project).fasta.screen.ace.2
(project).fasta.screen.ace.3
(project).fasta.screen.ace.4
(project).fasta.screen.ace.5

are in order of how old they are.  If you feel you are taking up too
much disk space, then start deleting the ace files starting at the
oldest.  I do not recommend that you overwrite existing ace files.
The version numbers just keep growing, and that is not a problem.


EXPORTING THE CONSENSUS

12)  Exporting the consensus.  Bring the Aligned Reads Window into view
 again.  Hold down the left mouse button on the 'File' menu and
 release the button on 'Export Consensus Sequence'.  Notice that the
 consensus will be stored (in this case) in a file called
 'Contig1.fasta'.  Click 'OK'.  There is now a file in your edit_dir
 directory called 'Contig1.fasta' that has the consensus sequence in
 it.  If you want to see the file, bring up another Xterm (if you are
 UNIX literate), and type: 


 cd standard/edit_dir
 more Contig1.fasta


13)  Fancier exporting the consensus.  Bring the Aligned Reads Window
 into view again.  Hold down the left mouse button on the 'File' menu
 but this time release on 'Export Consensus Sequence (with
 options)...'.  Just export a little snip of the consensus, from 400
 to 410.  (You will notice this contains a pad * character.)  Ask for
 both the bases file and the quality file.  Click 'OK'.  Consed will
 want to call this file 'Contig1.fasta' again.  You can overwrite the
 existing file.  

 Look in your other Xterm at these files:

 more Contig1.fasta
 more Contig1.fasta.qual

 The one file contains the bases (but no * pads) and the other
 contains the corresponding qualities of those bases.


14) (For this step, first click on the 'Dim' menu and release on 'Dim
Nothing'.)  Point to the 'Color' menu, hold down the left mouse button
and release on 'color means edited and tags'.  Notice that the bases
that you have edited will stand out in either white or grey (depending
on whether the base was made high quality or low quality).  Observe
this both in the trace window and the aligned reads window.  This
colormode is useful if you are interested in easily spotting which
bases are edited.

Return to the 'color means quality and tags' colormode by the
following:  point to the 'Color' menu, hold down the left moust button
and release on 'color means quality and tags'.

FIND MAIN WINDOW

15) On the aligned reads window, click on 'Find Main Win'.  This will
cause the main window to pop up in the event you have buried it under
other windows or iconified it.  (This may not work with non-X
compliant window managers, such as NT.  In that case you will have to
find and click on the Main Window to bring it up.)


MULTIPLE UNDO EDIT

16) Now that the main window is visible, click the 'Undo Edit...'
button.  There will be a popup indicating the most recent edit.  Click
'undo'.  Then you will see the edit that was done before that.  Click
'undo'.  You can continue if you like.  You now know how to undo more
than one edit.  You cannot choose which edits to undo and which to not
undo--edits can only be undone in precisely reverse order from the
order you made them.

SCROLLING TRACES AND ALIGNED READS TOGETHER

17) In the aligned reads window, scroll along the contig to a
different point.  Click the left mouse button on a read whose trace is
already up.  Notice that the existing trace is scrolled to the new
location.  Then go to the trace window and scroll the traces to a new
location.  Click on the EDT line with the left mouse button.  You will
notice that the aligned reads window will scroll to the corresponding
location.  Thus you can keep the aligned reads window and the traces
scrolled to the same location.

EXAMINING  ALL  TRACES

18) Go to a region where there are lots of reads, say base 1660.  Push
down the right mouse button and release on 'Display All Traces'.  You
will see all traces displayed in a scrolling window.  You can drag the
scrollbar on the right down and up to see all the traces.  This
feature is particularly useful for polymorphism/mutation detection
work.  This feature was added to work in cooperation with polyphred.
To see it in action, exit consed.

CONSED-POLYPHRED INTERACTION

Polyphred is a program for finding polymorphic sites developed by
Debbie Nickerson's group (contact them at stay@u.washington.edu).

We have a test database, 'polyphred', which has had polyphred run on
it already.  Polyphred has put a polymorphism tag on each polymorphic
site.

Type:

cd ../../polyphred/edit_dir
ls
../../consed -ace example2.fasta.screen.ace.1

When consed comes up, you should see 2 contigs.
Double click on Contig2

In the Aligned Reads Window, push the left mouse button while pointing
to the 'Navigate' menu and release on 

'Toggle feature:  when navigating to consensus location, pop up all
traces (currently off)' 

That will turn this feature on.

Now push the left mouse button while pointing to the 'Navigate' menu
and release on 'Tags'.  Up should pop a list of tag types.  Double
click on 'polymorphism'.   Polyphred has already been run so the
consensus is tagged with polymorphism tags at each polymorphic site.  
Up will pop a window labelled 'Polymorphism Tags' with a list of
sites.  Click on 'Next'.

If you correctly followed the instructions above, all the traces should
pop up at the first polymorphic site.  You may want to reposition the
traces window to see it better.  

Now ignore the original 'Polymorphism Tags' window and instead click
on 'Next' in the *traces* window.  This will take you to the next
polymorphic site.  Pretty nice, huh?


After you are done playing with this feature, exit consed and go back
to the previous database:

cd ../../standard/edit_dir
ls
../../consed -ace standard.fasta.screen.ace.1
Double click on Contig1 to bring up the Aligned Reads Window again in
preparation for the next step.


NAVIGATING

19) In the aligned reads window, pull down the Navigate menu and
release on 'Low Consensus Quality'.  You will see a list of locations.
Move the 'Low Consensus Quality' window down so you can see the
aligned reads window.  Repeatedly click on 'Next' until you reach the
end of the list.  (Low consensus quality means an area in which the
bases have too high probability of being wrong.)  This saves you from
having to look through large amounts of high quality data trying to
find problem areas.

You may want to click on the 'save' button to save to a file a copy of
this list of problem areas as you work through them.

In our experience, this will be the most important navigate list you
will use.  In fact, finishing consists mainly of adding reads and
rephrapping until this list is reduced to nothing.

20) Dismiss the Low Consensus Quality window.  Pull down the
'Navigate' menu again and release on 'High quality discrepancies as
above but omitting tagged compressions and G dropouts'.  You will
probably notice there are no entries (unless you created some yourself
by editing).  That is because there are no high quality discrepancies
with this dataset.  So let's force there to be some by lowering the
quality threshold.  First, dismiss the High Quality Discrepancies'
Window.

Click on 'Find Main Win'.  In the main consed window, pulldown the
'Options' menu and release on 'General Preferences'.  Notice that the
default for 'Threshold for High Quality Discrepancy' is 40.  Change it
to 15 and click 'Apply and Dismiss'.

Then follow the steps above to bring up the High Quality Discrepancies
menu.  Now you will see several entries.  Click 'next' repeatedly to
go successively to the next high quality discrepancy in the Aligned
Reads Window.

You can also double click on a particular line in the High Quality
Discrepancies Window to go to that location.  Alternatively, you can
single click on a line and then click the 'Go' button.

Dismiss the High Quality Discepancies Window.


21) Similarly, try the other navigate lists: Unaligned High Quality
Regions (this list will be empty with this data set), Edits, Regions
Covered by Only 1 Strand and Chemistry, and Regions Covered by Only 1
Subclone.

Unaligned High Quality regions are regions in which the traces are
high quality so there is no question of the bases, but the region
differs so much from other reads that phrap has given up trying to
align the region with the consensus.  This could be due to a chimeric
portion of a read, or perhaps the read belongs somewhere else.

We believe that regions covered by only 1 subclone should be covered
by a 2nd subclone to prevent the possibility of there being a deleting
in the single subclone.

The dimming options affect the algorithm used to find single subclone
regions and single stranded regions.  We suggest you set the dimming
option to 'Dim Unaligned' while doing this navigating.

There are so many different problem lists that you may forget to check
one of them and thus miss a serious problem.  Thus we combined them
all into a single list.  This is the first menu item: 'Low Cons/High
Qual Discrep/Single Stranded/Single Subclone/Unaligned High' We
suggest you use this list.

Also try navigate by tags: when the Select Tag Type Window appears,
double click on 'compression'.  (Note that you can't do anything else
until you deal with this window.)


PRIMER-PICKING


**** Temporary step **** 

After you have completed the 'install vector files' step (below), you
should never do this again. 

Click on 'Find Main Win'.  On the Main Window, open the Options menu,
and release on 'Primer Picking Preferences'.  Notice the question
'Screen Primers Against Sequences in File?'  (If you have trouble
finding this question, you may have to make the Primer Picking
Preferences large to see it.  It is between
'PrimersMaxLengthOfMononucleotideRepeat' and 'File of template you do
not want the primer picker to choose'.  Click on 'False'.  Then click
'ok' and the Primer Picking Preferences box will pop down.

(In real use, 'Screen Primers Against Sequences in File?' should be
set to 'True'.  I have had you set it to False just this once so you
can go ahead and see how this is supposed to work until your system
administrator has time to correctly install the vector sequences file.

**** end of temporary step ****


22) Go to some location near the right end of the contig, say base
2570.  Click with the right mouse button on the consensus and click on
either one of the top strand primer choices (either from subclone
template or from clone template).  Consed will pause a moment, and
then there will appear a selection of primers that pass all of
consed's requirements.  Templates are also chosen for each primer.
You may have to scroll the primer list to the right to see the
templates.  Consed lists these templates in order of quality--all of
them will cover the read you want to make.

Double click on one of the primers in the Primers Window.  That will
cause the Aligned Reads Window to scroll to show that oligo in
context.  Click on 'Accept Primer'.  Notice that a yellow oligo tag is
created on the consensus for that primer.  That tag contains all the
information you need to order that oligo and do the reaction--pop it
up and see.

When you are done editing and have saved the assembly and exited
consed, run ace2Oligos.perl (supplied with this distribution--make
sure your system administration installed it) which will extract all
the oligos you just created.  This is handy for email ordering of
oligos.

In the xterm, type:

ace2Oligos.perl standard.fasta.screen.ace.2 oligos.txt

where standard.fasta.screen.ace.2 is whatever the name is of the ace
file you just saved.

If you are interested in the details of primer-picking, see the
section 'PRIMER PARAMETERS' (below).

What is the difference between 'Pick Primer from Subclone Template'
and 'Pick Primer from Clone Template'?  

There are 2 differences:  

A.  which vector file the primers are screened against.  In the former
case, the primer is screened against the file primerSubcloneScreen.seq
and in the latter case against the file primerCloneScreen.seq 

B.  In checking for false matches elsewhere in the assembly, if the
template is the whole clone, then consed must check for false matches
in the *entire* assembly, including all other contigs.  But if the
template is just going to be a subclone, consed only needs to check
elsewhere in that subclone.  Actually, to be conservative, consed
checks for false matches +/- the maximum insert size of a subclone.

These are parameters: If you are interested in the details of
primer-picking, see the section 'PRIMER PARAMETERS' (below).


SEARCH FOR STRING

25) Try the 'Search for String' button (left side of the Aligned Reads
Window).  Type in a string (such as aaaca), and click 'ok'.  There
should be a list of 'hits'.  Double click on one of the hits (or
single click on it and click on 'go'.)  Notice that the Aligned Reads
Window scrolls to that position and has the cursor on the found
string.  (It might be complemented.)

Dismiss this window.  Try this again, only this time in the Search For
String Window select 'Search Just Reads'.  Then click 'OK'.  You will
notice there are many more hits.  This is because this shows hits in
each read, even if they are at the same consensus position.

COPY AND PASTE

26) In the Aligned Reads Window, swipe some bases by holding down the
left mouse button.  You should see the bases turn yellow, at least
temporarily.  Then click the 'Search for String' button.  Use the
middle mouse button to paste the bases you have just swiped into the
'Query string:' box.  Notice that you can swipe bases either from the
consensus or from a read.

The search for string is case-insensitive so don't worry about the
pasting being upper or lowercase.


CORRECTING FALSE JOINS MADE BY PHRAP

27)  Phrap may put several reads together that you believe do not belong
together.  (For example, you may see several high quality
discrepancies between the reads.)  If you are sure these reads do not
belong together, you can force a subsequent reassembly by phrap to not
assemble those reads together.  You do this by finding a location
where there is a high quality discrepancy.  Then click on the read
with the right mouse button and release on 'Tell phrap not to overlap
reads discrepant at this location'.  There are no high quality
discepancies with this dataset so you consed won't let you do this.
(Try it and see.)  However, when you use your own data, you may get
the chance! 


ADDING READS

28) For this to work, your system administrator must have set up
everything correctly (See below in INSTALLING CONSED.)  Assuming you
have set everything up correctly, you can now experiment with adding
reads.

Now bring up consed again using ace file standard.fasta.screen.ace.1
If it asks if you want to apply edits, just say 'no'.

On the Main Window, click on the Add New Reads button.  There will
appear a list of files ending with .fof These are files that contain
lists of chromatograms.  Double click on 'reads_to_add.fof' There
should be lots of progress output in the xterm from which you started
consed.  When it completes, there will be a Reads Added Window popup
with a report of which reads were added.  In this case, it should say
that 9 reads were successfully added and list them.


TEARS AND JOINS
      
29) When phrap really screws up, you may want to just tear the contig
apart in several places and then join the pieces back together in a
different way.  Although we discourage you from doing this, we do give
you the power to do it, if you want to.  Let's try it:

Go to location 1550.  Point the mouse at the consensus base at 1550
and push the right mouse button down.  Release the button on 'Tear
Contig at This Consensus Position'.  Up will pop a list of reads with
2 little buttons next to them <- and ->.  This time don't change the
buttons (you can try this another time changing the buttons) and just 
click 'Do Tear'.  

Now you should have 2 Aligned Reads Windows on top of each other.  One
should contain 'Contig2' and the other 'Contig3'.  

Now let's join these 2 contigs back together:


Click on 'Search for String' and type in the following bases:
agctgccatc

Click 'OK'. 

Search for string should find 2 locations, one in Contig2 and one in
Contig3:

Contig2     (consensus)     1447-1456   (uncomplemented)
Contig3     (consensus)     829-838     (uncomplemented)

Double click on the first one.  The Aligned Reads Window for Contig2
will scroll to location 1447 and lift up.  In that Aligned Reads
Window, click on 'Compare Cont'.  

Now double click on the 'Contig3' line in the above Search for String
results.  The Aligned Reads Window for Contig3 will scroll to location
829 and lift up.  In that Aligned Reads Window, click on 'Compare
Cont'.

Now the Compare Contigs window should be visible.  In the Compare
Contigs Window, try scrolling back and forth.  You can change the
cursors (blinking red), but if you do, please return them to the
locations 1447 and 829 for the next step.  The cursors 'pin' these
bases together when doing an alignment.  (The algorithm is a pinned
Smith-Waterman alignment.)

Click on Align.  Try scrolling the alignment by dragging the thumb in
the lower half of the Compare Contigs.  An 'X' means there is a
discrepancy between the 2 contigs.  There is also a 'P' (see if you
can find it!)  The P indicates the bases that you pinned together.

Click with the left mouse button on either contig in the bottom
alignment.  You will notice that both contigs will have the red
blinking cursor in the same position.  Click on 'Scroll Both Aligned
Reads Windows' and look at the Aligned Reads Windows to see that they
scroll to the corresponding positions.  You can have traces up for the
contigs, and they will scroll as well.  Experiment with this.  Then
click 'Join'.  The 2 previous Aligned Reads Windows will disappear and
there will be a new one which has a new contig 'Contig4'.  You have
made a join!

This is one method of exploring joins of contigs that were not made by
phrap.  Another method is to use phrapview, supplied with phrap.
phrapview gives a high level view of all internal joins while 'compare
contigs' shows the alignment of a single internal join.  Some users
have found them to work well together--phrapview to find a join and,
having found it, 'compare contigs' to examine it in more detail.


TAGS

30) Bring up a trace for a read (as above).  Swipe some bases on the
'edt' line with the middle mouse button.  A list of choices will popup.
Select 'Add Comment Tag'.  Type in a comment in the box that appears,
and click 'OK'.  You will now see a blue box both in the Aligned Reads
Window and in the Traces Window on that read.

To see the comment, you can click on that blue tag in the Aligned
Reads Window with the right mouse button and release on 'Tag: comment
Show more info?'.  Alternatively, you can click on the blue tag in the
traces window with the right mouse button.

Try creating some other kinds of tags: again swipe some bases in the
Trace Window.  But this time instead of clicking 'Add Comment Tag',
click on 'Add Tag'.  Select another tag type.  You will notice that
different tags are in different colors.  You can always click with the
right mouse button on the tag (as above) if you forget what a
particular color means.

You can also define your own tag types.  See below CREATING CUSTOM TAG
TYPES for how to do that.

31) You can create really, really long tags as follows: Just create a
short version of the tag as above for where you want the tag to start.
Then figure out the consensus position of where you want the tag to
end.  In the Aligned Reads Window, click on the short tag with the
right mouse button and release on 'tag: show more info?' (as above).
A Tag Window will appear for that tag.  In the Tag Window, simply
change the End Unpadded Consensus Position to the place you want it to
end.  Then click 'OK'.  You will now notice that the tag will be as
long as you wanted.

32) You can create tags on the consensus in the same way.  In the
Aligned Reads Window, use the middle mouse button to swipe some bases
on the consensus in the Aligned Reads Window.  Up will pop a list of
tag types.  Click on one of them.  Try it again somewhere else.  Try
it with the tag type being 'comment'.  In this case, you must enter a
comment.  Notice the pretty colors!  If you forget what a particular
color means, you can click on the colored tag with the right mouse
button and it will tell you.

33)  Try creating some tags that overlap each other.  You will notice
that the overlapping region will be purple.  If you want to know which
tags overlap, you can click with the right mouse button on the purple
and you will be told all tags that are on that base.

34) If you have many tags that overlap and thus are purple, you can
hide some less relevant tag types so there is less purple and there is
less distraction.  Make sure you have a few tags visible.  Then click
on 'Find Main Win'.  In the Main Window, open the Options menu, and
release on 'Hide Some Tag Types'.  A list of tag types will popup.
Select the type that you have visible (above).  Then click 'OK'.  Go
back to the Aligned Reads Window.  That tag should still be visible.
Click on the button 'Some Tags' in the upper right part of the Aligned
Reads Window.  Your tag should disappear.  The 'Some Tags' button
should have changed to 'Sh All Tags'.  Click on it again.  Your tags
should have reappeared.


INCREMENTAL SEARCH FOR READ NAME

35) Restart consed.  Instead of clicking on a read or contig name,
type a read name into the 'Find read:' box.  Try typing djs74_2 You
will notice that as you type each letter, the first item in the list
that matches the letters typed will be highlighted.  Experiment with
deleting a few letters and typing others.  This is a powerful method
of quickly getting to the read name you are interested in.  When you
get to the read you want, just type carriage return or click the 'OK'
button.

ONLINE DOCUMENTATION

36)  On the Aligned Reads Window, click on the 'Help' menu and release
on 'Show Documentaton'.  You will see this document.


GOTO POSITION

37) In the Aligned Reads Window, click in the 'Pos:' box in the upper
right-hand corner.  Type in a number, such as 540, and push the
'Return' or 'Enter' key.  The Aligned Reads Window will scroll to
position 540.  We find this feature is particularly useful when one
person wants another person to look at something in the sequence.

HIGHLIGHTING READ NAMES

38)  In the Aligned Reads Window, click on a read name with the left
mouse button.  The name will turn magenta.  Click again and it will
turn yellow again.  Try turning it magenta and then scrolling.  This
feature is helpful in keeping track of a particular read as you scroll.

COMPLEMENTING THE CONTIG

39)  Push 'Comp Contig' in the Aligned Reads Window to complement the
contig.  This displays the opposite strand of the contig including the
consensus and all reads.  Push this button again to uncomplement it.


RECOVERY FROM CRASHES

40)  It is important to feel that your data is safe, even if the
computer (or consed) were to crash.  Consed will recover your data
from such a crash.

Make an edit (remember, edits are made in the Trace Window) and jot
down its location.  Also note the name of the ace file which is
displayed in the upper left box in the Aligned Reads Window.  Then
simulate a crash by going to the xterm where you started consed and
typing control-C.  Restart consed and select the same ace file you
noted (above).  A box will come up saying 'There is an edit history (a
.wrk file) Consed may have crashed during a previous session with this
same file.  Do you want to apply those edits?'  Click on 'yes'.  Go
and find the edits you made before consed crashed--you will find them.

This is the purpose of the .wrk files--they are a log file of your
edits and they are added to as you make edits.

41)  You should save your edits by pulling open the 'File' menu on the
Aligned Reads Window, and releasing on 'Save assembly'.

PROTEIN TRANSLATION AND OPEN READING FRAMES

42)  If you would like, you can see the amino acid translation of the
 consensus in all reading frames.  In the Aligned Reads Window, push
 down the left mouse button on the 'Misc' menu and release on 'Show
 Top Strand Protein Translation'.  Try again but this time release on
 'Show Bottom Strand Protein Translation'.  Notice that there are 2
 characters that are in magenta color.  What are those characters?
 Why are they made in a different color?  To not show the protein
 translation, push down the left mouse button on the 'Misc' menu and
 release on 'Don't show protein translation'.

43)  You can search for open reading frames within a contig.  In the
 Aligned Reads Window, push the left mouse button on 'Navigate' and
 release on 'Search for Open Reading Frames'.  Notice that the open
 reading frames are shown for all 6 reading frames and are sorted by
 length.


AUTOFINISH

44) cd to autofinish/edit_dir

45)  Try starting consed by typing:

../../consed -autofinish -ace standard.fasta.screen.ace.1

(Note 'consed' above may be 'consed_solaris', 'consed_alpha',
'consed_hp', 'consed_sgi', or 'consed_linux' depending on your
executable.  If you have trouble, use that 'ls' command (see above)! )

Consed will start by printing stuff like:

PARAMETERS {
! These parameters are printed in a form useful for cutting and
! pasting into your ~/.Xdefaults file for the purpose of modifying
! any of these.  Note that for consed to read your ~/.Xdefaults
! file, you must type: 
! xrdb -remove
!
consed.autoFinishMaxAcceptableErrorsPerMegabase: 100
!  (target error rate) 
consed.autoFinishCostOfUniversalPrimerSubcloneReaction: 20
!  (compares universal primer subclone reaction, custom primer subclone reaction, and custom primer clone reaction to decide which to favor) 
consed.autoFinishCostOfCustomPrimerSubcloneReaction: 60
!  (see above) 
.
.
.

If it ends with:

Run-time exception error; current exception: InputDataError
        No handler for exception.
Abort

that means that you have not followed the instructions under
'INSTALLING CONSED' below.  Please follow those instructions and then
try this again.

If you correctly installed consed, it will print out a list of
experiments you should do to make reads in order to reduce the number
of errors below a target threshold.  If you want to have this output
go to a file instead of to the screen, you can just instead type:

../../consed -autofinish -ace standard.fasta.screen.ace.1 >myfile

where myfile is the name of the file the output should go to.


This finishing tool is designed to be run in batch after each
assembly.  In a high throughput operation, the production people can
make these reads without anyone using consed to examine the assembly
interactively.  Only when consed -autofinish cannot help you any
longer (either it reduces the number of expected errors below your
error threshold, or it says it can't help you further), must you bring
up consed interactively and examine the assembly.

AUTOFINISH TARGET ERROR RATE

Now let's experiment with some of the autofinish options.  By default,
autofinish will suggest finishing reads until the error rate is less
than 100 errors per megabase.  Suppose you want fewer errors.  Fine:

46)  Create a file in edit_dir called .consedrc 
and put the following line in it:

consed.autoFinishMaxAcceptableErrorsPerMegabase: 10

Run autofinish again the same as before:

../../consed -autofinish -ace autofinish.fasta.screen.ace.1

You will notice two differences in the output:  First, near the top of
the autofinish output it will say:

consed.autoFinishMaxAcceptableErrorsPerMegabase: 10

whereas before it said:

consed.autoFinishMaxAcceptableErrorsPerMegabase: 100

A second difference is that this time it suggested an additional
experiment.

AUTOFINISH:  CHANGING COSTS

47)  Now please change it back to
consed.autoFinishMaxAcceptableErrorsPerMegabase: 100
or else just comment out the line by putting a '!' in the first column
like this:

!consed.autoFinishMaxAcceptableErrorsPerMegabase: 10


and run autofinish again:

../../consed -autofinish -ace autofinish.fasta.screen.ace.1

Check that it now says:

consed.autoFinishMaxAcceptableErrorsPerMegabase: 100

near the top of the autofinish output.

Notice that it calls 5 custom primer subclone sequencing reactions and
1 universal primer sequencing reaction.

Suppose you want to indicate that you prefer doing whole clone
sequencing reactions to subclone reactions (perhaps because you don't
want to try to find the M13 subclones.  You can do this by increasing
the relative cost of subclone sequencing reactions.  Put the following
in .consedrc

consed.autoFinishCostOfCustomPrimerSubcloneReaction: 200


And then run autofinish again:

../../consed -autofinish -ace autofinish.fasta.screen.ace.1

Check that it now says:

consed.autoFinishCostOfCustomPrimerSubcloneReaction: 200

near the top of the autofinish output.

Now you will notice that there are 5 whole clone custom primer
reactions, and just 1 subclone reaction and 1 universal primer
reaction.

AUTOFINISH:  CHANGING MELTING TEMPERATURES

48)  Look near the top of the autofinish output and you will see the
following lines:

consed.primersMinMeltingTemp: 50
consed.primersMaxMeltingTemp: 55

Some labs prefer to use primers with higher melting temperatures.  In
your .consedrc file, put the following lines:

consed.primersMinMeltingTemp: 55
consed.primersMaxMeltingTemp: 60

Then run autofinish again:

Check that it now says:

consed.primersMinMeltingTemp: 55
consed.primersMaxMeltingTemp: 60

near the top of the autofinish output.

Compare the first experiment from the last 2 autofinish runs.  The
primer changed from the 1st to the 2nd:

 cgttatctctactattggcttatt melting temp 53
tcgttatctctactattggcttatt melting temp 55

AUTOFINISH:  OTHER CONTROL

49)  Try adding to .consedrc the following:

consed.autoFinishCloseGaps: false

and run autofinish again.

What happened?


Another parameter that people sometimes change is:

consed.autoFinishMinNumberOfErrorsFixedByAnExp: 0.1

One finisher says that she prefers to set this at 0.5 errors and to
decrease:
consed.autoFinishMaxAcceptableErrorsPerMegabase: 1

This has the effect of making autofinish work hard to resolve every
region where errors are clustered tightly together, even if the total
error rate for the entire BAC is very low.

You can change any of the parameters listed at the top of the
autofinish output (or actually any of the more exhaustive list of
resources listed in the 'Info' menu, 'Show Consed Resources' list.)

We believe the defaults are an excellent starting point.


------------------------------------------------------------------------

INSTALLING CONSED


50) Follow the first few steps of USING CONSED GRAPHICALLY of the
 Quick Tour (above).  If you have problems, it may be due to your X
 emulator.  See 'MONITORS FOR CONSED' below.

51) The default locations for most of consed, phred, and phrap require
that there be a directory /usr/local/genome I strongly suggest you
make such a location--it will save you many headaches of trying to
customize scripts for other locations.  However, I know that you will
probably start off ignoring this advice, so just keep it in mind if
you get to all the headaches.  At that point you could then download a
fresh distribution of consed and start over, this time using
/usr/local/genome


52) Put the consed executable in /usr/local/genome/bin (or wherever you
like to keep consed).  Make sure this location is some place that is
certain to be in every user's PATH.

53)  Check this by logging on as a user and typing:

consed -V

You should see 'Version 8.0'.  If you see something else, you have
some debugging to do.

54)  Build phd2fasta:
Go to the misc/phd2fasta directory and type 'make'
Move the phd2fasta executable to /usr/local/genome/bin

55)  Build mktrace:
Got to the misc/mktrace/980701 directory and type 'make'
Move the mktrace executable to /usr/local/genome/bin

56)  Move all perl scripts from the scripts directory to
/usr/local/genome/bin 
Make sure all are executable (chmod a+x *)

57)  Get perl 5.  You can check where to get perl via the perl web
site:

    http://www.perl.com/perl/info/software.html


(If you don't know about perl, try it--it will save you a
huge amount of time over developing the same utilities in C, awk, or
csh or sh.)  


58)  From the misc subdirectory, copy primerCloneScreen.seq and
primerSubcloneScreen.seq to the directory
/usr/local/genome/lib/screenLibs
(You may have to create this directory.)

Take a look at these files.  They are dummy files indicating the fasta
format of the sequences that should be put in them.  You should put
into primerCloneScreen.seq the vector sequence of the cloning vectors
you are using (BAC or cosmid) and into primerSubcloneScreen.seq the
sequencing vectors you are using (plasmid, M13, etc).  Don't be too
generous in putting lots of vectors into the files!  The larger they
are, the slower primer picking will be.  Our files are:

-rw-r--r--   1 root     root       29938 Nov  7  1997 primerCloneScreen.seq
-rw-r--r--   1 root     root        7381 Aug 13  1997 primerSubcloneScreen.seq

and primer picking is quite fast enough.

Now that you have set this up, you should try step 36 in the Quick
Tour (above) to make sure this works.  Note that you should *not* do
the temporary step just prior to step 36.

59)  You should also create a file 

/usr/local/genome/lib/screenLibs/vector.seq

This contains all the vector that you want to mask out before
phrapping.  In general, it is the combination of primerCloneScreen.seq 
and primerSubcloneScreen


ADDING NEW READS


60)  It will make your life easier if phred, phrap, and crossmatch are
all where consed expects them:  in /usr/local/genome/bin

61)  Make sure that phred's parameter file is put:
/usr/local/etc/PhredPar/phredpar.dat

62) Next you should test the ADDING NEW READS step (step 33) in the
Quick Tour (above).  This step requires that everything be set up
correctly and in the correct location.  Hopefully the error messages
are clear enough to help you if you have set up anything incorrectly.

RUNNING PHREDPHRAP

Follow instructions 10 and 11 (above)

If you do not have the latest phrap (Aug 1998 or better), then your
phrap will not have the -new_ace option.  Thus edit the phredPhrap
script so that -new_ace is replaced by -ace  MAKE SURE YOU CHANGE THIS 
BACK AS SOON AS YOU GET THE NEW PHRAP!!!!  This is crucial for many
features in consed to work correctly--see NEW ACE FILE FORMAT (below)
for details.


63)  Make a copy of the standard dataset.  E.g.,

cp -r standard test
cd test

64)  Delete all the file in phd_dir and edit_dir

65)  cd edit_dir

66)  Run phredPhrap by typing

phredPhrap

That's it--you no longer need to type *any* arguments, and generally
you should not.  (Please do *not* use the -notags option any longer.)
If you want to add phrap options, you can do that: 

e.g.,

phredPhrap -forcelevel 3

Then run consed on the resulting ace file as indicated in step 1 of
the Quick Tour (above).  If you have any problems, this is the time to 
diagnose them before you use your own data.  

After you have done this successfully, you are ready to use your own
data.  

USING YOUR OWN DATA


67)  Create the following directory structure:

Directory structure:
    top level directory (generally named after the BAC or cosmid)
        subdirectory 'chromat_dir'--chromatigrams go in here
        subdirectory 'phd_dir'--phd files will automatically be put here
        subdirectory 'edit_dir'--ace files will automatically be put here

If you already have your chromatigrams somewhere else, you can make
chromat_dir be a link to wherever you have them.  

The various phrap and crossmatch files will be put into edit_dir by
the phredPhrap script.

68)  cd to the edit_dir directory, and type:

phredPhrap

If you are successful, the script will tell you so and you can bring
up consed on the ace file:

69)  Type:

consed

You should see a file with the extension .ace.1
Double click on it.

You should see a list of contigs.

Double click on the one you want to see.

Now you should see a big colorful alignment of your sequences.  Repeat 
some of the experimenting you did with the test data set above.


70) determineReadTypes.perl

Phrap, Consed's primer picking, and Consed/Autofinish all need the
following information for each read:
          is it a univeral primer forward, a universal primer reverse,  
             or a walking read?
          what is its template name?

Generally this information can be determined from the read name, using
*your* naming convention.  Modify the perl script
determineReadTypes.perl to put this information into the phd file
using WR{ info items.


USING NON-STANDARD LOCATIONS FOR FILES

You have a lot of work to do.  You will need to edit nearly every
script mentioned above.  In addition, you will need to make sure that
the CONSED_PARAMETERS environment variable is set for every user and
that the CONSED_PARAMETERS file points to the new locations for these files:

consed.primersSubcloneFullPathnameOfFileOfSequencesForScreening: /usr/local/genome/lib/screenLibs/primerSubcloneScreen.seq
consed.primersCloneFullPathnameOfFileOfSequencesForScreening: /usr/local/genome/lib/screenLibs/primerCloneScreen.seq
consed.primersBadTemplatesFile: badTemplates.txt
consed.fullPathnameOfAddReads2ConsedScript: /usr/local/genome/bin/addReads2Consed.perl
consed.fullPathnameOfCrossMatch: /usr/local/genome/bin/cross_match
consed.fullPathnameOfPhred: /usr/local/genome/bin/phred


As you can see, sticking with the defaults will make your life
easier--not just at installation, but even in day to day operations.


--------------------------------------------------------------------------
NOTE TO SGI USERS

In /usr/lib, there must be a file: libCsup.so

If you don't have this file, you must get it from SGI.  To get it, if
you are on Irix 6.2 through 6.4, request:

SG0001637 'C++ Exception handling patch for 7.00 (and above) compilers
on irix 6.2' (it's on the 'Development Options 7.1' CD).

If you are on Irix 5.3, install patch 1600

To make things easier for you, I've included my libCsup.so
This might save you having to get the patches above.


--------------------------------------------------------------------------

MONITORS AND MICE FOR CONSED

If your monitor is part of a Unix computer (a Sun, an HP, a DEC, an
SGI, or a Linux box) or is an Xterminal, then you will have absolutely
no problems.

You must have 3 button mouse or 3 button emulation.  3 Button
emulation is tricky since consed uses all 3 buttons of the mouse and
it also uses Control-Middle-Mouse-button, Shift-Middle-Mouse-Button
and Control-Right-Mouse-Button.  So if you are going to try to just
use a 2 button mouse (or, God-forbid, a 1 button mouse), you should
make sure that you can emulate each of those.

If your monitor is a PC running Windows or NT, then you must have an X
emulator installed and running.  X emulators include:  Exceed, XWin32,
Reflection X, and OpenNT.  Any of these will work if configured
correctly (and the 'correctly' is the key).  I encourage you to use
single window mode and then use a Unix window manager such as CDE,
fvwm, or mwm.

If your monitor is a MAC, then you must also have an X emulator, such
as Exodus or MACX installed and running.  You *must* use this emulator
in single window mode, and then use a Unix window manager such as CDE,
fvwm, or mwm.  (If you don't use single window mode, consed might
crash in some circumstances.)


--------------------------------------------------------------------------

PRIMER PICKING PARAMETERS


On the main window, click on 'Options'/'Primer Picking Preferences'
again.  A great deal of science and experimentation has gone into
setting these defaults and I suggest you do not change them.  However,
I know you will anyway, so now you know where to find them.

This is what they mean (I suggest you skip over this for now):

    PrimersNumberOfBasesToBackupToStartLooking
        Consed is designed for you to put the cursor on the left-most
        (or right-most) edge of a region that you want to cover with a
        new read.  Since the data quality immediately after an oligo
        is not good, you don't want the oligo immediately next to the
        region you want to cover, but rather a little bit back from
        it.  This parameter gives how far back.

    PrimersWindowSizeInLooking
        This is the width of the region in which consed looks for
        primers.  So if PrimersNumberOfBasesToBackupToStartLooking is
        50 and PrimersWindowSizeInLooking is 450, and you are looking
        for a forward primer, then the consed will look from 500 bases
        to the left of the cursor up to 50 bases to the left of the
        cursor.  If you are looking for a reverse primer, then consed
        will start looking 50 bases to the right of the cursor and
        continue until 500 bases to the right of the cursor.

    PrimersMinimumLengthOfAPrimer
    PrimersMaximumLengthOfAPrimer
        (just what they sound like)

    PrimersMaxInsertSizeOfASubclone
        When you click on forward or reverse primer/subclone template,
        consed knows that it is all right if it finds a primer that
        has an additional match to somewhere else in the assembly, as
        long as that location is not on the same subclone template you
        intend to use.  Consed uses this parameter to specify the
        range of the search for unacceptable additional matches.

    PrimersMinMeltingTemp
    PrimersMaxMeltingTemp
        Consed uses the nearest-neighbor (with salt concentration
        correction) formula, just as all modern primer picking
        programs do

    PrimersMaxSelfMatchScore
        In choosing a primer, you don't want the primer to bind to
        itself (form a hairpin) or bind to another copy of itself.  It
        is particularly bad if it binds to another copy at its 3' end.
        This parameter is used in the algorithm that tests this.

    PrimersMaxMatchElsewhereScore
        In choosing a primer, it is important that the primer not
        stick somewhere besides the place you are trying to get a
        read--a 'false match'.  This can cause a primer to fail even
        if the false match is not perfect.  The worst kind of false
        matches are those the extend to the 3' end of the primer, and
        worse yet if they have a high percentage of G/C matches since
        G and C bind more tightly than A and T.  The algorithm used
        here takes both of these effects into account.  This parameter
        sets the max acceptable false match.

    PrimersMinQuality
        Some primers fail because the primers don't match where they
        are supposed to.  This is because the sequence where the
        primer is supposed to stick isn't accurately known.  Thus it
        is important to be certain of the sequence where the primer is
        chosen from.  This parameter is an indication of this
        certainty--it is the min quality of every base in an
        acceptable primer.

    PrimersMaxLengthOfMononucleotideRepeat
        Folklore says that mononucleotide repeats are bad.  To please
        consed users, I've put this check in.

    Screen Primers Against Sequences in File?  True False
        It is important that the primers not stick to the vector of
        the template.  Thus you must provide consed with two files--a
        file in fasta format of all subclone vectors, and a file in
        fasta format of all clone vectors.  Consed will not accept any
        primer that has a match against the appropriate one of these
        vectors (depending on whether you click in the aligned reads
        window mouse button 3 on forward/reverse primer from subclone
        template or clone template).  A primer that has a false match
        to a vector is rejected if that false match has a score worse
        than PrimersMaxMatchElsewhereScore


You can also read about this in the consed paper:

Gordon, D., C. Abajian, and P. Green. 1998. Consed: A graphical tool
for sequence finishing. Genome Research. 8:195-202


----------------------------------------------------------------------------

FOR PROGRAMMERS AND FELLOW TRAVELLERS ONLY


CONSED VERSION

On the command line, type:

consed -v

This is particularly useful to system administrators to make sure the
latest version is installed on all computers.

CONSED CUSTOMIZATION

Click on the 'Info' menu on the Main Consed Window and release on menu 
item 'Show Consed Resources'.  This shows you what is available to be
changed by putting in your ~/.consedrc file.

Changes in ~/.consedrc only affect one user.  If you want to make a
change to affect all consed users on the system, put a file in some
central location (e.g., /usr/local/genome/lib/.consedrc ) and then
have every user set the the environment variable CONSED_PARAMETERS to
that location:

setenv CONSED_PARAMETERS /usr/local/genome/bli/.consedrc

Anything the user puts in ~/.consedrc will override whatever is in the
CONSED_PARAMETERS file.

You can also have different parameters for different projects.  Put a
.consedrc file in the edit_dir of a particular project.  When you are
working on that project, whatever is in that .consedrc will override
whatever is in your ~/.consedrc file or the  CONSED_PARAMETERS file.


COMPRESSING CHROMATOGRAMS

If you are interested in compressing your chrotogram files, go into
chromat_dir and gzip one of the chromatogram files.  Make sure that
gunzip is in /usr/local/bin   (You can change this location via the
consed resource

consed.gunzipFullPath: /usr/local/bin/gunzip

--see CONSED CUSTOMIZATION (above), but it will be easiest for 
you and your users if you just put gunzip in /usr/local/bin and not
have to bother with consed resources.)

Restart consed and bring up the corresponding trace.  You will notice
no appreciable delay.


CONSED -ACE

Try bringing up consed like this:

consed -ace (name of ace file)

This can be useful if you are going to have consed brought up from
some other program.


NO PHD FILES

Try bring up consed like this:

consed -nophd

This mode does not allow editing and does not show quality
information.  It allows you to view an assembly when you don't have
phd files or chromatigrams but you only have the ace file.  You will
not be able to see the quality information, since that information is
kept in the phd files.  I do not recommend nor support this option!


CUSTOM NAVIGATION

Take a look at the file
standard/edit_dir/custom_navigation.nav 
supplied with this distribution.  You should also experiment with the
custom navigation feature as explained under step 22 (above) in the
Quick Tour.  You may want to write programs that produce such files.


CREATING CUSTOM TAG TYPES


The following consed resources are available for creating custom tag
types:

consed.tagColorCustomTag1: 
consed.tagColorCustomTag2: 
consed.tagColorCustomTag3: 
consed.tagColorCustomTag4: 
consed.tagColorCustomTag5: 
consed.tagColorCustomTag6: 
consed.tagColorCustomTag7: 
consed.tagColorCustomTag8: 
consed.tagColorCustomTag9: 
consed.tagColorCustomTag10: 
consed.tagColorCustomTag11: 
consed.tagColorCustomTag12: 
consed.tagColorCustomTag13: 
consed.tagColorCustomTag14: 
consed.tagColorCustomTag15: 
consed.customTag1: 
consed.customTag2: 
consed.customTag3: 
consed.customTag4: 
consed.customTag5: 
consed.customTag6: 
consed.customTag7: 
consed.customTag8: 
consed.customTag9: 
consed.customTag10: 
consed.customTag11: 
consed.customTag12: 
consed.customTag13: 
consed.customTag14: 
consed.customTag15: 
consed.tagColorCustomConsensusTag1: 
consed.tagColorCustomConsensusTag2: 
consed.tagColorCustomConsensusTag3: 
consed.tagColorCustomConsensusTag4: 
consed.tagColorCustomConsensusTag5: 
consed.tagColorCustomConsensusTag6: 
consed.tagColorCustomConsensusTag7: 
consed.tagColorCustomConsensusTag8: 
consed.tagColorCustomConsensusTag9: 
consed.tagColorCustomConsensusTag10: 
consed.tagColorCustomConsensusTag11: 
consed.tagColorCustomConsensusTag12: 
consed.tagColorCustomConsensusTag13: 
consed.tagColorCustomConsensusTag14: 
consed.tagColorCustomConsensusTag15: 
consed.customConsensusTag1: 
consed.customConsensusTag2: 
consed.customConsensusTag3: 
consed.customConsensusTag4: 
consed.customConsensusTag5: 
consed.customConsensusTag6: 
consed.customConsensusTag7: 
consed.customConsensusTag8: 
consed.customConsensusTag9: 
consed.customConsensusTag10: 
consed.customConsensusTag11: 
consed.customConsensusTag12: 
consed.customConsensusTag13: 
consed.customConsensusTag14: 
consed.customConsensusTag15: 

When you create a custom tag type, you specify its name and the color
you want it displayed in.

For example:

consed.tagColorCustomTag1: SlateBlue2
consed.tagColorCustomTag2: SlateBlue2
consed.tagColorCustomTag3: SlateBlue2
consed.tagColorCustomTag4: brown
consed.tagColorCustomTag5: MediumPurple
consed.tagColorCustomTag6: purple
consed.customTag1: polymorphismInsertion
consed.customTag2: polymorphismDeletion
consed.customTag3: polymorphismSubstitution
consed.customTag4: qualityCoreComment
consed.customTag5: coordinatorApproval
consed.customTag6: coordinatorComment

(All of these tag types are read tag types.  Consensus tag types are
specified separately--see the consed resource names (above).)

Once you have done this, the user of consed can add tags of these
types in the method described in steps 31 through 33 of the Quick Tour 
(above).

You can also write external programs that add tags to the ace file
and/or the phd files.

CONTROL OF CONSED FROM SOME OTHER PROGRAM

Consed can be controlled by some other program.  For example, you
might have a program that displays mapping data and you would like the
user to be able to click on a location and have consed come up showing
the bases in that region.  This feature allows a programmer to do
this.


The external program can start up consed as follows:

consed -socket (local port number) -ace (ace filename)

For example,

consed -socket 5432 -ace standard.fasta.screen.ace

After consed completes coming up (including you clicking whether you
want to apply edits), you will see the message in the xterm:

success bind to local port number: 5432

And then you will see a file created by consed in the default
directory called consedSocketLocalPortNumber

This gives the port number of the Berkeley socket that consed has
opened and is listening on.  Thus your program can read this file and
create a connection to the Berkeley socket created by consed.

Once the connection is established, your program can send commands to
consed at that socket indicating to consed which contig to display and
what consensus position to scroll to.  Currently, the only acceptable
command is:

Scroll (contigname) (consensus position)<return>

Just send such a command to the Berkeley socket, and consed will
respond appropriately.


AUTOMATIC ORDERING OF OLIGOS

I heard of a finisher who manually ordered 72 oligos.  She had to
cut/paste the bases of each oligo.  That is not only painful, but also
error prone.  I've supplied you a script that you can use to
automatically determine which oligos have been newly requested since
the last order, aggregate them into a single order, and email the
request off.

The script is ace2Oligos.perl.  It takes as parameters the name of an
ace file and the name of the oligo file.  The oligo file is a list of
oligos that have been ordered for that particular project, and looks
like this:

name=G1980A181.1
sequence=ctgcatggctaggga
template=seq from subclone
date=980427 temp=52
 
name=G1980A181.2
sequence=tcttactttctgactttcattt
template=seq from clone
date=980427 temp=50

ace2Oligos.perl finds all oligo tags in the ace file and makes sure
that all of them are in this oligo file.

To automatically order oligos each night, there is an additional
script you will have to write.  I suggest that you run your script
each night under cron and that it do the following:

for each project, it will look for the most recent ace file.  It will
run ace2Oligos.perl on that ace file and direct the oligo file to be
in the parent directory of edit_dir, phd_dir, and chromat_dir for that
project.  Thus there will be one oligos file for each project.  Your
script will run ace2Oligos.perl once for each project.

Then your script would, for each project, look in the oligos file for
new oligos, and aggregate the unordered oligos into a central file,
which it would email to the oligo company.  If it finds any new oligos
in an oligo file, it draws a line at the bottom:

-------------------------------

which indicates that all oligos have been ordered.  When this script
looks at this file the next night, it uses this line to determine
whether any additional oligos have been requested since the previous
order.  (The idea of this line came from St Louis.)  Thus the oligos
file tells you which oligos have been ordered and which have not yet
been ordered.


CUSTOM NAVIGATION

In the Main Window, there is also a Navigate menu.  Pull it down and
release on the Custom Navigation menu item.  A box will popup saying
'Select custom navigation file:'  
There will be a file:
custom_navigation.nav
Double click on it.

You will see the now-familiar custom navigation box.  Click 'Next'
repeatedly until you get to the end of the list.

Consed doesn't write such a file--it just reads it.  This feature
allows you the ability to write your own programs that select
locations that you want your finishers to examine.  Your program
writes a file, the user reads that file into consed in this manner,
and you can go to each of the locations.


----------------------------------------------------------------------------

NEW ACE FILE FORMAT

There is a new ace file format.  You *must* change to the new ace file
format as soon as possible, since it contains information that is not
contained in the old ace file format.  This additional information
(e.g., the alignment and quality clipping values) are essential for
some of the consed functions (e.g., navigate by single stranded,
navigate by single subclone, autofinish) to work correctly.

Another reason to switch to the new ace format is that you will get
faster consed startup performance.  The new ace file format is also
much smaller (about 60% as big as the old).

The new phrap (Aug 1998 and better) writes the new ace format (using
the -new_ace switch).  Since consed now uses the additional
information found only in the new ace format, if you are editing an
assembly, you should first re-phrap to take advantage of this
additional information.

Consed can read either old or new ace format.
Consed can also write either new or old ace format.  It write the new
ace format by default--see 'Options'/'General Preferences'.  Also see
the consed resource:

consed.writeThisAceFormat: 2

(where 2 means 'new' and 1 means 'old')

If you have scripts that read the ace file, you will need to modify
those scripts for the new ace format.  Here is the format:

Ace File Format

Refer to the accompanying sample_ace_file.txt (below)

AS <number of contigs> <total number of reads in ace file>

CO <contig name> <# of bases> <# of reads in contig> <# of base segments in contig> <U or C>

The U or C indicates whether the contig has been complemented from the
way phrap originally created it.  Thus this is always U for an ace
file created by phrap.

BQ

This starts the list of base qualities for the unpadded consensus
bases.  The contig is the one from the previous CO, hence no name is
needed here.

AF <read name> <C or U> <padded start consensus position>

This line replaces the 'AssembledFrom*' line in the previous ace file
format.  C or U means complemented or uncomplemented.  The <read name>
is the true read name (no .comp on it as with the previous ace file
format.)

BS <padded start consensus position> <padded end consensus position> <read name>

This replaces the 'BaseSegment*' line from the previous ace file format.

RD <read name> <# of padded bases> <# of whole read info items> <# of read tags>

QA <qual clipping start> <qual clipping end> <align clipping start> <align clipping end>

This is new information not found in the previous ace file.  If the
entire read is low quality, then <qual clipping start> and <qual
clipping end> will both be -1.  These positions are offsets from the
left end of the read (left, as shown in consed).  Hence for bottom
strand reads, the offsets are from the end of the read.  The offsets
are 1-based.  That is, if the left-most base is in the aligned,
high-quality region, <qual clipping start> = 1 and <align clipping
start> = 1 (not zero).

DS CHROMAT_FILE: <name of chromat file> PHD_FILE: <name of phd file> TIME: <date/time of the phd file>

This is replaces the DESCRIPTION line from the old ace file.

The following is for whole read info items.  These are not fully
implemented, and the format may eventually change.  The read is
implied by the location of the whole read info item within the ace
file.  They are found after the DS line for a read.

WR {
<tag type> <what program created tag> <date when tag was created in form YYMMDD:HHMISS>
}

The following is for transient read tags (those generated by
crossmatch and phrap).  They are not fully implemented, and the format
may eventually change.  The read is implied by the location of the
whole read info item within the ace file.  They are found after the WR
lines for a read.

RT{
<tag type> <what program created tag> <padded cons pos start> <padded cons pos end> <date when tag was created in form YYMMDD:HHMISS>
}

There are consensus tags now in the ace file.  All consensus tags have
the following format:

CT{
<contig name> <tag type> <what program created tag> <padded cons pos start> <padded cons pos end> <date when tag was created in form YYMMDD>
(possibly additional information)
}

 
In the case of most consensus tag types, there is only 1 line for the
consensus tag.  In the case of comment tags and oligo tags, there are
additional lines of information.  The comment tag includes the comment
on the additional lines.  The oligo tag has the following information:
<oligo name> <oligo bases from 5' to 3'> <melting temp> <C or U
indicating whether the oligo is top strand or bottom strand relative
to the orientation of the contig as created by phrap>

WA{
<tag type> <what program created tag> <date tag was created in form YYMMDD:HHMISS>
1 or more lines of data
}

This line is a 'whole assembly' tag.  It is used for information
referring to the assembly as a whole.  Currently, phrap puts its
version and phrap command line options in a WA tag.


----------------------------------------------------------------------------

ADVANCED PHRAP/CONSED USAGE


70)  BACKING OUT EDITS AFTER YOU HAVE SAVED THE ASSEMBLY

If you decide that all your edits are terrible and you want to start
over (perhaps you have been training a new finisher), the cleanest
solution is to delete everything in phd_dir and edit_dir , but leave
everything in chromat_dir and just run
phredPhrap again.  


71)  SELECTIVELY BACKING OUT EDITS AND REMOVING READS

If you want to back out all edits in just particular reads, I have
provided a perl script to do this:


revertToUneditedRead (read name)

What it does it copy the .phd.1 to 1 greater than the highest
version.  

Then you must reassemble using the phredPhrap script to create an ace
file that has no edits for that particular read.  It will have all
edits for all other reads. 

Why doesn't it just delete all phd files except for the
.phd.1?  In that case, consed could not read any previous ace file
since all previous versions of ace files would refer to phd files that 
have been deleted.

72)  REMOVING READS FROM AN ASSEMBLY

Create a file containing the filename of all the reads you want to
remove, one filename per line.
Then use the perl script

removeReads  <file of filenames>

Then reassemble using the phredPhrap script.


73)  ADDING READS WITHOUT CHROMATOGRAM FILES

This may happen if you, for example, download sequence from Genbank
and want to assemble it along with your reads.  

There are 2 ways to do this, depending on whether you want to edit the 
read or not.  

a)  If you want to edit the read, run mktrace to produce a fake trace.  It 
will have all perfect peaks.  

Run:

mktrace (name of file with fasta sequence)

Then run the phredPhrap script normally.  You will be able to bring up 
the traces in consed and edit the read.

b)  If it is not important to edit the reads, there is a method that
is a little faster.  Create just a fake phd file using:

fasta2Phd.perl (name of file with fasta sequence)


It will create a file whose name is taken from the fasta file name:
for example, if the fasta filename is Contig1.fasta, then the phd file
will be called Contig1.phd.1 The fasta name in the file is ignored.
You can then put this in the phd_dir, and reassemble using the
phredPhrap script.


74)  WHY ARE ALL THE READS NOT IN THE ASSEMBLY?

You will notice that there are some contigs that contain only one
read.  You will also notice that there are some reads that are not
shown by consed at all, since phrap did not put them into the ace
file.  Why?

If a read does not have a significant match (with Smith-Waterman score
exceeding minscore) to any other read, that read is not included in
the ace file.  Instead, that read is put in the '.singlets' file.
That read will not appear in consed.

If a read does have a significant match to any other read, then it
will appear in the ace file and be shown by consed.  However, such a
read might have other problems: it might not be possible to assemble
such a read with other reads (in the case of EST's this read may be a
unique representative of a particular gene (or a genomic sequence
contaminant) that happens to contain an Alu repeat and thus happens to
match other reads in the data set; or it may represent the only read
of a particular alternatively spliced form; or it may have data
anomalies of some sort (chimeras, etc.).  Such a read would end up in
a contig all of its own.


75) VIEWING THE CHROMATOGRAM OF SINGLETS OR NON-ASSEMBLED READS


If you have a chromatogram, you can use consed to view it, even if it
hasn't been assembled into the ace file.  This is common with cDNA
assemblies in which the reads don't overlap and thus phrap doesn't put 
them together into a contig.

To do this, make the same edit_dir, phd_dir,
and chromat_dir as above, put the chromatogram into chromat_dir, run
phred on it to generate the phd file which goes into phd_dir.

Then go to edit_dir and run:

phd2Ace.perl (name of phd file)

For example, if your phd file is myRead.phd.1
from edit_dir, type:

phd2Ace.perl myRead.phd.1

This will produce myRead.ace

Then just start consed normally:
consed -ace myRead.ace
and you can view the chromatogram.


MULTIPLE TRACE POPUP

76) Bring up dataset standard.  In the aligned reads window, scroll to
a region that has many reads and that has some discrepancies--try
position 1162.  Hold down the shift key, and click with the middle
mouse button on the consensus.  At this location 3 traces will
popup--these are the 2 highest quality traces that agree with the
consensus (on each strand) and the highest quality trace that
disagrees with the consensus.  This feature is useful in areas of high
coverage when you want to rapidly examine just the most significant
traces rather than looking at all of them.


MAXIMUM NUMBER OF TRACES DISPLAYED

77) Bring up dataset standard.  Scroll to position 1162.  Bring up 4
reads and then try bringing up additional reads.You will notice that
new reads are put at the top of the stack of traces and, once there
are 4 traces displayed, traces are automatically removed from the
bottom of the stack.  If you want to change this maximum number of
traces to something besides 4, you can do that: In the Main Consed
Window (click on 'Find Main Win' on the aligned reads window), pull
down the 'Options' menu, and release on 'General Preferences'.  Try
changing the 'Max Number of Traces Shown' to 3.  Then click 'Apply and
Dismiss'.  Now dismiss the Trace Window and again start adding
additional traces to the trace window.  You will notice that now the
number of traces shown will not exceed 3.

SEEING THE NUMERIC VALUE OF QUALITY

78) Click on a base of one of the reads.  Look in the xterm (the window
from which you started consed--you may have to move consed's windows
out of the way to see it).  You will see 'quality = ' the numeric
value of the quality and 'cons pos = ' the consensus position.  Click
on the consensus base.  You will similarly see its quality.  There are
situations in which you really want to see the numeric value of the
quality, rather than just the greyscale background.


HOTKEYS FOR EDITING

79)  If you do a lot of editing, you will want to have a faster method
of doing these edits than having the popup and selecting an option.
Thus the following hot keys exist:


    < and > (less than and greater than) to make n's to the left
        and the right (respectively) of the cursor
    control-l and control-r to make low quality to the left and
        the right (respectively) of the cursor
    overstriking with a capital letter (e.g., C instead of c) causes
        the base to become high quality rather than low quality
    overstriking with a lower case letter causes the base to become
        low quality

Give these a try.

80) Now go to the menu labelled 'color', and pulldown and release on
'color means match'.

Now you notice different colors:  The
colors have the following meaning:

    Blue:   agrees with consensus
    Orange: disagrees with consensus
    Yellow: this stretch of this read was used to form the consensus
    Grey:   Low quality or unaligned ends of reads 

Now go back to the colormode 'color means quality and tags' (the
default) for the next exercise. 

(The other colormodes will mean more to you later.)


ALPHABETICAL ORDERING OF READS

81)  The reads can be ordered in two ways:

	a) alphabetically
	b) first all the top strand reads and then all the bottom
		strand reads.  The top strand reads are then ordered
		by the left end of the reads.  Same with the bottom
		strand reads.

Try changing between a) and b).  In the Main Consed Window (click on
'Find Main Win' on the aligned reads window if you can't find the Main
Consed Window because it is covered up with other windows), pull down
the 'Options' menu, and release on 'General Preferences'.  Find
'Display reads sorted alphabetically or by strand/left end of read.'
Switch it between 'alpha' and 'strand'.  Then click 'Apply and
Dismiss'.  Notice the effect in the aligned reads window.  Many
polymorphism and mutation detection labs find that alphabetically
sorting is most useful, while many genomic sequencing labs find that
sorting by strand/left end of read is most useful.


SCROLLING TRACES INDEPENDENTLY

82) Dismiss all of your trace windows.  Then popup traces for 2
different reads in approximately the same location.  Scroll one of
them.  You may want to scroll by clicking the arrows or clicking to
the left or right of the thumb.  You will notice that both will
scroll.  Consed will do its best to have corresponding peak lined up.
(Consed can't line all of them up because the peak spacing is not
uniform and differs from read to read.)  Try removing a trace by
clicking on one of the 'Remove' buttons in the Trace Window.  Try
adding other traces.  Then click on 'No' for scrolling the traces
together and try scrolling.  You will now observe that they scroll
separately.


----------------------------------------------------------------------------

WHAT IS NEW IN CONSED 8.0


This section is mainly intended for advanced consed users.  Novice
users should consult the Quick Tour which is provided by clicking
'help' in consed or in the README.txt file downloaded with consed.

For more information, consult the README.txt file.

---------------------------------------------------------------------
Autofinish Improvements 

    Autofinish has now proven itself--it is successfully in use in the
    Genome Center in Seattle.  It is planned for installation at
    several other major sites around the country.  It has allowed the
    same human finishers to handle many times as many BACs in the same
    length of time as without autofinish.  Some BACs are completely
    finished by autofinish and submitted without any human decisions
    and without any editing.
    
    Here are the latest improvements to consed/autofinish:

    Single subclone regions (region covered by a single template) are
    now covered.

    Templates are extensively checked:

        Vector is detected and thus the actual starting/ending
        locations of the insert is detected.  This prevents walking
        into vector.

        Must not have an unaligned high quality region (longer than a
        threshold) nor high quality discrepancies.  This helps with
        locating templates that are misassembled or have deletions.
        
        Walking reads and whole clone reads are recognized (assuming
        you correctly modified the script determineReadTypes.perl) and
        thus do not indicate the start of a template.  This also
        prevents walking into vector.
        
        All existing reads from this template are checked for
        consistency. This allows consed to find misassemblies,
        tracking errors, and mislabelled subclones, thus decreasing
        the failure rate due to picking the wrong template.  This
        information also helps in determining the insert size.

    Tries to close gaps by walking and by resequencing with universal
    primer terminator reads.  Gap closing experiments must extend into
    a gap a minimum number of bases to be considered (by default, this
    number is 30).

    Tries to flank gaps by calling universal primer reverses

    Clone ends (BAC, cosmid) are detected and autofinish will not
    extend into them

    You can turn off particular types of reactions (such as BAC
    sequencing reactions) if you don't want autofinish to call them.

    Previously, consed/autofinish had to be run from someone's monitor
    since it needed to open a display.  Now when you run
    consed/autofinish, you can run this from a batch job (typically
    the same job that runs phrap) without it being on anyones
    terminal.

    Reports inconsistent fwd/rev read pairs.

    Contigs are excluded if their depth of coverage is out of line (likely
    contamination)   ("Out of line" means more than twice the depth of
    coverage of the largest contig.)

---------------------------------------------------------------------
Tear (split) a contig


Tell Phrap Not To Overlap Reads Discrepant At This Location
     Has been made more powerful

Add new reads
    If a read doesn't match well enough to go into the
    assembly, it is put into a contig by itself.  (This is an option.)

---------------------------------------------------------------------
Features of particular interest to people doing polymorphism detection
and/or cDNA assemblies:

Integration with POLYPHRED
    You can now bring up ALL traces at once (in a scrolling window) at a
    particular location.  You can also have this feature on when
    navigating to consensus locations.  Since POLYPHRED tags the
    consensus with polymorphism tags, you can navigate to those
    locations and conveniently determine if each site is a real
    polymorphism.

Reads can be put into alphabetical order
    This is in addition to the sorting based on top strand/bottom
    strand and left clone end position.

Show Protein Translation 
     You can see this (if you like) in the Aligned
     Reads Window in all 6 reading frames

Find Open Reading Frames
        
---------------------------------------------------------------------
Add read name to a file with options
    Saves your last used options

Windows raise when you want to see them
    When you navigate to a location, the window raises.
    When you use Compare Contigs and click on the second contig, the 
         Compare Contigs Window raises to the top.

Exporting part of the consensus and exporting quality
   Consed can write a part of the consensus, rather than the whole
   consensus.  It can write just the bases, or it can write both the
   bases and the quality values.  
     
Aesthetic improvements for very large assemblies (over 10,000 reads)

---------------------------------------------------------------------
For programmers only:

Whole read items are now implemented
    Users can see these tags by clicking in the Aligned Reads Window
    on the read name with the right mouse button.

Comment read tags are allowed in the ace file (RT tags)

Consed parameters
    Now it is much easier to set/change consed resources.  If you make
    a typo, consed will tell you.  You can set project-specific
    resources by putting a file .consedrc into the same directory with
    the ace file.  You can also set system-wide consed resources with
    the environment variable CONSED_PARAMETERS  You can set
    user-specific resources in the file ~/.consedrc   You no longer
    have to do xrdb -remove.  

Consed is now tolerant of missing or corrupted phd files
    Consed handles the missing phd file by reporting the error, making
    the read all quality 0 (dark), and not allowing you to pop up
    the trace or edit the read.


Consed DEC alpha users:  
    Type: uname -sr 
    and see what it says.  If anyone is still on the old OSF1 V3.2, let
    me know since I am considering dropping support for it.  If it
    says OSF1 V4.0, don't worry--I'll continue to support that.  You
    have to rev up to at least V4.0 for Y2K compliance.

Consed Sun users: 
   Type: uname -sr 
   and see what it says.  If anyone is still on the old 'SunOS 5.4',
   consed will still probably work, but I'm not guaranteeing anything.
   It it says 'SunOS 5.5.1' or 'SunOS 5.6' or 'SunOS 5.7', etc. don't
   worry--I'll    continue to support those.

Consed has been thoroughly tested by many, many users and all
reproducible bugs have been fixed.  But if you can find and reproduce
one, let me know.