Compiled by Sherwood Casjens, Daniel Haft, Jeremy Peterson, Brian Stevenson and Claire Fraser
Last modified on Feb. 3, 2000.
This document is also available as Macintosh Microsoft WORD 5.1 and OFFICE 98 (WORD98) document from Sherwood Casjens.
Please send corrections, additions, comments, etc. to sherwood.casjens@hci.utah.edu.
Table of Contents
|
The Pseudo-, Questionable, and Short Genes on the B31 Plasmids |
|
|
Ambiguous Nucleotides in the B. burgdorferi B31 Genome Sequence |
|
PURPOSE
This document contains a number of tables which cross-annotate the current
knowledge of the B. burgdorferi B31 genome in various ways. We hope that
this cross-referencing will allow readers to browse through the information
profitably, and that it will allow them to become familiar with what is not
known as well as what is known about this genome. Major conclusions from this
analysis are published in Fraser et al. (1997) and Casjens et al.
(2000)
ORGANIZATION
In each section of this document plasmids are listed with circular plasmids ascending in number (approximate size) followed by linear plasmids ascending in number as follows:
cp9, cp26, cp32-1, cp32-3, cp32-4, cp32-6, cp32-7, cp32-8, cp32-9,
lp5, lp17, lp21, lp25, lp28-1, lp28-2, lp28-3, lp28-4, lp36, lp38, lp54,
lp56
OPEN READING FRAMES and PREDICTED GENES
Throughout this document we use the words "gene"
and "protein" advisedly to mean putative gene and putative
protein that has been predicted from the nucleotide sequence. Since little
molecular biology has been done with these organisms, nearly all of the "genes"
in this document are currently only identified as open reading frames.
GENBANK ACCESSION NUMBERS and GENE NAME PREFIXES
The B. burgdorferi B31 chromosome and plasmid sequences are available at the TIGR Borrelia web site or from GENBANK. The accession numbers from GENBANK and gene name prefixes are as follows (as reported in Fraser et al. (1997) and Casjens et al. (2000):
|
Replicon |
Accession # |
gene name prefix |
|
Chromosome |
AE000788 |
BB0 (BBzero) |
|
cp9 |
AE000791 |
BBC |
|
cp26 |
AE000792 |
BBB |
|
cp32-1 |
AE001575 |
BBP |
|
cp32-3 |
AE001576 |
BBS |
|
cp32-4 |
AE001577 |
BBR |
|
cp32-6 |
AE001578 |
BBM |
|
cp32-7 |
AE001579 |
BBO (BB"oh") |
|
cp32-8 |
AE001580 |
BBL |
|
cp32-9 |
AE001581 |
BBN |
|
lp5 |
AE001583 |
BBT |
|
lp17 |
AE000793 |
BBD |
|
lp21 |
AE001582 |
BBU |
|
lp25 |
AE000785 |
BBE |
|
lp28-1 |
AE000794 |
BBF |
|
lp28-2 |
AE000786 |
BBG |
|
lp28-3 |
AE000784 |
BBH |
|
lp28-4 |
AE000789 |
BBI |
|
lp36 |
AE000788 |
BBK |
|
lp38 |
AE000787 |
BBJ |
|
lp54 |
AE000790 |
BBA |
|
lp56 |
AE001584 |
BBQ |
B31 Plasmid Open Reading Frame Summary
Sherwood Casjens - 1999
ALL B31 PLASMIDS
898 total gene-like entities. Among these gene-like entities are the following:
836 genes (which are not "questionable") + pseudogenes
167 pseudogenes (+ about 10 others that have marginal similarity to "intact" genes)
62 "questionable" genes (29 in-frame fragments of larger pseudogenes; 33 ²300 bp genes inside a larger pseudogene in another frame and short genes that were not called in paralogous sequence elsewhere on the plasmids).
669 "intact" genes (which are not "questionable")
39 convincing similarity hits to genes of known function outside of Borrelia among plasmid genes
16 convincing similarity hits to genes of unknown function outside of Borrelia among plasmid genes
535 "intact" genes >300 bp (which are not "questionable")
134 "intact" genes ²300 bp (which are not "questionable")
472 genes (which are not "questionable") have a paralog (it may not be intact)
197 genes (which are not "questionable") have no paralog (63 of these are >300 bp and 134 are ²300 bp)
98 plasmid gene-like entities that encode potential lipoproteins
90 intact plasmid genes that encode potential lipoproteins
7 gene-like entities that we defined as pseudogenes have translation start codons that could possibly lead to expression of lipoproteins that are truncated relative to their paralogs
32 intact plasmid genes that are below but close to our lipidation cutoff
162 paralogous gene families, 107 of which have plasmid-borne members
9 paralogous gene families encode only predicted lipoproteins
17 paralogous gene families are heterogeneous in that at least 1 potential LP and at least one non-LP is found in the family
THE "LOW PSEUDOGENE" or "WELL BEHAVED" B31 PLASMIDS
These plasmids are: cp9, cp26, all seven of the cp32s, lp28-2, lp54 and the cp32-like portion of lp56
498 gene-like entities on the "well behaved" plasmids on which apparent protein-encoding genes occupy >70% of the DNA.
9 "questionable" genes (all are ²300 bp genes inside a larger pseudogene in another frame or short genes that were not called in paralogous sequence elsewhere on the plasmids).
489 genes (which are not "questionable") + pseudogenes
22 pseudogenes
467 genes (which are not "questionable")
420 genes >300 bp (which are not "questionable")
47 genes ²300 bp (which are not "questionable")
54 genes that encode potential lipoproteins
12 genes that are below but close to our lipidation cutoff
23 convincing matches to genes of known function outside of Borrelia among plasmid genes (which are not "questionable")
13 convincing matches to genes of unknown function outside of Borrelia among plasmid genes (which are not "questionable")
THE "HIGH PSEUDOGENE" or "NOT YET AMMELIORATED" B31 PLASMIDS
These plasmids are: lp5, lp17, lp21, lp25, lp28-1, lp28-3, lp28-4, lp36, lp38, lp56 and the non-cp32-like portion of lp56
400 gene-like entities on the "bad" plasmids on which apparent protein-encoding genes occupy <75% of the DNA.
53 "questionable" genes (29 in-frame fragments of larger pseudogenes; 24 ²300 bp genes inside a larger pseudogene in another frame and short genes that were not called in paralogous sequence elsewhere on the plasmids).
347 genes (which are not "questionable") + pseudogenes
145 pseudogenes
202 genes (which are not "questionable")
115 genes >300 bp (which are not "questionable")
87 genes ²300 bp (which are not "questionable")
37 genes that encode potential lipoproteins
5 genes that are below but close to our lipidation cutoff
16 convincing matches to genes of known function outside of Borrelia among plasmid genes (which are not "questionable")
3 convincing matches to genes of unknown function outside of Borrelia among plasmid genes (which are not "questionable")
Annotated B. burgdorferi B31 Plasmid Gene List
Compiled by Sherwood Casjens, Dan Haft and Jeremy Peterson - April 1999
Definitions for Gene List
Note that these definitions are NOT necessarily absolutely identical to those used in the other published gene lists and maps for B. burgdorferi or on the TIGR WEB site. In particular we have an expanded definition of "pseudogene" that includes truncated members of paralogous gene families.
Putative genes and gene names column lists all the putative "gene-like entities" - genes and pseudogenes - currently recognized in the twenty-one B. burgdorferi B31 plasmids. We tentatively interpret those genes not indicated to be pseudogenes to be intact and potentially functional, but since the functionality of most Borrelia genes is unknown this may not be true. The gene and plasmid names used here are those used in Fraser et al. (1997) and Casjens et al. (2000). Of course any given putative pseudo-, questionable, short, fragmented or frameshifted genes could in principle have an important function, but it seems likely that a substantial fraction of them are not functional.
Daggers mark computer-recognized ORFs that are an in-frame and part of a larger pseudogene entity. To avoid counting the entity twice, these were ignored when compiling gene and pseudogene numbers in Casjens et al. (2000).
Coordinates - these columns list the positions of the 5 and 3 ends of the gene or pseudogene on the sequence of the relevant plasmid.
Database hit outside Borrelia indicates all similarities to non-Borrelia sequences in the extant database as of January 1999. The criteria for inclusion in the list are those of the TIGR protocol, which uses BLAST (Altschul et al., 1997), and alignments can be found on the TIGR Borrelia WEB page. A search using EMOTIF (Nevill-Manning et al., 1998) did not find any additional convincing B31 plasmid gene similarities to previously known genes.
Common name column gives gene names previously used in the literature. If it was previously named in a strain other than B31, the Borrelia strain is given in parentheses. In addition, we and others have suggested more specific, clarifying common names for genes currently under study in the following paralogous families: mlp [family 113], bdr [80], rev [63] and erp [162/163/164] genes.
Paralog family column indicates the family of paralogous genes (homologs within B. burgdorferi B31) to which individual genes belong. A complete list of genes and pseudogenes in each of these paralogous gene families can be found in PART II of this document.
Comments Column
N-terminal lipidation consensus refers to genes whose products are most likely to be lipoproteins.
Near-consensus N-terminal lipidation signal refers to genes whose products may be lipoproteins, but whose N-terminal amino acid sequences did not quite meet the arbitrary cutoff that we set for criteria for inclusion in the "probable lipoproteins" category.
See PART III of this document for a discussion of the strategies used to identifiy possible lipoprotein encoding genes.
Authentic frameshift genes contain one or a few simple frameshifts relative to their paralogs. It is unlikely that these are actually expressed by programmed frameshifting mechanisms, since they usually do not contain the expected translationally "slippery" sequences. The TIGR computer uses this term for damaged genes (hence it currently replaces "pseudogene" in some parts of the TIGR Borrelia web page). These considered to be pseudogenes in this analysis (Casjens et al., 2000).
Authentic point mutation gene has an in-frame stop codon relative to its paralogs. These are considered to be pseudogenes in this analysis.
Gene fragments or truncated genes are substantially shorter than other members of their paralogous families. Some of these could be expressed and have a function, although they are included in the pseudogene category for ease of discussion in this analysis and to point out that they are truncated.
Pseudogenes are regions of DNA that are similar in sequence to a paralogous Borrelia gene or to a gene from another organism, but which are obviously truncated and/or do not have full open reading frames relative to those homologs. These mostly appear to be mutationally damaged genes - they include "authentic frameshift", "authentic point mutation", fused and truncated genes. These pseudogenes often contain multiple frameshifts, deletions, insertions and inversions (see Casjens et al., 2000).
Exceptions to this definition of a pseudogene are the 15 silent vlsE cassettes on lp28-1; these are not damaged are apparently "designed" to be a reservoir of antigenic variation for the vlsE protein. They are pseudogenes in that they are incomplete relative to the expressed vlsE gene and are probably not expressed themselves.
Of course the gene fragments whose reading frames are intact, that we include in this category for ease of discussion, could in fact be expressed and if so could perform a function. Nonetheless such fragments are very unusual in prokaryotes, and given the other evidence for many rearrangements in the B31 plasmids (Casjens et al., 2000) it seems likely that many, if not all of such fragments, may no longer have a biological function.
See PART IV of this document for a complete list of pseudogenes and the reasons why each is so classified.
"Questionable genes" were called by TIGRs standard gene recognition protocol, but there is reason to suspect they may be spurious calls. For example, "computer-called genes" that are inside another gene or pseudogene and small genes that were not called in paralogous sequence elsewhere in the Borrelia sequence. Those marked with daggers () are inside of larger pseudogenes, but which were nonetheless called as genes by the TIGR protocol.
See PART IV of this document for a complete list of questionable genes and the reasons why each is so classified.
Short genes are <300 bp in length but ARE NOT in the "questionable" or "pseudogene" categories. The Borrelia plasmids have an inordinately large fraction of called genes that are <300 bp in length. These are often not tightly packed and fall into regions that contain no larger genes. Of course any given putative short gene could in principle be functional, but it seems likely that a substantial fraction of them are not functional
See PART IV of this document for a complete list of short plasmid "genes".
Putative functions were deduced in most cases from homologies to genes of known function.
WE EMPHASIZE ONE MORE TIME! Any given putative pseudo-, questionable, short, fragmented or frameshifted gene (as we have defined them) could in principle be functional. But it seems likely that a substantial fraction of them are not functional. We use the above pseudogene definitions only as terms to describe relevant features of the B31 plasmid genes, not to imply functionality in any specific cases.
A Complete B. burgdorferi B31 Plasmid Gene List
|
Putative Gene |
5end |
3end |
Database hit outside Borrelia {organism of best database hit} |
Common Name |
Paralog Family |
Comments/References |
|
cp9 |
A homolog of cp9, called cp8.3 from B. garinii strain Ip21 was completely sequenced by (Dunn et al., 1994) |
|||||
|
BBC01 |
163 |
1269 |
57 |
|||
|
BBC02 |
1282 |
1836 |
50 |
|||
|
BBC03 |
1892 |
2449 |
49 |
|||
|
BBC04 |
2700 |
2593 |
short gene |
|||
|
BBC05 |
2804 |
3709 |
161 |
|||
|
BBC06 |
4377 |
3856 |
eppA |
95 |
exported protein (Champion et al., 1994) |
|
|
BBC07 |
4788 |
4507 |
short gene |
|||
|
BBC08 |
5534 |
5977 |
55 |
|||
|
(BBC09) |
Does not exist; erroneously present in original gene list and map in figure 2 of Fraser et al. (1997) |
|||||
|
BBC10 |
6808 |
6284 |
63 |
N-terminal lipidation consensus |
||
|
BBC11 |
6974 |
7768 |
96 |
|||
|
BBC12 |
9203 |
7914 |
165 |
|||
|
cp26 |
Homolog of cp26 present in essentially all isolates (e.g., Tilly et al., 1997) |
|||||
|
BBB01 |
16 |
321 |
conserved hypothetical protein {Escherichia coli} |
weak similarity to acylphosphatase |
||
|
BBB02 |
751 |
311 |
||||
|
BBB03 |
2186 |
840 |
weak (Y-BLAST) similarity to phage N15 gene 29 |
The protein encoded by this gene has weak similarity to the putative "protelomerase" encoded by gene 29 of phage N15 ( Ravin et al., in preparation). Circumstantial evidence suggests this N15 protein is responsible for hairpoin end formation in the N15 prophage plasmid. |
||
|
BBB04 |
3807 |
2479 |
PTS system, cellobiose-specific IIC component (celB) {Bacillus stearothermophilus} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB05 |
4084 |
4428 |
PTS system, cellobiose-specific IIA component (celC) {Bacillus subtilis} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB06 |
4440 |
4754 |
PTS system, cellobiose-specific IIB component (celA) {Bacillus subtilis} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB07 |
4769 |
5863 |
||||
|
BBB08 |
6517 |
5891 |
N-terminal lipidation consensus |
|||
|
BBB09 |
6677 |
7711 |
N-terminal lipidation consensus |
|||
|
BBB10 |
7836 |
8762 |
62 |
|||
|
BBB11 |
8781 |
9296 |
50 |
|||
|
BBB12 |
9275 |
10033 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBB13 |
10104 |
10649 |
49 |
|||
|
BBB14 |
11417 |
10923 |
N-terminal lipidation consensus |
|||
|
BBB15 |
11636 |
11737 |
short gene |
|||
|
BBB16 |
12014 |
13603 |
oligopeptide ABC transporter, periplasmic oligopeptide-binding protein {Escherichia coli} |
oppAIV |
37 |
N-terminal lipidation consensus, not surface exposed, and not essential in culture (Bono et al., 1998) |
|
BBB17 |
15107 |
13896 |
IMP dehydrogenase {Haemophilus influenzae} |
guaA |
IMP dehydrogenase (Margolis et al., 1994b; Zhou et al., 1997) |
|
|
BBB18 |
16718 |
15135 |
GMP synthase {Haemophilus influenzae} |
guaB |
putative GMP synthase Margolis et al., 1994b) erroneous duplication in cp26 between BBB18 and BBB19 corrected in current gene list; affected originally released gene coordinates to right of BB18 |
|
|
BBB19 |
16903 |
17532 |
ospC |
surface localized (Wilske et al., 1993), N-terminal lipidation consensus (Fuchs et al., 1992; Jauris-Heipke et al., 1993; Jauris-Heipke et al., 1995; Marconi et al., 1993c; Margolis et al., 1994a; Margolis et al., 1994b; Masuzawa et al., 1997; Stevenson and Barthold, 1994; Stevenson et al., 1994; Tilly et al., 1997; Wang et al., 1999; Wilske et al., 1996a; Wilske et al., 1996b); transcription start site (Marconi et al., 1993b); temperature regulation (Schwan et al., 1995; Stevenson et al., 1995) |
||
|
BBB20 |
17733 |
17626 |
short gene |
|||
|
BBB21 |
17750 |
17842 |
short gene |
|||
|
BBB22 |
19321 |
17969 |
conserved hypothetical protein MJ0326 {Methanococcus jannaschii} |
94 |
12 putative membrane spanning regions; homologs in E. coli |
|
|
BBB23 |
20822 |
19434 |
conserved hypothetical protein MJ0326 {Methanococcus jannaschii} |
94 |
12 putative membrane spanning regions; homologs in E. coli |
|
|
BBB24 |
21364 |
20861 |
near-consensus N-terminal lipidation signal |
|||
|
BBB25 |
21851 |
21342 |
N-terminal lipidation consensus |
|||
|
BBB26 |
21898 |
22590 |
||||
|
BBB27 |
23154 |
22606 |
N-terminal lipidation consensus |
|||
|
BBB28 |
23255 |
24496 |
||||
|
BBB29 |
24825 |
26450 |
PTS system, maltose and glucose-specific IIABC component (malX) {Escherichia coli} |
16 |
putative sugar transport |
|
|
cp32-1 |
||||||
|
BBP01 |
66 |
1286 |
146 |
|||
|
BBP02 |
1306 |
1995 |
147 |
|||
|
BBP03 |
2011 |
2565 |
148 |
|||
|
BBP04 |
2575 |
3336 |
148 |
|||
|
BBP05 |
3369 |
3938 |
148 |
|||
|
BBP06 |
3948 |
4919 |
149 |
(Casjens et al., 1997) |
||
|
BBP07 |
4936 |
5394 |
150 |
|||
|
BBP08 |
5379 |
5777 |
107 |
|||
|
BBP09 |
5768 |
6154 |
108 |
|||
|
BBP10 |
6154 |
6717 |
151 |
|||
|
BBP11 |
6701 |
7810 |
152 |
|||
|
BBP12 |
7828 |
8253 |
153 |
|||
|
BBP13 |
8272 |
8724 |
154 |
|||
|
BBP14 |
8724 |
8957 |
155 |
short gene |
||
|
BBP15 |
8968 |
10239 |
156 |
|||
|
BBP16 |
10265 |
10945 |
157 |
|||
|
BBP17 |
10952 |
11899 |
159 |
|||
|
BBP18 |
11920 |
12462 |
160 |
|||
|
BBP19 |
12495 |
12824 |
139 |
|||
|
BBP20 |
12824 |
13696 |
140 |
|||
|
BBP21 |
13709 |
14311 |
141 |
|||
|
BBP22 |
14324 |
15136 |
142 |
|||
|
BBP23 |
15215 |
15415 |
orfA-1; blyA-1 |
109 |
putative hemolysin; short gene; sequenced for homologous plasmids in strain 297 by Porcella et al. (1996) |
|
|
BBP24 |
15422 |
15766 |
orfB; blyB-1 |
111 |
putative hemolysin; sequenced for homologous plasmids in strain 297 by Porcella et al. (1996) |
|
|
BBP25 |
15759 |
16091 |
orfC |
112 |
(Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al. (1996) |
|
|
BBP26 |
16081 |
16437 |
orfD |
143 |
(Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al. (1996); near-consensus N-terminal lipidation signal but strain 297 homolog was not ipidated in E. coli. |
|
|
BBP27 |
17060 |
16581 |
rev-1 |
63 |
N-terminal lipidation consensus (Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al.(1996) |
|
|
BBP28 |
17232 |
17675 |
mlpA |
113 |
N-terminal lipidation consensus (Gilmore and Mbow, 1998); sequenced in several homologous plasmids of strain 297 by Porcella et al. (1996); lipidated in E. coli (Porcella et al., 1996); paralog lipidated in B. afzelii Theisen (1996) |
|
|
BBP29 |
18728 |
17718 |
orf4-1 |
161 |
(Gilmore and Mbow, 1998) |
|
|
BBP30 |
19114 |
20211 |
orf1-1 |
57 |
(Zuckert and Meyer, 1996) |
|
|
BBP31 |
20224 |
20787 |
orf2-1 |
50 |
(Zuckert and Meyer, 1996) |
|
|
BBP32 |
20766 |
21503 |
plasmid partition protein {Bacillus subtilis} |
orfC-1 |
32 |
putative plasmid partition function (Zuckert and Meyer, 1996) |
|
BBP33 |
21510 |
22115 |
orf3-1 |
49 |
(Zuckert and Meyer, 1996) |
|
|
BBP34 |
22131 |
22760 |
bdrA |
80 |
contains 4.7 repeats of a 54 bp sequence; all "bdr" genes contain direct, tandem repeats (Casjens et al., 1999; Zuckert and Meyer, 1996) |
|
|
BBP35 |
23231 |
24553 |
orf8/7-1 |
165 |
(Casjens et al., 1997; Zuckert and Meyer, 1996) |
|
|
BBP36 |
24609 |
25031 |
orf10-1 |
144 |
(Casjens et al., 1997) |
|
|
BBP37 |
25816 |
25043 |
orf6-1 |
96 |
(Casjens et al., 1997) |
|
|
BBP38 |
26235 |
26765 |
erpA |
162 |
surface exposed (Lam et al., 1994); N-terminal lipidation consensus (Stevenson et al., 1996); lipidated in E. coli (Akins et al., 1995b; Wallich et al., 1995); erp-like genes have been sequenced from several other strains (Akins et al., 1999; Lam et al., 1994; Marconi et al., 1996b; Stevenson et al., 1997; Suk et al., 1995) |
|
|
BBP39 |
26796 |
27929 |
erpB |
163 |
N-terminal lipidation consensus (Stevenson et al., 1996) |
|
|
BBP40 |
28074 |
28652 |
114 |
|||
|
BBP41 |
28835 |
29398 |
115 |
|||
|
BBP42 |
29398 |
30747 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
(Amouriaux et al., 1993; Casjens et al., 1997); phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-3 |
||||||
|
BBS01 |
66 |
1286 |
146 |
|||
|
BBS02 |
1306 |
1995 |
147 |
|||
|
BBS03 |
2011 |
2565 |
148 |
|||
|
BBS04 |
2575 |
3336 |
148 |
|||
|
BBS05 |
3369 |
3938 |
148 |
|||
|
BBS06 |
3963 |
4919 |
149 |
(Casjens et al., 1997) |
||
|
BBS07 |
4936 |
5394 |
150 |
|||
|
BBS08 |
5379 |
5777 |
107 |
|||
|
BBS09 |
5768 |
6154 |
108 |
|||
|
BBS10 |
6154 |
6717 |
151 |
|||
|
BBS11 |
6701 |
7810 |
152 |
|||
|
BBS12 |
7828 |
8253 |
153 |
|||
|
BBS13 |
8272 |
8724 |
154 |
|||
|
BBS14 |
8724 |
8957 |
155 |
short gene |
||
|
BBS15 |
8968 |
10239 |
156 |
|||
|
BBS16 |
10265 |
10945 |
157 |
|||
|
BBS17 |
10952 |
11899 |
159 |
|||
|
BBS18 |
11920 |
12462 |
160 |
|||
|
BBS19 |
12495 |
12824 |
139 |
|||
|
BBS20 |
12824 |
13696 |
140 |
|||
|
BBS21 |
13709 |
14311 |
141 |
|||
|
BBS22 |
14324 |
15133 |
142 |
|||
|
BBS23 |
15212 |
15412 |
blyA-3 |
109 |
putative hemolysin; short gene |
|
|
BBS24 |
15419 |
15763 |
blyB-3 |
111 |
putative hemolysin; |
|
|
BBS25 |
15756 |
16088 |
112 |
|||
|
BBS26 |
16078 |
16434 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBS27 |
16586 |
16900 |
||||
|
BBS28 |
16915 |
17046 |
short gene |
|||
|
BBS29 |
17068 |
17694 |
bdrF |
80 |
contains 3.6 repeats of a 33 bp sequence |
|
|
BBS30 |
17803 |
18246 |
mlpC |
113 |
N-terminal lipidation consensus |
|
|
BBS31 |
19159 |
18290 |
orf4-3 |
161 |
(Zuckert and Meyer, 1996) |
|
|
BBS32 |
19198 |
19392 |
conserved hypothetical protein {Chlorella vulgaris}(similarity poor) |
questionable gene; gene not called in paralogous sequence on other cp32s |
||
|
BBS33 |
19605 |
20702 |
orf1-3 |
57 |
(Zuckert and Meyer, 1996) |
|
|
BBS34 |
20715 |
21278 |
50 |
|||
|
BBS35 |
21257 |
21994 |
plasmid partition protein {Bacillus subtilis} |
orfC-3 |
32 |
putative plasmid partition function; (Stevenson et al., 1998b) |
|
BBS36 |
22038 |
22577 |
orf3-3 |
49 |
(Stevenson et al., 1998b) |
|
|
BBS37 |
22593 |
23180 |
bdrE |
80 |
contains 4.1 repeats of a 54 bp sequence |
|
|
BBS38 |
23649 |
25013 |
orf8/7-3 |
165 |
(Casjens et al., 1997) |
|
|
BBS39 |
25069 |
25491 |
orf10-3 |
144 |
(Casjens et al., 1997) |
|
|
BBS40 |
26276 |
25503 |
orf6-3 |
96 |
(Casjens et al., 1997) |
|
|
BBS41 |
26708 |
27295 |
erpG; pG |
164 |
N-terminal lipidation consensus; (Stevenson et al., 1996; Wallich et al., 1995) |
|
|
BBS42 |
27410 |
27916 |
bapA |
95 |
(Stevenson et al., 1996; Wallich et al., 1995) |
|
|
BBS43 |
28067 |
28246 |
short gene |
|||
|
BBS44 |
28236 |
28871 |
115 |
|||
|
BBS45 |
28871 |
30220 |
conserved hypothetical protein Orf26 of phage f01205 {Streptococcus thermophilus} |
145 |
(Amouriaux et al., 1993; Casjens et al., 1997); phage f01205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-4 |
||||||
|
BBR01 |
66 |
1286 |
146 |
|||
|
BBR02 |
1306 |
1998 |
147 |
pseudogene; authentic frameshift |
||
|
BBR03 |
2013 |
2573 |
148 |
|||
|
BBR04 |
2580 |
3344 |
148 |
|||
|
BBR05 |
3340 |
3948 |
orfI |
148 |
(Casjens et al., 1997) |
|
|
BBR06 |
3958 |
4929 |
orfII |
149 |
(Casjens et al., 1997) |
|
|
BBR07 |
4952 |
5404 |
orfIII |
150 |
(Casjens et al., 1997) |
|
|
BBR08 |
5389 |
5787 |
orfIV |
107 |
(Casjens et al., 1997) |
|
|
BBR09 |
5778 |
6164 |
orfV |
108 |
(Casjens et al., 1997) |
|
|
BBR10 |
6164 |
6727 |
151 |
|||
|
BBR11 |
6711 |
7820 |
152 |
|||
|
BBR12 |
7838 |
8263 |
153 |
|||
|
BBR13 |
8282 |
8734 |
154 |
|||
|
BBR14 |
8734 |
8967 |
155 |
short gene |
||
|
BBR15 |
8978 |
10270 |
156 |
|||
|
BBR16 |
10296 |
10889 |
157 |
|||
|
BBR17 |
10896 |
11843 |
159 |
|||
|
BBR18 |
11864 |
12415 |
160 |
|||
|
BBR19 |
12448 |
12777 |
139 |
|||
|
BBR20 |
12777 |
13649 |
140 |
|||
|
BBR21 |
13662 |
14264 |
141 |
|||
|
BBR22 |
14277 |
15089 |
142 |
|||
|
BBR23 |
15167 |
15367 |
blyA-4 |
109 |
putative hemolysin; short gene |
|
|
BBR24 |
15374 |
15718 |
blyB-4 |
111 |
putative hemolysin |
|
|
BBR25 |
15711 |
16043 |
112 |
|||
|
BBR26 |
16033 |
16389 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBR27 |
16467 |
16994 |
bdrH |
80 |
sequenced in homologous plasmids of strain 297 by Porcella et al. (1996) and in B. afzelii by Theisen (1996) |
|
|
BBR28 |
17103 |
17522 |
mlpD |
113 |
N-terminal lipidation consensus |
|
|
BBR29 |
18664 |
17576 |
161 |
|||
|
BBR30 |
18829 |
18737 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBR31 |
18960 |
20054 |
57 |
|||
|
BBR32 |
20067 |
20630 |
50 |
|||
|
BBR33 |
20609 |
21361 |
plasmid partition protein {Bacillus subtilis} |
orfC-4 |
32 |
putative plasmid partition function (Stevenson et al., 1998b) |
|
BBR34 |
21415 |
21957 |
orf3-4 |
49 |
(Stevenson et al., 1998b) |
|
|
BBR35 |
21974 |
22249 |
bdrG |
80 |
authentic point mutation; has an in-frame stop codon |
|
|
BBR36 |
22831 |
24153 |
165 |
|||
|
BBR37 |
24210 |
24632 |
orf10-4 |
144 |
(Casjens et al., 1997) |
|
|
BBR38 |
25435 |
24644 |
orf6-4 |
96 |
(Casjens et al., 1997); sequence from strain N40 - assession # AF011453 |
|
|
BBR39 |
25636 |
25538 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBR40 |
25865 |
25966 |
erpH |
162 |
pseudogene; severely truncated relative to other erps; N-terminal lipidation consensus (Stevenson et al., 1996) |
|
|
BBR41 |
26077 |
26817 |
161/ 162 |
pseudogene; this is a "fusion" gene - a family [161] gene is fused to an [162] erp gene |
||
|
BBR42 |
26853 |
27524 |
erpY |
164 |
N-terminal lipidation consensus |
|
|
BBR43 |
27634 |
28200 |
114 |
|||
|
BBR44 |
28384 |
28947 |
115 |
|||
|
BBR45 |
28947 |
30296 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
homolog of phage Streptococcus thermophilus fO1205 gene orf26 that is likely to be phage structural protein; (Amouriaux et al., 1993; Casjens et al., 1997) |
|
|
cp32-6 |
||||||
|
BBM01 |
66 |
1286 |
146 |
|||
|
BBM02 |
1306 |
1995 |
147 |
|||
|
BBM03 |
2010 |
2570 |
148 |
|||
|
BBM04 |
2577 |
3341 |
148 |
|||
|
BBM05 |
3337 |
3945 |
148 |
|||
|
BBM06 |
3955 |
4926 |
149 |
|||
|
BBM07 |
4949 |
5401 |
150 |
|||
|
BBM08 |
5386 |
5784 |
107 |
|||
|
BBM09 |
5775 |
6161 |
108 |
|||
|
BBM10 |
6161 |
6727 |
151 |
|||
|
BBM11 |
6711 |
7820 |
152 |
|||
|
BBM12 |
7838 |
8263 |
153 |
|||
|
BBM13 |
8282 |
8734 |
154 |
|||
|
BBM14 |
8734 |
8967 |
155 |
short gene |
||
|
BBM15 |
8978 |
10249 |
156 |
|||
|
BBM16 |
10275 |
10955 |
157 |
|||
|
BBM17 |
10962 |
11909 |
159 |
|||
|
BBM18 |
11930 |
12481 |
160 |
|||
|
BBM19 |
12514 |
12843 |
139 |
|||
|
BBM20 |
12843 |
13715 |
140 |
|||
|
BBM21 |
13728 |
14330 |
141 |
|||
|
BBM22 |
14343 |
15152 |
142 |
|||
|
BBM23 |
15231 |
15431 |
blyA-6 |
109 |
putative hemolysin; short gene |
|
|
BBM24 |
15438 |
15782 |
blyB-6 |
111 |
putative hemolysin |
|
|
BBM25 |
15775 |
16107 |
112 |
|||
|
BBM26 |
16097 |
16453 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBM27 |
17075 |
16596 |
rev-6 |
63 |
N-terminal lipidation consensus |
|
|
BBM28 |
17247 |
17693 |
mlpF |
113 |
N-terminal lipidation consensus |
|
|
BBM29 |
18680 |
17736 |
161 |
|||
|
BBM30 |
19069 |
20166 |
57 |
|||
|
BBM31 |
20179 |
20742 |
50 |
|||
|
BBM32 |
20721 |
21467 |
plasmid partition protein {Bacillus subtilis} |
orfC-6 |
32 |
putative plasmid partition (Stevenson et al., 1998b) |
|
BBM33 |
21520 |
22095 |
orf3-6 |
49 |
(Stevenson et al., 1998b) |
|
|
BBM34 |
22102 |
22767 |
bdrK |
80 |
||
|
BBM35 |
23241 |
24563 |
165 |
|||
|
BBM36 |
24619 |
25041 |
144 |
|||
|
BBM37 |
25820 |
25053 |
96 |
only [96] member with signal sequence |
||
|
BBM38 |
26245 |
27012 |
erpK |
164 |
N-terminal lipidation consensus; (Casjens et al., 1997) |
|
|
BBM39 |
27745 |
27080 |
||||
|
BBM40 |
27731 |
27850 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBM41 |
27923 |
28486 |
115 |
|||
|
BBM42 |
28486 |
29835 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein; (Amouriaux et al., 1993; Casjens et al., 1997) |
|
|
cp32-7 |
||||||
|
BBO01 |
65 |
1285 |
146 |
|||
|
BBO02 |
1305 |
1994 |
147 |
|||
|
BBO03 |
2010 |
2564 |
148 |
|||
|
BBO04 |
2574 |
3335 |
148 |
|||
|
BBO05 |
3368 |
3937 |
148 |
|||
|
BBO06 |
3962 |
4918 |
149 |
|||
|
BBO07 |
4935 |
5393 |
150 |
|||
|
BBO08 |
5378 |
5776 |
107 |
|||
|
BBO09 |
5767 |
6153 |
108 |
|||
|
BBO10 |
6153 |
6719 |
151 |
|||
|
BBO11 |
6703 |
7812 |
152 |
|||
|
BBO12 |
7830 |
8255 |
153 |
|||
|
BBO13 |
8274 |
8726 |
154 |
|||
|
BBO14 |
8726 |
8959 |
155 |
short gene |
||
|
BBO15 |
8970 |
10301 |
156 |
|||
|
BBO16 |
10317 |
10955 |
157 |
|||
|
BBO17 |
10962 |
11900 |
159 |
|||
|
BBO18 |
11904 |
12470 |
160 |
|||
|
BBO19 |
12503 |
12832 |
139 |
|||
|
BBO20 |
12832 |
13707 |
140 |
|||
|
BBO21 |
13716 |
14318 |
141 |
|||
|
BBO22 |
14331 |
15143 |
142 |
|||
|
BBO23 |
15222 |
15422 |
blyA-7 |
109 |
putative hemolysin; short gene |
|
|
BBO24 |
15429 |
15782 |
blyB-7 |
111 |
putative hemolysin |
|
|
BBO25 |
15766 |
16098 |
112 |
|||
|
BBO26 |
16088 |
16444 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBO27 |
16522 |
17136 |
bdrN |
80 |
||
|
BBO28 |
17245 |
17664 |
mlpG |
113 |
N-terminal lipidation consensus |
|
|
BBO29 |
18770 |
17715 |