Two Positively Regulated Systems, ara and mal
Chapter
83
ROBERT SCHLEIF
This review (May 1994) briefly mentions several topics concerning the genetics and physiology of the arabinose and maltose operons, but its main focus is the mechanisms of gene regulation found in these two systems. More extensive information on the background, physiology, and genetics of the arabinose and maltose systems is available in the first edition of this work (52a). The main reason for the current focus on the mechanisms of gene regulation is that most of the research published in the past 8 years and most of the progress that has been made in understanding the ara and mal systems has been in this area. In particular, significant progress in developing and verifying a model for regulation of the arabinose operon that explains a wide variety of perplexing observations has recently been made. Further, the central parts of this regulatory mechanism seem likely to be found in the regulation of many other genes.
Induction in the arabinose systems appears to result from the positioning of an activation domain of the regulatory protein, AraC, at the –35 region of the promoter. When the domain is placed at this position, transcription activation occurs (64), presumably as a result of direct AraC-RNA polymerase interactions. At ara pBAD , the positioning occurs when a subunit of the protein dissociates from a distal binding site, thereby opening a DNA loop, and rebinds in the –35 region (48). DNA looping seems not to be involved in the regulation of the other arabinose genes (34, 42, 79); instead, an arabinose-induced increase in the affinity of AraC protein for its DNA site probably leads the protein to bind to the site in the promoter and to activate transcription (32).
At each of the promoters induced by maltose, the activator protein of the maltose operons, MalT, also binds to a site overlapping the –35 region. In vitro DNA binding by MalT requires maltose (65). These two facts lead to the expectations that the MalT-binding sites in the promoters are occupied only in the presence of maltose and that when these sites are occupied, the protein stimulates initiation of transcription.
The ara and mal regulatory proteins share no significant sequence similarity, and yet both probably share the same basic mechanism of activating transcription. Many other regulatory proteins also contain binding sites in the –35 region of the promoters they regulate (13). Because thousands of genes need to be regulated in cells and there cannot be thousands of different ways to interact with RNA polymerase and activate transcription, many of these proteins are likely to share the same basic regulation schemes.
Escherichia coli can take up and catabolize the pentose l-arabinose (38, 53, 70). Figure 1 shows the arabinose-inducible genes and their map locations. The arabinose-specific proteins encoded by these genes are AraC, the arabinose-responsive transcription activator protein (20); AraE, a protein conferring a low-affinity, high-capacity arabinose uptake system on the cells (43, 79); AraFGH, three proteins specifying a high-affinity but low-capacity arabinose transport system (2, 37, 38, 70); and AraBAD, the three arabinose-inducible enzymes that convert l-arabinose to d-xylulose-5-phosphate, an intermediate in the pentose phosphate shunt (19, 25, 44, 46, 55, 56). An additional arabinose- inducible protein, AraJ, is known, but its induction is weak, the protein is not involved in either of the two arabinose uptake systems, and cells lacking the protein display no observable phenotype (63). Figure 2 shows the binding positions of the proteins known to regulate the five arabinose promoters.
Genetic and physiological experiments by Englesberg et al. indicated that the l-arabinose araBAD operon is positively regulated by AraC (20, 21, 74, 75). That is, AraC protein must be present and active to induce the synthesis of AraBAD enzymes. At the time of their work, these results stood in sharp contrast to the behavior of the somewhat better understood negatively regulated genes of the lactose operon and the early genes of lambda phage. Consequently, the scientific community had some doubt that positive regulation really existed. Eventually, definitive in vitro coupled transcription-translation and in vitro transcription experiments excluded any possibility that the system was negatively regulated in a tricky way that superficially appeared to be positive regulation (27, 47).
In the interval between the demonstration of negative regulation of lac and the discovery of the sigma subunit of RNA polymerase, mechanisms for the positive regulation of gene expression were difficult to imagine. The discovery of the sigma subunit raised the possibility that promoters of some genes might be inactive except in the presence of a protein that acted like a gene-specific sigma subunit. Much interest therefore developed in determining the biochemical mechanism of positive regulation of araBAD. Subsequently, the discovery of other positively regulated genes and operons served only to increase interest in mechanisms of positive regulation.
Well after the initial experiments indicating that ara was positively regulated, a form of negative regulation was also exposed in araBAD (22). As indicated in Fig. 3, a deletion, Δ2, ending upstream of the araBAD genes and their promoter, pBAD, left the genes fully inducible by AraC protein in the presence of arabinose but with an AraC-dependent elevated basal level. In the absence of AraC protein, the uninduced level of pBAD activity was approximately the same as that in wild-type cells, but in the presence of AraC and the absence of arabinose, the basal level in the deletion strain was increased up to 30-fold. Somehow, in the absence of arabinose, AraC was able to induce, albeit weakly, pBAD only in the strain deleted of the critical site between Δ1 and Δ2. One way of looking at this phenomenon is to say that in the wild-type strain in the absence of arabinose, some of the AraC is spontaneously in an inducing state but that its induction of pBAD is repressed by the presence of other AraC protein only if the critical DNA site is present. This site lies upstream of all the sequences required for normal arabinose induction of the promoter. Because such behavior appeared to be a form of repression from upstream, considerable effort was expended in verifying this unusual result. Eventually, a number of additional deletions with similar properties were isolated and mapped (71, 73). Their existence then led to the discovery that AraC protein simultaneously binds to a site within the promoter and to an upstream site, thereby forming a DNA loop (51).
Upon arabinose addition to growing E. coli, transcription initiation at pBAD begins within 5 s (35). This rapid response suggests that few steps are required for induction and thus that arabinose is the true inducer of the ara operons. In vitro experiments have directly demonstrated the same result (32, 47). AraC responds to arabinose in an in vitro system containing only purified AraC protein, RNA polymerase, and DNA (47). Also, the presence of arabinose increases the affinity of AraC protein for DNA by a factor of about 50 (32). The induction kinetics of the arabinose transport genes has not been carefully measured.
The synthesis of AraC protein itself also responds to arabinose (6, 30). Immediately following arabinose addition, AraC protein synthesis increases approximately 10-fold for about 10 min. Then its synthesis rate falls back to the prearabinose value. The biological reason for this transient derepression is unknown, but the DNA loop mechanism described below explains the phenomenon.
An estimation of the affinity of AraC protein for the sugar arabinose based on induction as a function of arabinose concentration gave the initially surprising result of a dissociation constant on the order of 0.01 M (17). Such a weak binding might be expected from the high concentration of arabinose that likely exists in cells growing on the sugar. Judging from the binding constant of the first enzyme of the pathway (55) and the rate at which arabinose must be catabolized, cells growing on arabinose may possess an intracellular arabinose concentration as high as 0.1 M.
AraC protein was first reliably assayed by making use of its activity to stimulate AraB synthesis in a coupled transcription-translation system (27). Despite solubility and stability problems leading to the recovery of only a tiny fraction of the total AraC protein synthesized, 2- to 50-mg amounts of the protein may now be obtained at greater than 90% purity from 100 g of cells (72). The protein is a dimer of 30,000-molecular-mass subunits (78, 85). The DNA-binding ability of the protein can now be conveniently assayed with the DNA migration retardation assay (32). This assay requires that the amounts of protein-DNA complex not change significantly as the sample leaves the binding buffer and enters the acrylamide gel and the electrophoresis buffer. Extensive studies have shown that binding experiments can be done without artifact and that biochemically meaningful results can be obtained with the assay (32).
In contrast to many prokaryotic DNA-binding sites, which consist of two 5- to 10-bp inverted-repeat sequences, the binding site for AraC ideally consists of two 17-bp half-sites in a direct-repeat orientation (3, 5, 33, 45). Thus, each of the half-sites contains two major groove regions, and a dimer of the protein can contact four adjacent major groove regions of DNA. Initial DNase and dimethyl sulfate protection experiments with AraC revealed contacts to only three of the major grooves at araI. In retrospect, the fourth major groove region apparently binds AraC particularly weakly (3, 32).
Two basic sets of questions are raised by the binding sites of AraC. The first set relates to contact with the DNA. Does the dimeric AraC protein really contact four adjacent major grooves of DNA? If so, then each monomer must contact two major grooves, making use of two separate contacting structures. Does the protein contact the DNA by making use of any of the familiar DNA-contacting motifs such as the helix-turn-helix? What are the amino acids in AraC that contact the DNA? The second set of questions relates to the orientation of the half-sites. Most half-sites of DNA-binding proteins are inverted repeats, reflecting the face-to-face oligomerization symmetry of the protein. Can AraC polymerize indefinitely in a head-to-tail fashion? If it does not polymerize indefinitely, why not?
Examination of the sequence of AraC protein for homology to known DNA-binding motifs reveals two potential helix-turn-helix regions (Fig. 4) (3). Neither region shows high homology to the consensus sequence, however. In the first region, a cysteine is found in a position at which a glycine is normally found. In the second region, the pattern of hydrophobic amino acids in the second potential helix region differs sharply from that observed in authentic helix-turn-helix proteins.
Direct biochemical probing has been performed to locate amino acids in AraC that actually do contact the DNA. In these experiments, the contacting bases are identified as those that when chemically removed by depurination or depyrimidination or damaged by hydroxyl radical weaken binding by the protein (3). All bases can be probed at once by randomly modifying the DNA, adding protein which can bind to those DNA molecules that have not been modified in a contacted base, separating bound and free species, and quantitating on a sequencing gel the amount of modified base at each position in the protein-bound and free DNA species. After identification of potentially contacted bases, an amino acid can then be tested for DNA contact by changing the amino acid to alanine. Because alanine is smaller than most of the amino acids which contact DNA, if the amino acid does contact a base, the alanine-substituted protein should lack that specific contact. A base thus contacted by a particular amino acid can be identified. After a candidate base for contact has been identified, the base-amino acid interaction must be confirmed by changing the base in the DNA and showing that the wild-type protein is sensitive to the identity of the base but that the AraC with the alanine modification is not sensitive.
The missing-contact method for identifying amino acid-base contacts depends upon substantial structural independence among amino acids of the contacting region. If an amino acid is altered or if a base of the DNA is altered, the method works if the structure and DNA contacts of the remainder of the protein remain normal. If altering an amino acid or base in the DNA globally changes the structure, as when the bottom card is removed from a house of cards, then the method cannot work. Thus far in the control experiments, the method has worked satisfactorily.
Within the first potential helix-turn-helix region in AraC, the missing-contact methods described above identified residues 2 and 6 of the second helix as contacting bases in the araI site (3). If the region actually were in a helix-turn-helix, these two amino acids would be in a good position to make contact with DNA. Although these experiments cannot prove that the region possesses a helix-turn-helix conformation, they indicate that it is highly probable that it does. Similar missing-contact probing of the second helix-turn-helix region found no evidence of contacts by residues 1, 2, and 6. Thus, it seems unlikely that this region forms a helix-turn-helix structure and contacts DNA as most such structures do. Therefore, discussions based on sequence similarities between this region and actual helix-turn-helix regions are of dubious value.
As mentioned above, the half-sites for AraC binding are arranged as direct repeats (3, 5, 33, 45, 54). If AraC protein monomers also possess the same head-to-tail symmetry, it appears that the protein could polymerize indefinitely. Since this situation seems dangerous biologically, the symmetry was examined more carefully (5). It was found that only a dimer of AraC would bind to three direct repeats of the sequence consensus half-site, I1, and that two dimers bound independently to DNA containing four direct repeats of the half-site (5). Additionally, AraC could bind to half-sites that were in either a direct-repeat orientation or an inverted-repeat orientation (Table 1). Finally, AraC could also bind tightly to two repeated I1 half-sites whose separation was increased by the insertion of 10 or 21 bp; these sites were designated I1-10-I1 and I1-21-I1. These findings could most easily be explained if AraC possesses DNA-binding domains loosely connected to dimerization domains. Such a structure would permit the protein to be dimerized by a self-limiting face-to-face interaction between dimerization domains and at the same time would permit flexibility in the orientation and positioning of the DNA-binding domains.
Table 1Dissociation times from I-like sitesa |
Two chimeric proteins were constructed to test the predicted two-domain structure of AraC (4). One protein consisted of the first two-thirds of AraC fused to the DNA-binding domain of LexA. This protein bound to LexA operators, as did the LexA DNA-binding domain fused with a leucine zipper, a well-defined dimerization domain. The LexA DNA-binding domain without the N-terminal two-thirds of AraC did not bind DNA, thereby proving that the fused portion of AraC provides a dimerization function. Additionally, this chimera displayed an arabinose-sensitive response both in vivo and in vitro, demonstrating that the arabinose-binding region of AraC is located in the dimerization domain and not in the DNA-binding domain.
A chimera consisting of the C-terminal one-third of AraC joined to a leucine zipper dimerization structure from the eukaryotic transcription factor C/EBP was also functional. Only when the dimerizing function provided by the leucine zipper was present did the AraC DNA-binding domain bind DNA and activate transcription from pBAD .
The results with both chimeras demonstrate that AraC protein consists of two independent functional domains. Additional experiments showed that insertion of extra amino acids in the region between the domains permits the resulting protein to bind to half-sites that are more widely separated than normal (22a). Finally, the experiments with chimeras show that the portion of AraC responsible for activation of transcription lies within the DNA-binding domain of AraC or possibly within the linker amino acids but definitely not in the dimerization domain. The conclusion that the carboxyl-terminal one-third of the protein can bind DNA and activate transcription has also been reached from experiments with truncated AraC protein molecules (52).
In the presence of arabinose, AraC protein could no longer bind to half-sites separated by an extra 21 bp (5). This finding reveals two important facts about AraC protein. First, if arabinose binds directly to the DNA-binding domain of AraC so as to alter its structure and affect DNA binding, then the presence of arabinose should strengthen binding to all sites, independent of the half-site separation. Thus, the affinity for I1-21-I1 should also increase. Since arabinose did not increase the affinity for I1-21-I1, the sugar cannot be altering the intrinsic affinity of the DNA-binding domain for a half-site. Arabinose must be altering the ability of the protein to extend and bind to distantly separated half-sites. Most likely, therefore, arabinose was interacting with the dimerization domain. This inference was found to be correct from the arabinose-responsive behavior of the chimera formed from the AraC dimerization domain fused to the LexA DNA-binding domain. The second important fact revealed by the arabinose response is that in the presence of arabinose, the protein must shorten its arms or otherwise alter its conformation to decrease its arm length.
An effective shortening or tightening of the connection between the dimerization and DNA-binding domains of AraC may also explain how arabinose makes the protein shift its binding states. In the absence of arabinose, the protein binds one subunit to a half-site in the promoter, araI1, and the other subunit to a half-site located several hundred base pairs upstream, araO2; in the presence of arabinose, the two subunits bind to two adjacent half-sites, araI1 and araI2, in the promoter (48). If contacting the araI1 and araO2 half-sites in looped DNA requires a minimum length or flexibility in AraC, the conformational alteration caused by the addition of arabinose could make it difficult for the protein to maintain contact with both half-sites. Thus, arabinose would make it cease looping. Similarly, a shortening or reorientation of the flexible linker arms connecting the domains would increase the affinity of the protein for two adjacent half-sites on linear DNA. This effect results from the fact that in the stiffened protein, when one subunit of the arabinose-AraC binds to DNA, the other subunit will be more closely positioned near its half-site and will therefore be more likely to bind.
Our inability to crystallize AraC led to a search for and study of homologs that might be more tractable. The rhamnose operon yielded two such regulatory proteins, RhaR and RhaS, both highly homologous to AraC (81). Each responds to rhamnose, activates transcription, and binds in the –35 region of the regulated promoter (18). Since then, other homologs like MelR and XylS have been studied (7, 41). More than 25 additional homologs have been identified in a wide variety of bacteria and in a wide variety of gene systems by sequencing genes that complement regulatory mutations (Fig. 5). Information and references on these proteins are contained in GenBank. Within the AraC family, the homologous region includes the DNA-binding domain of AraC and occasionally extends toward the N terminus. Typically, a 25% similarity is seen in the regions of homology. This degree of similarity is sufficient to ensure that the tertiary structures of all the proteins are nearly identical over the homologous region (69). Within the DNA-binding region is a subset of about 20 amino acids that are very highly conserved (24; T. Reeder and R. Schleif, unpublished data).
Two of the homologs to AraC, SoxS and MarA, consist of only the amino acids which are homologous to the DNA-binding domain (11, 86). SoxS is a 12,000-molecular-weight protein and is one of the regulators of genes induced by oxidative damage to DNA. MarA is involved with the multiple-drug-resistance genes. The existence of these two proteins also indicates that the DNA-binding domain of this family of proteins activates transcription. It will be interesting to see how they manage to bind to DNA without obvious domains for dimerization.
Each of the ara promoters contains a cyclic AMP receptor protein (CRP)-binding site and displays catabolite sensitivity. In ara pBAD and p E , the CRP-binding site is not adjacent to RNA polymerase. The AraC-binding site lies between the two. Since CRP does not seem to be positioned so as to make direct contact with RNA polymerase, what can it be doing to stimulate transcription initiation from the two promoters? One obvious possibility is that CRP and AraC bind cooperatively and that the role of CRP is merely to assist AraC binding. This seems not to be the case, however, as an entire blank major groove of DNA exists between the two proteins. Additionally, in vitro binding studies (J. Withey and R. Schleif, unpublished data) do not reveal any interaction between CRP and AraC in binding to linear DNA fragments.
One major role for CRP at the ara pBAD promoter is to assist opening the DNA loop between araI1 and araO2 (49). Only when this loop is open can the DNA-binding domain of AraC bind to araI2 to activate transcription. CRP stimulates about fivefold in this role. Overall, however, CRP stimulates pBAD about 20-fold. Thus, even after the loop has been opened, CRP does something. Perhaps CRP makes a direct contact with RNA polymerase despite its binding location 100 nucleotides away. Such a contact would require that the DNA of the regulatory region be bent or coiled. Most likely, the activity of CRP in this second role is the same as that at the araE promoter pE.
The araFGH promoter is an oddball compared to the other ara promoters. In it, the CRP-binding site is immediately upstream from RNA polymerase, centered at position –41.5, and AraC binds to two sites further upstream, one centered at position –80 and the other at –154 (50). At both of these sites, the orientation of the half-sites is opposite to their orientation at pBAD . Physiological experiments and in vitro transcription experiments show that this promoter is more strongly dependent upon CRP than upon AraC protein. The role of the more upstream AraC-binding site is also unclear. Its deletion reduces inducibility only a moderate amount (50).
Regulation of araBAD can be well approximated by a two-state system (Fig. 6) (5, 48). In the absence of arabinose, most copies of the araBAD genes within cells are in a looped state (Fig. 6, left). In this state, one subunit of AraC protein contacts the araI1 half-site, and one subunit contacts the araO2 half-site. The looped state of the DNA largely blocks activity of the pC promoter even though the araO1 site from which AraC binding represses activity of the promoter is unoccupied (51). The ara pBAD promoter is not active, because AraC protein is not bound at the araI2 half-site, and the activation domain of AraC is therefore not properly positioned to activate the promoter (64). As mentioned earlier, AraC protein itself consists of a dimerization domain loosely connected to a DNA-binding domain (4). In the absence of arabinose, AraC protein is apparently free to bind to the two best half-sites within a 400- to 500-bp separation. Those half-sites are araI1 and araO2.
Upon the addition of arabinose and with the assistance of CRP to open the DNA loop (49), AraC loses its ability to contact both araO2 and araI1 and shifts to a state in which binding to two adjacent half-sites is energetically preferred (5). As a result, that subunit which previously contacted the araO2 half-site relocates and now contacts the araI2 half-site. Thus, the DNA loop opens, and the pC promoter is accessible to RNA polymerase until AraC binds to the araO1 site, which takes about 10 min on average (30). Because of the relocation of the DNA-binding domain from araO2 to araI2, an activation domain is properly positioned to assist transcription initiation at pBAD . Occupancy of this half-site is necessary and sufficient for activation of transcription from the pBAD promoter (64). When a subunit of AraC protein occupies araI2, specific contacts could then be made between RNA polymerase and amino acid residues on or near the DNA-binding domain of AraC. Such contacts have not yet been demonstrated.
The mechanism described above provides a simple explanation for the repression phenomenon first observed by Englesberg. In wild-type cells in the absence of arabinose, AraC is looped between araI1 and araO2, as indicated above. This binding of one subunit to araO2 reduces its availability to bind to the araI2 half-site. If araO2 is deleted as it was in the original Englesberg deletion (Fig. 3) (22) or is unable to bind AraC because of mutation, the domain of AraC which would normally bind there binds instead sometimes to araI2 even in the absence of arabinose. This binding stimulates an occasional spurious round of transcription from pBAD and raises the uninduced, or basal, level of transcription. Hence, the role of araO2 is to prevent inadvertent induction, and thus its role really ought to be called anti-induction rather than repression.
Maltose consists of two glucose monosaccharides joined in an α-1-4 linkage. Maltodextrins are higher homologs of maltose. These sugars are most commonly derived from starch and glycogen. E. coli can grow on maltose and maltodextrins by taking up the sugars and cleaving them to release glucose and glucose 1-phosphate. Although E. coli cannot utilize starch, the closely related Klebsiella pneumoniae possesses two enzymes that linearize starch polysaccharides to yield maltose and maltodextrins (9).
At least nine genes encode proteins that are involved in the utilization of maltose and maltodextrins in E. coli (Fig. 7). MalT is the maltose-responsive transcription activator of the maltose system (29, 31, 57). LamB is a maltose-inducible porin apparently with a somewhat larger pore size than those of the porins normally present in the outer membrane. LamB received its name as the lambda phage receptor before its role as a maltose porin was known (36, 62, 80). MalE is a periplasmic maltose-binding protein that facilitates transport (40). MalM and MalS are also located in the periplasmic space (23, 26). MalS may serve to cleave long maltodextrins (23), and the role of MalM is unknown. MalF, MalG, and MalK constitute the active transport system for maltose and are located in the inner membrane (15, 76, 77). MalP and MalQ produce glucose and glucose 1-phosphate from maltose and maltodextrin (31). When the genes for the two K. pneumoniae enzymes (pullulanase and an enzyme of unknown function) are placed in E. coli, they also are regulated by MalT protein (9). The binding sites of the regulatory proteins in the promoter regions for the mal operons are shown in Fig. 8.
As would be expected for a set of genes devoted to utilization of carbohydrate, the ability of cells to catabolize maltose is glucose sensitive. This sensitivity occurs at two levels. First, the synthesis of MalT protein itself is stimulated by CRP (8, 16, 60). Also, the mal promoters, pE and pK, are directly stimulated by CRP (58, 82). Thus, MalT is like AraC in that its synthesis is CRP stimulated but different in that its synthesis is not autoregulated.
MalT is a positive regulator of the maltose genes. The protein is unusually large (103,000 molecular weight) (12). It is a monomer in solution, although its binding to DNA may be cooperative (65). It is possible that the difference between the protein’s behavior when it is in solution and its behavior, when it is bound to DNA is that the strength of its oligomerization interaction is too weak for a significant amount of oligomer to form at the concentrations normally found in solution. When bound to DNA, the monomers of the protein would be held alongside one another by the DNA-binding sites. In this state, relatively weak protein-protein interactions could then have a detectable effect on the overall binding.
MalT responds to maltotriose but not to maltose or any other maltodextrin (59). The N terminus of MalT possesses the sequence Gly-X-X-Gly-X-Gly-Lys-Thr-Thr, which is a known ATP-binding motif, and the protein binds ATP. Both maltotriose and ATP are required for DNA binding and for transcription activation by the protein (65, 66). The role of the ATP is unknown. Apparently, ATP hydrolysis is not required for any step up to and including formation of open complexes stimulated by the protein. The affinities of MalT for specific and nonspecific DNA in the presence and absence of ATP and maltotriose have not been published.
Like AraC, MalT possesses a number of homologs. Significant sequence similarity exists between the C-terminal 100 amino acids of MalT and other regulatory proteins or domains capable of binding to DNA and activating transcription (28, 39). Many of these, but not MalT, share homology in their N-terminal regions as well (Fig. 9). Like AraC, the homologs’ sequence similarities also lie at the C-terminal part of MalT. Among these homologs is GerA, a protein that stimulates transcription of genes involved with spore germination in Bacillus subtilis (87). This homolog contains only the homology region, can bind DNA, and can activate transcription. Thus, the homology region is likely an independent folding domain capable in most cases of binding DNA and of activating transcription when it is bound to DNA. The same conclusion can be drawn from the fact that the C-terminal portions of FixJ and LuxR (10, 39), which are homologous to the C-terminal portion of MalT, activate transcription from the genes which the intact protein regulates.
The family of proteins to which MalT is homologous also includes the sigma factors of RNA polymerase (39). The homologous region lies within region 4 of the sigma factors, the portion of the protein that contacts the –35 region of the promoter. This fact leads to the speculation that in stimulating transcription initiation, the MalT family of regulators places its DNA-binding domain in the promoter –35 region in place of the domain of the sigma factor that normally binds there. On the other hand, comparisons of similarity may be biased by the inherent sequence similarity among helix-turn-helix regions of proteins.
A fusion of the C-terminal portion of MalT to the enzyme glutathione S-transferase permitted affinity purification of the truncated MalT (83). As expected, the fusion protein did bind DNA and yielded a DNase footprint nearly identical to that produced by full-length MalT. Curiously, the truncated MalT protein with glutathione S-transferase present or absent was incapable of activating transcription. By the homology argument advanced above, we would expect the activating portion of the protein also to lie within the DNA-binding domain.
Comparison of the MalT-binding sites determined by DNase footprinting with each other and protection from dimethyl sulfoxide and hydroxyl radical at the four maltose-responsive promoters (pPQ , pEFG , pKBM , and pS) provide a consensus binding site of GGGGAT/GGAGG (84). Actual contacts to these bases or phosphates on the backbone have not been determined. Therefore, their relative importance to binding is not known.
One of the most puzzling features of transcription activation by MalT is the requirement for multiple MalT-binding sites in the promoters. One binding site is always centered at base –37.5 or –38.5 (61, 68, 84). Additionally, one or two more binding sites are always present further upstream. In all cases, the ensemble of binding sites contains a pair of sites oriented in the same direction and separated by about three bases (14). Thus, functional binding regions contain two or three MalT-binding sites.
In a site consisting of three binding sites, the required positioning of the upstream pair has been investigated. Transcription activation is best when the pair is on one particular face of the DNA (14). Insertions or deletions that move the pair around the helix greatly reduce activation. Although the cooperativity observed in the binding of MalT to multiple sites indicates the presence of an interaction among the molecules of MalT, we do not understand the location and orientation requirements for the sites. Also unclear at present is whether only three MalT protein molecules are capable of interacting while bound to DNA or whether four or even an unlimited number may interact.
A single MalT-binding site located in the –38 region is insufficient for transcription activation (14). Why? One might guess that a single site never becomes occupied in vivo without the presence of the additional MalT-binding sites. Although oversynthesis of MalT does not permit a promoter containing a single MalT-binding site to be induced, the actual level of soluble protein within cells may not be greatly elevated, since the protein aggregates upon significant overproduction (65). Thus, the uninducibility of a promoter containing a single MalT site could still result from an absence of MalT binding. In vitro experiments examining the same question do find a relatively low affinity of MalT binding to a single site (61). Perhaps at the levels of protein necessary to obtain occupancy of the single site in vitro, so much protein is present that transcription in general is inhibited.
The mal pE and pK promoters are divergently oriented, and their transcription start sites are separated by 271 bp. Within the common regulatory region are eight binding sites for MalT protein and three high-affinity binding sites for CRP (61, 82, 84). Collectively, these sites ensure that the two promoters are active only when maltotriose and cyclic AMP-CRP are present. In the absence of CRP, MalT binds to three sites upstream from pKBM , but no activation of the promoter occurs, because these sites are not properly located with respect to the RNA polymerase-binding site (Fig. 10) (68). When CRP has bound to its three sites, MalT binding to the sites formerly occupied is prevented, and instead, MalT binds to a set of three weaker sites partially overlapping the tighter binding sites. One of these weaker binding sites is correctly positioned to activate transcription of pKBM . The binding of the three molecules of CRP and the three molecules of MalT is insufficient for full activation of the promoter, however (61). Occupancy of the two additional MalT sites just upstream from pEFG is also required for full activity of pKBM . Thus, the complete complex is required for full activity of either or both promoters.
Although data to confirm the picture are not yet available, we can imagine an induction mechanism analogous to that seen at the arabinose pBAD promoter. Suppose MalT binding to the site centered near position –38 is necessary and sufficient for activation of the promoter. Then, if there is cooperative binding between the two MalT sites near pEFG and the three upstream of CRP sites and if CRP is bound to its three sites, then MalT is forced to bind to the three lower-affinity binding sites near pKBM . As a result, the crucial location overlapping the –35 region is occupied, and the DNA-binding domain activates transcription. Because all the proteins must be present for full activity of both promoters and the DNA must be supercoiled (67), the complex is probably wrapped so that many internal interactions occur to signal the presence of each protein molecule.
I thank Susan Egan for comments on the manuscript and members of my laboratory for their interest in the arabinose problem and their help over many years. I also thank the National Institutes of Health for their support of the work in my laboratory.
References
1. Bedouelle, H. 1983. Mutations in the promoter regions of the malEFG and malK-lamB operons of Escherichia coli K-12. J. Mol. Biol. 170:861–882.
2. Brown, C. E., and R. W. Hogg. 1972. A second transport system of l-arabinose in Escherichia coli B/r controlled by the araC gene. J. Bacteriol. 111:606–613.
3. Brunelle, A., and R. Schleif. 1989. Determining residue-base interactions between AraC protein and araI DNA. J. Mol. Biol. 209:607–622.
4. Bustos, S., and R. Schleif. 1993. Functional domains of the AraC protein. Proc. Natl. Acad. Sci. USA 90:5638–5642.
5. Carra, J., and R. Schleif. 1993. Variation of half-site organization and DNA looping by AraC protein. Eur. J. Mol. Biol. 12:35–44.
6. M. J. 1976. Regulation of the regulatory gene for the arabinose pathway, araC. J. Mol. Biol. 104:557–566.
7. Caswell, R., J. Williams, A. Lyddiatt, and S. Busby. 1992. Overexpression, purification, and characterization of the Escherichia coli MelR transcription activator protein. Biochem. J. 287:501–508.
8. Chapon, C., and A. Kolb. 1983. Action of CAP on the malT promoter in vitro. J. Bacteriol. 156:1135–1143.
9. Chapon, C., and O. Raibaud. 1985. Structure of two divergent promoters located in front of the gene encoding pullulanase in Klebsiella pneumoniae and positively regulated by the malT product. J. Bacteriol. 164:639–645.
10. Choi, S., and E. Greenberg. 1991. The C-terminal region of the Vibrio fischeri LuxR protein contains an inducer-independent lux gene activating domain. Proc. Natl. Acad. Sci. USA 88:11115–11119.
11. Cohen, S. P., H. Hachler, and S. B. Levy. 1993. Genetic and functional analysis of the multiple antibiotic resistance (mar) locus in Escherichia coli. J. Bacteriol. 175:1484–1492.
12. Cole, S. T., and O. Raibaud. 1986. The nucleotide sequence of the malT gene encoding the positive regulator of the Escherichia coli maltose regulon. Gene 42:201–208.
13. Collado-Vides, J., B. Magasanik, and J. Gralla. 1991. Control site location and transcriptional regulation in Escherichia coli. Microbiol. Rev. 55:371–394.
14. Danot, O., and O. Raibaud. 1993. On the puzzling arrangement of the asymmetric MalT-binding sites in the MalT-dependent promoters. Proc. Natl. Acad. Sci. USA 90:10999–11003.
15. Dassa, E., and M. Hofnung. 1985. Sequence of gene malG in E. coli K-12: homologies between internal membrane components from binding protein-dependent transport systems. EMBO J. 4:2287–2293.
16. Débarbouillé, M., and M. Schwartz. 1979. The use of gene fusions to study the expression of malT, the positive regulator gene of the maltose regulon. J. Mol. Biol. 132:521–534.
17. Doyle, M. E., C. Brown, R. W. Hogg, and R. Helling. 1972. Induction of the ara operon of Escherichia coli B/r. J. Bacteriol. 110:56–65.
18. Egan, S. M., and R. Schleif. 1993. A regulatory cascade in the induction of rhaBAD. J. Mol. Biol. 234:87–98.
19. Englesberg, E., R. L. Anderson, R. Weinberg, N. Lee, P. Hoffee, G. Huttenhauer, and H. Boyer. 1962. l-Arabinose-sensitive, l-ribulose 5-phosphate 4-epimerase-deficient mutants of Escherichia coli. J. Bacteriol. 84:137–146.
20. Englesberg, E., J. Irr, J. Power, and N. Lee. 1965. Positive control of enzyme synthesis by gene C in the l-arabinose system. J. Bacteriol. 90:946–957.
21. Englesberg, E., D. Sheppard, C. Squires, and F. Meronk, Jr. 1969. An analysis of "Revertants" of a deletion mutant in the C gene of the l-arabinose gene complex in Escherichia coli B/r: isolation of initiator constitutive mutants (Ic). J. Mol. Biol. 43:281–298.
22. Englesberg, E., C. Squires, and F. Meronk. 1969. The l-arabinose operon in Escherichia coli B/r: a genetic demonstration of two functional states of the product of a regulator gene. Proc. Natl. Acad. Sci. USA 62:1100–1107.
22a. Eustance, R., S. Bustos, and R. Schleif. 1994. Reaching out, locating and lengthening the interdomain linker in AraC protein. J. Mol. Biol. 242:330–338.
23. Freundlich, S., and W. Boos. 1986. α-Amylase of Escherichia coli, mapping and cloning of the structural gene malS and identification of its product as a periplasmic protein. J. Biol. Chem. 261:2946–2953.
24. Gallegos, M., C. Michàn, and J. Ramos. 1993. The XylS/AraC family of regulators. Nucleic Acids Res. 21:807–810.
25. Gielow, W. O., and N. Lee. 1974. l-Ribulokinase from an initiator constitutive mutant. J. Bacteriol. 120:539–541.
26. Gilson, E., J. P. Rousset, A., Charbit, D. Perrin, and M. Hofnung. 1986. malM, a new gene of the maltose regulon in Escherichia coli K12. J. Mol. Biol. 191:303–311.
27. Greenblatt, J., and R. Schleif. 1971. Arabinose C protein: regulation of the arabinose operon in vitro. Nature (London) New Biol. 233:166–170.
28. Gross, R., B. Aricò, and R. Rippuoli. 1989. Families of bacterial signal-transducing proteins. Mol. Microbiol. 3:1661–1667.
29. Gutierrez, C., and O. Raibaud. 1984. Point mutations that reduce the expression of malPq, a positively controlled operon of Escherichia coli. J. Mol. Biol. 177:69–86.
30. Hahn, S., and R. Schleif. 1983. In vivo regulation of the Escherichia coli araC promoter. J. Bacteriol. 155:593–560.
31. Hatfield, D., M. Hofnung, and M. Schwartz. 1969. Genetic analysis of the maltose A region in Escherichia coli. J. Bacteriol. 98:559–567.
31a. Hatfield, D., M. Hofnung, and M. Schwartz. 1969. Nonsense mutations in the maltose A region of the genetic map of Escherichia coli. J. Bacteriol. 100:1311–1315.
32. Hendrickson, W., and R. Schleif. 1984. Regulation of the Escherichia coli l-arabinose operon studied by gel electrophoresis DNA binding assay. J. Mol. Biol. 178:611–628.
33. Hendrickson, W., and R. Schleif. 1985. A dimer of araC protein contacts three adjacent major groove regions of the araI DNA site. Proc. Natl. Acad. Sci. USA 82:3129–3133.
34. Hendrickson, W., C. Stoner, and R. Schleif. 1990. Characterization of the Escherichia coli araFGH and araI promoters. J. Mol. Biol. 215:497–510.
35. Hirsh, J., and R. Schleif. 1973. In vivo experiments on the mechanism of action of l-arabinose C gene activator and lactose repressor. J. Mol. Biol. 80:433–444.
36. Hofnung, M., D. Hatfield, and M. Schwartz. 1974. malB region in Escherichia coli K-12: characterization of new mutations. J. Bacteriol. 117:40–47.
37. Hogg, R. W., and E. Englesberg. 1969. l-Arabinose binding protein from Escherichia coli B/r. J. Bacteriol. 100:423–432.
38. Horazdovsky, B., and R. Hogg. 1989. Genetic reconstitution of the high-affinity l-arabinose transport system. J. Bacteriol. 171:3053–3059.
39. Kahn, D., and G. Ditta. 1991. Modular structure of FixJ: homology of the transcriptional activator domain with the –35 binding domain of sigma factors. Mol. Microbiol. 5:987–997.
40. Kellermann, O., and S. Szmelcmen. 1974. Active transport of maltose in Escherichia coli K12. Eur. J. Biochem. 47:139–149.
41. Kessler, B., V. de Lorenzo, and K. Timmis. 1993. Identification of a cis-acting sequence within the Pm promoter of the TOL plasmid which confers XylS-mediated responsiveness to substituted benzoates. J. Mol. Biol. 230:699–703.
42. Kosiba, B. E., and R. Schleif. 1982. Arabinose-inducible promoter from Escherichia coli, its cloning from chromosomal DNA, identification as the araFG promoter, and sequence. J. Mol. Biol. 156:53–66.
43. Lee, J.-H., S. Al-Zarban, and G. Wilcox. 1981. Genetic characterization of the araE gene in Salmonella typhimurium LT2. J. Bacteriol. 146:298–304.
44. Lee, N., and I. Bendet. 1968. Crystalline l-ribulokinase from Escherichia coli. J. Biol. Chem. 9:2043–2050.
45. Lee, N., C. Francklyn, and E. P. Hamilton. 1987. Arabinose-induced binding of AraC protein to araI2 activates the araBAD operon promoter. Proc. Natl. Acad. Sci. USA 84:8814–8818.
46. Lee, N., J. W. Patrick, and M. Masson. 1968. Crystalline l-ribulose 5-phosphate 4-epimerase from Escherichia coli. J. Biol. Chem. 18:4700–4705.
47. Lee, N., G. Wilcox, W. Gielow, J. Arnold, P. Cleary, and E. Englesberg. 1974. In vitro activation of the transcription of araBAD operon by araC activator. Proc. Natl. Acad. Sci. USA 71:634–638.
48. Lobell, R., and R. Schleif. 1990. DNA looping and unlooping by AraC protein. Science 250:528–532.
49. Lobell, R. B., and R. F. Schleif. 1991. AraC-DNA looping: orientation and distance-dependent loop breaking by the cyclic AMP receptor protein. J. Mol. Biol. 218:45–54.
50. Lu, Y., C. Flaherty, and W. Hendrickson. 1992. AraC protein contacts asymmetric sites in the Escherichia coli araFGH promoter. J. Biol. Chem. 267:24848–24857.
51. Martin, K., L. Huo, and R. Schleif. 1986. The DNA loop model for ara repression: AraC protein occupies the proposed loop sites in vivo and repression-negative mutations lie in these same sites. Proc. Natl. Acad. Sci. USA 83:3654–3658.
52. Menon, K., and N. Lee. 1990. Activation of ara operons by a truncated AraC protein does not require inducer. Proc. Natl. Acad. Sci. USA 87:3708–3712.
52a. Neidhardt, F. C., J. L. Ingraham, K. B. Low, B. Magasanik, M. Schaechter, and H. E. Umbarger (ed.). 1987. Escherichia coli and Salmonella typhirium: Cellular and Molecular Biology. American Society for Microbiology, Washington, D.C.
53. Novotny, C. P., and E. Englesberg. 1966. The l-arabinose permease system in Escherichia coli B/r. Biochem. Biophys. Acta 117:217–230.
54. Ogden, S., D. Haggerty, C. Stoner, D. Kolodrubetz, and R. Schleif. 1980. The Escherichia coli l-arabinose operon: binding sites of the regulatory proteins and a mechanism of positive and negative regulation. Proc. Natl. Acad. Sci. USA 77:3346–3350.
55. Patrick, J. W., and N. Lee. 1968. Purification and properties of an l-arabinose isomerase from Escherichia coli. J. Biol. Chem. 16:4312–4318.
56. Patrick, J. W., and N. Lee. 1969. Subunit structure of l-arabinose isomerase from Escherichia coli. J. Biol. Chem. 244:4277–4283.
57. Raibaud, O., M. Débarbouillé, and M. Schwartz. 1983. Use of deletions created in vitro to map transcriptional regulatory signals in the malA region of Escherichia coli. J. Mol. Biol. 163:395–408.
58. Raibaud, O., C. Gutierrez, and M. Schwartz. 1985. Essential and nonessential sequences in malPp, a positively controlled promoter in Escherichia coli. J. Bacteriol. 161:1201–1208.
59. Raibaud, O., and E. Richet. 1987. Maltotriose is the inducer of the maltose regulon of Escherichia coli. J. Bacteriol. 169:3059–3061.
60. Raibaud, O., D. Vidal-Ingigliardi, and A. Kolb. 1991. Genetic studies on the promoter of malT, the gene that encodes the activator of the Escherichia coli maltose regulon. Res. Microbiol. 142:937–942.
61. Raibaud, O., D. Vidal-Ingigliardi, and E. Richet. 1989. A complex nucleoprotein structure involved in activation of transcription of two divergent Escherichia coli promoters. J. Mol. Biol. 205:471–485.
62. Randall-Hazelbauer, L., and M. Schwartz. 1973. Isolation of the bacteriophage lambda receptor from Escherichia coli. J. Bacteriol. 116:1436–1446.
63. Reeder, T., and R. Schleif. 1991. Mapping, sequence, and apparent lack of function of araJ, a gene of the Escherichia coli arabinose regulon. J. Bacteriol. 173:7765–7771.
64. Reeder, T., and R. Schleif. 1993. AraC protein can activate transcription from only one position and when pointed in only one direction. J. Mol. Biol. 231:205–213.
65. Richet, E., and O. Raubaud. 1987. Purification and properties of the MalT protein, the transcription activator of the Escherichia coli maltose regulon. J. Biol. Chem. 262:12647–12653.
66. Richet, E., and O. Raibaud. 1989. MalT, the regulatory protein of the Escherichia coli maltose system, is an ATP-dependent transcriptional activator. EMBO J. 8:981–987.
67. Richet, E., and O. Raibaud. 1991. Supercoiling is essential for the formation and stability of the initiation complex at the divergent malEp and malKp promoters. J. Mol. Biol. 218:529–542.
68. Richet, E., D. Vidal-Ingigliardi, and O. Raibaud. 1991. A new mechanism for coactivation of transcription initiation: repositioning of an activator triggered by the binding of a second activator. Cell 66:1185–1195.
69. Sander, C., and R. Schneider. 1991. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins 9:56–68.
70. Schleif, R. 1969. An l-arabinose binding protein and arabinose permeation in E. coli. J. Mol. Biol. 46:185–196.
71. Schleif, R. 1972. Fine-structure deletion map of the Escherichia coli l-arabinose operon. Proc. Natl. Acad. Sci. USA 69:3479–3484.
72. Schleif, R., and M. Favreau. 1982. Hyperproduction of araC protein from Escherichia coli. Biochemistry 21:778–782.
73. Schleif, R., and J. T. Lis. 1975. The regulatory region of the l-arabinose operon: a physical, genetic and physiological study. J. Mol. Biol. 95:417–431.
74. Sheppard, D. E., and E. Englesberg. 1967. Further evidence for positive control of the l-arabinose system by gene araC. J. Mol. Biol. 25:443–454.
75. Sheppard, D., and E. Englesberg. 1971. Positive control in the l-arabinose gene-enzyme complex of Escherichia coli B/r as exhibited with stable merodiploids. Biochemistry 10:345–348.
76. Shuman, H., and T. J. Silhavy. 1981. Identification of the malK gene product, a peripheral membrane component of the Escherichia coli maltose transport system. J. Biol. Chem. 256:560–562.
77. Shuman, H. A., T. J. Silhavy, and J. Beckwith. 1980. Labeling of proteins with β-galactosidase by gene fusion. J. Biol. Chem. 255:168–174.
78. Steffen, D., and R. Schleif. 1977. Overproducing araC protein with lambda-arabinose transducing phage. Mol. Gen. Genet. 157:333–339.
79. Stoner, C., and R. Schleif. 1983. The araE low affinity l-arabinose transport promoter: cloning, sequence, transcription start site and DNA binding sites of regulatory proteins. J. Mol. Biol. 171:369–381.
80. Thirion, J. P., and M. Hofnung. 1972. On some genetic aspects of phage lambda resistance in E. coli K-12. Genetics 71:207–216.
81. Tobin, J. F., and R. F. Schleif. 1987. Positive regulation of the Escherichia coli l-rhamnose operon is mediated by the products of tandemly repeated regulatory genes. J. Mol. Biol. 196:789–799.
82. Vidal-Ingigliardi, D., and O. Raibaud. 1991. Three adjacent binding sites for cAMP receptor protein are involved in the activation of the divergent malEp-malKp promoters. Proc. Natl. Acad. Sci. USA 88:229–233.
83. Vidal-Ingigliardi, D., E. Richet, O. Danot, and O. Raibaud. 1993. A small C-terminal region of the Escherichia coli MalT protein contains the DNA-binding domain. J. Biol. Chem. 268:24527–24530.
84. Vidal-Ingigliardi, D., E. Richet, and O. Raibaud. 1991. Two MalT binding sites in direct repeat. J. Mol. Biol. 218:323–334.
85. Wilcox, G., and P. Meuris. 1976. Stabilization and size of AraC protein. Mol. Gen. Genet. 145:97–100.
86. Wu, J., and B. Weiss. 1991. Two divergently transcribed genes, soxR and soxS, control a superoxide response regulon of Escherichia coli. J. Bacteriol. 173:2864–2871.
87. Zheng, L., R. Halberg, S. Roels, H. Ichikawa, L. Kroos, and R. Losick. 1992. Sporulation regulatory protein GerE from Bacillus subtilis binds to and can activate or repress transcription from promoters for mother-cell-specific genes. J. Mol. Biol. 226:1037–1050.