Reconstruction and Use of Microbial Metabolic Networks: the Core <i>Escherichia coli</i> Metabolic Model as an Educational Guide
JEFFREY D. ORTH, R. M. T. FLEMING, AND BERNHARD Ø. PALSSON*
[SECTION EDITOR: PETER KARP]
Posted February 01, 2010
Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093–0412
*Corresponding author. Mailing address: Department of Bioengineering, University of California, San Diego, 9500 Gilman Drive, Mail Code 0412, La Jolla, CA 92093–0412. Phone: (858) 534–5668, E-mail:
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
.
Biochemical network reconstructions have become popular tools in systems biology (135). There are different types of reconstructions representing various types of biological networks (metabolic, regulatory, transcription/translation), although metabolic network reconstructions have been used the most extensively. Early reconstructions of Escherichia coli metabolism were small, containing only central metabolic reactions (185). Today, there are many genome-scale reconstructions available, based on all known metabolic genes in the annotated genome of an organism along with other data sources (49, 57, 127, 155). Metabolic network reconstructions are biochemically, genetically, and genomically (BiGG) structured databases of biochemical reactions and metabolites. They contain information such as exact reaction stoichiometry, reaction reversibility, and the relationships between genes, proteins, and reactions. Reconstructed networks serve as flexible BiGG knowledge bases (58), storing curated information in a useful format while allowing for content to be updated based on new research. Although many organisms have similar central metabolic networks, there can be differences even between two closely related organisms. Network reconstructions are therefore organism specific (146). For most applications of network reconstructions, it is necessary to convert the network into a mathematical model. Metabolic models are usually formulated as a stoichiometric matrix, while regulatory network models are often formulated as Boolean networks.
There have been many practical uses of network reconstructions. Some reconstructions have been used as tools to study bacterial evolution. The effects on metabolism of adding or removing genes from the network can be simulated, enabling studies of horizontal gene transfer (132), adaptation to new environments (131), and evolution to minimal genomes (133). Reconstructions can also be used for analysis of network properties. In these studies, methods have been developed to determine the interactions between different sets of reactions and compounds, improving our understanding of the organisms under investigation. Some examples include identification of alternate optimal network states (112), identification of sets of coupled reactions (23), and studies of the states of regulatory networks (7, 160). It has also been determined by simulating thousands of growth conditions that E. coli contains a set of common, high-flux backbone reactions (3). Network reconstructions have been extensively used to study the phenotypic behavior of wild-type and mutant stains under a variety of conditions, linking genotypes with phenotypes. These predictions have been verified by experimental studies (53). Such phenotypic simulations have allowed for the prediction of growth after genetic manipulations (167, 169), prediction of growth phenotypes after adaptive evolution (83), and prediction of essential genes (92). Another promising use of reconstructions is in the discovery of unknown biological features. By comparing experimental data such as growth phenotypes (150), metabolic flux measurements (77), or gene essentiality (107) with model-based predictions, missing content in reconstructions can be identified. The reconstructions can be updated and new biological knowledge can be elucidated. Finally, network reconstructions have proven to be very useful for metabolic engineering and synthetic biology (136). Because of the capacity of models to be used to predict growth and metabolite secretion phenotypes, it is possible to predict the genetic interventions most likely to produce a strain with the desired properties (63). Model-based algorithms can even predict nonintuitive designs that couple production of desired metabolites to cell growth (24, 139).
This chapter serves as an introduction to metabolic and regulatory network reconstructions and models, and gives a complete description of the core E. coli metabolic model. This model can be encoded and analyzed in any computational format (such as MATLAB or Mathematica) based on the information given in this chapter. The core E. coli model is a small-scale model that can be used for educational purposes. It is meant to be used by senior undergraduate and first-year graduate students learning about constraint-based modeling and systems biology. This model has enough reactions and pathways to enable interesting and insightful calculations, but it is also simple enough that the results of such calculations can be easily understood. This model is also useful for testing and evaluating new constraint-based analysis methods, since its small scope makes troubleshooting and interpretation of results easier.
The construction of a genome-scale metabolic network is a long-term process (from many months to several years, depending on the size of the network) consisting of four major steps, each step requiring the use of different types of biological data (Fig. 1). In the first step, an organism's annotated genome is used to generate a draft reconstruction. Second, the draft reconstruction is curated in a long process that involves the study of many highly specific data sources. In the third step, the reconstruction is converted to a mathematical model, and model-based simulations can be compared with phenotypic data. In the fourth and final step, high-throughput data can be integrated with the model, allowing for biological discovery and iterative model refinement. After the model is complete, it can be used for a variety of purposes (see “Uses of Metabolic Models,” below).
The first step in building any new genome-scale reconstruction, whether of metabolic or regulatory networks, is to use the genome annotation for the desired organism to generate an initial list of functions. Genomes have been annotated to some degree for hundreds of microbial organisms, and they provide several types of valuable information. First, they contain the genome sequence of an organism, from which open reading frames (ORFs) can be identified. The function of each ORF can then be determined through a variety of methods (150). The strongest evidence for the function of a particular ORF usually comes from direct biochemical analysis, such as isolation and characterization of the function of an enzyme. E. coli is a very well studied model organism and many of its ORFs have been experimentally characterized (95). Unfortunately, for many other organisms, few biochemical data are available. To identify the ORFs in these organisms (and also to identify uncharacterized ORFs in well studied organisms), their genomic sequences are compared with the genomes of other organisms to identify homologous genes. In silico methods can also be used for annotation, including methods that identify genes based on protein-protein interactions, transcriptomics, phylogenetic profiles, protein fusion, and operon clustering (173). These methods typically allow for 40 to 70% of the genes in a new genome to be annotated. When a high-quality genome annotation is not available for a particular organism, it becomes more challenging to build a reconstruction of that organism on a genome scale.
There are organism-specific databases for some genome annotations, including EcoCyc (95) for E. coli and the Saccharomyces Genome Database (SGD) (34) and Comprehensive Yeast Genome Database (CYGD) (73) for yeast. For many other microbial organisms (111), Comprehensive Microbial Resource (CMR) (142), Genome Reviews (174), and Integrated Microbial Genomes (IMG) (114) contain useful genomic information. All of the metabolic genes identified in the genome annotation for the desired organism can be assembled into an initial parts list. From this initial list of genes, an initial list of enzymes and the reactions they catalyze can be constructed by mapping each gene to one or more reactions according to information from a database. The data used for this mapping can be included in the genome annotations or it can be obtained from metabolic databases such as KEGG (94), BRENDA (28), ENZYME (5), MetaCyc (105), the SEED (47), or TransportDB (154). Most databases include EC (Enzyme Commission) numbers that can be used to easily identify the enzymes and reactions associated with a particular gene. With the appropriate databases, the reactions known to be associated with each gene can be identified. Some of these databases will be more useful for certain organisms than others. The process of building an initial reconstruction from a genome annotation and reaction information from databases can be performed manually, or it can be partially or fully automated. Tools available for building draft reconstructions include SimPheny (Genomatica, Inc., San Diego, CA), PathoLogic (134), and PRIAM (36). SimPheny is a commercial software platform for building and analyzing metabolic constraint-based models. It can download the annotated genome of an organism and provide a framework for manually associating metabolic genes with reactions. SimPheny is also useful in the other stages of building reconstructions because it contains tools for manual curation and model quality control and quality assurance, as well as tools for performing simulations and analyzing experimental data. PathoLogic, part of the Pathway Tools software system, is a tool for mapping genes to reactions in an automated manner. It requires a fully annotated genome, and uses EC numbers, Gene Ontology terms (4), or the annotated gene names to predict which reactions are associated with a particular gene. It can then predict which pathways are present in an organism by comparing the predicted set of reactions with a reference database such as MetaCyc. PRIAM is another automated method that identifies enzymes in any genome sequence. This program uses all of the known sequences for any individual enzyme in the ENZYME database to identify the characteristic sequence modules of that enzyme. Specific rules that can identify an enzyme based on which modules are present in a sequence are then formulated. PRIAM forms these modules and rules for every enzyme in the database, and then uses scoring matrices to identify modules in the genome of interest. It then uses the rules to predict which enzyme is associated with every gene in the genome. This algorithm can be very useful because it does not require a fully annotated genome. The result of the initial mapping process is a draft reconstruction that lists most of the metabolic genes and reactions in an organism with reasonable accuracy.
The next step in the reconstruction process is to manually curate the initial reconstruction. Any reconstruction resulting from a fully automated procedure will be incomplete and, in some cases, incorrect. Some reactions will be missing because it is not known which genes encode their enzymes, leaving gaps in pathways. Other reactions may be mistakenly included because of incorrect genome annotation or nonspecific information in databases. Reactions in the reconstruction may have incorrect or unbalanced stoichiometries or cofactor usage, because these attributes are often unique to enzymes in specific organisms (146). Gene-protein-reaction associations (GPRs) must also be included to formally connect reactions to one or more functional proteins, and every protein to one or more known genes (149). To correct any mistakes and improve the reconstruction, a researcher must manually curate the list of reactions by using data from many different sources. Organism-specific textbooks and databases are very useful for this purpose. The genome-scale E. coli reconstruction iAF1260 (57) relied heavily on the textbook Escherichia coli and Salmonella: Cellular and Molecular Biology (122). Unfortunately, such texts are usually not available for less-well-studied organisms. Literature data, from both primary and review articles, are also extremely useful. These articles can contain useful and specific information about reaction stoichiometry and directionality and can indicate the presence of many reactions with unknown genes. Many different types of studies can be useful, including enzyme assays, gene knockout studies, metabolomic studies including flux measurements, and protein localization studies. The information from these sources often cannot be found in databases. The manual curation process is extremely labor intensive, usually requiring the study of hundreds of literature sources over a period of several months or years.
After a high-quality reconstruction has been assembled, it must be converted into a genome-scale constraint-based model to be further analyzed (135). A reconstruction is a BiGG knowledge base, a list of stoichiometrically balanced reactions and their associated genes and proteins. A model is a network in a mathematical format with defined system boundaries and constraints on the reactions (185). While a reconstruction is unique to an organism, many different models (e.g., condition specific) can be derived from a reconstruction. Metabolic network models are usually encoded in a stoichiometric matrix, in which each unique metabolite is represented by a row in the matrix, and each reaction is represented by a column. The entries in each column are the stoichiometric coefficients of the metabolites in a reaction, with negative coefficients for consumed metabolites and positive coefficients for produced metabolites. The properties of this matrix can be investigated through various constraint-based analysis methods, including flux balance analysis (54, 185, 186, 187). This method uses linear optimization to identify optimal reaction flux distributions of the network given a set of minimum and maximum reaction rates and an objective, such as maximum cellular growth. To simulate growth with a genome-scale metabolic model, a biomass reaction is needed. This is an organism-specific reaction that drains specified metabolites from the network, representing the metabolic precursors that contribute to biomass (185, 186). To construct a biomass reaction, the relative amounts of nucleic acids, lipids, proteins, and other macromolecules of an organism must be known. These macromolecules can then be broken down into building blocks such as nucleotides or amino acids. The relative amounts of each of these building blocks form the stoichiometric coefficients of the biomass reaction (see “Biomass Reaction,” below). Experimentally determined growth data must also be used to determine the amount of energy needed for growth and for non-growth-associated maintenance functions, representing the energy demands of the cell (57). Once the biomass reaction has been constructed, flux balance analysis (185, 186) can be used to predict optimal growth rates under many different conditions (187). A completed genome-scale metabolic model can be used to assess the quality of the reconstruction. There must be continuous pathways to every biomass precursor or the in silico cell cannot grow. Other gaps in the network can be identified by unreachable reactions or metabolites (50). Another common test of a new model is to compare growth simulations under different conditions with actual growth data (53).
Genome-scale metabolic models can be used to map many types of experimental data to a biological network, allowing for the integration of different data types. These data can be compared with model predictions, and the discrepancies can lead to discovery of new reactions and pathways. High-throughput screens for growth on different media conditions can be used to reveal previously unknown substrate uptake pathways. In vivo knockout screens and synthetic lethal screens can be compared with in silico-predicted knockout effects (Fig. 2), with discrepancies indicating the presence of alternative metabolic pathways (150). Metabolomic data can predict the presence of metabolites not accounted for in a reconstruction, necessitating the addition of new production and utilization pathways. Proteomic and transcriptomic data can be used to suggest the genes and proteins that fill gaps in a reconstruction (19). As new biological features and capabilities are discovered, the model can be improved by incorporating these new data. The updated model can then be used to probe different pathways and features, leading to an iterative cycle of discovery and model improvement.
The process of reconstructing transcriptional regulatory networks is not as well developed as the process for building metabolic reconstructions. To date, only a few examples of genome-scale regulatory reconstructions exist (16, 78, 193). The oldest of these is iMC1010, which contains the regulators of E. coli metabolism (41). This model was built by a process similar to that used for metabolic reconstructions, relying on a variety of data types including the genome annotation and literature sources. A gene expression study was then conducted in which gene expression under aerobic and anaerobic conditions was compared in several different transcription factor knockout strains. By analyzing the discrepancies between the experimental results and the model predictions, the model was iteratively updated and improved.
Several automated methods for inferring transcriptional regulatory networks have been developed recently (16, 56, 81, 166). Progress in high-throughput experimental methods is allowing for transcriptional regulatory reconstructions to be assembled in a top-down, automated manner (30, 31, 33). The connectivity of an organism's transcriptional regulatory network can be determined by performing ChIP-chip experiments (chromatin immunoprecipitation followed by microarray hybridization). In these studies, all of the DNA binding sites of a particular transcription factor on an entire genome under a particular set of conditions can be identified in vivo. First, proteins are fixed to genomic DNA in living cells, and then the DNA is extracted and sheared into small fragments. Next, the fragments with a particular transcription factor bound are filtered out by using antibodies, and the fragments are identified by hybridization to a DNA microarray. When ChIP-chip experiments are repeated under a variety of conditions, all of the binding sites of a transcription factor can be found, identifying all of the genes regulated by that transcription factor.
A set of ChIP-chip experiments must be run for every transcription factor to elucidate an entire regulatory network. As useful as these experiments are, they do not reveal what the effect of a transcription factor on its targets is, or how different transcription factors interact. The direct and indirect effects of transcription factors can be determined by expression profiling of strains with those transcription factor genes knocked out (84). Performing so many high-throughput experiments is a very expensive and time-consuming process, but improvements in parallel sequencing technologies should improve the effectiveness of these approaches. ChIP-seq experiments (chromatin immunoprecipitation followed by DNA sequencing) are an example of this trend (30, 31, 32, 70, 168).
As with metabolic reconstructions, regulatory reconstructions must be converted to computational models to utilize their predictive potential. Several different model types can be used. The most common modeling framework is the Boolean model, which represents the connections between genes or other variables as logical rules. Boolean models can qualitatively describe the functions of a regulatory network and make accurate predictions of behavior (42). An equivalent structure to the Boolean model is the “regulation matrix” (68). By reformulating a Boolean model as a matrix, more advanced mathematical analysis is possible, and all of the possible expression states of the model can be sampled (67). Newer regulatory network models are being formulated in structures similar to the stoichiometric matrix, allowing them to be interrogated with constraint-based analysis methods (179).
Here, the biochemistry of the reactions in the core E. coli metabolic model is summarized. An overview of the reactions is given in Fig. 3. Where possible, reactions are described in the context of their major functional pathway. Reactions occasionally participate in more than one pathway and, in such cases, this is highlighted. The E. coli core model is based on the first stoichiometric reconstruction of E. coli fueling pathways (185). The current model contains the reactions of glycolysis, the pentose phosphate shunt, the tricarboxylic acid cycle, the glyoxylate cycle, gluconeogenesis, anaplerotic reactions, the electron transport chain and oxidative phosphorylation, the transfer of reducing equivalents, fermentation, and nitrogen metabolism.
In this model, metabolites and reactions are given both full names and abbreviations. Metabolite abbreviations are lowercase, and extracellular metabolites are denoted with the suffix “[e],” e.g., extracellular acetate is abbreviated “ac[e]”. This reconstruction does not distinguish between the periplasmic space and the extracellular medium. All metabolites that are not denoted as extracellular are cytosolic. In many reconstructions, cytosolic metabolites use the suffix “[c],” but here, this is omitted for clarity. In the figures describing the metabolic network, such as Fig. 3, cytosolic metabolites are represented by orange circles and extracellular metabolites are represented by yellow circles. Reaction abbreviations are uppercase and italicized. For example, acetaldehyde dehydrogenase (acetylating) is abbreviated as ACALD. There are several common suffixes used in the reaction abbreviations, including “abc” (ATP-binding cassette transporter), “i” (irreversible), “r” (reversible), and “t” (transport). Most reactions in the reconstruction are named after the enzymes that catalyze them. In this section, the text often uses the reaction abbreviations to refer to the enzymes catalyzing the reactions.
In the figures, metabolic reactions are represented as blue arrows, and the reaction abbreviations are inside blue boxes with yellow outlines. Certain reactions are assumed to be effectively irreversible when thermodynamic considerations are taken into account (1, 57). In brief, if the in vivo change in Gibbs energy for a biochemical reaction is highly negative, then it is assumed that net flux is always in the forward direction. In all figures, effectively irreversible reactions are denoted with arrowheads in one direction only. The tables associated with each pathway also indicate reactions that are effectively irreversible with the symbol → in the reaction equation. Reversible reactions use the symbol ↔ in the reaction equation.
GPRs are shown adjacent to each reaction. The reconstruction associates each reaction with an enzyme or an enzyme complex. Some reactions may be catalyzed by more than one enzyme and this distinction is also represented (Fig. 4). A few reactions are known to occur in E. coli, but the corresponding gene has yet to be identified. These orphan reactions are discussed further in “Discovery.” Each protein is associated with a gene name and genomic locus. The genomic locus ties a reaction to one particular unique nucleotide sequence in the 4,639,675-base-pair E. coli K-12 MG1655 circular chromosome (157). The functional genomic structure, such as operon structure or transcriptional units, is not specifically represented in the E. coli core reconstruction but is represented in more comprehensive models of E. coli (179). The charge of each metabolite is included in the core model to determine the proper chemical formula in E. coli. These charges were determined by using the pKa of each metabolite at a pH of 7.2 (57).
The reactions and pathways in the core model were chosen to represent the most well known and widely studied metabolic pathways of E. coli. These pathways are often the subjects of textbook chapters and should be familiar to most readers with a basic biochemistry background. As much as possible, the reactions and GPRs were taken directly from the iAF160 genome-scale E. coli reconstruction (57). Some pathways, such as the electron transport system, were greatly simplified to limit the scope of the model and ensure that every reaction is understandable. This model contains a total of 72 metabolites and 95 reactions. There are 20 extracellular metabolites and 52 intracellular metabolites, with a total of 54 unique metabolites (most extracellular metabolites are just extracellular versions of intracellular metabolites). There are 20 exchange reactions, one for each extracellular metabolite. The model also has 25 transport reactions, 49 metabolic reactions, and one biomass reaction (see “Biomass Reaction,” below).
One of the most important and widely studied metabolic pathways is glycolysis, a series of ten chemical reactions that convert one 6-carbon glucose molecule into two 3-carbon pyruvate molecules (see Table 1, Table 2, and Fig. 5). In these reactions, a net two molecules of adenosine triphosphate (ATP) are produced by substrate-level phosphorylation, and two molecules of reduced nicotinamide adenine dinucleotide (NADH) are also produced. In addition to producing these energy and redox carriers, glycolysis also produces several compounds that are precursors for E. coli biomass (see “Biomass reaction,” below).
TABLE 1.Glycolysis reactions| Abbr. | Reaction | Equation |
| GLCpts | D-Glucose transport via PEP:Pyr PTS | glc-D[e] + pep → g6p + pyr |
| PGI | Glucose-6-phosphate isomerase | g6p ↔ f6p |
| FRUpts2 | Fructose transport via PEP:Pyr PTS (f6p generating) | fru[e] + pep → f6p + pyr |
| PFK | Phosphofructokinase | atp + f6p → adp + fdp + h |
| FBP | Fructose-bisphosphatase | fdp + h2o → f6p + pi |
| FBA | Fructose-bisphosphate aldolase | fdp ↔ dhap + g3p |
| TPI | Triose-phosphate isomerase | dhap ↔ g3p |
| GAPD | Glyceraldehyde-3-phosphate dehydrogenase | g3p + nad + pi ↔ 13dpg + h + nadh |
| PGK | Phosphoglycerate kinase | 3pg + atp ↔ 13dpg + adp |
| PGM | Phosphoglycerate mutase | 2pg ↔ 3pg |
| ENO | Enolase | 2pg ↔ h2o + pep |
| PYK | Pyruvate kinase | adp + h + pep → atp + pyr |
| PPS | Phosphoenolpyruvate synthase | atp + h2o + pyr → amp + 2 h + pep + pi |
TABLE 2.Glycolysis metabolites| Abbr. | Metabolite | Formula | Charge |
| glc-D | D-Glucose | C6H12O6 | 0 |
| g6p | D-Glucose 6-phosphate | C6H11O9P | −2 |
| fru | D-Fructose | C6H12O6 | 0 |
| f6p | D-Fructose 6-phosphate | C6H11O9P | −2 |
| fdp | D-Fructose 1,6-bisphosphate | C6H10O12P2 | −4 |
| dhap | Dihydroxyacetone phosphate | C3H5O6P | −2 |
| g3p | Glyceraldehyde 3-phosphate | C3H5O6P | −2 |
| 13dpg | 3-Phospho-D-glyceroyl-phosphate | C3H4O10P2 | −4 |
| 3pg | 3-Phospho-D-glycerate | C3H4O7P | −3 |
| 2pg | D-Glycerate-2-phosphate | C3H4O7P | −3 |
| pep | Phosphoenolpyruvate | C3H2O6P | −3 |
| pyr | Pyruvate | C3H3O3 | −1 |
| h | H+ | H | 1 |
| h2o | H2O | H2O | 0 |
| amp | Adenosine monophosphate | C10H12N5O7P | −2 |
| adp | Adenosine diphosphate | C10H12N5O10P2 | −3 |
| atp | Adenosine triphosphate | C10H12N4O13P3 | −4 |
| pi | Phosphate | HO4P | −2 |
| nad | Nicotinamide adenine dinucleotide (NAD+) | C21H26N7O14P2 | −1 |
| nadh | Nicotinamide adenine dinucleotide-reduced | C21H27N7O14P2 | −2 |
Glycolysis begins at the phosphoenolpyruvate:pyruvate phosphotransferase protein complex, which actively translocates hexoses across the inner cytoplasmic membrane (175). Certain proteins of the phosphoenolpyruvate:pyruvate phosphotransferase complex (PEP:Pyr PTS) are carbohydrate specific, but in each case, transport is driven by transfer of the phosphate group of phosphoenolpyruvate to the carbohydrate. In the reaction D-glucose transport via PEP:Pyr PTS, GLCpts, phosphoenolpyruvate donates the phosphate group to glucose to form D-glucose 6-phosphate, and the dephosphorylated remainder of phosphoenolpyruvate is pyruvate (153). The same general procedure generates D-fructose 6-phosphate from fructose in the reaction fructose transport via PEP:Pyr PTS (f6p generating), FRUpts2 (52). The interconversion of D-glucose 6-phosphate and D-fructose 6-phosphate is catalyzed by glucose-6-phosphate isomerase, PGI (65). Phosphofructokinase, PFK, catalyzes the transfer of a phosphate group from ATP to D-fructose 6-phosphate to form D-fructose 1,6-bisphosphate and adenosine diphosphate (ADP) (45, 158). This reaction is effectively thermodynamically irreversible. However, in the reverse direction, the dephosphorylation of D-fructose 1,6-bisphosphate to form D-fructose 6-phosphate is catalyzed by fructose-bisphosphatase, FBP (79). The role of this reaction is discussed further in “Glyoxylate Cycle, Gluconeogenesis, and Anaplerotic Reactions,” below.
Fructose-bisphosphate aldolase, FBA, splits the 6-carbon D-fructose 1,6-bisphosphate into two 3-carbon molecules, dihydroxyacetone phosphate and glyceraldehyde 3-phosphate (2, 6, 79, 181). Triose-phosphate isomerase, TPI, rapidly and reversibly structurally rearranges dihydroxyacetone phosphate to glyceraldehyde 3-phosphate (144). A linear sequence of four reversible reactions catalyzed by glyceraldehyde-3-phosphate dehydrogenase, GAPD (18), phosphoglycerate kinase, PGK (124), phosphoglycerate mutase, PGM (64), and enolase, ENO (106), converts glyceraldehyde 3-phosphate to phosphoenolpyruvate. This sequence of reactions also reduces one NAD+ to form NADH in the glyceraldehyde-3-phosphate dehydrogenase reaction and yields one high-energy currency metabolite, ATP, in the phosphoglycerate kinase reaction. In the final step of glycolysis, pyruvate kinase, PYK, catalyzes the transfer of the phosphate group of phosphoenolpyruvate to ADP resulting in the production of pyruvate and ATP (66, 119). A cursory glance at Fig. 3 reveals that pyruvate is an important precursor involved in many pathways. It can be converted into acetyl-CoA, which provides carbon for the tricarboxylic acid cycle, as discussed in “Tricarboxylic Acid Cycle.” It can also be converted into lactate as part of fermentation, discussed in “Fermentation.”
The primary function of the pentose phosphate shunt is to provide the 5-carbon and 4-carbon biosynthetic precursors α-D-ribose 5-phosphate and D-erythrose 4-phosphate. α-D-Ribose 5-phosphate and D-erythrose 4-phosphate can be produced by either of two parallel pathways, the decarboxylating oxidative pathway or the nonoxidative pathway (see Table 3, Table 4, and Fig. 6). Under anaerobic conditions there is a greater flux through the nonoxidative pathway than through the oxidative pathway (90, 91).
TABLE 3.Pentose phosphate shunt reactions| Abbr. | Reaction | Equation |
| G6PDH2r | Glucose-6-phosphate dehydrogenase | g6p + nadp ↔ 6pgl + h + nadph |
| PGL | 6-Phosphogluconolactonase | 6pgl + h2o → 6pgc + h |
| GND | Phosphogluconate dehydrogenase | 6pgc + nadp → co2 + nadph + ru5p-D |
| RPI | Ribose-5-phosphate isomerase | r5p ↔ ru5p-D |
| TKT2 | Transketolase | e4p + xu5p-D ↔ f6p + g3p |
| TALA | Transaldolase | g3p + s7p ↔ e4p + f6p |
| TKT1 | Transketolase | r5p + xu5p-D ↔ g3p + s7p |
| RPE | Ribulose-5-phosphate 3-epimerase | ru5p-D ↔ xu5p-D |
TABLE 4.Pentose phosphate metabolites| Abbr. | Metabolite | Formula | Charge |
| 6pgl | 6-Phospho-D-glucono-1–5-lactone | C6H9O9P | −2 |
| 6pgc | 6-Phospho-D-gluconate | C6H10O10P | −3 |
| ru5p-D | D-Ribulose 5-phosphate | C5H9O8P | −2 |
| r5p | α-D-Ribose 5-phosphate | C5H9O8P | −2 |
| f6p | D-Fructose 6-phosphate | C6H11O9P | −2 |
| g3p | Glyceraldehyde 3-phosphate | C3H5O6P | −2 |
| xu5p-D | D-Xylulose 5-phosphate | C5H9O8P | −2 |
| s7p | Sedoheptulose 7-phosphate | C7H13O10P | −2 |
| e4p | D-Erythrose 4-phosphate | C4H7O7P | −2 |
| nadp | Nicotinamide adenine dinucleotide phosphate (NADP+) | C21H25N7O17P3 | −3 |
| nadph | Nicotinamide adenine dinucleotide phosphate-reduced | C21H26N7O17P3 | −4 |
The decarboxylating oxidative pathway is effectively irreversible. It consumes D-glucose 6-phosphate and after three reactions catalyzed by glucose-6-phosphate dehydrogenase, G6PDH2r (143), 6-phosphogluconolactonase, PGL (180), and phosphogluconate dehydrogenase, GND (188), produces D-ribulose 5-phosphate. The first and third reactions in this pathway each reduces one nicotinamide adenine dinucleotide phosphate (NADP+) to NADPH. D-Ribulose 5-phosphate is then reversibly structurally rearranged into α-D-ribose 5-phosphate in the reaction ribose-5-phosphate isomerase, RPI (55). The oxidative branch of the pentose phosphate shunt is important for production of reducing power in the form of NADPH. However, the pentose phosphate shunt is not the only source of NADPH (43). The reactions catalyzed by NAD(P) transhydrogenase, THD2, isocitrate dehydrogenase (NADP), ICDHyr, and malic enzyme (NADP), ME2, can also supply E. coli with NADPH.
The nonoxidative reversible rearrangement of the glycolytic sugar monophosphates to the pentose phosphate shunt sugar monophosphates is a simple mechanism for creating these precursors (115), but it does not contribute to reducing power. This rearrangement requires three steps, as illustrated in Fig. 6. First, transketolase, TKT2 (85, 172), catalyzes the conversion of a 6-carbon compound, D-fructose 6-phosphate, plus a 3-carbon compound, glyceraldehyde 3-phosphate, into a 5-carbon compound, D-xylulose 5-phosphate, plus a 4-carbon precursor, D-erythrose 4-phosphate. Then transaldolase, TALA, catalyzes the conversion of this 4-carbon compound plus another 6-carbon D-fructose 6-phosphate, into a 7-carbon, sedoheptulose 7-phosphate, plus another molecule of 3-carbon glyceraldehyde 3-phosphate. Next, the multifunctional enzyme transketolase, TKT1, catalyzes the conversion of this 7-carbon plus this 3-carbon into two different 5-carbon compounds, D-xylulose 5-phosphate and the precursor α-D-ribose 5-phosphate. In addition to ribose-5-phosphate isomerase, ribulose-5-phosphate 3-epimerase, RPE (110), provides another reversible catalytic link between the oxidative and nonoxidative branches since it interconverts D-xylulose 5-phosphate and D-ribulose 5-phosphate.
The tricarboxylic acid (TCA) cycle is a well-studied pathway with a variety of functions depending on the environment (Table 5, Table 6, and Fig. 7). During aerobic growth on 6-carbon sugars such as glucose, the TCA cycle functions to create the precursors oxaloacetate, 2-oxoglutarate (also commonly called α-ketoglutarate), and succinyl-CoA. The aerobic production of biomass precursors is carried out primarily by the oxidative arm of the TCA cycle, from oxaloacetate to 2-oxoglutarate. This is the counterclockwise, lower part of the cycle in Fig. 7. The full TCA cycle, continuing counterclockwise in Fig. 7, can totally oxidize acetyl-CoA. Cycle intermediates are still required as biosynthetic precursors, so flux from anaplerotic pathways is required to maintain the pool of dicarboxylic intermediates (“Glyoxylate Cycle, Gluconeogenesis, and Anaplerotic Reactions,” below). Under anaerobic conditions, the TCA cycle functions not as a cycle but as two separate pathways. The oxidative pathway, the counterclockwise lower part of the cycle in Fig. 7, forms the precursor 2-oxoglutarate. The reductive pathway, the clockwise upper part of the cycle in Fig. 7, forms the precursor succinyl-CoA. E. coli can grow in an environment where the only carbon substrate is one of the TCA cycle intermediates. This is enabled by proton symport transport reactions that translocate either 2-oxoglutarate, succinate, fumarate, or L-malate into the cell.
TABLE 5.Tricarboxylic acid cycle reactions| Abbr. | Reaction | Equation |
| CS | Citrate synthase | accoa + h2o + oaa → cit + coa + h |
| ACONTa | Aconitase (half-reaction A, citrate hydro-lyase) | cit ↔ acon-C + h2o |
| ACONTb | Aconitase (half-reaction B, isocitrate hydro-lyase) | acon-C + h2o ↔ icit |
| ICDHyr | Isocitrate dehydrogenase (NADP) | icit + nadp ↔ akg + co2 + nadph |
| AKGDH | 2-Oxoglutarate dehydrogenase | akg + coa + nad → co2 + nadh + succoa |
| SUCOAS | Succinyl-CoA synthetase (ADP-forming) | atp + coa + succ → adp + pi + succoa |
| FRD7 | Fumarate reductase | fum + q8h2 → q8 + succ |
| SUCDi | Succinate dehydrogenase (irreversible) | q8 + succ → fum + q8h2 |
| FUM | Fumarase | fum + h2o ↔ mal-L |
| MDH | Malate dehydrogenase | mal-L + nad ↔ h + nadh + oaa |
| AKGt2r | 2-Oxoglutarate reversible transport via symport | akg[e] + h[e] ↔ akg + h |
| SUCCt3 | Succinate transport out via proton antiport | h[e] + succ → h + succ[e] |
| SUCCt2_2 | Succinate transport via proton symport (2 H) | 2 h[e] + succ[e] → 2 h + succ |
| FUMt2_2 | Fumarate transport via proton symport (2 H) | fum[e] + 2 h[e] → fum + 2 h |
| MALt2_2 | Malate transport via proton symport (2 H) | 2 h[e] + mal-L[e] → 2 h + mal-L |
TABLE 6.Tricarboxylic acid cycle metabolites| Abbr. | Metabolite | Formula | Charge |
| accoa | Acetyl-CoA | C23H34N7O17P3S | −4 |
| cit | Citrate | C6H5O7 | −3 |
| acon-C | cis-aconitate | C6H3O6 | −3 |
| icit | Isocitrate | C6H5O7 | −3 |
| akg | 2-Oxoglutarate | C5H4O5 | −2 |
| succoa | Succinyl-CoA | C25H35N7O19P3S | −5 |
| fum | Fumarate | C4H2O4 | −2 |
| mal-L | L-Malate | C4H4O5 | −2 |
| oaa | Oxaloacetate | C4H2O5 | −2 |
| coa | Coenzyme A | C21H32N7O16P3S | −4 |
| q8 | Ubiquinone-8 | C49H74O4 | 0 |
| q8h2 | Ubiquinol-8 | C49H76O4 | 0 |
Pyruvate dehydrogenase, PDH (see Fig. 10), catalyzes the synthesis of acetyl-CoA from pyruvate and coenzyme A with concomitant reduction of NAD+ to NADH. This reaction is carried out by a large multienzyme complex containing 12 AceE dimers, 6 Lpd dimers, and a 24-AceF core (152). Citrate synthase, CS (125), catalyzes the condensation reaction of the 2-carbon acetate residue from acetyl-CoA and a molecule of 4-carbon oxaloacetate to form the 6-carbon compound citrate, with the release of coenzyme A. Next, the reactions aconitase A, ACONTa, and aconitase B, ACONTb, isomerize citrate to isocitrate via the metabolite cis-aconitate (21, 147). Isocitrate sits at a branch point in the TCA cycle, where carbon flux either continues in the oxidative branch to the reaction catalyzed by isocitrate dehydrogenase, ICDHyr (25), or is diverted into the glyoxylate cycle by isocitrate lyase, ICL. The glyoxylate cycle is discussed further in “Glyoxylate Cycle, Gluconeogenesis, and Anaplerotic Reactions,” below.
Isocitrate dehydrogenase catalyzes the decarboxylation of isocitrate, producing 2-oxoglutarate and CO2 while reducing NADP+ to NADPH. 2-Oxoglutarate provides a carbon backbone for synthesis of glutamate and glutamine, the central metabolites in nitrogen metabolism, discussed in “Nitrogen Metabolism.” 2-Oxoglutarate dehydrogenase, AKGDH, catalyzes the decarboxylation of 2-oxoglutarate, producing CO2, reducing NAD+ to NADH, and transferring coenzyme A to the decarboxylated carbon compound to form succinyl-CoA. 2-Oxoglutarate dehydrogenase is a large enzyme complex of the same family as pyruvate dehydrogenase, containing 12 SucA enzymes, an Lpd dimer, and 24 SucB enzymes (141).
During aerobic growth, the TCA cycle continues counterclockwise from succinyl-CoA in Fig. 7. Succinyl-CoA synthetase, SUCOAS, generates ATP by separating succinate and coenzyme A (20). Succinate dehydrogenase, SUCDi, is a multiprotein enzyme complex that straddles the cytoplasmic membrane allowing it to couple the TCA cycle to the electron transport chain (195), discussed in “Electron Transport Chain, Oxidative Phosphorylation, and Transfer of Reducing Equivalents.” The succinate dehydrogenase complex catalyzes the irreversible oxidation of succinate to fumarate while reducing ubiquinone-8, q8, to ubiquinol-8, q8h2 (38). Ubiquinol-8 is then released from the enzyme complex and free to diffuse through the cytoplasmic membrane to interact with subsequent enzymes of the electron transport chain. After succinate is converted fumarate, fumarase, FUM, reversibly catalyzes the conversion of fumarate and water into L-malate (11, 62, 192). Finally, to complete the TCA cycle, malate dehydrogenase, MDH, reversibly catalyzes the conversion of malate into oxaloacetate while reducing NAD+ to NADH (176). When this set of reactions is used in reverse, clockwise in Fig. 7, as the reductive pathway of the TCA cycle, a reversing reaction is catalyzed by fumarate reductase, FRD7 (37). In the model, this reaction oxidizes ubiquinol-8 to ubiquinone-8, although in actual E. coli, fumarate reductase oxidizes the electron carrier menaquinol-8 instead (87). This reaction had to be included in an unrealistic form because the simplified electron transport system in the model includes only ubiquinone-8/ubiquinol-8 as an electron carrier.
When growing on some substrates, the glyoxylate cycle is used instead of the TCA cycle because it bypasses the reactions that lose carbon in the form of CO2. The glyoxylate cycle consists of some of the reactions in the TCA cycle as well as other reactions used only by the glyoxylate cycle (see Table 7, Table 8, and Fig. 8). It overlaps with the TCA cycle from the incorporation of acetyl-CoA to the production of isocitrate at the aforementioned branch point in the oxidative arm of the TCA cycle. Isocitrate lyase, ICL, catalyzes the cleavage of 6-carbon isocitrate into 4-carbon succinate and 2-carbon glyoxylate (80). Malate synthase, MALS, then catalyzes the condensation of glyoxylate with another acetyl-CoA, yielding malate (118). Succinate generated in the first step can continue along the TCA cycle to eventually form oxaloacetate.
TABLE 7.Glyoxylate cycle, anaplerotic reactions, and gluconeogenesis reactions| Abbr. | Reaction | Equation |
| ICL | Isocitrate lyase | icit → glx + succ |
| MALS | Malate synthase | accoa + glx + h2o → coa + h + mal-L |
| ME1 | Malic enzyme (NAD) | mal-L + nad → co2 + nadh + pyr |
| ME2 | Malic enzyme (NADP) | mal-L + nadp → co2 + nadph + pyr |
| PPS | Phosphoenolpyruvate synthase | atp + h2o + pyr → amp + 2 h + pep + pi |
| PPCK | Phosphoenolpyruvate carboxykinase | atp + oaa → adp + co2 + pep |
| PPC | Phosphoenolpyruvate carboxylase | co2 + h2o + pep → h + oaa + pi |
TABLE 8.Glyoxylate metabolite| Abbr. | Metabolite | Formula | Charge |
| glx | Glyoxylate | C2HO3 | −1 |
In addition to growing on hexoses or pentoses, E. coli can also grow on 2-, 3-, or 4-carbon sources, such as lactate or malate, but in this situation, some 6-carbon metabolites in the glycolytic pathway are still required as precursors for biomass components. With only 2-, 3-, or 4-carbon sources in the environment, the glycolytic pathway can actually be reversed to produce net flux from pyruvate to glucose 6-phosphate. This reversal of glycolytic flux is referred to as gluconeogenesis. The two reactions of glycolysis that are effectively irreversible, catalyzed by phosphofructokinase, PFK, and pyruvate kinase, PYK, are replaced with two reversing reactions, catalyzed by fructose-bisphosphatase, FBP, and phosphoenolpyruvate synthase, PPS, respectively. Phosphoenolpyruvate synthase (120) catalyzes the conversion of pyruvate to phosphoenolpyruvate and in the process hydrolyzes one ATP to AMP (39).
Anaplerotic reactions replenish TCA cycle intermediates drained off for biosynthesis. The TCA cycle operating cyclically can completely oxidize acetate to carbon dioxide without net consumption or production of intermediates. However, intermediates of the TCA cycle such as oxaloacetate and 2-oxoglutarate are consumed in the production of macromolecules. TCA cycle intermediate generation from the glycolytic metabolites is accomplished by the irreversible carbon dioxide-fixing conversion of the 3-carbon phosphoenolpyruvate to the 4-carbon oxaloacetate, catalyzed by the enzyme phosphoenolpyruvate carboxylase, PPC (Fig. 8).
Growth on 4-carbon dicarboxylic acid intermediates of the TCA cycle, such as malate, requires that the cell be able to produce phosphoenolpyruvate for gluconeogenesis. Two pathways exist to fulfill these phosphoenolpyruvate demands (74, 75). One pathway involves the conversion of malate to pyruvate by malic enzyme, ME1 or ME2 (88, 113), followed by the synthesis of phosphoenolpyruvate from pyruvate by phosphoenolpyruvate synthase, PPS (126). Malic enzyme, ME1, reduces one molecule of NAD+ to NADH while converting malate to pyruvate. A second parallel reaction, malic enzyme (NADP), ME2, reduces one molecule of NADP+ to NADPH. The other pathway from the TCA cycle to glycolytic intermediates is the conversion of oxaloacetate to phosphoenolpyruvate by the action of phosphoenolpyruvate carboxykinase, PPCK (48). Phosphoenolpyruvate carboxykinase catalyzes the reverse reaction to the anaplerotic enzyme, phosphoenolpyruvate carboxylase, PPC (93). The former reaction consumes a high-energy phosphate bond in ATP and produces CO2, whereas phosphoenolpyruvate carboxylase releases inorganic phosphate and consumes CO2. Although the reactions catalyzed by the enzymes phosphoenolpyruvate carboxykinase and malic enzyme are thermodynamically reversible, physiologically they are found to operate unidirectionally (103, 104).
The electron transport chain and oxidative phosphorylation are used to produce the bulk of the cell's ATP under aerobic conditions. The electron transport chain translocates protons (H+) from the cytoplasm across the cytoplasmic membrane into the periplasmic space (see Table 9, Table 10, and Fig. 9). Since the cytoplasmic membrane is effectively impermeable to protons and hydroxyl ions (OH−), this establishes a difference in concentration of protons, and a difference in electrical charge, across the cytoplasmic membrane. This thermodynamic potential difference gives rise to a proton motive force that can be utilized to drive a myriad of endergonic reactions, such as synthesis of high-energy currency metabolites like ATP. In the model, protons are translocated into the extracellular medium as a simplification, but this is a reasonable approximation given that the pH of the periplasm and that of extracellular medium are the same (191).
TABLE 9.Electron transport chain, oxidative phosphorylation, and transfer of reducing equivalents reactions| Abbr. | Reaction | Equation |
| NADH16 | NADH dehydrogenase (ubiquinone-8 & 3 protons) | 4 h + nadh + q8 → 3 h[e] + nad + q8h2 |
| CYTBD | Cytochrome oxidase bd (ubiquinol-8: 2 protons) | 2 h + ½ o2 + q8h2 → h2o + 2 h[e] + q8 |
| O2t | o2 transport via diffusion | o2[e] ↔ o2 |
| ATPS4r | ATP synthase (four protons for one ATP) | adp + 4 h[e] + pi ↔ atp + h2o + 3 h |
| ATPM | ATP maintenance requirement | atp + h2o → adp + h + pi |
| ADK1 | Adenylate kinase | amp + atp ↔ 2 adp |
| THD2 | NAD(P) transhydrogenase | 2 h[e] + nadh + nadp → 2 h + nad + nadph |
| NADTRHD | NAD transhydrogenase | nad + nadph → nadh + nadp |
TABLE 10.Electron transport chain, oxidative phosphorylation, and transfer of reducing equivalents metabolites| Abbr. | Metabolite | Formula | Charge |
| q8 | Ubiquinone-8 | C49H74O4 | 0 |
| q8h2 | Ubiquinol-8 | C49H76O4 | 0 |
| nad | Nicotinamide adenine dinucleotide (NAD+) | C21H26N7O14P2 | −1 |
| nadh | Nicotinamide adenine dinucleotide-reduced | C21H27N7O14P2 | −2 |
| nadp | Nicotinamide adenine dinucleotide phosphate (NADP+) | C21H25N7O17P3 | −3 |
| nadph | Nicotinamide adenine dinucleotide phosphate-reduced | C21H26N7O17P3 | −4 |
| atp | Adenosine triphosphate | C10H12N5O13P3 | −4 |
| adp | Adenosine diphosphate | C10H12N5O10P2 | −3 |
| amp | Adenosine monophosphate | C10H12N5O7P | −2 |
| h | H+ | H | 1 |
The electron transport chain of E. coli consists of several different respiratory dehydrogenases, quinones, and terminal reductases. There are 15 different dehydrogenases that accept electrons from donors such as NADH or succinate, and then pass the electrons to one of three different quinones, which then deliver the electrons to one of at least 14 different terminal reductases. The reductases complete the chain by reducing a terminal electron acceptor such as oxygen or fumarate. Some, but not all dehydrogenases and reductases pump protons into the periplasm. It is possible for the dehydrogenases, quinones, and reductases to be used in many different combinations, so the entire system can be very complicated (184). The core E. coli model condenses the sequence of steps in the electron transport chain into two reactions, representing generic NADH dehydrogenase and cytochrome oxidase reactions, connected by only one quinone. First, an NADH dehydrogenase, NADH16, catalyzes the oxidation of NADH to form NAD+ while removing four protons from the cytoplasm, translocating three protons to the extracellular space and combining the fourth with a proton plus two electrons from NADH with ubiquinone-8 to form the reduced ubiquinol-8 (165, 171). Ubiquinone-8 and ubiquinol-8 are oil-soluble coenzymes that diffuse freely within the lipid environment of the cytoplasmic membrane (Fig. 9). The next condensed step is when cytochrome oxidase, CYTBD, catalyzes the oxidation of ubiquinol-8 back to ubiquinone-8, which drives the translocation of two more protons into the extracellular space (102). The two spare electrons are then combined with two cytoplasmic protons and an oxygen atom to form water. Oxygen spontaneously diffuses from the environment into the cell down a concentration gradient, O2t, and represents an exogenous source of terminal electron acceptor.
The enzyme ATP synthase, ATPS4r, catalyzes the synthesis of ATP from ADP, forming a high-energy phosphate bond by coupling catalysis to the import of four protons that were pumped out by the electron transport chain (26, 96). The exact number of high-energy phosphate bonds that are generated per oxygen atom used as a terminal acceptor is the P/O ratio. This varies depending on periplasmic pH and other environmental conditions, but the core model P/O ratio is stoichiometrically fixed at 1.25. The ATP maintenance reaction, ATPM, is not a real biochemical reaction. It is included for modeling purposes since the scope of the E. coli core model does not extend to all of the reactions that consume ATP in the cell. Adenylate kinase, ADK1, is a phosphotransferase enzyme that catalyzes the interconversion of adenine nucleotides, and plays an important role in cellular energy homeostasis (12, 22).
NADH is used for the catabolic activities of the cell, for instance, driving the export of protons into the periplasmic space in the electron transport chain in the reaction catalyzed by NADH dehydrogenase, NADH16. In contrast, NADPH is essential for anabolic metabolism such as the biosynthesis of building blocks for polymerization reactions from precursor metabolites produced by the fueling pathways of core metabolism. Maintaining the proper balance between anabolic reduction charge, NADPH/NADP+, and catabolic reduction charge, NADH/NAD+, is achieved by reactions catalyzed by transhydrogenase enzymes. NAD(P) transhydrogenase, THD2, catalyzes the transfer of a hydride ion from NADH to create NADPH, in a reaction coupled to the proton motive force (13). The opposite transfer, of a hydride ion from NADPH to create NADH, is catalyzed by another enzyme, NAD transhydrogenase, NADTRHD, but it is not coupled to the translocation of protons (161). This pair of reactions effectively allows the transfer of reducing equivalents between anabolic and catabolic reduction charge.
During aerobic respiration, oxygen is the terminal electron acceptor for the electron transport chain, and the ATP required for biosynthesis is produced by ATP synthase. Under anaerobic conditions, E. coli can generate ATP by substrate-level phosphorylation in the process of fermentation, where excess carbon is secreted as various organic by-products (see Table 11, Table 12, and Fig. 10). Glycolysis results in the net production of 2 ATP per glucose by substrate-level phosphorylation, but this is very low compared with the 17.5 ATP per glucose generated during aerobic respiration (in the model 17.5 ATP per glucose is generated, but this number can vary in vivo). The substrates of fermentation are typically sugars, so during fermentative growth, each cell must maintain a large flux through glycolysis to generate sufficient ATP to drive the constitutive biosynthesis, polymerization, and assembly reactions required for growth. This necessitates a large efflux of fermentative end products since there is insufficient ATP to assimilate all carbon as biomass. Approximately 10% of carbon substrate is assimilated because of the poor energy yield of fermentation (35). Glycolysis also produces two molecules of NADH for each glucose; therefore, NADH must be reoxidized by fermentation to regenerate NAD+ to maintain the oxidation-reduction balance of the cell.
TABLE 11.Fermentation reactions| Abbr. | Reaction | Equation |
| LDH_D | D-Lactate dehydrogenase | lac-D + nad ↔ h + nadh + pyr |
| D_LACt2 | D-Lactate transport via proton symport | h[e] + lac-D[e] ↔ h + lac-D |
| PDH | Pyruvate dehydrogenase | coa + nad + pyr → accoa + co2 + nadh |
| PFL | Pyruvate formate lyase | coa + pyr → accoa + for |
| FORti | Formate transport via diffusion | for → for[e] |
| FORt2 | Formate transport via proton symport | for[e] + h[e] → for + h |
| PTAr | Phosphotransacetylase | accoa + pi ↔ actp + coa |
| ACKr | Acetate kinase | ac + atp ↔ actp + adp |
| ACALD | Acetaldehyde dehydrogenase (acetylating) | acald + coa + nad ↔ accoa + h + nadh |
| ALCD2x | Alcohol dehydrogenase (ethanol) | etoh + nad ↔ acald + h + nadh |
| ACt2r | Acetate reversible transport via proton symport | ac[e] + h[e] ↔ ac + h |
| ACALDt | Acetaldehyde reversible transport | acald[e] ↔ acald |
| ETOHt2r | Ethanol reversible transport via proton symport | etoh[e] + h[e] ↔ etoh + h |
TABLE 12.Fermentation metabolites| Abbr. | Metabolite | Formula | Charge |
| lac-D | D-Lactate | C3H5O3 | −1 |
| for | Formate | CHO2 | −1 |
| actp | Acetyl-phosphate | C2H3O5P | −2 |
| ac | Acetate | C2H3O2 | −1 |
| acald | Acetaldehyde | C2H4O | 0 |
| etoh | Ethanol | C2H6O | 0 |
E. coli fermentation normally generates a mixture of end products anaerobically from sugars. The major soluble products are acetate, ethanol, lactate, and formate, with a smaller amount of succinate (35). In addition, fermentation results in the production of substantial quantities of carbon dioxide and hydrogen. Depending on the pH of the culture medium, and the redox state of the fermentation substrate, a cell may vary the relative flux through each of the fermentation pathways branching from pyruvate. When the pH of the environment drops because of increased concentrations of other acidic fermentative end products, such as acetic, formic, or succinic acid, then flux may be increased through the reaction catalyzed by D-lactate dehydrogenase, LDH_D (51, 89). This results in the reduction of pyruvate to form lactate while oxidizing NADH to form NAD+.
Pyruvate formate lyase, PFL, catalyzes the nonoxidative cleavage of pyruvate into acetyl-CoA and formate, with the incorporation of coenzyme A into acetyl-CoA (101, 164). Acetyl-CoA can then lead to either of the fermentative end products, acetate or ethanol. Both pathways involve two-step reversible mechanisms. In the conversion to acetate, phosphotransacetylase, PTAr (177), catalyzes transfer of a phosphate group onto the acetyl moiety of acetyl-CoA, to form acetyl-phosphate and release coenzyme A to be recycled for the previous reaction catalyzed by pyruvate formate lyase, PFL. Then, acetate kinase, ACKr, catalyzes the conversion of acetyl-phosphate to acetate, in the process forming a much needed high-energy phosphate bond by converting ADP to ATP (170). The reactions catalyzed by phosphotransacetylase and acetate kinase are thermodynamically reversible. This allows E. coli to grow aerobically on acetate by reversing the flux through this fermentative pathway. During the conversion of acetyl-CoA to ethanol, two molecules of NADH are reoxidized, one by the first reaction catalyzed by acetaldehyde dehydrogenase (acetylating), ACALD (60), and the second by the subsequent reaction catalyzed by alcohol dehydrogenase (ethanol), ALCD2x (98). Fermentation of many different substrates is possible because ethanol is more reduced than sugars, whereas acetate is more oxidized than ethanol. Redox balancing is achieved by varying the ratio of ethanol to acetate secreted. The end products of each fermentation pathway exit the cell along a concentration gradient, and, in the process, transport a proton from the cytoplasm to the periplasmic space.
Nitrogen is the fourth most abundant element in E. coli and enters the cell either by ammonium ion uptake, NH4t, or as a moiety within organic molecules such as the amino acid L-glutamine or L-glutamate. Glutamate is an extremely abundant metabolite, with a measured concentration of 100.55 μmol per gram dry weight (gDW−1) and production rate of 4.77 mmol gDW−1 h−1. Glutamine is less abundant, with a measured concentration of 3.92 μmol gDW−1 and production rate of 3.36 mmol gDW−1 h−1 (196). Glutamate is a central component of the nitrogen metabolism of E. coli, as it is a nitrogen donor in many reactions. In particular, it is required for the synthesis of the other amino acids because its amino group is transferred to other compounds in transamination reactions. Glutamine is actively transported across the inner cytoplasmic membrane by an ATP-binding cassette transporter, GLNabc, which imports one molecule of glutamine while consuming one ATP (194). Glutamate is actively transported across the inner cytoplasmic membrane by proton symport, GLUt2r (190).
Direct ammonia assimilation is catalyzed by NADPH-specific reductive amination of 2-oxoglutarate to glutamate by glutamate dehydrogenase (NADP), GLUDy (196). Indirect ammonia assimilation is by a cyclic pair of sequential reactions catalyzed by glutamine synthetase, GLNS (156), and glutamate synthase (NADPH), GLUSy (196). Glutamine synthetase first catalyzes the assimilation of ammonia by converting glutamate, with one amino moiety, into glutamine, with two amino moieties. Then glutamate synthase, GLUSy, catalyzes the transfer of the amide group of glutamine to 2-oxoglutarate, generating a second molecule of glutamate (see Fig. 11, Table 13, and Table 14).
TABLE 13.Nitrogen metabolism reactions| Abbr. | Reaction | Equation |
| GLNabc | L-Glutamine transport via ABC system | atp + gln-L[e] + h2o → adp + gln-L + h + pi |
| GLUt2r | L-Glutamate transport via proton symport | glu-L[e] + h[e] ↔ glu-L + h |
| GLUDy | Glutamate dehydrogenase (NADP) | glu-L + h2o + nadp ↔ akg + h + nadph + nh4 |
| GLNS | Glutamine synthetase | atp + glu-L + nh4 → adp + gln-L + h + pi |
| GLUSy | Glutamate synthase (NADPH) | akg + gln-L + h + nadph → 2 glu-L + nadp |
| GLUN | Glutaminase | gln-L + h2o → glu-L + nh4 |
TABLE 14.Nitrogen metabolites| Abbr. | Metabolite | Formula | Charge |
| glu-L | L-Glutamate | C5H8NO4 | −1 |
| gln-L | L-Glutamine | C5H10N2O3 | 0 |
| nh4 | Ammonium | H4N | 1 |
The core E. coli metabolic reconstruction can be converted into a mathematical model by encoding all of the metabolites and reactions in a stoichiometric matrix:
[S ε Zm,n]
The stoichiometric matrix for the core E. coli model has m = 72 rows, each corresponding to a metabolite, and n = 95 columns, each corresponding to a reaction. The coefficients within a single column of the stoichiometric matrix represent the stoichiometry of a single reaction. A negative stoichiometric coefficient indicates the number of molecules of a particular metabolite consumed in that reaction, and a positive stoichiometric coefficient represents the number of molecules of a metabolite produced in that reaction. Since each reaction typically involves only a few metabolites, the stoichiometric matrix is sparse, consisting mostly of zero coefficients. Each reaction must be balanced with respect to consumption and production of both chemical elements and electrical charge. The exceptions to this rule are the columns of the E. coli core stoichiometric matrix that correspond to exchange reactions at the boundary of the model and its environment. Each exchange reaction represents the influx or efflux of an extracellular metabolite between the model and the environment. A metabolite leaving the model is represented with a positive exchange flux, and the opposite for a metabolite entering the model. The biomass reaction, representing the synthesis of biomass, is also represented by a column of the stoichiometric matrix. The biomass reaction may be considered a special type of exchange reaction, and is discussed further in “Biomass Reaction.”
The net flux through all of the 95 reactions in the model can be mathematically represented by a 95 dimensional vector:
[v ε Rn]
By convention, net flux is represented with the unit millimoles per gram dry weight per hour (mmol gDW−1 h−1). The dot product of the stoichiometric matrix and the net flux vector gives an m-dimensional vector of changes in metabolite concentration over time:
[dx/dt ε Rm]
During balanced growth, when considering a large population of cells, it is reasonable to assume that all metabolite concentrations are constant; therefore, we have the fundamental system of m equations for mass conservation at steady state:
S · v = (dx/dt) = 0 (1)
Let the first row of the stoichiometric matrix correspond to ATP, then consider the meaning of the first equation in the system of equation 1:
S 1,1 v 1 + S1,2v2 + … + S1,nvn = (d[ATP]/dt) = 0 (2)
Since the stoichiometric coefficients represent the number of molecules consumed or produced in each reaction in the model, and the net fluxes represent the rates for each respective reaction, then equation 2 balances the total net production and consumption of ATP (Fig. 12).
The net flux for all biochemical reactions is bounded by diffusion limitations (61). In addition, experimental data on maximum reaction rates, substrate uptake rates, or waste secretion rates may be used to further tighten these bounds, mathematically represented by:
v lb ≤ v ≤ vub
For reactions that are considered to be effectively thermodynamically irreversible in the forward direction, the convention is to set the lower bound on a reaction rate to zero (57). In any realistic stoichiometric model, there are more columns than rows in the stoichiometric matrix. Therefore, there is an insufficient number of equations to specify a unique net flux in equation 1. To predict a biologically meaningful net flux, flux balance analysis can be used (185, 186). Flux balance analysis uses linear programming to optimize a biologically motivated objective function, subject to steady-state mass balance and bounds on net fluxes to predict an in vivo flux. This can be expressed as the following linear programming problem:
maximize: cT · v
subject to: S · v = 0
v lb ≤ v ≤ vub
The objective function, cTv, can be any linear combination of reaction fluxes. In flux balance analysis of E. coli, the biomass reaction is often the sole reaction to be maximized by the objective function since maximum growth is evolutionarily favored in genetically heterogeneous culture (83). The publicly available COBRA Toolbox software for MATLAB (9) may be used to make numerical predictions of in vivo net flux by implementing a mathematical model as a computational model (59).
To represent growth, the core E. coli model includes a biomass reaction that drains precursor metabolites from the network at stoichiometrically fixed relative rates while also producing several by-product metabolites (57) (see Table 15 and Fig. 13). These precursors are used to produce the lipids, proteins, nucleic acids, and other macromolecules required to replicate a cell. To determine these metabolites and their quantity, we used the dry weight composition data for an average E. coli B/r cell growing exponentially at 37°C under aerobic conditions in glucose minimal medium, with an approximate doubling time of 40 min having a dry cell weight of 2.81013 g (123). Since most of the subunits of the cellular macromolecules, such as nucleic acids and amino acids, are not present in the core model, they could not be directly accounted for in the biomass reaction. The metabolites in the core model that those macromolecular subunits are synthesized from are included instead. These are precursor metabolites. For example, the amino acid L-alanine is synthesized from pyruvate and L-glutamate, so both of these metabolites are consumed in the biomass reaction. Several metabolites are actually produced by the biomass reaction. ADP, protons, and inorganic phosphate are produced by the hydrolysis of ATP in the balanced reaction “atp + h2o → adp + h + pi.” 2-Oxoglutarate is produced during the synthesis of amino acids, when L-glutamate transfers its amino group to another compound in a transamination reaction. Coenzyme A is produced when acetyl-CoA is consumed, and NAD+ is reduced to NADH and NADPH is oxidized to NADP+ during biomass synthesis.
TABLE 15.Twenty-three different metabolites consumed or produced to simulate growth in the biomass reactiona| Abbr. | Metabolite | Stoichiometry |
| 3pg | 3-Phospho-D-glycerate | −1.496 |
| accoa | Acetyl-CoA | −3.7478 |
| adp | Adenosine diphosphate | 59.81 |
| akg | 2-Oxoglutarate | 4.1182 |
| atp | Adenosine triphosphate | −59.81 |
| coa | Coenzyme A | 3.7478 |
| e4p | D-Erythrose 4-phosphate | −0.361 |
| f6p | D-Fructose 6-phosphate | −0.0709 |
| g3p | Glyceraldehyde 3-phosphate | −0.129 |
| g6p | D-Glucose 6-phosphate | −0.205 |
| gln-L | L-Glutamine | −0.2557 |
| glu-L | L-Glutamate | −4.9414 |
| h | H+ | 59.81 |
| h2o | H2O | −59.81 |
| nad | Nicotinamide adenine dinucleotide (NAD+) | −3.547 |
| nadh | Nicotinamide adenine dinucleotide-reduced | 3.547 |
| nadp | Nicotinamide adenine dinucleotide phosphate (NADP+) | 13.0279 |
| nadph | Nicotinamide adenine dinucleotide phosphate-reduced | −13.0279 |
| oaa | Oxaloacetate | −1.7867 |
| pep | Phosphoenolpyruvate | −0.5191 |
| pi | Phosphate | 59.81 |
| pyr | Pyruvate | −2.8328 |
| r5p | α-D-Ribose 5-phosphate | −0.8977 |
|
Additional energetic requirements exist for growth beyond what is needed to generate the macromolecular content of the cell. These energetic maintenance requirements are for growth-associated maintenance (e.g., protein polymerization costs) and non-growth-associated maintenance (e.g., membrane leakage). To represent growth-associated maintenance, ATP is converted to ADP at 59.81 mmol gDW−1 h−1, accounting for energy used in cell division and other growth processes. Non-growth-associated maintenance is not part of the biomass reaction. Instead, it is represented with a lower bound of 8.39 mmol gDW−1 h−1 on the ATP maintenance reaction (ATPM), simulating energy used for protein turnover and other processes that do not change with growth.
In addition to the metabolic reconstruction, the core E. coli model also contains a Boolean representation of part of the associated transcriptional regulatory network. This network is a modified subset of the genome-scale transcriptional regulatory reconstruction iMC1010 (41). In response to external and internal stimuli, in silico transcription factors either activate or repress genes associated with metabolic reactions. This regulation improves the predictive fidelity of the metabolic model by imposing additional context-dependent constraints on certain reactions. The transcriptional regulatory reconstruction consists of a set of Boolean rules that dictate whether a gene is either fully induced or fully repressed. If the genes associated with an enzyme or transport protein/complex are repressed, then in silico flux is constrained to zero for the corresponding reaction. The solution space of the network shrinks when these additional constraints are imposed. Reactions that are not used because of regulatory effects are thus restricted, so when using flux balance analysis, the optimal flux distribution will be consistent with known regulation. This optimal flux distribution may be different from the flux distribution of an unregulated model. In this case, the flux distribution of the unregulated model violated at least one regulatory constraint, making it biologically unrealistic. The use of computationally implemented Boolean rules in a genome-scale model has been shown to lead to more accurate flux balance analysis predictions (41).
A gene is considered to be induced when evaluation of the corresponding Boolean rule gives “true.” In contrast, a gene is considered to be repressed when evaluation of the corresponding Boolean rule gives “false.” Boolean logic is used to evaluate each Boolean rule. For example, consider the enzyme phosphoenolpyruvate synthase, PPS, which catalyzes the first step of gluconeogenesis, the conversion of pyruvate to phosphoenolpyruvate. The gene for phosphoenolpyruvate synthase is pps and its Boolean rule is simply “FruR.” That is, if FruR is “true,” then the pps gene is induced allowing in silico flux through the reaction catalyzed by phosphoenolpyruvate synthase, PPS. FruR is a transcriptional regulator that is active when the cytoplasmic concentration of D-fructose 1,6-bisphosphate, FDP, is low (14). The Boolean rule for FruR is “NOT surplusFDP.” That is, if there is no surplus of D-fructose 1,6-bisphosphate, then FruR is “true,” and therefore the pps gene is induced, allowing in silico gluconeogenic flux through the reaction catalyzed by phosphoenolpyruvate synthase, PPS. In contrast, if surplusFDP is “true,” then FruR is “false,” and therefore pps is repressed.
Regulatory conditions, such as surplusFDP, are variables that represent a complex regulatory rule for a transcription factor that cannot be accurately represented with only one variable. The regulatory rule for surplusFDP is “((NOT FBP) AND (NOT (TKT2 OR TALA OR PGI))) OR fru[e].” If fru[e] is “true,” then surplusFDP is “true,” independent of the state of the other variables. If fructose-bisphosphatase, FBP, is “false,” and any one of transketolase, TKT2, transaldolase, TALA, or glucose-6-phosphate isomerase, PGI, is “false,” then surplusFDP is “true”; therefore, FruR is “false” and pps is repressed. By using Boolean logic, all rules in a regulatory network can be reduced to either “true” or “false,” and ultimately this dictates whether each metabolic gene is induced or repressed. Not every gene in the metabolic network is controlled by the regulatory network, so the unregulated genes are assumed to always be active, and their fluxes are never constrained to zero. Table 16 lists the Boolean regulatory rules for regulated metabolic genes, and Table 17 lists the Boolean regulatory rules for transcription factors and regulatory conditions in the transcriptional regulatory network. An abstract overview of part of the regulatory network is depicted in Fig. 14.
TABLE 16.Regulatory rules for metabolic genes in the core model| Gene | Locus | Reaction abbr. | Regulatory rule |
| aceA | b4015 | ICL | (NOT IclR) AND ((NOT ArcA) OR FruR) |
| aceB | b4014 | MALS | (NOT IclR) AND ((NOT ArcA) OR FruR) |
| aceE | b0114 | PDH | (NOT PdhR) OR Fis |
| aceF | b0115 | PDH | (NOT PdhR) OR Fis |
| adhE | b1241 | ACALD and ALCD2x | (NOT o2[e]) OR (NOT (o2[e] AND FruR)) OR Fis |
| cydA | b0733 | CYTBD | (NOT Fnr) OR ArcA |
| cydB | b0734 | CYTBD | (NOT Fnr) OR ArcA |
| dctA | b3528 | FUMt2_2, MALt2_2, and SUCCt2_2 | CRPnoGLM AND (NOT ArcA) AND DcuR |
| focA | b0904 | FORt2 and FORti | ArcA OR (Fnr AND CRPnoGLC) |
| focB | b2492 | FORt2 and FORti | ArcA OR (Fnr AND CRPnoGLC) |
| frdA | b4154 | FRD7 | Fnr OR DcuR |
| frdB | b4153 | FRD7 | Fnr OR DcuR |
| frdC | b4152 | FRD7 | Fnr OR DcuR |
| frdD | b4151 | FRD7 | Fnr OR DcuR |
| fumA | b1612 | FUM | NOT (ArcA OR Fnr) |
| fumB | b4122 | FUM | Fnr OR CRPnoGLC OR DcuR |
| fumC | b1611 | FUM | NOT ArcA |
| gdhA | b1761 | GLUDy | NOT (Nac OR glu-L[e]) |
| glcA | b2975 | D_LACt2 | (NOT ArcA) AND GlcC |
| glcB | b2976 | MALS | (NOT ArcA) AND GlcC |
| glnA | b3870 | GLNS | CRPnoGLC |
| gltB | b3212 | GLUSy | (NOT (NRI_hi AND glu-L[e])) |
| gltD | b3213 | GLUSy | (NOT (NRI_hi AND glu-L[e])) |
| lldP | b3603 | D_LACt2 | NOT ArcA |
| manX | b1817 | FRUpts2 and GLCpts | CRPnoGLM OR (NOT Mlc) |
| manY | b1818 | FRUpts2 and GLCpts | CRPnoGLM OR (NOT Mlc) |
| manZ | b1819 | FRUpts2 and GLCpts | CRPnoGLM OR (NOT Mlc) |
| mdh | b3236 | MDH | NOT ArcA |
| nuoA | b2288 | NADH16 | (NOT (ArcA OR Fnr)) |
| nuoB | b2287 | NADH16 | (NOT (ArcA OR Fnr)) |
| nuoC | b2286 | NADH16 | (NOT (ArcA OR Fnr)) |
| nuoE-N | b2276-b2285 | NADH16 | (NOT (ArcA OR Fnr)) |
| pflA | b0902 | PFL | ArcA OR (Fnr AND CRPnoGLC) |
| pflB | b0903 | PFL | ArcA OR (Fnr AND CRPnoGLC) |
| pflC | b3952 | PFL | ArcA OR Fnr |
| pflD | b3951 | PFL | ArcA OR Fnr |
| pitB | b2987 | PIt2r | NOT PhoB |
| pps | b1702 | PPS | FruR |
| ptsG | b1101 | GLCpts | (NOT Mlc) OR (NOT FruR) |
| pykF | b1676 | PYK | NOT FruR |
| sdhA | b0723 | SUCDi | (NOT (ArcA OR Fnr)) OR CRPnoGLC OR Fis |
| sdhB | b0724 | SUCDi | (NOT (ArcA OR Fnr)) OR CRPnoGLC OR Fis |
| sdhC | b0721 | SUCDi | (NOT (ArcA OR Fnr)) OR CRPnoGLC OR Fis |
| sdhD | b0722 | SUCDi | (NOT (ArcA OR Fnr)) OR CRPnoGLC OR Fis |
| tdcD | b3115 | ACKr | CRPnoGLC OR Fnr |
| tdcE | b3114 | PFL | CRPnoGLC OR Fnr |
| yneH | b1524 | GLUN | (NOT glc-D[e]) OR (nh4[e] AND (NOT CRPnoGLC)) |
TABLE 17.Regulatory rules for transcriptional regulators and regulatory conditions| Regulator | Locus | Regulatory rule |
| ArcA | b4401 | NOT o2[e] |
| DcuR | b4124 | DcuS |
| DcuS | b4125 | succ[e] OR fum[e] OR mal-L[e] |
| FadR | b1187 | glc-D[e] OR (NOT ac[e]) |
| Fis | b3261 | Biomass_Ecoli_core_w_GAM |
| Fnr | b1334 | NOT o2[e] |
| FruR | b0080 | NOT surplusFDP |
| GlcC | b2980 | ac[e] |
| GlnG | b3868 | NOT nh4[e] |
| IclR | b4018 | FadR |
| Mlc | b1594 | NOT glc-D[e] |
| Nac | b1988 | NRI_low |
| PdhR | b0113 | NOT surplusPYR |
| PhoB | b0399 | PhoR |
| PhoR | b0400 | NOT pi[e] |
| CRPnoGLC | b3357 | NOT glc-D[e] |
| CRPnoGLM | b3357 | NOT (glc-D[e] OR mal-L[e] OR lac-D[e]) |
| NRI_hi | | NRI_low |
| NRI_low | | GlnG |
| surplusFDP | | ((NOT FBP) AND (NOT (TKT2 OR TALA OR PGI))) OR fru[e] |
| surplusPYR | | (NOT (ME2 OR ME1)) AND (NOT (GLCpts OR PYK OR PFK OR LDH_D OR SUCCt2_2)) |
Under anoxic conditions, the transcription factors ArcA and Fnr act as global regulators that induce many different genes needed for fermentation and growth without oxygen. However, the principal function of ArcA and Fnr is to repress genes not required when oxygen is abundant (86). When oxygen availability is reduced, the drop in redox potential signals the phosphorylation and thereby activation of the global regulator ArcA. Only in anaerobic conditions is the global transcriptional regulator Fnr also activated. When there is oxygen in the cytoplasm, an oxidized 4Fe-4S cluster in Fnr inactivates it. The regulatory rule “NOT o2[e]” is the same for ArcA and Fnr. Therefore, when o2[e] is false, ArcA and Fnr are true, representing activation.
Both ArcA and Fnr induce fermentative genes such as pflA, pflB, pflD, and pflC, coding for pyruvate formate lyase, PFL (163), and focA and focB coding for formate transport, FORt2 and FORti. The regulatory rule for pflC is “ArcA or Fnr.” Therefore, if either ArcA or Fnr are true, then pflC is true, representing expression. Fnr also induces tdcD, coding for the fermentative enzyme acetate kinase, ACKr. Anoxic conditions also induce adhE, encoding the enzymes for both acetaldehyde dehydrogenase (acetylating), ACALD, and alcohol dehydrogenase (ethanol), ALCD2x. The regulatory rule for adhE is “(NOT o2[e]) OR (NOT (o2[e] AND FruR)) OR Fis,” representing its indirect induction when oxygen is absent or derepression by the inactivity of the transcriptional regulator FruR. Fis is discussed in “Regulation by Fis.”
In the absence of a common electron acceptor such as oxygen or nitrate, the reactions of the TCA cycle no longer operate as an energy-producing cycle. Instead, they function as two separate biosynthetic pathways. Beginning at oxaloacetate, a reductive pathway via fumarate reductase, FRD7, producing succinyl-CoA, and an oxidative pathway producing 2-oxoglutarate. In the absence of oxygen, ArcA represses a number of genes in the TCA cycle that are unnecessary for its cyclic operation, including sdhA, sdhB, sdhC, and sdhD, coding for succinate dehydrogenase (irreversible), SUCDi (44, 137), in addition to fumA and fumC coding for fumarase, FUM (138), and mdh coding for malate dehydrogenase, MDH. The regulatory rule for sdhA-D, “(NOT (ArcA OR Fnr)) OR Crp OR Fis,” indicates that these genes are expressed when ArcA and Fnr are false, or when either of the transcriptional regulators Crp or Fis are true.
The cell adapts to changing environmental oxygen conditions by utilizing different isozymes of fumarase, FUM. Both Fnr and ArcA repress fumA coding for fumarase A (138), but Fnr induces fumB coding for the isozyme fumarase B that has greater affinity for malate as a substrate (182). The regulatory rule for fumA is “NOT (ArcA OR Fnr),” so the fumA isozyme is expressed when ArcA and Fnr are false. The regulatory rule for fumB is “Fnr OR Crp OR DcuR,” so the fumB isozyme is expressed when any one of Fnr, Crp, or DcuR is true. The transcriptional regulator DcuR is discussed in “Growth on Acetate or C4-Dicarboxylate Compounds,” below. Fnr complements induction by DcuR of frdA, frdB, frdC, and frdD, coding for fumarate reductase, FRD7 (183). In anoxic conditions, fumarate reductase, FRD7, catalyzes the reverse reaction to that catalyzed by succinate dehydrogenase, SUCDi, in oxic conditions. The regulatory rule for frdA-D is “Fnr OR DcuR,” meaning that these genes are expressed when either Fnr or DcuR are activated.
Both ArcA and Fnr downregulate oxidative phosphorylation by repressing the nuoA-nuoN operon coding for NADH dehydrogenase (ubiquinone-8: & 3 protons), NADH16 (15, 71). This reaction is used for aerobic respiration and is therefore unnecessary in anoxic conditions. The corresponding regulatory rule for nuoA-N is “(NOT (ArcA OR Fnr)).” When oxygen is present at low concentration ArcA induces cydA and cydB, coding for the reaction cytochrome oxidase bd (ubiquinol-8: -2 protons), CYTBD (183). In fully anoxic conditions these same genes cydA and cydB coding for cytochrome oxidase bd (ubiquinol-8: 2 protons), CYTBD, are repressed by Fnr (183). The regulatory rule for cydA and cydB is “(NOT Fnr) OR ArcA.” ArcA downregulates the glyoxylate cycle by repressing aceB and glcB, coding for malate synthase, MALS, and aceA coding for isocitrate lyase, ICL (140). ArcA also represses a number of transporters, including glcA and lldP coding for D-lactate transport via proton symport, D_LACt2, and dctA coding for the aerobic transporter for fumarate, malate, and succinate, FUMt2_2, MALt2_2, and SUCCt2_2 (46). Many of the latter genes are also regulated by other transcription factors giving rise to a combinatorial expansion in the total number of network states.
In a medium containing glucose and another substrate such as lactate or malate, E. coli preferentially catabolizes glucose until it is depleted, thereafter switching to catabolism of the less desirable substrate (128). The repression of enzymes for catabolism of a less desirable substrate by the presence of a desirable substrate is generally termed catabolite repression (159) (see Fig. 15). Cytoplasmic concentrations of the cofactor cyclic adenosine monophosphate, cAMP, allosterically control the activity of the cAMP receptor protein, Crp (17). The presence of glucose reduces the cytoplasmic concentration of Crp-cAMP that is necessary for induction of various genes necessary for catabolism of less desirable substrates. To represent regulation by Crp-cAMP in a Boolean manner, extra regulatory condition variables are necessary to represent Crp-cAMP under different conditions. The CRPnoGLM regulatory condition is false when either glucose (glc-D[e]), malate (mal-L[e]), or lactate (lac-D[e]) are present in the media. In this condition the gene dctA is repressed, which codes for the fumarate, malate, and succinate transporters, FUMt2_2, MALt2_2, and SUCCt2_2 (46). The regulatory rule for dctA is “CRPnoGLM AND (NOT ArcA) AND DcuR.” Therefore ArcA must also be false and DcuR must also be true for dctA to be expressed. When the CRPnoGLM regulatory condition is false, the genes manX, manY, and manZ are repressed downregulating fructose transport via PEP:Pyr PTS, FRUpts2. Note that manX, manY, and manZ can also code for subunits of D-glucose transport via PEP:Pyr PTS, GLCpts. The latter can also be encoded by ptsH, ptsI, malX, crr, and ptsG, providing an independent means of glucose transport.
When glucose is absent, the regulatory condition CRPnoGLC is true, representing activation of the global regulator Crp-cAMP. CRPnoGLC induces the genes sdhA, sdhB, sdhC, and sdhD, coding for succinate dehydrogenase, SUCDi (44). CRPnoGLC also upregulates fermentation by inducing pflA, pflB, and tdcE coding for pyruvate formate lyase, PFL, inducing focA and focB coding for formate transport, FORt2 and FORti, inducing tdcD coding for acetate kinase, ACKr (162). The regulatory rule for focA and focB is “ArcA OR (Fnr AND CRPnoGLC)”; therefore, these genes are induced when ArcA is true or when both Fnr and CRPnoGLC are true. When CRPnoGLC is true, this induces fumB coding for fumarase, FUM, in the TCA cycle. CRPnoGLC represses glnA coding for glutaminase, GLUN, but induces yneH coding for glutamine synthase, GLNS. Another transcription factor that is activated when glucose is absent is Mlc. Mlc represses ptsG, manX, manY, and manZ, which all code for subunits of the glucose and fructose phosphotransferase systems, FRUpts2 and GLCpts (145). The regulatory rule for manX-Z is “CRPnoGLM OR (NOT Mlc),” indicating that these transport genes are induced when CRPnoGLM is true or Mlc is false.
The gene glcC codes for a transcription factor that is induced when acetate (ac[e]) is present in the media (140). The regulatory rule for GlcC is therefore simply “ac[e].” GlcC induces glcA and glcB, which code for D-lactate transport via proton symport, D_LACt2, and malate synthase, MALS, respectively (140). The regulatory rule for glcA and glcB is “(NOT ArcA) AND GlcC,” so these genes are induced when acetate and oxygen are present. The dual transcriptional regulator FadR is activated by the presence of glucose or the absence of acetate in the media, and it is a regulator of fatty acid synthesis. FadR activates transcription of the gene iclR, which is also a transcription factor (72). IclR represses transcription of the genes aceB and aceA (40), which code for the malate synthase, MALS, and isocitrate lyase, ICL, enzymes, respectively. The overall effect of fadR is thus to inhibit the glyoxylate cycle when the cell is not consuming acetate. The genes aceB and aceA have the regulatory rule “(NOT IclR) AND ((NOT ArcA) OR FruR),” indicating that the glyoxylate cycle is active when glucose and oxygen are present or when glucose is present and acetate is absent.
The response to C4-dicarboxylate compounds is regulated by the two-component system dcuR and dcuS. dcuS codes for a sensor histidine kinase that is activated when fumarate (fum[e]), L-malate (mal-L[e]), or succinate (succ[e]) are present in the media (69). Once activated, the DcuS protein phosphorylates DcuR, activating it. DcuR then induces the gene dctA (46), which codes for the transporter for fumarate, malate, and succinate, FUMt2_2, MALt2_2, and SUCCt2_2, allowing these metabolites to be consumed. DcuS also upregulates fumB (182), which codes for the fumarase B isozyme. Finally, DcuS upregulates frdA, frdB, frdC, and frdD (69, 183), which are subunits of the fumarate reductase enzyme, FRD7. These two reactions are responsible for interconversion of succinate, fumarate, and malate. The regulatory rule for frdA-D is “Fnr OR DcuR,” indicating that either when oxygen is present or fumarate, L-malate, or succinate is present in the media, then fumarate reductase is induced.
When cytoplasmic concentrations of D-fructose 1,6-bisphosphate (fdp) are low, the dual transcriptional regulator FruR represses glycolytic and fermentative enzymes, yet simultaneously induces gluconeogenic genes (14). In contrast, surplus D-fructose 1,6-bisphosphate binds to FruR, dislodging it from its binding sites and thereby derepressing glycolytic and fermentative enzymes and deactivating gluconeogenic enzymes. Since current constraint-based models do not explicitly model metabolite concentrations, the regulatory condition surplusFDP is used to indicate conditions of excess D-fructose 1,6-bisphosphate. The surplusFDP condition is met when fructose (fru[e]) is present in the media or the reactions FBP and any of TKT2, TALA, or PGI have zero flux. When surplusFDP is false, then FruR is true, thereby repressing glycolytic and fermentative enzymes and inducing gluconeogenic enzymes.
FruR induces ptsG coding for D-glucose transport via PEP:Pyr PTS, GLCpts, but represses pykF coding for the glycolytic enzyme pyruvate kinase, PYK (178). FruR also downregulates fermentation by repressing adhE, coding for acetaldehyde dehydrogenase (acetylating), ACALD, and alcohol dehydrogenase (ethanol), ALCD2x (116, 117). FruR upregulates the glyoxylate cycle by inducing aceA coding for isocitrate lyase, ICL, and inducing aceB coding for malate synthase, MALS (40). FruR also upregulates gluconeogenesis by inducing pps, which encodes phosphoenolpyruvate synthase PPS (121). In summary, FruR is capable of reversing the flow of carbon to replenish glycolytic intermediates as sensed by the level of D-fructose 1,6-bisphosphate.
The dual transcriptional regulator pdhR (pyruvate dehydrogenase complex regulator) downregulates pyruvate dehydrogenase when the pyruvate concentration in the cell is low (148). The Boolean regulatory rule for PdhR is “NOT surplusPYR” (see Fig. 17). High pyruvate concentration is represented by the variable surplusPYR, which is true when there is no flux through ME1 or ME2, and no flux through either one of GLCpts, PYK, PFK, LDH_D, or SUCCt2_2. The Boolean rule for surplusPYR is “(NOT (ME2 OR ME1)) AND (NOT (GLCpts OR PYK OR PFK OR LDH_D OR SUCCt2_2)).” PdhR inhibits the genes aceE and aceF that both code for subunits of the pyruvate dehydrogenase complex, PDH. The regulatory rule for aceE and aceF is “(NOT PdhR) OR Fis,” meaning that these genes are repressed when PdhR is true and meaning that pyruvate dehydrogenase is repressed when cytoplasmic pyruvate concentration is low.
The DNA-binding protein Fis regulates the expression of many genes by bending the genomic DNA, changing its topological structure. It has been shown to directly or indirectly regulate 21% of the genes in E. coli, and 894 Fis-associated regions of the genome have been identified (31). Transcription of fis is activated when the cell is in exponential growth phase. Fis then activates the genes aceE, aceF, coding for two of the subunits of the pyruvate dehydrogenase enzyme, PDH. It also activates transcription of the succinate dehydrogenase, SUCDi, subunits sdhC, sdhD, sdhA, and sdhB (44). Fis can also activate adhE, which codes for an enzyme that carries out both the acetaldehyde dehydrogenase (acetylating), ACALD, and alcohol dehydrogenase (ethanol), ALCD2x, reactions (116). When modeling the balanced steady-state growth typical of exponential phase, the state of Fis is always set to true. Other growth phases can be modeled with the full E. coli Boolean regulatory model (41).
The response to low nitrogen concentration in E. coli is a complex process (Fig. 16). There is a fast (low-level) response and a slower (high-level) response, represented in the Boolean regulatory model with the regulatory condition variables NRI_low and NRI_hi. The low-level response is activated by the transcription factor GlnG. GlnG is activated by low extracellular ammonium (nh4[e]) concentration and regulates nitrogen levels by induction or repression of many different genes (108). The high-level nitrogen response is activated by the low-level response, so NRI_hi is always activated after NRI_low is activated.
As a whole, the low- and high-level nitrogen responses conserve nitrogen by decreasing glutamate production. NRI_low induces transcription of the transcription factor Nac (nitrogen assimilation control). The product of this gene then represses gdhA (27), which codes for the glutamate dehydrogenase enzyme, GLUDy, which produces glutamate from 2-oxoglutarate. NRI_hi represses the genes gltB and gltD (122), which code for the subunits of the glutamate synthase enzyme, GLUSy. This enzyme produces glutamate through an alternate mechanism, so NRI_hi leads to further nitrogen conservation.
Phosphorus uptake is regulated by the two-component system phoR/phoB (76). phoR codes for a sensor kinase that is phosphorylated when extracellular inorganic phosphate (pi[e]) is not present. The phosphorylated enzyme is activated, and it phosphorylates the transcriptional regulator PhoB. Phosphorylated PhoB then represses the pitB gene, which codes for the phosphate transporter, PIt2r. As indicated in Fig. 17, the regulatory rule for pitB is “NOT PhoB”; therefore, pitB is true when phoB is false, and inorganic phosphate is present. The overall effect of phosphorus regulation is to downregulate the phosphate transport reaction, PIt2r, when no extracellular inorganic phosphate is present.
Once an accurate metabolic reconstruction is converted into a computational model, it may be used for a growing number of applications. Such models have been utilized to address a broad spectrum of basic and practical applications in five main categories: studies of evolutionary processes, analysis of network properties, interpretation of phenotypic screens, model-directed discovery, and metabolic engineering (Fig. 18). Metabolic reconstructions are a common denominator in the systems analysis of metabolic functions. As evident from “Description of the Core E. coli Metabolic Reconstruction” and “Boolean Core E. coli Transcriptional Regulation,” a wealth of biological knowledge is encoded in a manually curated network reconstruction. “Understanding Metabolic Capabilities” describes how this knowledge may then be used to predict capabilities of a network that emerge from the interaction of multiple components. Likewise, the knowledge encoded in a model may be used to study the process of bacterial evolution (59). Applications include the interpretation of experimental adaptive evolution (131), horizontal gene transfer (131, 132), and evolution to minimal metabolic networks (133). Network reconstructions also provide a context for integration of high-throughput data from multiple complementary experiments, as described in “High-Throughput Data Analysis.” Discovery of the biochemical function of previously uncharacterized genes using metabolic models is discussed in “Discovery.” A growing application of genome-scale metabolic reconstructions is the prediction of optimal mutant strains in synthetic biological and engineering settings (“Synthetic Biology and Metabolic Engineering”).
Flux balance analysis and other constraint-based methods can be used to analyze the capabilities of a metabolic network (135). In flux balance analysis, the biomass reaction can be optimized (maximized) by use of linear programming software to simulate growth (187). The result is an in silico prediction of steady-state flux through each reaction in the model including a prediction of the maximum balanced growth rate of the cell. Growth can be simulated under many different conditions, such as aerobic or anaerobic conditions, or growth on glucose or other substrates. Different conditions are simulated by changing the constraints on the exchange reactions. Exchange reactions, at the boundary of the model and the environment, act as sources or sinks of substrate and waste metabolites (Fig. 3). For example, anaerobic conditions are simulated when the lower bound of the O2 exchange reaction, EX_o2(e), is constrained to zero flux, allowing no O2 to enter the system. To simulate different media, the exchange reactions for metabolites present in the media are constrained to have a lower bound equal to their desired uptake rate, and all metabolites not present will have exchange reactions with lower bounds constrained to zero.
Growth was simulated by using the core E. coli metabolic model (without regulation) with glucose as the only organic substrate, under both aerobic and anaerobic conditions; the resulting flux distributions are shown in Fig. 19. Under aerobic conditions with a glucose uptake rate (EX_glc(e)) of 10 mmol gDW−1 h−1, the growth rate is 0.87 h−1. The flux through the electron transport chain is high, and no organic by-products are secreted. Under anaerobic conditions with the same glucose uptake rate, the growth rate is 0.21 h−1. The electron transport chain and most of the TCA cycle are not used at all, and formate, acetate, and ethanol are all secreted. When the regulatory model is combined with the metabolic model, the reactions FORt2, FORti, FUMt2_2, ICL, MALS, MALt2_2, PFL, and SUCCt2_2 are inactivated under aerobic conditions when growing on glucose. However, since none of these reactions are used in the optimal flux distribution for growth on glucose, the growth rate is not affected. Under anaerobic conditions, the reactions D_LACt2, FUMt2_2, ICL, MALS, MALt2_2, MDH, NADH16, and SUCCt2_2 are inactivated. These reactions are also not part of the optimal flux distribution, so anaerobic growth rate is still 0.21 h−1. These results highlight the predictive capacity of assuming an optimal growth rate, given a particular environmental condition.
The maximum yields of important cofactors such as ATP, NADH, and NADPH can also be determined by using flux balance analysis (185). By constraining the glucose uptake rate to exactly −1 mmol gDW−1 h−1, and setting the ATP maintenance reaction, ATPM, as the objective to be maximized, the yield of ATP from glucose can be calculated. ATPM is a stoichiometrically balanced reaction that drains ATP from the network. To determine the maximum yields of NADH and NADPH, similar balanced drain reactions must be added to the network and set as objectives. The maximum yields of these cofactors are given in Table 18. These yields are limited by the balancing of protons and by the stoichiometry of the network. See Table 19 for examples of COBRA Toolbox (9) commands for performing these simulations.
TABLE 18.Maximum cofactor production from glucose, aerobically| Cofactor | Yield | PPS,%a | Constraint |
| ATP | 17.5 | 0 | H+ |
| NADH | 10 | 0 | Energy, stoichiometry |
| NADPH | 8.778 | 300 | Energy, stoichiometry |
|
TABLE 19.Example COBRA Toolbox commands for performing flux balance analysis| Action | Command |
| Change bounds for anaerobic growth | model = changeRxnBounds(model,‘EX_o2(e)’,0,‘l’); |
| Change bounds for aerobic growth | model = changeRxnBounds(model,‘EX_o2(e)’,-1000,‘l’); |
| Change glucose uptake rate to 10 mmol gDW−1 h−1 | model = changeRxnBounds(model,‘EX_glc(e)’,-10,‘l’); |
| Simulate maximum growth by FBA | solution = optimizeCbModel(model); |
| Simulate maximum growth of regulated model | [FBAsols,DRgenes,constrainedRxns,cycleStart,states] = optimizeRegModel(model); |
| Change objective to maximum ATP yield | model = changeObjective(model,‘ATPM’); |
A metabolic network can be used as a tool to analyze different types of high-throughput data, including gene expression data. These data can be mapped to the network, providing a context in which to interpret the results. In this example, gene expression data for E. coli comparing anaerobic growth with aerobic growth on glucose (41) were mapped to the core E. coli model (Fig. 20). In this figure, the triangles next to each reaction name represent the genes associated with each reaction. The GPRs in the core model were used to determine which reactions were upregulated or downregulated based on the gene regulation. When plotted against the network map, it is clear that glycolysis and the oxidative branch of the pentose phosphate pathway are upregulated, while the TCA cycle is downregulated. These results do not fully agree with the fluxes predicted by flux balance analysis in “Understanding Metabolic Capabilities,” which is expected because gene expression and metabolic fluxes are not trivially quantitatively related. Nevertheless, such experimental data can be used to identify repressed genes and hence regulatory rules for a particular condition. Setting the flux to zero for reactions catalyzed by repressed genes allows for more biochemically realistic flux balance analysis.
A recently developed computational algorithm called GIMME (10) can be used to interpret gene expression data using constraint-based models. This algorithm uses data from microarrays to form context-specific models by adding reactions for which the associated gene expression levels are above a specified threshold. Additional reactions are then added to meet a known cellular function such as growth or secretion of a metabolite for the conditions in which the microarray data was gathered. These reactions are assigned an inconsistency score based on their agreement with the gene expression data. The GIMME algorithm and the E. coli core metabolic model were used along with 170 E. coli microarrays covering a large number of growth conditions and genetic perturbations to find which conditions are most consistent with the network topology needed for secretion of 12 different organic metabolites. This analysis reveals which conditions are most amenable to production of certain metabolites. Figure 21 shows the consistency scores for each context-specific model created. As we would expect, the common anaerobic secretion products such as ethanol, acetate, and formate (187) are more consistent with the arrays from anaerobically grown strains than with aerobic strains.
Metabolic and regulatory models can be used to facilitate biological discovery. Computational predictions can be compared with experimental measurements under many different conditions, and when there are disagreements, it is because the model may be incomplete or incorrect in some way. These disagreements can be analyzed, aiding in the discovery of new biological features. Usually, the most up-to-date genome-scale models are used for discovery purposes. The core E. coli model is not suitable for discovery because it intentionally lacks most known metabolic reactions and has a limited scope.
Models have been used most often to characterize unknown ORFs in an organism's genome. An algorithm that combines computational analysis of the genome-scale metabolic model iJR904 (151) with experimental growth phenotype screening was recently used to identify eight unknown ORFs in E. coli (150). First, E. coli was grown under many different minimal media conditions in a high-throughput screen with different carbon and nitrogen sources. The growth phenotypes were qualitatively compared with growth phenotypes predicted by flux balance analysis using the genome-scale E. coli metabolic model. The comparison highlighted 50 conditions where growth is possible in vivo but not possible in silico. For every growth phenotype that could not be explained by the model, an optimization algorithm was used to determine the minimum number of reactions that needed to be added to the model from a universal database of reactions (from KEGG [94]) to make growth possible.
In this optimization algorithm, the stoichiometric matrix, representing E. coli reactions and metabolites, was used in addition to a second matrix, containing all the reactions from the universal database, and a third matrix, containing exchange reactions for metabolites not included in iJR904. For 26 of the 50 tested conditions, at least one set of universal database and/or exchange reactions were predicted to be necessary for in silico growth. In some growth conditions, up to 15 different sets of reactions were predicted to each allow in silico growth. These new reaction sets served as hypotheses for the identities of unknown ORFs, and several of these hypotheses were experimentally investigated in more detail.
The algorithm predicted that transport reactions needed to be added to the model to allow growth on propionate and 5-keto-D-gluconate. Eight genes that were predicted to be transporters, by homology search, were tested experimentally. It was found that E. coli cannot grow on propionate without the gene putP, and cannot grow on 5-keto-D-gluconate without idnT, indicating that these genes do code for transporters. The algorithm also predicted that growth on D-malate is possible with a transport reaction and enzymatic conversion to succinate. E. coli strains lacking the genes dctA, yeaT, or yeaU are unable to grow on D-malate. Through analysis of gene expression data and biochemical assays, it was found that dctA is the transporter, yeaU is the enzyme, and yeaT is a regulator of the enzyme.
In this same study, genes involved in growth on galactonate γ-lactone were identified through gene expression data, even though the algorithm was unable to identify the necessary reactions. The genome-scale model was still useful, however, because it demonstrated that no known genes were responsible for growth on this substrate. Orphan reactions, reactions that are known to exist but are catalyzed by unknown enzymes, provide another way to identify unknown ORFs in microbial organisms. In the core E. coli model, there are four orphan transport reactions (listed in Table 20). Algorithms that consider the phylogenetic profiles (29) or coexpression and clustering on the genome (99) of the reactions adjacent the orphan reactions in the network have been shown to be successful in assigning tentative ORFs to such reactions.
TABLE 20.Orphan reactions in the core E. coli reconstructiona| Abbr. | Full name | Equation |
| ACt2r | Acetate reversible transport via proton symport | ac[e] + h[e] ↔ ac + h |
| ETOHt2r | Ethanol reversible transport via proton symport | etoh[e] + h[e] ↔ etoh + h |
| PYRt2r | Pyruvate reversible transport via proton symport | h[e] + pyr[e] ↔ h + pyr |
| SUCCt3 | Succinate transport out via proton antiport | h[e] + succ ↔ h + succ[e] |
|
Recently, an entirely new pathway for pyrimidine catabolism was discovered in E. coli by use of a combination of a subsystem approach to genome annotation and experimental validation (109). The subsystem approach to genome annotation and pathway analysis allows integrated knowledge of existing biochemical network structure and genomic structure to be projected across the entire collection of diverse species with completely sequenced genomes (130). In addition to establishing which organisms implement one or the other functional variants of a subsystem, this approach helps to reveal gaps in knowledge (missing genes) and potential new players (predicted genes) (129).
Because of the predictive capabilities of constraint-based metabolic models, these models can be used in synthetic biology and metabolic engineering applications (8, 59). In particular, constraint-based models are useful because they can predict when a particular metabolite will be overproduced and secreted. Gene knockouts and knock-ins can be simulated by removing or adding reactions to the network, and the behavior of the modified network can be predicted by flux balance analysis. These model-based predictions are more accurate than predictions based in intuition and knowledge of gene functions, because flux balance analysis considers the complicated interacting effects that a knockout has on all pathways simultaneously.
Growth-coupled designs are a particularly promising class of metabolically engineered strains. When the product of interest in one of these strains is produced at a higher rate, the growth rate of the strain also increases. This is unlike most strain designs, in which the growth rate decreases as flux is diverted to a metabolic by-product. Growth-coupled designs are evolutionarily stable, because their production rates actually increase as mutations that increase growth rate accumulate. The growth coupling of designs can be visualized by using a production envelope, a graph that shows the solution space of a model in the dimensions of growth rate and the exchange reaction of a particular metabolite. When the metabolite is growth coupled, its minimum exchange rate at the maximum possible growth rate will be greater than zero. OptKnock is a bilevel linear programming optimization algorithm that uses a constraint-based model to identify sets of reaction knockouts that couple the production of a metabolite to growth (24). It seeks to simultaneously maximize growth rate and product secretion rate. This algorithm was used to identify growth-coupled designs for many metabolites in the core E. coli model under anaerobic conditions. The production envelopes of some growth-coupled designs are shown in Fig. 22.
Currently, the wealth of biochemical information exceeds the scope and depth of even the largest network reconstructions. The scope of the E. coli core model represents perhaps the most well characterized fraction of the latest E. coli genome-scale metabolic model (57). Nevertheless, the results of future genetic studies that establish the genes corresponding to orphan reactions need to be incorporated as part of an iterative cycle of development (59). The regulatory network for E. coli core metabolism is rapidly being discovered by use of high-throughput protein-DNA-binding assays (33, 100, 189, 197). The incorporation of this new knowledge, with a more comprehensive study of the biochemical literature on core metabolic regulation, would result in a significant expansion in the number and complexity of Boolean regulatory rules in the E. coli core model.
Biochemical network reconstructions are biochemically, genetically, and genomically structured databases. These reconstructions rely heavily on information retrieved from biochemical characterization of reactions and their substrates. Genetic studies are essential to identify the ORF(s) that encode the enzyme(s) responsible for each catalytic reaction, giving rise to GPRs. More broadly, species-specific biochemical and molecular biological studies are essential to provide a sound experimental basis for the components and reactions that are the key elements in any reconstruction. The utility of biochemical network reconstructions is driven by the fact that they can be transformed into a computational model. In turn, the computational model can be applied to address an increasingly wide variety of biological questions (59), including bacterial evolution (133), analysis of network properties (23, 112, 160), study of phenotypic behavior (53, 83), biological discovery (41, 77, 150), and metabolic engineering (24, 136, 139).
In the future, we can look forward to complete reconstructions of all known biochemical processes in E. coli and Salmonella. Such models ideally serve as structured self-consistent representations of our knowledge. Inevitably they will grow in scope, and more details will need to be added such that model predictions finally reach par with experimental observations. Even now, in certain situations, model predictions not only provide qualitatively and quantitatively accurate predictions of experiments, but can also be used to suggest profitable avenues for experimental confirmation. Even though reconstructions continue to grow in size and scope, fundamentally, they will still be systems of biochemical reactions, just like the core E. coli model, but with the rest of the cell built around it. Ultimately, the full utility of computational models will be realized at the fingertips of biological domain-specific experts. However, if all other domain-specific experts have contributed to the same model, as each expert probes her or his own area of biological expertise in silico, they can do so in the knowledge that their in silico predictions automatically account for others’ expertise, through the biochemically, genetically, and genomically structured model. We hope that this introduction to the core E. coli model brings this era a little closer.
The following supplemental files can be found online at http://systemsbiology.ucsd.edu/:
ecoli_core_model.xls A Microsoft Excel file that describes the reactions and metabolites in the core E. coli model. This file includes the full S matrix.
core_regulatory_rules.xls A Microsoft Excel file that gives the Boolean regulatory rules for every gene in the regulated model.
ecoli_core_model.mat A MATLAB data file that contains the core E. coli model (without regulation) in a format that can be used with the COBRA Toolbox (see Becker et al. [9] for details on this toolbox).
modelReg.mat A MATLAB data file that contains the regulated core E. coli model for the COBRA Toolbox.
optimizeRegModel.m A new COBRA Toolbox function that is needed to run simulations using the regulated model. This function uses flux balance analysis to determine the state of the network while considering regulatory constraints.
dynamicRFBA.m A new COBRA Toolbox function that performs dynamic flux balance analysis using the regulated model (see Covert et al. [42] for details on dynamic rFBA).
solveBooleanRegModel.m A COBRA Toolbox function that is required by both optimizeRegModel.m and dynamicRFBA.m.
ecoli_core_model.xml An SBML file for the core E. coli model without regulation. COBRA Toolbox users with the SBML Toolbox can load this file using the function “readCbModel.”
We thank B. K. Cho, N. Lewis, I. Thiele, and K. Zengler for their helpful comments.
References
1. Alberty, R. A. 2003. Thermodynamics of Biochemical Reactions. Massachusetts Institute of Technology, Cambridge, MA.
2. Alefounder, P. R., S. A. Baldwin, R. N. Perham, and N. J. Short. 1989. Cloning, sequence analysis and over-expression of the gene for the class II fructose 1,6-bisphosphate aldolase of Escherichia coli. Biochem. J.257:529–534.[PubMed]
3. Almaas, E., B. Kovacs, T. Vicsek, Z. N. Oltvai, and A. L. Barabasi. 2004. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature427:839–843.[PubMed] [CrossRef]
4. Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, and G. Sherlock. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet.25:25–29.[PubMed] [CrossRef]
5. Bairoch, A. 2000. The ENZYME database in 2000. Nucleic Acids Res.28:304–305.[PubMed] [CrossRef]
6. Baldwin, S. A., and R. N. Perham. 1978. Novel kinetic and structural properties of the class-I D-fructose 1,6-bisphosphate aldolase from Escherichia coli (Crookes’ strain). Biochem. J.169:643–652.[PubMed]
7. Barrett, C. L., C. D. Herring, J. L. Reed, and B. O. Palsson. 2005. The global transcriptional regulatory network for metabolism in Escherichia coli attains few dominant functional states. Proc. Natl. Acad. Sci. USA102:19103–19108.[PubMed] [CrossRef]
8. Barrett, C. L., T. Y. Kim, H. U. Kim, B. O. Palsson, and S. Y. Lee. 2006. Systems biology as a foundation for genome-scale synthetic biology. Curr. Opin. Biotechnol.17:488–492.[PubMed] [CrossRef]
9. Becker, S. A., A. M. Feist, M. L. Mo, G. Hannum, B. O. Palsson, and M. J. Herrgard. 2007. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat. Protocols2:727–738. [CrossRef]
10. Becker, S. A., and B. Ø. Palsson. 2008. Context-specific metabolic networks are consistent with experiments. PLoS Comput. Biol.4:e1000082.[PubMed] [CrossRef]
11. Bell, P. J., S. C. Andrews, M. N. Sivak, and J. R. Guest. 1989. Nucleotide sequence of the FNR-regulated fumarase gene (fumB) of Escherichia coli K-12. J. Bacteriol.171:3494–3503.[PubMed]
12. Berry, M. B., E. Bae, T. R. Bilderback, M. Glaser, and G. N. Phillips, Jr. 2006. Crystal structure of ADP/AMP complex of Escherichia coli adenylate kinase. Proteins62:555–556.[PubMed] [CrossRef]
13. Bizouarn, T., O. Fjellstrom, J. Meuller, M. Axelsson, A. Bergkvist, C. Johansson, B. Goran Karlsson, and J. Rydstrom. 2000. Proton translocating nicotinamide nucleotide transhydrogenase from E. coli. Mechanism of action deduced from its structural and catalytic properties. Biochim. Biophys. Acta1457:211–228.[PubMed] [CrossRef]
14. Bledig, S. A., T. M. Ramseier, and M. H. Saier, Jr. 1996. Frur mediates catabolite activation of pyruvate kinase (pykF) gene expression in Escherichia coli. J. Bacteriol.178:280–283.[PubMed]
15. Bongaerts, J., S. Zoske, U. Weidner, and G. Unden. 1995. Transcriptional regulation of the proton translocating NADH dehydrogenase genes (nuoA-N) of Escherichia coli by electron acceptors, electron donors and gene regulators. Mol. Microbiol.16:521–534.[PubMed] [CrossRef]
16. Bonneau, R., M. T. Facciotti, D. J. Reiss, A. K. Schmid, M. Pan, A. Kaur, V. Thorsson, P. Shannon, M. H. Johnson, J. C. Bare, W. Longabaugh, M. Vuthoori, K. Whitehead, A. Madar, L. Suzuki, T. Mori, D. E. Chang, J. Diruggiero, C. H. Johnson, L. Hood, and N. S. Baliga. 2007. A predictive model for transcriptional control of physiology in a free living cell. Cell131:1354–1365.[PubMed] [CrossRef]
17. Botsford, J. L., and J. G. Harman. 1992. Cyclic AMP in prokaryotes. Microbiol. Rev.56:100–122.[PubMed]
18. Branlant, G., and C. Branlant. 1985. Nucleotide sequence of the Escherichia coli gap gene. Different evolutionary behavior of the NAD+-binding domain and of the catalytic domain of D-glyceraldehyde-3-phosphate dehydrogenase. Eur. J. Biochem.150:61–66.[PubMed] [CrossRef]
19. Breitling, R., D. Vitkup, and M. P. Barrett. 2008. New surveyor tools for charting microbial metabolic maps. Nat. Rev. Microbiol.6:156–161.[PubMed] [CrossRef]
20. Bridger, W. A., W. T. Wolodko, W. Henning, C. Upton, R. Majumdar, and S. P. Williams. 1987. The subunits of succinyl-coenzyme A synthetase—function and assembly. Biochem. Soc. Symp.54:103–111.[PubMed]
21. Brock, M., C. Maerker, A. Schutz, U. Volker, and W. Buckel. 2002. Oxidation of propionate to pyruvate in Escherichia coli. Involvement of methylcitrate dehydratase and aconitase. Eur. J. Biochem.269:6184–6194.[PubMed] [CrossRef]
22. Brune, M., R. Schumann, and F. Wittinghofer. 1985. Cloning and sequencing of the adenylate kinase gene (adk) of Escherichia coli. Nucleic Acids Res.13:7139–7151.[PubMed] [CrossRef]
23. Burgard, A. P., E. V. Nikolaev, C. H. Schilling, and C. D. Maranas. 2004. Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res.14:301–312.[PubMed] [CrossRef]
24. Burgard, A. P., P. Pharkya, and C. D. Maranas. 2003. Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng.84:647–657.[PubMed] [CrossRef]
25. Burke, W. F., R. A. Johanson, and H. C. Reeves. 1974. NADP+-specific isocitrate dehydrogenase of Escherichia coli. II. Subunit structure. Biochim. Biophys. Acta351:333–340.[PubMed]
26. Cain, B. D., and R. D. Simoni. 1989. Proton translocation by the F1F0 ATPase of Escherichia coli. Mutagenic analysis of the a subunit. J. Biol. Chem.264:3292–3300.[PubMed]
27. Camarena, L., S. Poggio, N. Garcia, and A. Osorio. 1998. Transcriptional repression of gdhA in Escherichia coli is mediated by the Nac protein. FEMS Microbiol. Lett.167:51–56.[PubMed] [CrossRef]
28. Chang, A., M. Scheer, A. Grote, I. Schomburg, and D. Schomburg. 2009. BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucleic Acids Res.37:D588–D592.[PubMed] [CrossRef]
29. Chen, L., and D. Vitkup. 2006. Predicting genes for orphan metabolic activities using phylogenetic profiles. Genome Biol.7:R17.[PubMed] [CrossRef]
30. Cho, B. K., C. L. Barrett, E. M. Knight, Y. S. Park, and B. O. Palsson. 2008. Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli. Proc. Natl. Acad. Sci. USA105:19462–19467.[PubMed] [CrossRef]
31. Cho, B. K., E. M. Knight, C. L. Barrett, and B. Ø. Palsson. 2008. Genome-wide analysis of Fis binding in Escherichia coli indicates a causative role for A-/AT-tracts. Genome Res.18:900–910.[PubMed] [CrossRef]
32. Cho, B. K., E. M. Knight, and B. Ø. Palsson. 2006. Transcriptional regulation of the fad regulon genes of Escherichia coli by ArcA. Microbiology152:2207–2219.[PubMed] [CrossRef]
33. Cho, B. K., E. M. Knight, and B. Ø. Palsson. 2008. Genomewide identification of protein binding locations using chromatin immunoprecipitation coupled with microarray. Methods Mol. Biol.439:131–145.[PubMed] [CrossRef]
34. Christie, K. R., S. Weng, R. Balakrishnan, M. C. Costanzo, K. Dolinski, S. S. Dwight, S. R. Engel, B. Feierbach, D. G. Fisk, J. E. Hirschman, E. L. Hong, L. Issel-Tarver, R. Nash, A. Sethuraman, B. Starr, C. L. Theesfeld, R. Andrada, G. Binkley, Q. Dong, C. Lane, M. Schroeder, D. Botstein, and J. M. Cherry. 2004. Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res.32(Database issue):D311–D314.[PubMed] [CrossRef]
35. Clark, D. P. 1989. The fermentation pathways of Escherichia coli. FEMS Microbiol. Rev.5:223–234.[PubMed] [CrossRef]
36. Claudel-Renard, C., C. Chevalet, T. Faraut, and D. Kahn. 2003. Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res.31:6633–6639.[PubMed] [CrossRef]
37. Cole, S. T., C. Condon, B. D. Lemire, and J. H. Weiner. 1985. Molecular biology, biochemistry and bioenergetics of fumarate reductase, a complex membrane-bound iron-sulfur flavoenzyme of Escherichia coli. Biochim. Biophys. Acta811:381–403.[PubMed]
38. Condon, C., R. Cammack, D. S. Patil, and P. Owen. 1985. The succinate dehydrogenase of Escherichia coli. Immunochemical resolution and biophysical characterization of a 4-subunit enzyme complex. J. Biol. Chem.260:9427–9434.[PubMed]
39. Cooper, R. A., and H. L. Kornberg. 1965. Net formation of phosphoenolpyruvate from pyruvate by Escherichia coli. Biochim. Biophys. Acta104:618–620.[PubMed]
40. Cortay, J. C., D. Negre, A. Galinier, B. Duclos, G. Perriere, and A. J. Cozzone. 1991. Regulation of the acetate operon in Escherichia coli: purification and functional characterization of the IclR repressor. EMBO J.10:675–679.[PubMed]
41. Covert, M. W., E. M. Knight, J. L. Reed, M. J. Herrgard, and B. Ø. Palsson. 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature429:92–96.[PubMed] [CrossRef]
42. Covert, M. W., C. H. Schilling, and B. Palsson. 2001. Regulation of gene expression in flux balance models of metabolism. J. Theor. Biol.213:73–88.[PubMed] [CrossRef]
43. Csonka, L. N., and D. G. Fraenkel. 1977. Pathways of NADPH formation in Escherichia coli. J. Biol. Chem.252:3382–3391.[PubMed]
44. Cunningham, L., and J. R. Guest. 1998. Transcription and transcript processing in the sdhCDAB-sucABCD operon of Escherichia coli. Microbiology144(Pt 8):2113–2123.[PubMed] [CrossRef]
45. Daldal, F. 1984. Nucleotide sequence of gene pfkB encoding the minor phosphofructokinase of Escherichia coli K-12. Gene28:337–342.[PubMed] [CrossRef]
46. Davies, S. J., P. Golby, D. Omrani, S. A. Broad, V. L. Harrington, J. R. Guest, D. J. Kelly, and S. C. Andrews. 1999. Inactivation and regulation of the aerobic C(4)-dicarboxylate transport (dctA) gene of Escherichia coli. J. Bacteriol.181:5624–5635.[PubMed]
47. DeJongh, M., K. Formsma, P. Boillot, J. Gould, M. Rycenga, and A. Best. 2007. Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics8:139.[PubMed] [CrossRef]
48. Delbaere, L. T., A. M. Sudom, L. Prasad, Y. Leduc, and H. Goldie. 2004. Structure/function studies of phosphoryl transfer by phosphoenolpyruvate carboxykinase. Biochim. Biophys. Acta1697:271–278.[PubMed]
49. Duarte, N. C., S. A. Becker, N. Jamshidi, I. Thiele, M. L. Mo, T. D. Vo, R. Srivas, and B. Ø. Palsson. 2007. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc. Natl. Acad. Sci. USA104:1777–1782.[PubMed] [CrossRef]
50. Duarte, N. C., M. J. Herrgard, and B. Palsson. 2004. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res.14:1298–1309.[PubMed] [CrossRef]
51. Dym, O., E. A. Pratt, C. Ho, and D. Eisenberg. 2000. The crystal structure of D-lactate dehydrogenase, a peripheral membrane respiratory enzyme. Proc. Natl. Acad. Sci. USA97:9413–9418.[PubMed] [CrossRef]
52. Eberstadt, M., S. G. Grdadolnik, G. Gemmecker, H. Kessler, A. Buhr, and B. Erni. 1996. Solution structure of the IIB domain of the glucose transporter of Escherichia coli. Biochemistry35:11286–11292.[PubMed] [CrossRef]
53. Edwards, J. S., R. U. Ibarra, and B. Ø. Palsson. 2001. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat. Biotechnol.19:125–130.[PubMed] [CrossRef]
54. Edwards, J. S., and B. O. Palsson. 2000. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl. Acad. Sci. USA97:5528–5533.[PubMed] [CrossRef]
55. Essenberg, M. K., and R. A. Cooper. 1975. Two ribose-5-phosphate isomerases from Escherichia coli K12: partial characterisation of the enzymes and consideration of their possible physiological roles. Eur. J. Biochem.55:323–332.[PubMed] [CrossRef]
56. Faith, J. J., B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins, and T. S. Gardner. 2007. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol.5:e8.[PubMed] [CrossRef]
57. Feist, A. M., C. S. Henry, J. L. Reed, M. Krummenacker, A. R. Joyce, P. D. Karp, L. J. Broadbelt, V. Hatzimanikatis, and B. O. Palsson. 2007. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol.3:121.[PubMed] [CrossRef]
58. Feist, A. M., M. J. Herrgard, I. Thiele, J. L. Reed, and B. O. Palsson. 2009. Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7:129–143.[PubMed]
59. Feist, A. M., and B. Ø. Palsson. 2008. The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat. Biotech.26:659–667.[PubMed] [CrossRef]
60. Ferrandez, A., J. L. Garcia, and E. Diaz. 1997. Genetic characterization and expression in heterologous hosts of the 3-(3-hydroxyphenyl)propionate catabolic pathway of Escherichia coli K-12. J. Bacteriol.179:2573–2581.[PubMed]
61. Fersht, A. 1999. Structure and Mechanism in Protein Science: a Guide to Enzyme Catalysis and Protein Folding. W. H. Freeman, New York, NY.
62. Flint, D. H. 1994. Initial kinetic and mechanistic characterization of Escherichia coli fumarase A. Arch. Biochem. Biophys.311:509–516.[PubMed] [CrossRef]
63. Fong, S. S., A. P. Burgard, C. D. Herring, E. M. Knight, F. R. Blattner, C. D. Maranas, and B. O. Palsson. 2005. In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol. Bioeng.91:643–648.[PubMed] [CrossRef]
64. Fraser, H. I., M. Kvaratskhelia, and M. F. White. 1999. The two analogous phosphoglycerate mutases of Escherichia coli. FEBS Lett.455:344–348.[PubMed] [CrossRef]
65. Froman, B. E., R. C. Tait, and L. D. Gottlieb. 1989. Isolation and characterization of the phosphoglucose isomerase gene from Escherichia coli. Mol. Gen. Genet.217:126–131.[PubMed] [CrossRef]
66. Garrido-Pertierra, A., and R. A. Cooper. 1983. Evidence for two distinct pyruvate kinase genes in Escherichia coli K-12. FEBS Lett.162:420–422.[PubMed] [CrossRef]
67. Gianchandani, E. P., A. R. Joyce, B. O. Palsson, and J. A. Papin. 2009. Functional states of the genome-scale Escherichia coli transcriptional regulatory system. PLoS Comput. Biol.5:e1000403.[PubMed] [CrossRef]
68. Gianchandani, E. P., J. A. Papin, N. D. Price, A. R. Joyce, and B. Ø. Palsson. 2006. Matrix formalism to describe functional states of transcriptional regulatory systems. PLoS Comput. Biol.2:e101.[PubMed] [CrossRef]
69. Golby, P., S. Davies, D. J. Kelly, J. R. Guest, and S. C. Andrews. 1999. Identification and characterization of a two-component sensor-kinase and response-regulator system (DcuS-DcuR) controlling gene expression in response to C4-dicarboxylates in Escherichia coli. J. Bacteriol.181:1238–1248.[PubMed]
70. Grainger, D. C., H. Aiba, D. Hurd, D. F. Browning, and S. J. Busby. 2007. Transcription factor distribution in Escherichia coli: studies with FNR protein. Nucleic Acids Res.35:269–278.[PubMed] [CrossRef]
71. Green, J., and J. R. Guest. 1994. Regulation of transcription at the ndh promoter of Escherichia coli by FNR and novel factors. Mol. Microbiol.12:433–444.[PubMed] [CrossRef]
72. Gui, L., A. Sunnarborg, and D. C. LaPorte. 1996. Regulated expression of a repressor protein: FadR activates iclR. J. Bacteriol.178:4704–4709.[PubMed]
73. Guldener, U., M. Munsterkotter, G. Kastenmuller, N. Strack, J. van Helden, C. Lemer, J. Richelles, S. J. Wodak, J. Garcia-Martinez, J. E. Perez-Ortin, H. Michael, A. Kaps, E. Talla, B. Dujon, B. Andre, J. L. Souciet, J. De Montigny, E. Bon, C. Gaillardin, and H. W. Mewes. 2005. CYGD: the Comprehensive Yeast Genome Database. Nucleic Acids Res.33:D364–D368.[PubMed] [CrossRef]
74. Hansen, E. J., and E. Juni. 1975. Isolation of mutants of Escherichia coli lacking NAD- and NADP-linked malic. Biochem. Biophys. Res. Commun.65:559–566.[PubMed] [CrossRef]
75. Hansen, E. J., and E. Juni. 1974. Two routes for synthesis of phosphoenolpyruvate from C4-dicarboxylic acids in Escherichia coli. Biochem. Biophys. Res. Commun.59:1204–1210.[PubMed] [CrossRef]
76. Harris, R. M., D. C. Webb, S. M. Howitt, and G. B. Cox. 2001. Characterization of PitA and PitB from Escherichia coli. J. Bacteriol.183:5008–5014.[PubMed] [CrossRef]
77. Herrgard, M. J., S. S. Fong, and B. Ø. Palsson. 2006. Identification of genome-scale metabolic network models using experimentally measured flux profiles. PLoS Comput. Biol.2:e72.[PubMed] [CrossRef]
78. Herrgard, M. J., B. S. Lee, V. Portnoy, and B. Ø. Palsson. 2006. Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae. Genome Res.16:627–635.[PubMed] [CrossRef]
79. Hines, J. K., H. J. Fromm, and R. B. Honzatko. 2006. Novel allosteric activation site in Escherichia coli fructose-1,6-bisphosphatase. J. Biol. Chem.281:18386–18393.[PubMed] [CrossRef]
80. Hoyt, J. C., E. F. Robertson, K. A. Berlyn, and H. C. Reeves. 1988. Escherichia coli isocitrate lyase: properties and comparisons. Biochim. Biophys. Acta966:30–35.[PubMed]
81. Hu, Z., P. J. Killion, and V. R. Iyer. 2007. Genetic reconstruction of a functional transcriptional regulatory network. Nat. Genet.39:683–687.[PubMed] [CrossRef]
82. Hua, Q., C. Yang, T. Baba, H. Mori, and K. Shimizu. 2003. Responses of the central metabolism in Escherichia coli to phosphoglucose isomerase and glucose-6-phosphate dehydrogenase knockouts. J. Bacteriol.185:7053–7067.[PubMed] [CrossRef]
83. Ibarra, R. U., J. S. Edwards, and B. Ø. Palsson. 2002. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature420:186–189.[PubMed] [CrossRef]
84. Ideker, T. E., V. Thorsson, and R. M. Karp. 2000. Discovery of regulatory interactions through perturbation: inference and experimental design. Pacific Symp. Biocomput.292:305–316.
85. Iida, A., S. Teshiba, and K. Mizobuchi. 1993. Identification and characterization of the tktB gene encoding a second transketolase in Escherichia coli K-12. J. Bacteriol.175:5375–5383.[PubMed]
86. Iuchi, S., and E. C. Lin. 1988. arcA (dye), a global regulatory gene in Escherichia coli mediating repression of enzymes in aerobic pathways. Proc. Natl. Acad. Sci. USA85:1888–1892.[PubMed] [CrossRef]
87. Iverson, T. M., C. Luna-Chavez, G. Cecchini, and D. C. Rees. 1999. Structure of the Escherichia coli fumarate reductase respiratory complex. Science284:1961–1966.[PubMed] [CrossRef]
88. Iwakura, M., J. Hattori, Y. Arita, M. Tokushige, and H. Katsuki. 1979. Studies on regulatory functions of malic enzymes. VI. Purification and molecular properties of NADP-linked malic enzyme from Escherichia coli W. J. Biochem.85:1355–1365.[PubMed]
89. Jiang, G. R., S. Nikolova, and D. P. Clark. 2001. Regulation of the ldhA gene, encoding the fermentative lactate dehydrogenase of Escherichia coli. Microbiology147:2437–2446.[PubMed]
90. Josephson, B. L., and D. G. Fraenkel. 1974. Sugar metabolism in transketolase mutants of Escherichia coli. J. Bacteriol.118:1082–1089.[PubMed]
91. Josephson, B. L., and D. G. Fraenkel. 1969. Transketolase mutants of Escherichia coli. J. Bacteriol.100:1289–1295.[PubMed]
92. Joyce, A. R., J. L. Reed, A. White, R. Edwards, A. Osterman, T. Baba, H. Mori, S. A. Lesely, B. O. Palsson, and S. Agarwalla. 2006. Experimental and computational assessment of conditionally essential genes in Escherichia coli. J. Bacteriol.188:8259–8271.[PubMed] [CrossRef]
93. Kai, Y., H. Matsumura, and K. Izui. 2003. Phosphoenolpyruvate carboxylase: three-dimensional structure and molecular mechanisms. Arch. Biochem. Biophys.414:170–179.[PubMed] [CrossRef]
94. Kanehisa, M., and S. Goto. 2000. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res.28:27–30.[PubMed] [CrossRef]
95. Karp, P. D., I. M. Keseler, A. Shearer, M. Latendresse, M. Krummenacker, S. M. Paley, I. Paulsen, J. Collado-Vides, S. Gama-Castro, M. Peralta-Gil, A. Santos-Zavaleta, M. I. Penaloza-Spinola, C. Bonavides-Martinez, and J. Ingraham. 2007. Multidimensional annotation of the Escherichia coli K-12 genome. Nucleic Acids Res.35:7577–7590.[PubMed] [CrossRef]
96. Kasimoglu, E., S. J. Park, J. Malek, C. P. Tseng, and R. P. Gunsalus. 1996. Transcriptional regulation of the proton-translocating ATPase (atpIBEFHAGDC) operon of Escherichia coli: control by cell growth rate. J. Bacteriol.178:5563–5567.[PubMed]
97. Keseler, I. M., J. Collado-Vides, S. Gama-Castro, J. Ingraham, S. Paley, I. T. Paulsen, M. Peralta-Gil, and P. D. Karp. 2005. EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res.33:D334–D337.[PubMed] [CrossRef]
98. Kessler, D., W. Herth, and J. Knappe. 1992. Ultrastructure and pyruvate formate-lyase radical quenching property of the multienzymic AdhE protein of Escherichia coli. J. Biol. Chem.267:18073–18079.[PubMed]
99. Kharchenko, P., D. Vitkup, and G. M. Church. 2004. Filling gaps in a metabolic network using expression information. Bioinformatics20(Suppl. 1):I178–I185.[PubMed] [CrossRef]
100. Kim, T. H., and B. Ren. 2006. Genome-wide analysis of protein-DNA interactions. Annu. Rev. Genomics Hum. Genet.7:81–102.[PubMed] [CrossRef]
101. Knappe, J., and G. Sawers. 1990. A radical-chemical route to acetyl-CoA: the anaerobically induced pyruvate formate-lyase system of Escherichia coli. FEMS Microbiol. Rev.6:383–398.[PubMed] [CrossRef]
102. Kobayashi, K., S. Tagawa, and T. Mogi. 1999. Electron transfer process in cytochrome bd-type ubiquinol oxidase from Escherichia coli revealed by pulse radiolysis. Biochemistry38:5913–5917.[PubMed] [CrossRef]
103. Kornberg, H. L. 1966. Anaplerotic sequences and their role in metabolism. Essays Biochem.2:1–31.
104. Kornberg, H. L. 1965. The coordination of metabolic routes. Function and Structure in Microorganisms: Fifteenth Symposium of the Society for General Microbiology. University Press, London, United Kingdom.
105. Krieger, C. J., P. Zhang, L. A. Mueller, A. Wang, S. Paley, M. Arnaud, J. Pick, S. Y. Rhee, and P. D. Karp. 2004. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res.32(Database issue):D438–D442.[PubMed] [CrossRef]
106. Kuhnel, K., and B. F. Luisi. 2001. Crystal structure of the Escherichia coli RNA degradosome component enolase. J. Mol. Biol.313:583–592.[PubMed] [CrossRef]
107. Kumar, V. S., and C. D. Maranas. 2009. GrowMatch: an automated method for reconciling in silico/in vivo growth predictions. PLoS Comput. Biol.5:e1000308.[PubMed] [CrossRef]
108. Lee, J., J. T. Owens, I. Hwang, C. Meares, and S. Kustu. 2000. Phosphorylation-induced signal propagation in the response regulator ntrC. J. Bacteriol.182:5188–5195.[PubMed] [CrossRef]
109. Loh, K. D., P. Gyaneshwar, E. Markenscoff Papadimitriou, R. Fong, K. S. Kim, R. Parales, Z. Zhou, W. Inwood, and S. Kustu. 2006. A previously undescribed pathway for pyrimidine catabolism. Proc. Natl. Acad. Sci. USA103:5114–5119.[PubMed] [CrossRef]
110. Lyngstadaas, A., G. A. Sprenger, and E. Boye. 1998. Impaired growth of an Escherichia coli rpe mutant lacking ribulose-5-phosphate epimerase activity. Biochim. Biophys. Acta1381:319–330.[PubMed]
111. Maglott, D., J. Ostell, K. D. Pruitt, and T. Tatusova. 2005. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res.33:D54–D58.[PubMed] [CrossRef]
112. Mahadevan, R., and C. H. Schilling. 2003. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab. Eng.5:264–276.[PubMed] [CrossRef]
113. Mahajan, S. K., C. C. Chu, D. K. Willis, A. Templin, and A. J. Clark. 1990. Physical analysis of spontaneous and mutagen-induced mutants of Escherichia coli K-12 expressing DNA exonuclease VIII activity. Genetics125:261–273.[PubMed]
114. Markowitz, V. M., F. Korzeniewski, K. Palaniappan, E. Szeto, G. Werner, A. Padki, X. Zhao, I. Dubchak, P. Hugenholtz, I. Anderson, A. Lykidis, K. Mavromatis, N. Ivanova, and N. C. Kyrpides. 2006. The integrated microbial genomes (IMG) system. Nucleic Acids Res.34:D344–D348.[PubMed] [CrossRef]
115. Melendez-Hevia, E., and A. Isidoro. 1985. The game of the pentose phosphate cycle. J. Theor. Biol.117:251–263.[PubMed] [CrossRef]
116. Membrillo-Hernandez, J., and E. C. Lin. 1999. Regulation of expression of the adhE gene, encoding ethanol oxidoreductase in Escherichia coli: transcription from a downstream promoter and regulation by fnr and RpoS. J. Bacteriol.181:7571–7579.[PubMed]
117. Mikulskis, A., A. Aristarkhov, and E. C. Lin. 1997. Regulation of expression of the ethanol dehydrogenase gene (adhE) in Escherichia coli by catabolite repressor activator protein Cra. J. Bacteriol.179:7129–7134.[PubMed]
118. Molina, I., M. T. Pellicer, J. Badia, J. Aguilar, and L. Baldoma. 1994. Molecular characterization of Escherichia coli malate synthase G. Differentiation with the malate synthase A isoenzyme. Eur. J. Biochem.224:541–548.[PubMed] [CrossRef]
119. Muirhead, H. 1990. Isoenzymes of pyruvate kinase. Biochem. Soc. Trans.18:193–196.[PubMed]
120. Narindrasorasak, S., and W. A. Bridger. 1977. Phosphoenolpyruvate synthetase of Escherichia coli: molecular weight, subunit composition, and identification of phosphohistidine in phosphoenzyme intermediate. J. Biol. Chem.252:3121–3127.[PubMed]
121. Negre, D., C. Oudot, J. F. Prost, K. Murakami, A. Ishihama, A. J. Cozzone, and J. C. Cortay. 1998. FruR-mediated transcriptional activation at the ppsA promoter of Escherichia coli. J. Mol. Biol.276:355–365.[PubMed] [CrossRef]
122. Neidhardt, F. C., R. Curtis III, J. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger (ed.). 1996. Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd ed. ASM Press, Washington, DC.
123. Neidhardt, F. C., and H. E. Umbarger. 1996. Chemical composition of Escherichia coli, p. 13–16. In F. C. Neidhardt, R. Curtis III, J. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger (ed.), Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd ed., vol. 1. ASM Press, Washington, DC.
124. Nellemann, L. J., F. Holm, T. Atlung, and F. G. Hansen. 1989. Cloning and characterization of the Escherichia coli phosphoglycerate kinase (pgk) gene. Gene77:185–191.[PubMed] [CrossRef]
125. Nguyen, N. T., R. Maurus, D. J. Stokell, A. Ayed, H. W. Duckworth, and G. D. Brayer. 2001. Comparative analysis of folding and substrate binding sites between regulated hexameric type II citrate synthases and unregulated dimeric type I enzymes. Biochemistry40:13177–13187.[PubMed] [CrossRef]
126. Niersbach, M., F. Kreuzaler, R. H. Geerse, P. W. Postma, and H. J. Hirsch. 1992. Cloning and nucleotide sequence of the Escherichia coli K-12 ppsA gene, encoding PEP synthase. Mol. Gen. Genet.231:332–336.[PubMed]
127. Nogales, J., B. Ø. Palsson, and I. Thiele. 2008. A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory. BMC Syst. Biol.2:79.[PubMed] [CrossRef]
128. Okinaka, R. T., and W. J. Dobrogosz. 1967. Catabolite repression and pyruvate metabolism in Escherichia coli. J. Bacteriol.93:1644–1650.[PubMed]
129. Osterman, A. 2006. A hidden metabolic pathway exposed. Proc. Natl. Acad. Sci. USA103:5637–5638.[PubMed] [CrossRef]
130. Overbeek, R., T. Begley, R. M. Butler, J. V. Choudhuri, H. Y. Chuang, M. Cohoon, V. de Crecy-Lagard, N. Diaz, T. Disz, R. Edwards, M. Fonstein, E. D. Frank, S. Gerdes, E. M. Glass, A. Goesmann, A. Hanson, D. Iwata-Reuyl, R. Jensen, N. Jamshidi, L. Krause, M. Kubal, N. Larsen, B. Linke, A. C. McHardy, F. Meyer, H. Neuweger, G. Olsen, R. Olson, A. Osterman, V. Portnoy, G. D. Pusch, D. A. Rodionov, C. Ruckert, J. Steiner, R. Stevens, I. Thiele, O. Vassieva, Y. Ye, O. Zagnitko, and V. Vonstein. 2005. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res.33:5691–5702.[PubMed] [CrossRef]
131. Pal, C., B. Papp, and M. J. Lercher. 2005. Adaptive evolution of bacterial metabolic networks by horizontal gene transfer. Nat. Genet.37:1372–1375.[PubMed] [CrossRef]
132. Pal, C., B. Papp, and M. J. Lercher. 2005. Horizontal gene transfer depends on gene content of the host. Bioinformatics21(Suppl. 2):ii222–ii223.[PubMed] [CrossRef]
133. Pal, C., B. Papp, M. J. Lercher, P. Csermely, S. G. Oliver, and L. D. Hurst. 2006. Chance and necessity in the evolution of minimal metabolic networks. Nature440:667–670.[PubMed] [CrossRef]
134. Paley, S. M., and P. D. Karp. 2002. Evaluation of computational metabolic-pathway predictions for Helicobacter pylori. Bioinformatics18:715–724.[PubMed] [CrossRef]
135. Palsson, B. Ø. 2006. Systems Biology: Properties of Reconstructed Networks. Cambridge University Press, New York, NY.
136. Park, J. H., K. H. Lee, T. Y. Kim, and S. Y. Lee. 2007. Metabolic engineering of Escherichia coli for the production of L-valine based on transcriptome analysis and in silico gene knockout simulation. Proc. Natl. Acad. Sci. USA104:7797–7802.[PubMed] [CrossRef]
137. Park, S. J., G. Chao, and R. P. Gunsalus. 1997. Aerobic regulation of the sucABCD genes of Escherichia coli, which encode alpha-ketoglutarate dehydrogenase and succinyl coenzyme A synthetase: roles of ArcA, Fnr, and the upstream sdhCDAB promoter. J. Bacteriol.179:4138–4142.[PubMed]
138. Park, S. J., and R. P. Gunsalus. 1995. Oxygen, iron, carbon, and superoxide control of the fumarase fumA and fumC genes of Escherichia coli: role of the arcA, fnr, and soxR gene products. J. Bacteriol.177:6255–6562.[PubMed]
139. Patil, K. R., I. Rocha, J. Forster, and J. Nielsen. 2005. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics6:308.[PubMed] [CrossRef]
140. Pellicer, M. T., C. Fernandez, J. Badia, J. Aguilar, E. C. Lin, and L. Baldom. 1999. Cross-induction of glc and ace operons of Escherichia coli attributable to pathway intersection. Characterization of the glc promoter. J. Biol. Chem.274:1745–1752.[PubMed] [CrossRef]
141. Perham, R. N., and L. C. Packman. 1989. 2-Oxo acid dehydrogenase multienzyme complexes: domains, dynamics, and design. Ann. N. Y. Acad. Sci.573:1–20.[PubMed] [CrossRef]
142. Peterson, J. D., L. A. Umayam, T. Dickinson, E. K. Hickey, and O. White. 2001. The comprehensive microbial resource. Nucleic Acids Res.29:123–125.[PubMed] [CrossRef]
143. Peyru, G., and D. G. Fraenkel. 1968. Genetic mapping of loci for glucose-6-phosphate dehydrogenase, gluconate-6-phosphate dehydrogenase, and gluconate-6-phosphate dehydrase in Escherichia coli. J. Bacteriol.95:1272–1278.[PubMed]
144. Pichersky, E., L. D. Gottlieb, and J. F. Hess. 1984. Nucleotide sequence of the triose phosphate isomerase gene of Escherichia coli. Mol. Gen. Genet.195:314–320.[PubMed] [CrossRef]
145. Plumbridge, J. 1998. Control of the expression of the manXYZ operon in Escherichia coli: Mlc is a negative regulator of the mannose PTS. Mol. Microbiol.27:369–380.[PubMed] [CrossRef]
146. Price, N. D., J. A. Papin, C. H. Schilling, and B. Palsson. 2003. Genome-scale microbial in silico models: the constraints-based approach. Trends Biotechnol.21:162–169.[PubMed] [CrossRef]
147. Prodromou, C., M. J. Haynes, and J. R. Guest. 1991. The aconitase of Escherichia coli: purification of the enzyme and molecular cloning and map location of the gene (acn). J. Gen. Microbiol.137:2505–2515.[PubMed]
148. Quail, M. A., and J. R. Guest. 1995. Purification, characterization and mode of action of PdhR, the transcriptional repressor of the pdhR-aceEF-lpd operon of Escherichia coli. Mol. Microbiol.15:519–529.[PubMed] [CrossRef]
149. Reed, J. L., I. Famili, I. Thiele, and B. Ø. Palsson. 2006. Towards multidimensional genome annotation. Nat. Rev. Genet.7:130–141.[PubMed] [CrossRef]
150. Reed, J. L., T. R. Patel, K. H. Chen, A. R. Joyce, M. K. Applebee, C. D. Herring, O. T. Bui, E. M. Knight, S. S. Fong, and B. Ø. Palsson. 2006. Systems approach to refining genome annotation. Proc. Natl. Acad. Sci. USA103:17480–17484.[PubMed] [CrossRef]
151. Reed, J. L., T. D. Vo, C. H. Schilling, and B. Ø. Palsson. 2003. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol.4:R54.1–R54.12.[PubMed] [CrossRef]
152. Reed, L. J., F. H. Pettit, M. H. Eley, L. Hamilton, J. H. Collins, and R. M. Oliver. 1975. Reconstitution of the Escherichia coli pyruvate dehydrogenase complex. Proc. Natl. Acad. Sci. USA72:3068–3072.[PubMed] [CrossRef]
153. Reidl, J., and W. Boos. 1991. The malX malY operon of Escherichia coli encodes a novel enzyme II of the phosphotransferase system recognizing glucose and maltose and an enzyme abolishing the endogenous induction of the maltose system. J. Bacteriol.173:4862–4876.[PubMed]
154. Ren, Q., K. Chen, and I. T. Paulsen. 2007. TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels. Nucleic Acids Res.35:D274–D279.[PubMed] [CrossRef]
155. Resendis-Antonio, O., J. L. Reed, S. Encarnacion, J. Collado-Vides, and B. Ø. Palsson. 2007. Metabolic reconstruction and modeling of nitrogen fixation in Rhizobium etli. PLoS Comput. Biol.3:1887–1895.[PubMed] [CrossRef]
156. Rhee, S. G., G. A. Ubom, J. B. Hunt, and P. B. Chock. 1982. Catalytic cycle of the biosynthetic reaction catalyzed by adenylylated glutamine synthetase from Escherichia coli. J. Biol. Chem.257:289–297.[PubMed]
157. Riley, M., T. Abe, M. B. Arnaud, M. K. Berlyn, F. R. Blattner, R. R. Chaudhuri, J. D. Glasner, T. Horiuchi, I. M. Keseler, T. Kosuge, H. Mori, N. T. Perna, G. Plunkett III, K. E. Rudd, M. H. Serres, G. H. Thomas, N. R. Thomson, D. Wishart, and B. L. Wanner. 2006. Escherichia coli K-12: a cooperatively developed annotation snapshot-2005. Nucleic Acids Res.34:1–9.[PubMed] [CrossRef]
158. Rypniewski, W. R., and P. R. Evans. 1989. Crystal structure of unliganded phosphofructokinase from Escherichia coli. J. Mol. Biol.207:805–821.[PubMed] [CrossRef]
159. Saier, M. H., Jr. 1998. Multiple mechanisms controlling carbon metabolism in bacteria. Biotechnol. Bioeng.58:170–174.[PubMed] [CrossRef]
160. Samal, A., and S. Jain. 2008. The regulatory network of E. coli metabolism as a Boolean dynamical system exhibits both homeostasis and flexibility of response. BMC Syst. Biol.2:21.[PubMed] [CrossRef]
161. Sauer, U., F. Canonaco, S. Heri, A. Perrenoud, and E. Fischer. 2004. The soluble and membrane-bound transhydrogenases UdhA and PntAB have divergent functions in NADPH metabolism of Escherichia coli. J. Biol. Chem.279:6613–6619.[PubMed] [CrossRef]
162. Sawers, G. 2001. A novel mechanism controls anaerobic and catabolite regulation of the Escherichia coli tdc operon. Mol. Microbiol.39:1285–1298.[PubMed] [CrossRef]
163. Sawers, G. 1993. Specific transcriptional requirements for positive regulation of the anaerobically inducible pfl operon by ArcA and FNR. Mol. Microbiol.10:737–747.[PubMed] [CrossRef]
164. Sawers, G., and G. Watson. 1998. A glycyl radical solution: oxygen-dependent interconversion of pyruvate formate-lyase. Mol. Microbiol.29:945–954.[PubMed] [CrossRef]
165. Schneider, D., T. Pohl, J. Walter, K. Dorner, M. Kohlstadt, A. Berger, V. Spehr, and T. Friedrich. 2008. Assembly of the Escherichia coli NADH:ubiquinone oxidoreductase (complex I). Biochim. Biophys. Acta1777:735–739.[PubMed] [CrossRef]
166. Segal, E., M. Shapira, A. Regev, D. Pe’er, D. Botstein, D. Koller, and N. Friedman. 2003. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet.34:166–176.[PubMed] [CrossRef]
167. Segre, D., D. Vitkup, and G. M. Church. 2002. Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl. Acad. Sci. USA99:15112–15117.[PubMed] [CrossRef]
168. Shimada, T., A. Ishihama, S. J. Busby, and D. C. Grainger. 2008. The Escherichia coli RutR transcription factor binds at targets within genes as well as intergenic regions. Nucleic Acids Res.36:3950–3955.[PubMed] [CrossRef]
169. Shlomi, T., O. Berkman, and E. Ruppin. 2005. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc. Natl. Acad. Sci. USA102:7695–7700.[PubMed] [CrossRef]
170. Skarstedt, M. T., and E. Silverstein. 1976. Escherichia coli acetate kinase mechanism studied by net initial rate, equilibrium, and independent isotopic exchange kinetics. J. Biol. Chem.251:6775–6783.[PubMed]
171. Spehr, V., A. Schlitt, D. Scheide, V. Guenebaut, and T. Friedrich. 1999. Overexpression of the Escherichia coli nuo-operon and isolation of the overproduced NADH:ubiquinone oxidoreductase (complex I). Biochemistry38:16261–16267.[PubMed] [CrossRef]
172. Sprenger, G. A., U. Schorken, G. Sprenger, and H. Sahm. 1995. Transketolase A of Escherichia coli K12. Purification and properties of the enzyme from recombinant strains. Eur. J. Biochem.230:525–532.[PubMed] [CrossRef]
173. Stein, L. 2001. Genome annotation: from sequence to biology. Nat. Rev. Genet.2:493–503.[PubMed] [CrossRef]
174. Stoesser, G., M. A. Tuli, R. Lopez, and P. Sterk. 1999. The EMBL nucleotide sequence database. Nucleic Acids Res.27:18–24.[PubMed] [CrossRef]
175. Stolz, B., M. Huber, Z. Markovic-Housley, and B. Erni. 1993. The mannose transporter of Escherichia coli. Structure and function of the IIABMan subunit. J. Biol. Chem.268:27094–27099.[PubMed]
176. Sutherland, P., and L. McAlister-Henn. 1985. Isolation and expression of the Escherichia coli gene encoding malate dehydrogenase. J. Bacteriol.163:1074–1079.[PubMed]
177. Suzuki, T. 1969. Phosphotransacetylase of Escherichia coli B, activation by pyruvate and inhibition by NADH and certain nucleotides. Biochim. Biophys. Acta191:559–569.[PubMed]
178. Tanaka, Y., K. Kimata, T. Inada, H. Tagami, and H. Aiba. 1999. Negative regulation of the pts operon by Mlc: mechanism underlying glucose induction in Escherichia coli. Genes Cells4:391–399.[PubMed] [CrossRef]
179. Thiele, I., N. Jamshidi, R. M. Fleming, and B. O. Palsson. 2009. Genome-scale reconstruction of Escherichia coli's transcriptional and translational machinery: a knowledge base, its mathematical formulation, and its functional characterization. PLoS Comput. Biol.5:e1000312.[PubMed] [CrossRef]
180. Thomason, L. C., D. L. Court, A. R. Datta, R. Khanna, and J. L. Rosner. 2004. Identification of the Escherichia coli K-12 ybhE gene as pgl, encoding 6-phosphogluconolactonase. J. Bacteriol.186:8248–8253.[PubMed] [CrossRef]
181. Thomson, G. J., G. J. Howlett, A. E. Ashcroft, and A. Berry. 1998. The dhnA gene of Escherichia coli encodes a class I fructose bisphosphate aldolase. Biochem. J.331(Pt. 2):437–445.[PubMed]
182. Tseng, C. P. 1997. Regulation of fumarase (fumB) gene expression in Escherichia coli in response to oxygen, iron and heme availability: role of the arcA, fur, and hemA gene products. FEMS Microbiol. Lett.157:67–72.[PubMed] [CrossRef]
183. Tseng, C. P., J. Albrecht, and R. P. Gunsalus. 1996. Effect of microaerophilic cell growth conditions on expression of the aerobic (cyoABCDE and cydAB) and anaerobic (narGHJI, frdABCD, and dmsABC) respiratory pathway genes in Escherichia coli. J. Bacteriol.178:1094–1098.[PubMed]
184. Unden, G., and P. Dunnwald. 11 March 2008, posting date. The Aerobic and Anaerobic Respiratory Chain of Escherichia coli and Salmonella enterica: Enzymes and Energetics, The Aerobic and Anaerobic Respiratory Chain of Escherichia coli and Salmonella enterica: Enzymes and Energetics. In A. Böck, R. Curtiss III, J. B. Kaper, P. D. Karp, F. C. Neidhardt, T. Nyström, J. M. Slauch, C. L. Squires, and D. Ussery (ed.), EcoSal—Escherichia coli and Salmonella: Cellular and Molecular Biology. ASM Press, Washington, DC.
185. Varma, A., and B. Ø. Palsson. 1993. Metabolic capabilities of Escherichia coli: I. Synthesis of biosynthetic precursors and cofactors. J. Theor. Biol.165:477–502. [CrossRef]
186. Varma, A., and B. Ø. Palsson. 1993. Metabolic capabilities of Escherichia coli: II. Optimal growth patterns. J. Theor. Biol.165:503–522. [CrossRef]
187. Varma, A., and B. Ø. Palsson. 1994. Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl. Environ. Microbiol.60:3724–3731.[PubMed]
188. Veronese, F. M., E. Boccu, and A. Fontana. 1976. Isolation and properties of 6-phosphogluconate dehydrogenase from Escherichia coli. Some comparisons with the thermophilic enzyme from Bacillus stearothermophilus. Biochemistry15:4026–4033.[PubMed] [CrossRef]
189. Wade, J. T., K. Struhl, S. J. Busby, and D. C. Grainger. 2007. Genomic analysis of protein-DNA interactions in bacteria: insights into transcription and chromosome organization. Mol. Microbiol.65:21–26.[PubMed] [CrossRef]
190. Wallace, B., Y. J. Yang, J. S. Hong, and D. Lum. 1990. Cloning and sequencing of a gene encoding a glutamate and aspartate carrier of Escherichia coli K-12. J. Bacteriol.172:3214–3220.[PubMed]
191. Wilks, J. C., and J. L. Slonczewski. 2007. pH of the cytoplasm and periplasm of Escherichia coli: rapid measurement by green fluorescent protein fluorimetry. J. Bacteriol.189:5601–5607.[PubMed] [CrossRef]
192. Woods, S. A., S. D. Schwartzbach, and J. R. Guest. 1988. Two biochemically distinct classes of fumarase in Escherichia coli. Biochim. Biophys. Acta954:14–26.[PubMed]
193. Workman, C. T., H. C. Mak, S. McCuine, J. B. Tagne, M. Agarwal, O. Ozier, T. J. Begley, L. D. Samson, and T. Ideker. 2006. A systems approach to mapping DNA damage response pathways. Science312:1054–1059.[PubMed] [CrossRef]
194. Wu, L. F., and M. A. Mandrand-Berthelot. 1995. A family of homologous substrate-binding proteins with a broad range of substrate specificity and dissimilar biological functions. Biochimie77:744–750.[PubMed] [CrossRef]
195. Yankovskaya, V., R. Horsefield, S. Tornroth, C. Luna-Chavez, H. Miyoshi, C. Leger, B. Byrne, G. Cecchini, and S. Iwata. 2003. Architecture of succinate dehydrogenase and reactive oxygen species generation. Science299:700–704.[PubMed] [CrossRef]
196. Yuan, J., W. U. Fowler, E. Kimball, W. Lu, and J. D. Rabinowitz. 2006. Kinetic flux profiling of nitrogen assimilation in Escherichia coli. Nat. Chem. Biol.2:529–530.[PubMed] [CrossRef]
197. Zhou, D., and R. Yang. 2006. Global analysis of gene transcription regulation in prokaryotes. Cell. Mol. Life Sci.63:2260–2290.[PubMed] [CrossRef]