ProcessDB User Guide Wiki    Computational Cell Biology Textbook  


Chapter 3

Construction of a Symbol and Arrow Diagram for Your System

Defining the Boundary of Your Model

Thoughtful consideration of what is to be included in your model, and what is not, can save you a lot of trouble in the later stages of the modeling process. A good first step in defining the boundary of your model is to list all the variables that might reasonably be expected to control, influence or regulate your biological system, but which you confidently expect will remain constant during all experiments you wish to analyze. If you will, for example, perform all your experiments at 20 degrees C, then you will not wish to write differential equations for the delivery and removal of heat to and from your system. Or if you consider that the biological system contains sufficient buffers that no reasonable production of protons or hydroxyl ions will alter the ambient pH, then you will not want to include the possible dependence of protein function on pH in your model. Similarly, you may be willing to assume that the concentrations of ATP, and NADH will remain constant, or that on the time scale of your experiment, the concentrations of the enzymes and other proteins do not change. A good definition of the boundary of your model is provided when you can name the variables that you expect to change with time. These are the variables for which you will write differential equations, and they are the variables INSIDE your model boundary.

Returning to the last example in the previous paragraph, there is a useful generalization to be made. Many cellular responses take place on time scales that are rapid compared to the time required to initiate transcription and translation. Models of such systems generally will NOT include differential equations for the mass of a transport protein or an enzyme. This is because the processes that might change these masses are too slow to do so in the time measurements are being made. This is the difference between responses governed by pre-existing proteins, and responses that require gene transcription and translation of mRNA into protein.

Biological variables that you place outside the boundary of your model, generally do not appear in your symbol and arrow diagram. You should be constantly aware that everything left out of your diagram is equivalent to an assumption. Don't hesitate to add such assumptions to your assumption list.

The boundary of your model may have other features. You may for example have recorded measurements of a particular variable which you know serves as a regulator of processes in your model, but the mechanisms by which the recorded dynamics come about are not of interest. In this case, even though the quantity is not constant, you may wish to use it as a forcing function and thus decouple your system of interest from this variable by placing it on the other side of your system boundary.

Choosing a basis for your model is often confused with setting the boundary. But, as the term is used in this book, your basis embodies the principal normalization for all the model's variables. You may, for example, choose to express all your model's fluxes as pmol/sec/million cells. If so, you have chosen a million cells as the basis of your model. The same model could be constructed on the basis of a single cell, and fluxes might then be reported as amol/sec/cell. If many of your experimental measurements are normalized to cell protein content, you might choose one mg cell protein as the basis of your model. The primary criterion for choosing your basis is convenience. Whenever possible, choose a basis that is a natural normalization for experimental measurements. An experienced modeler will carry in his or her head several of the most frequently required conversion factors, so that numbers from the model simulations can readily be converted to other bases.

Here are some useful rough approximations:

  • There are about 5 - 20 pg total DNA per cell
  • There are about 10 pg total RNA per cell
  • There are about 125 - 300 pg total protein per cell
  • The total dry weight of a cell is about 400 pg
  • The total wet weight of a cell is about 1300 pg
  • Cytosolic volume is about 1 pl per cell
  • The number of different proteins in a cell is between 5000 and 10,000
  • There are about 5 x 109 protein molecules in a cell
  • Cells make up about half the wet weight of a tissue

Especially when modeling individual cells, it is important to be completely familiar with all of the standard prefixes denoting useful orders of magnitude for physical quantities. Here is a table of the most widely used prefixes in biological modeling.

Prefix

Meaning

d

10-1

c

10-2

m

10-3

10-6

n

10-9

p

10-12

f

10-15

a

10-18

z

10-21

As more and more modeling is done at the cellular level, it becomes ever more important to be able to convert easily between molar concentrations and molecular abundances measured in molecules (or copies) per cell. To make this truly memorable consider a cell whose volume is 1.66 pl. This is a reasonable volume for many cells. Given only this constraint, the following table provides a useful modeler's rule of thumb.

Abundance

(molecules per cell)

Concentration
1 1 pM
10 10 pM
100 100 pM
1000 1 nM
104 10 nM
105 100 nM
106 1 M
107 10 M
108 100 M
109 1 mM
1010 10 mM
Formulating Your Working Hypothesis

The most effective way we have found to elicit a scientist's working hypothesis, is to ask him or her for a picture of how the system works. Frequently, such diagrams are referred to as "cartoons", but experience shows that the only thing funny about these cartoons is how infrequently they account for the experimental data. Nevertheless, they are extremely effective in communicating how the artist thinks the system works. In this respect, they have the same purpose as a political cartoon.

If, as you are reading this paragraph, you already have in mind a biological system you wish to model, it would be instructive to draw a picture of your current theory before reading the list of rules in the next section. These rules are based only on experience so exceptions may be found, and you're more likely to discover such exceptions if you read the rules with a preconceived notion as to how your drawing should look. If you want to draw a diagram without first looking at the rules below, be sure to represent each important variable and each important process.

Drawing Your Diagram

Practical rules for construction of symbol and arrow diagrams:

  • Every biological quantity you think is essential for characterizing your biological system should be given a symbol and should appear in the diagram
  • Each chemical species, in each physical location, should appear in the diagram only once
  • Use solid arrows to represent conversions of one molecule to another, or movements from one place to another
  • Use dashed or dotted arrows, or arrows of a different color, to represent regulation or control
  • Recognize that each solid arrow represents a process, and that only processes are subject to regulation or control
  • Draw the barriers between physical locations in your biological system, such as cell membranes, organelle membranes, and epithelia

Compiled in the diagrams that follow are some of the most common errors in constructing symbol and arrow diagrams.

Figure 3-1 Errors in Symbol-and-Arrow Diagrams

This diagram has two errors. First, cytosolic Ca is represented twice, incorrectly suggesting the need for two differential equations where one will do nicely. Second, the arrow representing extrusion of Ca from the cell, perhaps by a CaATPase, does not end in a symbol. Upon converting this diagram to a differential equation, the modeler would be likely to omit the term corresponding to this extrusion in the differential for extracellular Ca.

Another common error relates to control and regulation.

This diagram suggests that cytosolic Ca is increased by an increase in the concentration of inositol 1,4,5-trisphosphate. This, of course, is true, but the diagram shows regulation of a variable rather than regulation of a process. A correct diagram combining both of the previous ones is drawn below.

Exercise 3-1: construct a symbol and arrow diagram for a physical, chemical or biological process that you know well. Give your diagram to your instructor for feedback All of the solid arrows in these diagrams must correspond to a biological process.

What kinds of processes exist? Answer: Only three, as described in the next section.

There are Only Three Kinds of Biological Processes
  • Translocation
  • Transformation
  • Binding

Translocation is the class of processes that results in movement of a chemical species from one location to another. Examples are diffusion, ion channels, permeases, transporters (active, cotransport, facilitated diffusion), endocytosis, exocytosis, and vesicular trafficking. Notice that many of these involve traversing a membrane; this emphasizes that when we speak of "a chemical species in a physical place" (see below), the "place" is often delimited by a lipid bilayer.

Transformation is the class of processes that results in conversion of one molecular species to another. These may be catalyzed by proteins, catalyzed by some RNA species, or may be spontaneous reactions at biological temperature and pH. A more specific example is the conversion of ATP to cAMP by the action of adenylyl cyclase. All chemical reactions involving making or breaking of covalent bonds are included here.

Binding is the class of processes that consists of two molecular species combining, usually non-covalently, to form a single complex. Examples are transcription factors binding to response elements in DNA, hormones and neurotransmitters binding to their receptors on the cell surface, cAMP binding to the regulatory subunit of protein kinase A, Ca2+ binding to calmodulin, Ca2+ -calmodulin complex binding to an allosteric regulatory site on myosin light chain kinase, and fatty acids binding to albumin. It might reasonably be argued that binding is really a special case of transformation, and that there are only two classes of biological processes. Experience suggests that binding is such a widespread biological phenomenon that it deserves a separate category. This is especially true since the mathematical treatment of binding is often different from the treatment of enzyme kinetics.

You should be able to point to each solid arrow in your system diagram and identify the corresponding process as translocation, transformation, or binding. If one arrow represents more than one of these processes, it is often helpful to break it into its constituent subprocesses.

A Compartment is a Chemical Species in a Physical Place

Another name for the application of ordinary differential equations to the study of physical or chemical systems is compartmental analysis. This name arises because the analysis, in effect, compartmentalizes space and treats all the resulting compartments as uniform in composition. For example, it is common to treat the cytosolic space in a mammalian cell as well-mixed with respect to ion concentrations. This means that within the cytosolic magnesium compartment, for example, there are no spatial gradients. Formally,

If the assumption of spatial homogeneity is later found to be in error, then you have two choices:

  • divide your single compartment into multiple compartments, each of which is spatially homogeneous
  • resort to partial differential equation descriptions of your system

It is worth pointing out that if you choose to write and solve partial differential equations, you will be doing the equivalent of choosing to divide your single compartment into a large number (say, 10 to 100) of sub compartments which will be treated as spatially homogeneous. Modern partial differential equation solvers, are really ordinary differential equation solvers optimized to work on many small compartments. On the same computational platform, solution of full partial differential equation models can easily require ten to fifty times more computation time than the corresponding system of ordinary differential equations. Generally, you should only consider the partial differential equation model when the resolution of your experimental data is sufficient. One example of a good case for partial differential equations is the analysis of Ca2+ waves in large cells, but you should be aware that surprisingly complex Ca2+ oscillations can be reproduced with compartmental models containing only a single cytosolic Ca2+ compartment.

As with your data tables and assumption lists, it is a good idea to document your system diagram. Footnotes to the diagram are a time-tested means of efficient documentation. Any information about a variable or a process (either qualitative or quantitative) can be included in a footnote. Be sure to include the reference to the scientific literature, and mention the cell type or tissue to which the information applies. Most of the symbols in your system diagram will correspond to compartments, although symbols referring to quantities that remain always at fixed values may be treated as parameters. Symbols (and the biological variables they stand for) which vary with time during the proposed experimental protocols will become the state variables of your model. Once your diagram is finalized, you need only understand the concept of a state variable before you begin to translate your diagram into the language of mathematics.

Your model may have a great many state variables, so it pays to choose names for your state variables that convey the identities of the "chemical species" and the "physical place." Try to work with a software package that permits variable names of your choosing.

The State Variable Concept

If your previous training is in engineering, you will already be familiar with the concept of a state variable. Most simply put, the state variables of a system are those quantities whose values will change with time as the proposed experimental protocols are carried out; they are the quantities for which differential equations must be written.

Often, these differential equations are written in matrix form as

where x is the vector of state variables and A is the matrix of rate coefficients. Notice that this form of the differential equations is equivalent to the assertion that the future of the system is determined by its present state; that is, you can calculate the derivatives given only the A matrix and the current values of the state variables. For some sorts of experiments, notably tracer kinetic studies, the A matrix always has constant elements, aij. But for the more general case of a dynamic biological system, the A matrix must represent many sorts of nonlinear rate laws. In such instances the aij are themselves functions of the state variables of the system. Consequently, the differential equations become nonlinear:

In either case, your objective as a modeler is to figure out the elements of the A matrix. It can reasonably be argued that if you can specify all of the aij or the aij(x), then you have a complete understanding of the system. The logic of this argument is based on your ability to predict the outcome of future experiments involving the same set of state variables. If your A matrix is correct, then by solving the system of differential equations, you can predict the system's response to any perturbation involving the state variables, xi. This, of course, does not mean that your prediction is correct, but if it is not correct, then the experiment contains new information that should be incorporated in your model, just as described in Chapter 1.

If you are unfamiliar with matrix notation, or if you find it more intuitive to consider one differential equation at a time, then consider the following differential equation for a single state variable:

Here, there are positive and negative terms on the right hand side of the the equation, corresponding to processes that increase and decrease xi. In the context of this equation, your modeling objective is to identify the functions aij and aji. Fortunately, many of these will be zero, corresponding to the fact that not every state variable is an immediate determinant of every other state variable. Moreover, the nonzero aij will rarely be functions of more than a small subset of the state variables. Frequently, in fact, you will begin your analysis by assuming that most of the aij are constants, not functions of the other state variables.

A major tenet of the modeling philosophy in this hypertextbook is that there must be a correspondence between the terms of your equations and the known or hypothesized processes occurring in the biological system under study. What, then, are the sorts of real processes that are represented in the differential equations above? Perhaps the best example is the case of a molecule that is both synthesized and degraded inside cells. For generality, call the molecule m. Then the differential equation for the rate of change of the mass of m is

Looking back at the general differential equations presented earlier, can you identify a term that might represent synthesis and a term that might represent degradation? The logical choice is that the aij(x) represent all the possible synthesis terms and the aji(x) represent all the possible degradation terms. Typically, this great generality is unnecessary in practice, but it serves as a reminder of the universe of possibilities. Another way of making this same point is to emphasize that the A matrix of a biological system is almost always a sparse matrix. This means that most of the elements, aij, of the matrix are, in fact, zero. Consider, for example, the differential equation for the mass of a particular second messenger, cyclic AMP, in the cytosol of a particular cell. Let C be the mass of cAMP, then

In other words, there is only one positive term and only one negative term in the differential equation for cAMP. You will find that many of the terms in biological differential equations correspond to the functions of particular proteins, and these particular terms correspond to the rate laws for two particular enzymes; adenyl cyclase which catalyzes the conversion of ATP to cAMP, and phosphodiesterase which catalyzes the hydrolysis of the ester bond and thus converts cAMP to AMP.

The most difficult and rewarding work in mathematical modeling of biological systems is the discovery of correct rate laws for the processes you've indicated in your system diagram. This work requires both biological and mathematical sophistication and is the subject of the next chapter.


Chapter 4