Chapter 3
Construction of a Symbol and Arrow Diagram for Your System
Defining the Boundary of Your Model
Thoughtful consideration of what is to be included in your
model, and what is not, can save you a lot of trouble in the
later stages of the modeling process. A good first step in
defining the boundary of your model is to list all the variables
that might reasonably be expected to control, influence or
regulate your biological system, but which you confidently expect
will remain constant during all experiments you wish to analyze.
If you will, for example, perform all your experiments at 20
degrees C, then you will not wish to write differential equations
for the delivery and removal of heat to and from your system. Or
if you consider that the biological system contains sufficient
buffers that no reasonable production of protons or hydroxyl ions
will alter the ambient pH, then you will not want to include the
possible dependence of protein function on pH in your model.
Similarly, you may be willing to assume that the concentrations
of ATP, and NADH will remain constant, or that on the time scale
of your experiment, the concentrations of the enzymes and other
proteins do not change. A good definition of the boundary of your
model is provided when you can name the variables that you expect
to change with time. These are the variables for which you will
write differential equations, and they are the variables INSIDE
your model boundary.
Returning to the last example in the previous paragraph, there
is a useful generalization to be made. Many cellular responses
take place on time scales that are rapid compared to the time
required to initiate transcription and translation. Models of
such systems generally will NOT include differential equations
for the mass of a transport protein or an enzyme. This is because
the processes that might change these masses are too slow to do
so in the time measurements are being made. This is the
difference between responses governed by preexisting proteins,
and responses that require gene transcription and translation of
mRNA into protein.
Biological variables that you place outside the boundary of
your model, generally do not appear in your symbol and arrow
diagram. You should be constantly aware that everything left out
of your diagram is equivalent to an assumption. Don't hesitate to
add such assumptions to your assumption
list.
The boundary of your model may have other features. You may
for example have recorded measurements of a particular variable
which you know serves as a regulator of processes in your model,
but the mechanisms by which the recorded dynamics come about are
not of interest. In this case, even though the quantity is not
constant, you may wish to use it as a forcing function and thus
decouple your system of interest from this variable by placing it
on the other side of your system boundary.
Choosing a basis for your model is often confused with setting
the boundary. But, as the term is used in this book, your basis
embodies the principal normalization for all the model's
variables. You may, for example, choose to express all your
model's fluxes as pmol/sec/million cells. If so, you have chosen
a million cells as the basis of your model. The same model could
be constructed on the basis of a single cell, and fluxes might
then be reported as amol/sec/cell. If many of your experimental
measurements are normalized to cell protein content, you might
choose one mg cell protein as the basis of your model. The
primary criterion for choosing your basis is convenience.
Whenever possible, choose a basis that is a natural normalization
for experimental measurements. An experienced modeler will carry
in his or her head several of the most frequently required
conversion factors, so that numbers from the model simulations
can readily be converted to other bases.
Here are some useful rough approximations:
 There are about 5  20 pg total DNA per cell
 There are about 10 pg total RNA per cell
 There are about 125  300 pg total protein per cell
 The total dry weight of a cell is about 400 pg
 The total wet weight of a cell is about 1300 pg
 Cytosolic volume is about 1 pl per cell
 The number of different proteins in a cell is between
5000 and 10,000
 There are about 5 x 109 protein molecules in a cell
 Cells make up about half the wet weight of a tissue
Especially when modeling individual cells, it is important to
be completely familiar with all of the standard prefixes denoting
useful orders of magnitude for physical quantities. Here is a
table of the most widely used prefixes in biological modeling.
Prefix

Meaning

d

10^{1}

c

10^{2}

m

10^{3}

µ

10^{6}

n

10^{9}

p

10^{12}

f

10^{15}

a

10^{18}

z

10^{21}

As more and more modeling is done at the cellular
level, it becomes ever more important to be able to convert
easily between molar concentrations and molecular abundances
measured in molecules (or copies) per cell. To make this truly
memorable consider a cell whose volume is 1.66 pl. This is a
reasonable volume for many cells. Given only this constraint, the
following table provides a useful modeler's rule of thumb.
Abundance (molecules per cell)

Concentration 
1 
1 pM 
10 
10 pM 
100 
100 pM 
1000 
1 nM 
10^{4} 
10 nM 
10^{5} 
100 nM 
10^{6} 
1 µM 
10^{7} 
10 µM 
10^{8} 
100 µM 
10^{9} 
1 mM 
10^{10} 
10 mM 
Formulating Your Working Hypothesis
The most effective way we have found to elicit a scientist's
working hypothesis, is to ask him or her for a picture of how the
system works. Frequently, such diagrams are referred to as
"cartoons", but experience shows that the only thing
funny about these cartoons is how infrequently they account for
the experimental data. Nevertheless, they are extremely effective
in communicating how the artist thinks the system works. In this
respect, they have the same purpose as a political cartoon.
If, as you are reading this paragraph, you already have in
mind a biological system you wish to model, it would be
instructive to draw a picture of your current theory before
reading the list of rules in the next section. These rules are
based only on experience so exceptions may be found, and you're
more likely to discover such exceptions if you read the rules
with a preconceived notion as to how your drawing should look. If
you want to draw a diagram without first looking at the rules
below, be sure to represent each important variable and each
important process.
Drawing Your Diagram
Practical rules for construction of symbol and arrow diagrams:
 Every biological quantity you think is essential for
characterizing your biological system should be given a
symbol and should appear in the diagram
 Each chemical species, in each physical location, should
appear in the diagram only once
 Use solid arrows to represent conversions of one molecule
to another, or movements from one place to another
 Use dashed or dotted arrows, or arrows of a different
color, to represent regulation or control
 Recognize that each solid arrow represents a process, and
that only processes are subject to regulation or control
 Draw the barriers between physical locations in your
biological system, such as cell membranes, organelle
membranes, and epithelia
Compiled in the diagrams that follow are some of the most
common errors in constructing symbol and arrow diagrams.
Figure 31 Errors in SymbolandArrow Diagrams
This diagram has two errors. First, cytosolic Ca is
represented twice, incorrectly suggesting the need for two
differential equations where one will do nicely. Second, the
arrow representing extrusion of Ca from the cell, perhaps by a
CaATPase, does not end in a symbol. Upon converting this diagram
to a differential equation, the modeler would be likely to omit
the term corresponding to this extrusion in the differential for
extracellular Ca.
Another common error relates to control and regulation.
This diagram suggests that cytosolic Ca is increased by an
increase in the concentration of inositol 1,4,5trisphosphate.
This, of course, is true, but the diagram shows regulation of a
variable rather than regulation of a process. A correct diagram
combining both of the previous ones is drawn below.
Exercise 31: construct a symbol and arrow diagram for a
physical, chemical or biological process that you know well. Give
your diagram to your instructor for feedback All of the solid
arrows in these diagrams must correspond to a biological process.
What kinds of processes exist? Answer: Only three, as
described in the next section.
There are Only Three Kinds of Biological Processes
 Translocation
 Transformation
 Binding
Translocation is the class of processes that results in
movement of a chemical species from one location to another.
Examples are diffusion, ion channels, permeases, transporters
(active, cotransport, facilitated diffusion), endocytosis,
exocytosis, and vesicular trafficking. Notice that many of these
involve traversing a membrane; this emphasizes that when we speak
of "a chemical species in a physical place" (see
below), the "place" is often delimited by a lipid
bilayer.
Transformation is the class of processes that results
in conversion of one molecular species to another. These may be
catalyzed by proteins, catalyzed by some RNA species, or may be
spontaneous reactions at biological temperature and pH. A more
specific example is the conversion of ATP to cAMP by the action
of adenylyl cyclase. All chemical reactions involving making or
breaking of covalent bonds are included here.
Binding is the class of processes that consists of two
molecular species combining, usually noncovalently, to form a
single complex. Examples are transcription factors binding to
response elements in DNA, hormones and neurotransmitters binding
to their receptors on the cell surface, cAMP binding to the
regulatory subunit of protein kinase A, Ca^{2+} binding
to calmodulin, Ca^{2+} calmodulin complex binding to an
allosteric regulatory site on myosin light chain kinase, and
fatty acids binding to albumin. It might reasonably be argued
that binding is really a special case of transformation, and that
there are only two classes of biological processes. Experience
suggests that binding is such a widespread biological phenomenon
that it deserves a separate category. This is especially true
since the mathematical treatment of binding is often different
from the treatment of enzyme kinetics.
You should be able to point to each solid arrow in your system
diagram and identify the corresponding process as translocation,
transformation, or binding. If one arrow represents more than one
of these processes, it is often helpful to break it into its
constituent subprocesses.
A Compartment is a Chemical Species in a Physical Place
Another name for the application of ordinary differential
equations to the study of physical or chemical systems is
compartmental analysis. This name arises because the analysis, in
effect, compartmentalizes space and treats all the resulting
compartments as uniform in composition. For example, it is common
to treat the cytosolic space in a mammalian cell as wellmixed
with respect to ion concentrations. This means that within the
cytosolic magnesium compartment, for example, there are no
spatial gradients. Formally,
If the assumption of spatial homogeneity is later found to be
in error, then you have two choices:
 divide your single compartment into multiple
compartments, each of which is spatially homogeneous
 resort to partial differential equation descriptions of
your system
It is worth pointing out that if you choose to write and solve
partial differential equations, you will be doing the equivalent
of choosing to divide your single compartment into a large number
(say, 10 to 100) of sub compartments which will be treated as
spatially homogeneous. Modern partial differential equation
solvers, are really ordinary differential equation solvers
optimized to work on many small compartments. On the same
computational platform, solution of full partial differential
equation models can easily require ten to fifty times more
computation time than the corresponding system of ordinary
differential equations. Generally, you should only consider the
partial differential equation model when the resolution of your
experimental data is sufficient. One example of a good case for
partial differential equations is the analysis of Ca^{2+}
waves in large cells, but you should be aware that surprisingly
complex Ca^{2+} oscillations can be reproduced with
compartmental models containing only a single cytosolic Ca^{2+}
compartment.
As with your data tables and assumption lists, it is a good
idea to document your system diagram. Footnotes to the diagram
are a timetested means of efficient documentation. Any
information about a variable or a process (either qualitative or
quantitative) can be included in a footnote. Be sure to include
the reference to the scientific literature, and mention the cell
type or tissue to which the information applies. Most of the
symbols in your system diagram will correspond to compartments,
although symbols referring to quantities that remain always at
fixed values may be treated as parameters. Symbols (and the
biological variables they stand for) which vary with time during
the proposed experimental protocols will become the state
variables of your model. Once your diagram is finalized, you need
only understand the concept of a state variable before you begin
to translate your diagram into the language of mathematics.
Your model may have a great many state variables, so it pays
to choose names for your state variables that convey the
identities of the "chemical species" and the
"physical place." Try to work with a software package
that permits variable names of your choosing.
The State Variable Concept
If your previous training is in engineering, you will already
be familiar with the concept of a state variable. Most simply
put, the state variables of a system are those quantities whose
values will change with time as the proposed experimental
protocols are carried out; they are the quantities for which
differential equations must be written.
Often, these differential equations are written in matrix form
as
where x is the vector of state variables and A
is the matrix of rate coefficients. Notice that this form of the
differential equations is equivalent to the assertion that the
future of the system is determined by its present state; that is,
you can calculate the derivatives given only the A matrix
and the current values of the state variables. For some sorts of
experiments, notably tracer kinetic studies, the A matrix
always has constant elements, a_{ij}. But for the more
general case of a dynamic biological system, the A matrix must
represent many sorts of nonlinear rate laws. In such instances
the a_{ij} are themselves functions of the state
variables of the system. Consequently, the differential equations
become nonlinear:
In either case, your objective as a modeler is to figure out
the elements of the A matrix. It can reasonably be argued
that if you can specify all of the aij or the a_{ij}(x),
then you have a complete understanding of the system. The logic
of this argument is based on your ability to predict the outcome
of future experiments involving the same set of state variables.
If your A matrix is correct, then by solving the system of
differential equations, you can predict the system's response to
any perturbation involving the state variables, x_{i}.
This, of course, does not mean that your prediction is correct,
but if it is not correct, then the experiment contains new
information that should be incorporated in your model, just as
described in Chapter 1.
If you are unfamiliar with matrix notation, or if you find it
more intuitive to consider one differential equation at a time,
then consider the following differential equation for a single
state variable:
Here, there are positive and negative terms on the right hand
side of the the equation, corresponding to processes that
increase and decrease x_{i}. In the context of this
equation, your modeling objective is to identify the functions a_{ij}
and a_{ji}. Fortunately, many of these will be zero,
corresponding to the fact that not every state variable is an
immediate determinant of every other state variable. Moreover,
the nonzero a_{ij} will rarely be functions of more than
a small subset of the state variables. Frequently, in fact, you
will begin your analysis by assuming that most of the a_{ij}
are constants, not functions of the other state variables.
A major tenet of the modeling philosophy in this hypertextbook
is that there must be a correspondence between the terms of your
equations and the known or hypothesized processes occurring in
the biological system under study. What, then, are the sorts of
real processes that are represented in the differential equations
above? Perhaps the best example is the case of a molecule that is
both synthesized and degraded inside cells. For generality, call
the molecule m. Then the differential equation for the
rate of change of the mass of m is
Looking back at the general differential equations presented
earlier, can you identify a term that might represent synthesis
and a term that might represent degradation? The logical choice
is that the a_{ij}(x) represent all the possible
synthesis terms and the a_{ji}(x) represent all
the possible degradation terms. Typically, this great generality
is unnecessary in practice, but it serves as a reminder of the
universe of possibilities. Another way of making this same point
is to emphasize that the A matrix of a biological system
is almost always a sparse matrix. This means that most of the
elements, a_{ij}, of the matrix are, in fact, zero.
Consider, for example, the differential equation for the mass of
a particular second messenger, cyclic AMP, in the cytosol of a
particular cell. Let C be the mass of cAMP, then
In other words, there is only one positive term and only one
negative term in the differential equation for cAMP. You will
find that many of the terms in biological differential equations
correspond to the functions of particular proteins, and these
particular terms correspond to the rate laws for two particular
enzymes; adenyl cyclase which catalyzes the conversion of ATP to
cAMP, and phosphodiesterase which catalyzes the hydrolysis of the
ester bond and thus converts cAMP to AMP.
The most difficult and rewarding work in mathematical modeling
of biological systems is the discovery of correct rate laws for
the processes you've indicated in your system diagram. This work
requires both biological and mathematical sophistication and is
the subject of the next chapter.
