 |
 |
Premises underlying systems biology research
The focus of research is a biological system. One can
broadly define a ‘system’ as a group of independent but
interconnected elements that function together to comprise a
unified whole. The boundaries of the system may not be
clearly defined or definable, especially if multicellular organisms
are being investigated. Instead, one may start with an observable
phenotype. For example, yeast filamentation, and, using a systems
biology approach to identify subsets of modules within a system
that specifically relate to the emergence of that phenotype.
Systems biologists usually identify a model organism such as
yeast, or a model cell population such as
macrophages that are
well-suited for the biological process under investigation.
With a model system in place, systems biology research
proceeds using both discovery-based and hypothesis-based
approaches.
The hallmark of discovery science is a relatively unbiased
collection and cataloging of data pertaining to a specific domain
of inquiry. Examples include:
- sequencing of species’
genomes
- non-normalized Expressed Sequenced Tags (EST) libraries of tissue types at various
stages of development
- transcriptional profiling and ChIP-chip
microarrays
- affinity capture protocols for macromolecular
complexes
- delineation of the precise number of cells in an
organism as has been done for the worm C. elegans.
Loosely speaking, performing a data collection exercise whose
specific results cannot be predicted in advance constitutes a
discovery-based approach. Discovery science generates “parts
lists” for the system under investigation. And, in some cases,
is able to provide a complete and comprehensive list of parts that
can be used as a basis for narrowing the scope of possible
interactions.
Once a system’s elements and interactions have been
discovered and delineated to a first approximation, specific and
testable hypotheses are required for determining which
elements and interactions are functionally relevant to the
observable phenotypes of the system under the varying conditions
being investigated. To this end, many systems biologists
assume that experimentally observed or inferred interactions among
elements might profitably be conceptualized as networks,
with the individual elements (e.g., genes, proteins, metabolites)
portrayed as nodes, and the interactions or interconnections (e.g.,
DNA-protein binding, protein-protein binding) as links or
edges. In this way, the structure of the system can be
conceptualized. Network diagrams provide a visual representation
for how different types of interconnections might be
organized. (See, for example,
Biotapestry and
Cytoscape.)
These representations are used to guide the
formulation of hypotheses.

For example, results of yeast two hybrid experiments performed
using different baits and under different physiological conditions
can be more easily compared when portrayed as protein interaction
networks. From these comparisons, predictions about
the content of macromolecular complexes involved with processes,
such as gene regulation, can be evaluated as to their likelihood
prior to performing expensive experiments.
However, static network models do not reveal dependency relationships among
interaction partners or the kinetics of changes in the network over
time in response to an experimental perturbation. Thus, they
provide a useful starting point for understanding a system but do
not, in themselves, explain causal significance (i.e., how
functional attributes emerge from the interactions among the
network’s components). The distinction between
“correlated with” and “functionally
affects” is not easily represented in a network diagram of
interconnections among a system’s elements. It is
anticipated, however, that developments in biological network
topology will conduce to more effective modeling tools.
Networks cannot easily represent different types of relationships
between elements. They cannot indicate that the interaction of
elements has changed one of the elements (e.g., chemically modified
it), and they cannot represent the dynamics of the network.
The true test of a good system model is successful prediction of
the system’s behavior under targeted alterations (genetic or
environmental perturbations) of experimental conditions.
But the very properties that make biological systems interesting
and worthwhile to study their emergent properties, robustness,
stability, modularity and adaptability to change, also make their
behavior hard to predict at the molecular level.
Confounding factors include functional redundancy (i.e., a
given process might be accomplished by several different molecular
mechanisms), and the stochasticity of cell populations (what is
measured, e.g., gene expression, could be an average of a wide
range of discrete responses among individual cells).
Systems biologists approach this conundrum by adopting the
following principles:
- Global approaches should be taken to data collection and
analyses. Ideally, high-throughput platforms are used to
collect accurate measurements under multiple sets of well-defined
experimental conditions. Technologies for performing
quantitative, multi parameter measurements on a single sample need to
be developed. To add value to the analyses of data obtained from
multiplex technologies such as chips and panels of gene
deletion mutants or RNAi gene knockouts, global approaches will
incorporate relevant findings from curated databases and the
published literature.
- Information derived from diverse data types should be
integrated. Systems biology derives power from the
leveraging of pre-existing biochemical and cell biology knowledge
with the various interaction network models inferred from the
global datasets. Even though each source of data type
might be sparse, noisy, or contain systematic errors, a meaningful
pattern among the diverse data might become
apparent and further analysis made possible if the
network models are integrated.
- Mathematical and statistical
modeling is essential to the
quantitative analysis of a system’s properties. Based on
a working model and relevant assumptions, computer simulations are
used to probe the probable effects of perturbations on a
system’s components and interactions in the interest of
making predictions that can be validated by the collection of more
data. Thus, there is a tight integration of computer modeling
with experimental design.
- Biology should drive technology which, in turn, makes better
biology possible. Invention of novel or more sophisticated data collection, analysis and modeling tools is motivated by the need to solve a real-world biological problem. As a paradigm case, the Human Genome Project forced the development of high-throughput DNA sequencing methodologies. The need to perform multiparameter measurements on single cells is currently driving the invention of microfluidic/nanotechnology devices.
- Systems biology research should create an interactive inter-disciplinary scientific culture. For progress to occur, experts in engineering, physics, mathematics, and computer science must join biochemists, cell biologists, and physiologists in the effort to figure out how to obtain the required data and develop the sophisticated computational approaches that will be needed to make viable predictions. For scientists who have been trained primarily in one of these disciplines, doing systems biology research involves stepping outside one's comfort zone to learn new concepts and methodologies. Systems biology-focused institutions accept that cross-disciplinary training from the get-go is the best way for new investigators to embrace the field.
- The results of research should be freely disseminated. The Human Genome Project has revealed the enormous benefit that derives from the public release of data to the community of researchers. While not as easy to work with as genomic sequence, available microarray datasets, yeast two-hybrid analyses, collections of gene knockout strains and the like have accelerated progress in systems biology research. Similarly, computational biology is facilitated by the sharing of open-source software.
|
 |

|
 |
 |