Page of

When Theories Shake Hands

When Theories Shake Hands

(p.72) Five When Theories Shake Hands
Science in the Age of Computer Simulation
University of Chicago Press

Abstract and Keywords

This chapter explores the theories used in computer simulation. It suggests that there is a class of computer simulation models that are especially important to philosophers interested in the relations between theories at different levels of description. These are the so-called parallel multiscale simulation models, which are cobbled together using the resources of quantum mechanics, classical molecular dynamics, and the linear-elastic theory of solids. The chapter explains how it is possible for these different theoretical frameworks, which provide essentially incompatible descriptions of the underlying structure of the systems they describe, to be combined together.

Keywords:   computer simulation, multiscale simulation models, quantum mechanics, classical molecular dynamics, theory of solids

In the preceding chapters, we have devoted a fair amount of attention to the role of theory in guiding the construction of simulation models. Most of the examples we have been drawing upon, though, have been simulations whose theoretical ancestry was limited to a single domain. Some simulations, for example, are based on continuum methods, which begin with a theory that treats their objects of study as a medium described by fields and distributions. Others describe their objects of study as a collection of atoms and molecules and rely on the theoretical framework of a classical potential or as a collection of nuclei and electrons, relying on the theoretical background of a quantum Hamiltonian. And so, while we have seen that the relationship between simulation and theory is complex, our discussion of that relationship has been simplified in at least one respect.

That is because not all simulations work with the framework of a single theoretical background. Indeed, there is a class of computer simulation models that are especially important to philosophers interested in the relations between theories at different levels of description. As it turns out, certain kinds of phenomena are best studied with simulations based on theoretically hybrid models. Such models are constructed under the guidance of theories from a variety of domains. These so-called parallel multiscale simulation models are cobbled together using the resources of quantum (p.73) mechanics, classical molecular dynamics, and the linear-elastic theory of solids.

I want to argue in this chapter that a close look at such simulation methods can offer novel insights into the kinds of relationships that exist between theories at different levels of description. Philosophers of science often assume that the interesting relationships that exist between theories at different levels are essentially mereological. But parallel multiscale models are precisely the kinds of hybrid models in which the various scales are not linked by mereology but are instead sewn together using specially constructed algorithms to mediate between otherwise incompatible frameworks.

So I need to offer an account of how it is possible for these different theoretical frameworks, which provide essentially incompatible descriptions of the underlying structure of the systems they describe, to be sewn together. One particularly interesting thing about the process is that it sometimes relies on model features that I argue should be properly understood as fictions. Fictions, according to the account I offer, are representations that are not concerned with truth or any of its philosophical cousins (approximate truth, empirical adequacy, etc.). This pragmatic definition of a fiction places much weaker demands on what constitutes a nonfictional representation than one traditionally encounters. Still, I argue that fictions can play an important role in science: they can help us to sew together inconsistent model-building frameworks and to extend those frameworks beyond their traditional limits.

Hybrid Models at the Nanoscale

One of the places where we find the most interesting uses of hybrid simulation models is in the so-called nanosciences. Nanoscience, intuitively, is the study of phenomena and structures, and the construction of devices, at a novel scale of description: somewhere between the strictly atomic and the macroscopic levels. Theoretical methods in nanoscience, therefore, often have to draw on theoretical resources from more than one level of description.

Take, for example, the field of nanomechanics, which is the study of solid-state materials that are too large to be manageably described at the atomic level and too small to be studied using the laws of continuum mechanics. As it turns out, one of the methods of studying these nano-sized samples of solid-state materials is to run simulations of them based (p.74) on hybrid models constructed out of theories from a variety of levels (Nakano et al. 2001). Such models bear interestingly novel relationships to their theoretical ancestors. So a close look at simulation methods in the nanosciences could offer novel insights into the kinds of relationships that exist between different theories (at different levels of description) and between theories and their models.

For an example of a simulation model likely to stimulate such insights, we need look no further than the so-called parallel multiscale methods of simulation. These methods were developed by a group of researchers interested in studying the mechanical properties (reactions to stress, strain, and temperature) of intermediate-sized solid-state materials. The particular case that I examine below, developed by Farid Abraham and a group of his colleagues, is a pioneering example of this method.1 What makes the modeling technique “multiscale” is that it couples together the effects described by three different levels of description: quantum mechanics, molecular dynamics, and continuum mechanics.

Multiscale Modeling

Modelers of nanoscale solids need to use these multiscale methods—the coupling together of different levels of description—because each theoretical framework is inadequate on its own at the scale in question. The traditional theoretical framework for studying the mechanical behavior of solids is continuum mechanics (CM). CM provides a good description of the mechanics of macroscopic solids close to equilibrium. But the theory breaks down under certain conditions. CM, particularly the flavor of CM that is most computationally tractable—linear-elastic theory—is no good when the dynamics of the system are too far from equilibrium. This is because linear-elastic theory assumes that materials are homogeneous even at the smallest scales, but we know this is far from the truth. It is an idealization. When modeling large samples of material, this idealization works because the sample is large enough that one can effectively average over the inhomogeneities. Linear-elastic theory is in effect a statistical theory. But as we get below the micron scale, the fine-grained structure begins to matter more. When the solid of interest becomes smaller than approximately one micron in diameter, this “averaging” fails to be adequate. Small local variations from mean structure, such as material (p.75) decohesions—an actual tearing of the material—and thermal fluctuations begin to play a significant role in the system. In sum, CM cannot be the sole theoretical foundation of nanomechanics—it is inadequate for studying solids smaller than one micrometer in size (Rudd and Broughton 2000).

The ideal theoretical framework for studying the dynamics of solids far from equilibrium is classical molecular dynamics (MD). This is the level at which thermal fluctuations and material decohesions are most naturally described. But computational issues constrain MD simulations to about 107−108 molecules. In linear dimensions, this corresponds to a constraint of only about fifty nanometers.

So MD methods are too computationally expensive, and CM methods are insufficiently accurate for studying solids that are on the order of one micron in diameter. On the other hand, parts of the solid in which the far-from-equilibrium dynamics take place are usually confined to regions small enough for MD methods. So the idea behind multiscale methods is that a division of labor might be possible—use MD to model the regions where the real action is, and use CM for the surrounding regions, where things remain close enough to equilibrium for CM to be effective.

There is a further complication. The propagation of cracks through a solid involves the breaking of chemical bonds. But the breaking of bonds involves the fundamental electronic structure of atomic interaction. So methods from MD (which uses a classical model of the energetic interaction between atoms) are unreliable right near the tip of a propagating crack. Building a good model of bond-breaking in crack propagation requires a quantum mechanical (QM) approach. Of course, QM modeling methods are orders of magnitude more computationally expensive than MD methods. In practice, these modeling methods cannot model more than two hundred and fifty atoms at a time.

The upshot is that it takes three separate theoretical frameworks to model the mechanics of crack propagation in solid structures on the order of one micron in size. Multiscale models couple together the three theories by dividing the material to be simulated into three roughly concentric spatial regions. At the center is a very small region of atoms surrounding a crack tip, modeled by the methods of computational QM. In this region, bonds are broken and distorted as the crack tip propagates through the solid. Surrounding this small region is a larger region of atoms modeled by classical MD. In that region, material dislocations evolve and move, and thermal fluctuations play an important role in the dynamics. The far-from-equilibrium dynamics of the MD region is driven by the energetics of the breaking bonds in the inner region. In the outer (p.76)

When Theories Shake Hands

5.1 The three regions of simulation. At the center of action, where the crack propagates, the system is simulated using a tight-binding (TB)) algorithm based on quantum mechanics. In the surrounding area, the far-from-equilibrium disturbances caused by the crack are simulated using molecular dynamics (MD). The elastic waves in the outlying areas are simulated using a finite-element (FE) method. Reprinted with permission from Abraham et al. 1998. Copyright 1998, American Institute of Physics.

region, elastic energy in dissipated smoothly and close to equilibrium on length scales that are well modeled by the linear-elastic, continuum mechanical domain. In turn, it is the stresses and strains applied on the longest scales that drive the propagation of the cracks on the shortest scales (see figure 5.1).

It is the interactions between the effects on these different scales that lead students of these phenomena to describe them as “inherently multiscale” (Broughton et al. 1999, 2391). What they mean is that there is significant feedback between the three regions. All of these effects, each one of which is best understood at its own unique scale of description, are strongly coupled together. Since they all interact simultaneously, all three of the different modeling regions must be coupled together and modeled simultaneously. The fact that three different theories at three different levels of description need to be employed makes the models “multiscale.” The fact that these different regions interact simultaneously, that they (p.77) are strongly coupled together, means that the models must be “parallel multiscale” models.

An instructive way to think about the meaning of the phrase parallel multiscale is to compare two different ways of going about integrating different scales of description into one simulation. The first and more traditional method is what Abraham's group label “serial multiscale.” The idea of serial multiscale is to choose a region, simulate it at the lower level of description, summarize the results into a set of parameters digestible by the higher-level description, and then pass those results up to a simulation at the higher level.2

But serial multiscale methods are not effective when the different scales are strongly coupled together. There is a large class of problems for which the physics is inherently multiscale; that is, the different scales interact strongly to produce the observed behavior. It is necessary to know what is happening simultaneously in each region since each one is strongly coupled to the others (Broughton et al. 1999, 2391).

What seems to be required for simulating an inherently multiscale problem is an approach that simulates each region simultaneously, at its appropriate level of description, and then allows each modeling domain to continuously pass relevant information back and forth between regions—in effect, a model that seamlessly combines all three theoretical approaches. This kind of method is referred to as parallel multiscale modeling or as “concurrent coupling of length scales.” What allows the integration of the three theories to be seamless is that they overlap at the boundaries between the pairs of regions. At these boundary regions, the different theories “shake hands” with each other. The regions are called “handshaking regions,” and they are governed by “handshaking algorithms.” I discuss how this works in more detail in the next section.

The use of these handshaking algorithms is one of the things that make parallel multiscale models interesting. Parallel multiscale modeling, in particular, appears to be a new way to think about the relationship (p.78) between different levels of description in physics, chemistry, and engineering. Typically, after all, we tend to think about relationships between levels of description in mereological terms: a higher level of description relates to a lower level of description more or less in the way that the entities discussed in the higher level are composed of the entities found in the lower level. That kind of relationship, one grounded in mereology, accords well with the relationship that different levels of models bear to each other in what the Abraham group calls serial multiscale modeling. But parallel multiscale models appear to be a different way of structuring the relationship between different levels of description in physics and chemistry.

I would like to offer a bit more detail about how these models are put together and, in particular, to say a bit more about how the handshaking algorithms work—in effect, to illustrate how one seamless model can integrate more than one level of description. To do this, though, I must first say a bit more about how each separate modeling level works.

Three Theoretical Approaches

Continuum Mechanics (Linear-Elastic Theory)

The basic theoretical background for the model of the largest scale regions is linear-elastic theory, which relates, in linear fashion, stress (a measure of the quantity of force on a point in the solid) with strain (a measure of the degree to which the solid is deformed from equilibrium at a point). Linear-elastic theory, combined with a set of experimentally determined parameters for the specific material under study, enables you to calculate the potential energy stored in a solid as a function of its local deformations. Since linear-elastic theory is continuous, it must be discretized in order to be used in a computational model. This is done using a “finite-element” method. This technique involves a “mesh” made up of points that effectively tile the entire modeling region with tetrahedral or some other tiling. The size of each tetrahedron can vary across the material being simulated according to how much detail is needed in that area. Each mesh point is associated with a certain amount of displacement—the strain field. At each time step, the total energy of the system is calculated by “integrating” over each tetrahedron. The gradient of this energy function is used to calculate the acceleration of each grid point, which is in turn used to calculate its position for the next time step. And so on.

(p.79) Molecular Dynamics

In the medium-scale regions, the basic theoretical background is a classical theory of interatomic forces. The model begins with a lattice of atoms. The forces between the atoms come from a classical potential energy function for silicon proposed by Stillinger and Weber (1985). The Stillinger-Weber potential is much like the Leonard-Jones potential in that its primary component comes from the energetic interaction of nearest-neighbor pairs. But the Stillinger-Weber potential also adds a component to the energy function from every triplet of atoms, proportional to the degree to which the angle formed by each triplet deviates from its equilibrium value. Just as in the finite-element case, forces are derived from the gradient of the energy function, which are in turn used to update the position of each atom at each time step.

Quantum Mechanics

The very smallest regions of the solid are modeled as a set of atoms whose energetic interaction is governed not by classical forces but by a quantum Hamiltonian. The quantum mechanical model they use is based on a semi-empirical method from computation quantum chemistry known as the “tight-binding” method. It begins with the Born-Oppenheimer approximation, which separates electron motion and nuclear motion, and treats the nuclei as basically fixed particles as far the electronic part of the problem is concerned. The next approximation is to treat each electron as basically separate from the others and confined to its own orbital. The semi-empirical part of the method uses empirical values for the matrix elements in the Hamiltonian of these orbitals. For example, the model system that Abraham's group has focused on is solid-state silicon. Thus the values used for the matrix elements come from a standard reference table for silicon—derived from experiment. Again, once a Hamiltonian can be written down for the whole system, the motions of the nuclei can be calculated from step to step.

Handshaking between Theories

Clearly, these three different modeling methods embody mutually inconsistent frameworks. They each offer fundamentally different descriptions of matter, and they each offer fundamentally different mathematical (p.80) functions describing the energetic interactions among the entities they describe. The overarching theme is that a single Hamiltonian is defined for the entire system (Broughton et al. 1999, 2393).

The key to building a single coherent model out of these three regions is to find the right handshaking algorithm to pass the information about what is going on in one region that will affect a neighboring region into that neighbor. One of the difficulties that beset earlier attempts to exchange information between different regions in multiscale models was that they failed, badly, to conserve energy. The key to Abraham's success in avoiding this problem is that his group constructs the handshaking algorithms in such as way as to define a single expression for energy for the whole system. The expression is a function of the positions of the various “entities” in their respective domains, whether they be mesh elements, classical atoms, or the atomic nuclei in the quantum mechanical region.

The best way to think of Abraham's handshaking algorithms then, is as an expression that defines the energetic interactions between, for example, the matter in the continuum mechanical region with the matter in the molecular dynamical regions. But this is a strange idea indeed—to define the energetic interactions between regions—since the salient property possessed by the matter in one region is a (strain) field value, while in the other it is the position of a constituent particle, and in the third it is an electron cloud configuration. To understand how this is possible, we have to simply look at the details in each case.

Handshaking between CM and MD

To understand the CM/MD handshaking algorithm, first envision a plane separating the two regions. Next, recall that in the finite-element method of simulating linear-elastic theory, the material to be simulated is covered in a mesh that divides it up into tetrahedral regions. One of the original strengths of the finite-element method is that the finite-element (FE) mesh can be varied in size to suit the simulation's needs, allowing the simulationists to vary how fine or coarse the computational grid is in different locations. When the finite-element method is being used in a multiscale model, this feature of the FE mesh becomes especially useful. The first step in defining the handshake region is to ensure that as you approach the plane separating the two domains from the finite-element side, the mesh elements of the FE domain are made to coincide with the atoms of the MD domain. (Farther away from the plane, the mesh will typically get much coarser.) (p.81)

When Theories Shake Hands

5.2 The dots in the handshake region play the role of mesh elements when they look to the left and of atoms when they look to the right. Reprinted with permission from Abraham et al. 1998. Copyright 1998, American Institute of Physics.

The next step is to calculate the energy of the handshake region. This is the region between the last mesh point on one side and the first atom on the other. The technique that Abraham's group uses is essentially to calculate this energy twice: once from the perspective of FE, and once from the perspective of MD, and then average the two. Doing the first of these involves pretending that the atoms in the first row are actually mesh elements; doing the second involves the opposite—pretending that the mesh elements in the last row are atoms (see figure 5.2).

Suppose, for, example that there is an atom on the MD side of the border. It looks over the border and sees a mesh point. For the purpose of the handshaking algorithm, we treat that mesh point as an atom, calculate the energetic interaction according to the Stillinger-Weber potential, and we divide it by two (remember, we are going to be averaging together the two energetics). We do this for every atom/mesh-point pair that spans the border. Since the Stillinger-Weber potential also involves triples, we do the same thing for every triple that spans the border (again dividing by two). This is one-half of the “handshaking Hamiltonian.” The other half comes from the continuum dynamics' energetics. Whenever a mesh point on the CM side of the border looks over and sees an atom, it pretends that atom is a mesh point. Thus, from that imaginary point of view, there are complete tetrahedra that span the border (some of whose (p.82) vertices are mesh points that are “really” atoms). Treating the position of that atom as a mesh-point position, the algorithm can calculate the strain in that tetrahedron and integrate over the energy stored in the tetrahedron. Again, since we are averaging together two Hamiltonians, we divide that energy by two.

We now have a seamless expression for the energy stored in the entire region made up of both the continuous solid and the classical atoms. The gradient of this energy function dictates how both the atoms and the mesh points will move from step to step. In this way, the happenings in the CM region are automatically communicated to the molecular dynamics region, and vice versa.

Handshaking between MD and QM

The general approach for the handshaking algorithm between the quantum region and the molecular dynamics region is similar: the idea is to create a single Hamiltonian that seamlessly spans the union of the two regions. But in this case, there is an added complication. The difficulty is that the tight-binding algorithm does not calculate the energy locally. That is, it does not apportion a value for the energy for each interatomic bond; it calculates energy on a global basis. Thus, there is no straightforward way for the handshaking algorithm between the quantum and MD regions to calculate an isolated quantum mechanical value for the energetic interaction between an outermost quantum atom and a neighboring innermost MD atom. But it needs to do this in order to average it with the MD value for that energy.

The solution that Abraham and his group have developed to this problem is to employ a trick that allows the algorithm to localize that QM value for the energy. The trick is to employ the convention that at the edge of the QM region, each “dangling bond” is “tied off” with an artificial univalent atom. To do this, each atom location that lies at the edge of the QM region is assigned an atom with a hybrid set of electronic properties (see figure 5.3). In the case of silicon, what is needed is something like a silicon atom with one valence electron. These atoms, called “silogens,” have some of the properties of silicon and some of the properties of hydrogen. They produce a bonding energy with other silicon atoms that is equal to the usual Si-Si bond energy, but they are univalent like a hydrogen atom. This is made possible by the fact that the method is semi-empirical, and so fictitious values for matrix elements can simply be assigned at will. The result is that the silogen atoms do not energetically interact with their silogen neighbors, which means that the algorithm (p.83)

When Theories Shake Hands

5.3 The dots in the middle region act as classical molecules in the contributions to the MD Hamiltonian and as either silicon or “silogen” atoms in their contributions to the QM Hamiltonian. Reprinted with permission from Abraham et al. 1998. Copyright 1998, American Institute of Physics.

can localize their quantum mechanical energetic contributions. Finally, once the problem of localization is solved, the algorithm can assign an energy between atoms that span the regional threshold that is the average of the Stillinger-Weber potential and the energy from the Hamiltonian in the tight-binding approximation. Again, this creates a seamless expression for energy.

Confronting Some Philosophical Intuitions

In the next section, I suggest that there are features of these multiscale models—with their integration of different levels of description, their handshaking algorithms, and their silogens—that appear on their face to be at odds with some basic philosophical intuitions about the relationships between different theories and between theories and their models. But before I begin to draw any philosophical conclusions, I think it is important to note that this area of research—nanomechanics in general and these multiscale methods in particular—is in its relative infancy. And while Abraham and his group have had some success with their models, researchers in these areas are still facing important challenges. It is probably (p.84) too early to say whether or not this particular method of simulation will turn out, in the great scheme of things, to be the right way to go about predicting and representing the behavior of “intermediate-sized” samples of solid-state materials. Hence, it is probably also too early to be drawing conclusions, methodological or otherwise, from these sorts of examples.

It might not be too early, however, to start thinking about what kinds of basic philosophical intuitions about science are likely to come under pressure—or to be informed in novel ways—if and when these scientific domains mature. So we might, at this stage, try to pinpoint some basic philosophical questions whose answers are likely to be influenced by this kind of work. In other words, what I want to do here is to offer some ideas about what kinds of questions philosophers are likely to be able to shed light on, prospectively, if they keep an eye on what is going on in nanoscale modeling and simulation—especially with regard to multiscale methods—and to provide a sneak preview of what we might discover as the field progresses.

One issue that has received perennial attention from philosophers of science is that of the relationship between different levels of description. Traditionally, the focus of this inquiry has been debate about whether or not, and to what extent or in what respect, laws or theories at higher levels of description are reducible to those at a lower level.

Underlying all of this debate, I believe, has been a common intuition: the basis for understanding interlevel interaction—to the extent that it is possible—is just applied mereology. In other words, to the extent that the literature in philosophy of science about levels of description has focused on whether and how one level is reducible to another, it has implicitly assumed that the only interesting possible relationships are logical ones—that is, intertheoretic relationships that flow logically from the mereological relationships between the entities posited in the two levels.3

But if methods that are anything like those described above become accepted as successful in nanoscale modeling, that intuition is likely to come under pressure. The reason is that parallel multiscale modeling methods are forced to develop relationships between the different levels that are perhaps suggested, but certainly not logically determined, by their mereology. Rather, developing the appropriate relationships, in Abraham's words, “requires physical insight.”

(p.85) What this suggests is that there can be a substantial physics of inter-level interaction—a physics that is guided but not determined by either the theories at each level or the mereology of their respective entities. Indeed, whether or not the relationships employed by Abraham and his group will turn out to be the correct ones is an empirical/physical question and not a logical/mereological one.

Another issue that has recently begun to receive attention among philosophers of science, particularly in the work of Mathias Frisch (2004), is the importance of consistency in a set of laws. Using classical electrodynamics (CED) as an example, Frisch has challenged a common philosophical intuition about scientific theories: that the internal consistency of their laws is a necessary condition that all successful theories must satisfy. I want to make a similar point here. In this case, the example of multiscale modeling seems to put pressure on a closely related, if somewhat weaker, intuition: that an inconsistent set of laws can have no models.

In a formal setting, this claim is obviously true; indeed, it is true by definition. But rarely in scientific practice do we actually deal with models that have a clear formal relationship to the laws that inspire them. Most likely, the intuition that inconsistent laws cannot produce a coherent model in everyday scientific practice rests as much on pragmatic considerations as it does on the analogy to formal systems: how, in practice, could mutually conflicting sets of laws guide the construction of a coherent and successful model? We can start by looking at what we learn from Frisch. In CED the strategy is usually to keep the inconsistent subsets of the theory properly segregated for a given model.

The Maxwell-Lorentz equations can be used to treat two types of problems. We can appeal to the Maxwell equations to determine the fields associated with a given charge and current distribution, or we can use the Lorentz force law to calculate the motion of a charged particle in a given external electromagnetic field (Frisch 2004, 529). In other words, in most models of CED, each respective model draws from only one of the two mutually inconsistent “sides” of the theory. This technique works for most applications, but there are exceptions where the method fails. Models of synchrotron radiation, for example, necessarily involve both mutually inconsistent parts of the theory.

There are problems, in other words, that require us to calculate the field from the charges, as well as to calculate the motion of the charges from the fields. But the solution method, even in the synchrotron case as Frisch describes it, is still a form of segregation. The segregation is temporal. You break the problem up into time steps: in one time step you (p.86) use the Lorentz equations; in the next you use the Maxwell equations, and so on.

A form of segregation is employed in multiscale modeling as well, but it is forced to break down at the boundaries. Each of the three theoretical approaches is confined to its own spatial region of the system.4 But the fact that there are significant simultaneous and back-and-forth interactions between the physics in each of these regions means that the strategy of segregation cannot be entirely effective. Parallel multiscale methods require the modeler to apply, in the handshaking region, two different sets of laws. The laws in Abraham's model, moreover, are each pair-wise inconsistent. They offer conflicting descriptions of matter and conflicting accounts of the energetic interactions between the constituents of that matter. But the construction of the model in the handshaking regions is guided by both members of the pair. When you include the handshaking regions, parallel multiscale models are—all at once—models of an inconsistent set of laws.

The methods developed by these researchers for overcoming these inconsistencies (the handshaking algorithms) may or may not turn out to be too crude to provide a reliable modeling approach. But by paying close attention to developments in the field of nanoscale modeling, a field in which the models are almost certainly going to be required to involve hybrids of classical, quantum, and continuum mechanics, philosophers are likely to learn a great deal about how inconsistencies are managed. In the process, we will be forced to develop richer accounts of the relationships between theories and their models—richer accounts, in any case, than the one suggested by the analogy to formal systems.

Finally, these modeling techniques raise a third issue, a variation on a perennial theme in the philosophy of science: How do models differ from ideal descriptions? In particular, what role can falsehoods play in model building?

It has been widely recognized that many successful scientific models do not represent exactly. A simple example: the model of a simple harmonic oscillator can quite successfully predict the behavior of many real physical systems, but it provides at best only an approximately accurate representation of those systems. Nevertheless, many philosophers (p.87) hold to the intuition that successful models differ from ideal descriptions primary in that they include idealizations and approximations. Ronald Laymon has made this intuition more precise with the idea of “piecewise improvability” (Laymon 1985). The idea is that while many empirically successful models deviate from ideal description, a small improvement in the model (that is, a move that brings it closer to an ideal description) should always result in a small improvement in its empirical accuracy.

But what about the inclusion of silogen atoms in multiscale models of silicon? Here, piecewise improvability seems to fail. If we make the model “more realistic” by putting in more accurate values for the matrix elements at the periphery of the QM region, then the resulting calculation of the energetic interactions in the handshake region will become less accurate, not more accurate, and the overall simulation will fail to represent accurately at all.

This would suggest that we can add certain sorts of elements to models that are different in kind from ordinary idealizations, approximations, and simplifications. I suggest that we should call such modeling elements, such as the silogen atom, fictions.

Fictions in Science

To a first approximation, fictions are representations that do not concern themselves with truth. Science, to be sure, is full of representations. But the representations offered to us by science, we are inclined to think, are supposed to aim at truth (or at least one of its cousins: approximate truth, empirical adequacy, reliability). If the proper and immediate object of fictions is contrary to the aims of science, what role could there be for fictions in science?

I want to use the silogen example to argue for at least one important role for fictions in science, especially in computer simulation. Fictions, I contend, are sometimes needed for extending the useful scope of theories and model-building frameworks beyond the limits of their traditional domains of application. One especially interesting way in which they do this is by allowing model builders to sew together incompatible theories and apply them in contexts in which neither theory by itself will do the job.

I have already mentioned piecewise improvability as one way in which the use of the silogen atom differs from other sorts of modeling assumptions like idealizations and approximations. But I would like to clarify more precisely what I take it to mean for a representation in science to be a fiction. My view differs substantially from those of many others who (p.88) discuss the role of fictions in science. The history of discussions of fictions in the philosophy of science goes back at least to Hans Vaihinger's famous book, The Philosophy of ‘As If’. On Vaihinger's view, science is full of fictions.5 I believe this view is shared by many who discuss fictions in science, including many contributors to a recent volume on the subject (Suarez 2009). Vaihinger asserts that any representation that contradicts reality is a fiction or, at least, a semifiction. (A full fiction, according to Vaihinger, is something that contradicts itself.) Accordingly, most ordinary models in science are a kind of fiction.

“Fictional,” however, is not the same thing as “inexact” or “not exactly truthful.” Not everything, I would argue, that diverges from reality, or from our best accounts of reality, is a fiction. Many of the books to be found in the nonfiction section of your local bookstore contain claims that are inexact or even false. But we do not, in response, ask to have them reshelved in the fiction section. An article in this morning's newspaper might make a claim, “The green zone is a 10 km2 circular area in the center of Baghdad,” that is best seen as an idealization. And though I live at the end of a T-intersection, “Google maps” shows the adjacent street continuing on to run through my home. Still, none of these things are fictions.

I take as a starting point, therefore, the assumption that we ought to count as nonfictional many representations in science that fail to represent exactly; even representations that in fact contradict what our best science tells us to be the case about the world. Many of these kinds of representations are best captured by our ordinary use of the word “model.” The frictionless plane, the simple pendulum, and the point particle all serve as good representations of real systems for a wide variety of purposes. All of them, at the same time, fail to represent exactly the systems they purport to represent or, for that matter, any known part of the world. They all incorporate false assumptions and idealizations.

But, contra Vaihinger, I urge that, because of their function (in ordinary contexts), we continue to call these sorts of representations “models,” and resist calling them fictions. It seems to me to be simply wrong to say that ordinary models in science are not concerned with truth or any of it cousins. In sum, we do not want to get carried away. We do not want all (or almost all) of the representations in science, on maps, and in (p.89) journalism, and so on to count as fictions. To do so risks not only giving a misleading overall picture of science (“all of science is fiction!”) but also of weakening to the point of emptiness a useful dichotomy between different kinds of representations—the fictional and the nonfictional. If only exact representations are nonfictions, then even the most ardent scientific realist will have to admit that there are precious few nonfictional representations in the world.

Fictions, then, are more rare in science than are models, because most models in science aim at truth or one of its cousins. But fictions do not. It might seem then, that on this more narrow conception of what it is for a representation to be a fiction, there will probably turn out to be no fictions in science. Part of my goal, then, is to show that there are. To do this, I must define my more limited conception of what constitutes a fiction. This will involve being more clear about what it means to “aim at truth or one of its cousins.”

So how should we proceed in demarcating the boundary between fictions and nonfictions? The truly salient difference between a fictional and a nonfictional representation, it seems to me, rests with the proper function of the representation. Indeed, I argue that we should count any representation—even one that misrepresents certain features of the world (as most models do)—as a nonfiction if we offer it for the sort of purpose for which we ordinarily offer nonfictional representations.

I offer, in other words, a pragmatic rather than a correspondence conception of fictionality. What, then, is the ordinary function of nonfictional representations? I suggest that, under normal circumstances, when we offer a nonfictional representation, we offer it for the purpose of being a “good enough” guide to the way some part of the world is, for a particular purpose. Here, “good enough” implies that the model is accountable to the world (in a way that fictions are not) in the context of that purpose. On my account, to hold out a representation as a nonfiction is ipso facto to offer it for a particular purpose and to promise that, for that purpose, the model “won't let you down” when it comes to offering guidance about the world for that purpose. In short, nonfictional representations promise to be reliable in a certain sort of way.

But not just in any way. Consider an obvious fiction: the fable of the grasshopper and the ant. The fable of the grasshopper and the ant, you may recall, is the story of the grasshopper who sings and dances all summer, while the ant toils at collecting and storing food for the coming winter. When the winter comes, the ant is well prepared, and the grasshopper, who is about to starve, begs for his charity. This is what we might call a didactic fiction. The primary function of the fable, one can (p.90) assume, is to offer us important lessons about the way the world is and how we should conduct ourselves in it. For the purpose of teaching children about the importance of hard work, planning, and the dangers of living for today, it is a reasonably reliable guide to certain features of the world (or so one might think).

So why is it a fiction? What is the difference, just for example, between didactic fictions and nonfictions if both of them can serve the purpose of being reliable guides? To answer this, I think we need to consider the representational targets of representations. The fable of the grasshopper and the ant depicts a particular state of affairs. It depicts a land where grasshoppers sing and dance, insects talk, grasshoppers seek charity from ants, and so forth. If you read the fable wrong, if you read it as a nonfiction, you will think the fable is describing some part of the world—its representational target. If you read it in this way, then you will think it is meant to be a reliable guide to the way this part of the world—this little bit of countryside where the grasshopper and the ant live—is. In short, the fable is a useful guide to the way the world is in some general sense, but it is not a guide to the way its prima facie representational target is. And that is what makes it, despite its didactic function, a fiction.

Nonfictions, in other words, are not just reliable guides to the way the world is in any old way. They describe and point to a certain part of the world and say, “If you want to know about that part of the world I am pointing to, for a certain sort of purpose, I promise to help you in that respect and not let you down.” The importance of this point about the prima facie representation targets of representations will become clear later.

Fictional representations, on the other hand, are not thought to be good enough guides in this way. They are offered with no promises of a broad domain of reliability. Unlike most models in science, fictions do not come stamped with promissory notes that say something like “In these respects and to this degree of accuracy (those required for a particular purpose), some domain of the world—the domain that I purport to point to—is like me, or will behave like me.”

So in order to understand the difference between fictional and nonfictional representations, it is crucial to understand the different functions that they are intended to serve. But intended by whom? A brief note about intentionality is order. It probably sounds, from the above, as though in my view whether or not a representation counts as a fiction depends on the intention of the author of the representation. Thus, if the author intends for the representation to carry with it such a promissory note, then the representation is nonfictional. This is close to my view. But following Katherine Elgin (2009), I prefer to distinguish fictions from (p.91) nonfictions without reference to the intention of the author. If we find, someday, the secret diaries of James Watson and Francis Crick, and these reveal to us that they intended the double helix model as a planned staircase for a country estate, this has no bearing on the proper function of the model. On my view, what a representation is for depends not on the intention of the author, but on the community's norms of correct use.

Consider the famous tapestry hanging in the Cloisters museum in New York, the Unicorn in Captivity. This is a nice example of a representation. Quite possibly, the artist intended the tapestry be taken as a nonfictional representation belonging to natural history. More important, his contemporaries probably believed in unicorns. Had there been museums of natural history at the time, this might have been where the tapestry would have hung. But today, the tapestry clearly belongs in a museum of art. Deciding whether the tapestry is a fiction or a nonfiction does not depend on the intention of the author; it depends on what, on the basis of the community's norms, we can correctly take its function to be—on what kind of museum the community is likely to display it in. Once upon a time, its function might have been to provide a guide to the animal kingdom. But it no longer is.

It is clear that science is full of representations that are inexact and of representations that contradict what we know or believe to be the case about the world. It is clear, that is, that science is full of models. But are there fictions in science? If a representation is not even meant to serve as a reliable guide to the way the world is, if it is a fiction, wouldn't it necessarily fall outside of the enterprise of science?

Despite the rather liberal constraints I want to impose on what counts as a nonfiction, I believe that fictions can play a variety of roles in science. And I believe the example of the silogen atom from nanomechanics illustrates one such role. In that example, the model builders have added a fiction to their model—the silogen atom—in order to extend the useful scope of the model-building frameworks they employ beyond the limits of their traditional domains of application.

Silogen atoms, it should be clear, are fictions. To see this, we need to look at their function. But we need to be careful. If we examine the overall model that drives the simulation as a whole, it is clearly nonfictional. The representational target of the Abraham model is micron-sized pieces of silicon. And the Abraham model is meant to be reliable guide to the way that such pieces of silicon behave. To endorse such a model in the relevant respect is to promise that, for the purpose of designing nano-electromechanical systems, the model will not let you down. It will be a good enough guide to the way these pieces of silicon behave—accurate (p.92) to the degree and in the respects necessary for NEMS design. Though it contradicts reality in more ways than we could probably even count, Abraham's model is a nonfiction.

But within the simulation, we can identify components of the model that, prima facie, play their own local representational role. Each point in the QM region appears to represent an individual atom. Each point in the MD region, and each tetrahedron in the FE region, has an identifiable representational target. Some of these points, however—the ones that live in the QM/MD handshaking region—are special. These points, which represent their representation targets as silogen atoms, do not function as reliable guides to the way the way those atoms behave. Silogens are for tying off the energy function at the edge of the QM region. They are for getting the whole simulation to work properly, not for depicting the behavior of the particular atom being modeled. We are deliberately getting things wrong locally so that we get things right globally. The silogen atoms are fictional entities that “smooth over” the inconsistencies between the different model-building frameworks and extend their scope to domains where they would individually otherwise fail. In a very loose sense, we can think of them as being similar to the fable of the grasshopper and the ant. In the overall scheme of things, their grand purpose is to inform us about the world. But if we read off from them their prima facie representational targets, we are pointed to a particular domain of the world about which they make no promises to be reliable guides.


(1.) Good review literature on parallel multiscale simulation methods for nanomechanics appears in Abraham et al. 1998; Broughton et al. 1999; and Rudd and Broughton 2000.

(2.) Note that the essential idea behind parallel multiscale simulation is not entirely new. It underlies virtually all of the early attempts at subgrid modeling, including the one I discuss at length in chapter 6, namely, the original proposal for computing an artificial viscosity (i.e., using global information about the gradients of the velocity to compute a local quantity, the artificial viscosity). One fundamental difference between that kind of parallel multiscale modeling and the kind discussed here, however, is that artificial viscosity does not come from a basic theory. Indeed most subgrid modeling schemes have this relatively ad hoc character. But in the case discussed here, the schemes being used at the smallest levels actually come from the most fundamental physics. Thus, subgrid schemes like artificial viscosity do not really invoke relations between theories at different levels of description as these methods do.

(3.) An important exception is the recent work of Robert Batterman (2002).

(4.) I should note that multiscale modeling does not necessarily involve coupling spatially disjoint regions; it is also possible for a multiscale method to focus on one region, within which the system under study is modeled on more than one level of detail (viz., using both MD and continuum descriptions on the very same domain). There are many possible reasons for doing so—one being that a problem might call for knowing both the mean behavior (modeled adequately by a continuum approach) and the behavior of molecular fluctuations (modeled by MD) in the same region.

(5.) See Hans Vaihinger, The Philosophy of ‘As If’: A System of the Theoretical, Practical and Religious Fictions of Mankind, trans. C. K. Ogden (New York: Barnes and Noble, 1968); orig. pub. in England by Routledge and Kegan Paul, 1924. My understanding of Vaihinger comes almost entirely from Fine 1993.

Copyright © 2021. All rights reserved.
Privacy Policy and Legal Notice