1Schelling’s model of segregation has long been considered a para--digmatic example of emergent behavior, as in the model seemingly tolerant individual choices produce complete racial segregation. The Forsé and Parodi article takes issue with both the mathematical and substantive characterization of Schelling’s model. At a mathematical level, they argue that the model’s behavior is not emergent, paradoxical, non-linear or otherwise complex. Instead, they claim that the level of segregation is a linear function of individual preferences for segregation. Substantively, Forsé and Parodi argue that tolerant individual choices do not produce segregation, and therefore that racial ghettos do not “result from innocent decisions.” While the authors identify important issues, their conclusions are potentially misleading and merit fuller discussion in light of contemporary research into segregation in particular and analytical sociology and complex social system dynamics more broadly.

# Do seemingly tolerant individual choices produce racial ghettos?

2Substantively, the authors are sceptical of the claim that tolerant or “innocent” decisions produce segregated ghettos, and their scepticism is easily justified on at least two counts. First, as the authors hint at several points, Schelling’s research on segregation has been often treated as an apology and justification for ongoing racism and discrimination. Second, as the authors clearly state, preferences are not the sole – or perhaps even primary – cause of urban segregation. Many people (discriminatory or not) were actively involved in developing and promulgating laws, real estate practices and bank lending policies that create and maintain neighborhood segregation. Nonetheless, it is important to note that the rightful targets of these critiques are those who misuse or misinterpret Schelling’s model for their own ends, not Schelling himself or those whose research was directly inspired by his groundbreaking work. On the first point, it is indisputable that Schelling’s model is often invoked by people who believe that the research “proves” that most racial discrimination is unintentional or benign. And unfortunately, science has a long history of providing prominent political actors with “evidence” used to justify discrimination against minorities, women and others. In contemporary times, the legacy of scientific racism and the eugenics movement lives on in books such as *The Bell Curve*. Thomas Schelling’s work is clearly not in this category, however, nor has Schelling himself become politically involved with the issue of residential segregation. While the topic matter and provocative thesis of the research does open itself up for misuse, any concern about misuse must be balanced against the equally legitimate goal of encouraging a full and open academic debate; particularly on sensitive topics related to race and ethnicity. Schelling’s model is not exceptional in inspiring a heated debate over whether – and to what extent – discriminatory outcomes are the result of intentionally discriminatory behavior. Asking whether it is possible for tolerant individuals to live in segregated neighborhoods is not substantially different from asking whether tolerant individuals would pay black employees less than white employees (Altonji and Blank 1999), support decreasing spending on social welfare (Kinder and Sears 1981; Sniderman and Tetlock 1986) or find racial profiling or capital punishment acceptable (Bobo and Johnson 2004). I do not doubt that Forsé and Parodi would endorse the scientific goal of open debate; but explicit discussion of their underlying concerns allows non-specialists in the field of segregation to distinguish between a legitimate scientific debate and misleading partisan advocacy. On the second point, there is also no denying the fact that popular discourse often treats Schelling’s model as “proof” that individual preferences – not institutional factors – are the primary cause of segregation. Once again, however, the popular imagination is a poor reflection of the state of scholarly research on the topic. Schelling’s work was simply a mathematical thought-experiment, and insight into mathematical dynamics necessarily starts with simple mathematical models. Research in the field has moved forward substantially over the half century since Schelling’s model was published, and I am not aware of any researchers interested in making empirical (as opposed to mathematical) claims who would argue that individual preferences are the sole cause of real world segregation. The confusion on this point is understandable, however, given the scholarly premium on simple and parsimonious models. While a laudable goal, simplicity should not be an end unto itself. Mathematical models only provide insight into empirical phenomena when the models themselves are appropriate to the circumstances; understanding that 2 + 2 equals 4 isn’t helpful to someone who has in his hands 3 oranges and 3 apples. In the world of non-linear dynamical systems, even straightforward abstract generalizations (*e.g.*, the sum of two large numbers is always greater than the sum of two small numbers) go out the window. The impact of model factors such as population size, network or neighborhood size and structure, etc. are contingent on the specific model parameters and initial conditions in any particular simulation. In other words, at least for the time being, insight into empirical phenomena that are generated by complex systems requires realistic (and typically more complex) mathematical models. Thankfully, the dynamics of segregation processes are well enough understood for researchers to build more complex and realistic models of residential segregation; models that often go beyond individual preferences to include political, economic and social institutions that contribute to segregation. Segregation research is relatively rare among social science research topics to incorporate empirical evidence on both individual preferences and other factors into mathematical simulations, a technique commonly used by natural scientists and engineers to learn more about complex systems. Careful analysis of evidence-based mathematical models of segregation (*e.g.*, Fossett 2006; Bruch and Mare 2006; Clark 1991) is currently our best hope for sorting out the relative impact of the various contributing factors. Unfortunately, better models are still unlikely to fully resolve the substantive debate over whether or not segregation is the product of racism.

# Does the model produce complex or “paradoxical” segregation dynamics?

3Most of the article is devoted to a thorough discussion of the mathematical dynamics of Schelling’s segregation model, with the ample use of simulations to illustrate two primary criticisms. First, the authors argue that the discrete nature of the social interactions and preference thresholds means that the “effective” desire to live near co-ethnics is much higher than the stated numerical threshold. Second, the authors claim that once the effective thresholds are taken into account, the resulting mathematical model exhibits no “paradoxi-cal” behavior, or spontaneously produced levels of segregation above and beyond what might be expected. While both claims highlight important and misunderstood aspects of agent-based models (abm), the authors have again highlighted areas where the popular understanding of abm is at odds with the view of Schelling and most other researchers in segregation and complex systems more broadly. Schelling originally proposed not one but two models of segregation – the visually entertaining checkerboard (or spatial proximity) model illustrated by moving coins on a checkerboard, and the less entertaining but analytically tractable model in which city residents move into or out of a single neighborhood instead of from square to square. While lacking in visual props, the dynamics of the “neighborhood” model are easier to grasp intuitively – and therefore provide a useful start to understanding segregation dynamics. To simplify the transition from the neighborhood to the checkerboard model, we will consider a slightly modified neighborhood model. All residents live in one of two neighorhoods (*i.e.*, neighborhood *A* or neighborhood *B*), with residents initially assigned by random draw to live in neighborhood *A* or *B*. City residents decide whether or not to move from neighborhood *A* to *B* (or *vice versa*) using threshold rules that are conditional on the decisions (and race) of those around them. For example, with a threshold of 30%, an individual will move unless at least 30% of neighborhood residents are of the same race or ethnicity. At the limit (*i.e.*, as city population grows infinitely large), we expect high levels of segregation when residential preferences for one’s own race exceed that race’s proportion of the population. Suppose all residents (regardless of race) want at least 30% of their neighbors to be of the same race. If the minority race is only 20% of the population, they will move to another neighborhood to be nearer of their own kind. Once one neighborhood reaches 30% minority, all minorities will move to that neighborhood, ultimately pushing out majority race members. Using a similar logic, segregation becomes more likely as the sum of the thresholds of minority and majority group members approaches 100%, and will always result when the sum of thresholds exceeds 100% (as in Schelling’s original model). On the other hand, we do not expect a “tip” towards segregation when the relative size of the minority population exceeds the minority preference threshold. If preferences remain the same (30% of same race neighbors), but minorities are now 40% of the population, the city will remain integrated most of the time. The only exception is if an extremely unbalanced initial assignment of city residents to neighborhoods (*i.e.*, substantially more than half of the minority group is assigned to one of the two neighborhoods) sets of a cascade of movers, resulting in segregation. As the city population grows larger, such unbalanced ini- tial assignments grow increasingly less common, while unbalanced assignments grow more likely as a function of the number of neighborhoods. In summary, the modified neighborhood model will always “tip” to complete segregation when the thresholds of both groups exceed 50%, and will often tip to complete segregation even when thresholds are much lower unless the minority population is relatively large. These results nonetheless understate the natural momentum of the system towards segregation, as this very simple version of the neighborhood model is deterministic (*i.e.*, it does not allow for noise, error, or random moves unrelated to racial preferences.) In a model with random moves, integrated neighborhoods are no longer evolutionarily or stochastically stable regardless of the racial preference thresholds (Zhang 2004b). Using the same logic as above, random moves will (purely by chance) after some arbitrary long periods of time accumulate into unbalanced racial proportions that will “tip” the neighborhoods into segregation. Thus, while neighborhoods may remain integrated for long periods of time, complete or near complete segregation is the only stable equilibrium (“basin of attraction”) and neighborhoods will always reach a segregated state from which there is no recovery. While segregation is the almost inevitable outcome of most versions of the neighborhood model, do the same principals apply to the Schelling checkerboard model? The checkerboard model is more difficult to analyze because actors move around a lattice with multiple residential sites, they do not merely move from neighborhood *A* to neighborhood *B*. Nonetheless, existing results – several of which have attempted to use analytical strategies borrowed from evolutionary economics or statistical physics – confirm that checkerboard models exhibit similar patterns to those observed in the neighborhood model (Zhang 2004b; Gauvin, Vannimenus, and Nadal 2009). In particular, while some versions of the checkerboard model do produce persistent integration, near complete segregation is the checkerboard model’s only stochastically stable equilibrium outcome or stable basin of attraction (Zhang 2004a). Why did the current article come to the opposite conclusion as the bulk of the existing literature? I suspect that the primary source of confusion is related to the feature of the checkerboard model that the authors (rightly) highlight as a potential concern: the combination of deterministic threshold rules with discrete choices and discrete spatial arrangements. As the authors correctly note, and as Schelling himself acknowledged, effective preferences for segregation are often considerably higher than the static threshold rules set in the model. For example, they calculate that a 33% threshold was effectively translated into a “move” decision only at around 50% own race neighbors. They put down this gap between model parameters and effective move rates to the discrete nature of the model: city residents have only a binary choice (move or don’t move) and have only 8 neighbors on whom they base the move decision (and often far less.) Discrete choices, fixed threshold rules, and other discrete model elements are known to alter or even distort the relationship between simulation outcomes and the basic dynamic of conditional decision models. Discrete elements introduce resistance to state-change and many runs of the simulation will get “stuck” in different, suboptimal states. Several prominent models of culture have been heavily criticized for using fixed threshold rules that produce knife-edge results (Klemm *et al.* 2003; Miguel *et al.* 2005), while threshold simulations run in grid-based Moore neighborhoods (those used in the checkerboard model) put a significant damper on the contagiousness of social behavior when compared to other interaction spaces such as random networks, biased random networks or global knowledge (M. Rolfe 2004). The solution in such cases is not to abandon models with discrete choices or discrete interaction spaces; we live in a bumpy, sticky, clumpy world of discrete people and binary choices. Instead, the solution is to introduce stochastic noise and/or individual heterogeneity into the model unless there is a good justification for not doing so, and to use extreme caution in making claims about real world phenomena unless discrete model parameters are based on empirical data. What happens to the checkerboard model if noise is introduced, either *via* the explicit introduction of random moves or through the use of probabilistic decision rules as opposed to fixed threshold rules? As might be expected on the basis of the earlier discussion, complete or near-complete segregation emerges far more often than in the deterministic model (Vinkovi? and Kirman 2006). Similar outcomes are produced when individuals have different thresholds (even when those thresholds are fixed), as individual heterogeneity will also greatly increase the degree of segregation produced by the checkerboard model (Bruch and Mare 2006). In this situation, a few initial moves undertaken by prejudiced or intolerant residents may set off a cascade of moves amongst moderate to high tolerance residents. In other words, the authors are correct to point out that the specific version of the checkerboard model that they analyze, with 50% minority population and deterministic fixed thresholds, only rarely “tips” to a segregated state. This result is very much in keeping with what we know about the dynamics of segregation and the impact of fixed thresholds and Moore interaction spaces on conditional decision models. However, the relative lack of segregation in this particular version of the model does not mean that there are never behavioral cascades in a particular “run” of the simulation – only that such cascades will happen less rarely and that a single move will have limited impact, triggering smaller cascades that are more likely to get “stuck” along the way. Rare and self-limiting “tipping” is not a robust feature of the checkerboard model, however, as introducing stochastic noise or variation into the model greatly increases the occurrence of segregated ghettos. Let us now look at the authors’ final set of mathematical claims. They are that the checkerboard model: 1) does not produce paradoxical segregation above what would be expected by chance, 2) exhibits a linear link between intolerance and segregation, and 3) does not produce results that are dependent on initial conditions. The implication appears to be that the Schelling model does not exhibit three primary characteristics of non-linear complex adaptive systems: 1) emergence, 2) non-linear dynamics, and 3) dependence on initial conditions. Given the frequent references to physics throughout the article, however, I assume that the article’s goal on this point is to highlight popular misunderstandings regarding agent-based modelling and complex adaptive systems, not to undermine the insights produced by a vibrant field of research. The Schelling model is not magical, but it does produce emergent, city-level behavior; otherwise we wouldn’t be able to talk about “segregation” and “ghettos” as if they were properties of cities. Many complex systems concepts (*e.g.*, emergence, chaos, “butterfly effect”) and methods (*e.g.*, agent-based modelling) are treated as if they were magical. However, this popular misunderstanding has more to do with a general lack of familiarity with complex systems mathematics than with the intrinsic properties of complex systems; as we all deal intuitively with complex systems everyday. Mathematically speaking, complex systems are made up of heterogeneous interacting components (*e.g.*, cells, city residents), and thus follow standard mathematical rules even if they are often difficult to analyze using traditional or easily available methods. There is also nothing magical about agent-based modelling; it is simply a method for better understanding mathematical systems that are generally too complex (often because of the discrete elements involved) to analyze using most available methods. The oft-repeated maxim to “keep it simple” should be taken with a large grain of salt, as any mathematical system that requires agent-based modelling is necessarily more complex than analytically tractable models with a far greater number of model parameters. However, even complex discrete-choice systems are increasingly possible to analyze using newer methods, and the introduction of empirical data to parameterize discrete model elements is a strongly recommended practice for empirical applications. Anyone who is familiar with the weather announcer’s prediction of a 40% chance of rain – or statistics more generally – is also familiar with an important aspect of complex systems: dependence on initial conditions. Any “run” or history of a complex system – even one which is completely deterministic in that future states of the system depends exclusively on prior states of the system – can be highly dependent on initial conditions. This dependence on initial conditions is clearly visible in the Schelling model, and can be identified by comparing the very different results obtained for the “almost identical” starting points, or by considering the range of segregation (0-80%) produced by otherwise similar populations with 40% own-race preferences. While the authors are correct to point out that the range of simulated outcomes is (almost) normally distributed, it is still true that the actual level of segregation (leave alone the precise arrangement of residents) is predictable only with precise knowledge of the initial conditions of a particular simulation run; it only takes one or two people to be in a different place initially to produce a much different outcome. Non-linearity is another common feature of complex systems, and the Schelling checkerboard model is decidedly non-linear. Non-linearity is unavoidable in discrete choice mathematical systems, simply because the system does not respond in a continuous manner to changes in the environment. Thus, a white city resident with a threshold of 50% and 2 same race neighbors may have two black neighbors move in and remain in the same neighborhood, and then move after a third black neighbor arrives. The move decision is not continuous (*e.g.*, the resident cannot move 33% of the time after the first black neighbor, 66% of the time after the second), and thus the model is inherently a non-linear system. However, the authors do not claim that the system is non-linear, but that the model produces results that appear to be linearly dependent on the parameters. However, appearances can be misleading, and they certainly are in this case. Before explaining the optical illusion in greater detail, it is useful to reflect for a moment on whether or not we would expect an “S” shape plot or “diffusion” curve in the results. To this end, lets consider what the Schelling neighborhood model would lead us to expect in this particular model (deterministic choices; minorities are 50% of the population; small city size). We expect: 1) near chance-based levels of segregation (*S*0.15-0.17) at thresholds below 50%, 2) complete segregation above 50% thresholds, and 3) sporadic “tips” into segregation caused by unbalanced initial distributions that might be experienced even with quite low thresholds and grow relatively frequent as thresholds approach 50%. Thus, in the neighborhood model, we would expect to see an almost complete discontinuity at the 50% threshold point in very large cities – with low levels of segregation below the discontinuity and higher levels above it. In other words, behavioral contagion or diffusion in the neighborhood model does not produce an *S*-shaped curve. In the checkerboard model, however, we will have micro-tips or diffusion processes that are spatially restricted, and these spatially restricted contagion processes are known for producing diffusion curves in the familiar *S*-shape. Therefore, we expect a similar pattern of outcomes as in the neighborhood model: an “S” shape with a wide spread of outcomes near the center (at a 50% threshold) due to the discrete spatial arrangement of the city residents putting a damper on the “tipping” dynamic and make it more likely that simulations will stabilize with segregated pockets rather than complete segregation. Now, lets compare our expectations to the author’s simulation results, shown in two scatterplots (Figures 2 and 3.) Before we can compare our expectations to the plot, it is important to take note of two graph features. The first feature is that simulated segregation levels are mapped against *effective* thresholds, not the actual thresholds used in the simulation. However, while this graphing choice is likely to smooth over the results and disguise some of the inherent bumpiness of discrete systems, it won’t affect our ability to compare the results of the checkerboard model to our expectations of the neighborhood model, except to make clearer the mathematical similarity between the two systems. The second graph feature is a more significant contributor to the optical illusion, as the *x* and *y* axis of the graph are not on the same scale. A linear correspondence between preferences for same race neighbors and city-level segregation would be represented by a diagonal (45°) line; where 20% thresholds resulted in a 20% segregation rate, and 80% preference thresholds resulted in 80% segregation. However, while the best line through the scatterplot may appear to lie along the diagonal, this is only because of the truncated *x*-axis. Correcting for that, it is easy to see that segregation is on average lower than expected under a 50% threshold, and higher than expected above a 50% threshold – just as we would expect on the basis of the simplified Schelling neighborhood model. Moreover, if the range of simulations included were expanded to look at thresholds below 20% and above 70% threshold, the expected shape of the plot would be even more evident than it is in Figure 3. The expected pattern is easier to see in Figure 3 than in Figure 2 because, as the authors correctly point out, the starting conditions of Schelling’s visual “checkerboard” were quite exceptional in promoting relatively high levels of segregation even at low thresholds.

# Conclusion

4Given the dynamics of segregation, the authors are absolutely correct on one key point: a preference for 50% of same race neighbors is in many cases mathematically indistinguishable from a desire for complete segregation. Does that mean that Schelling was off-base in claiming that even moderately tolerant decisions could produce segregation? Not really, if a preference for a 50-50 neighborhood is compared to a preference for strict exclusion (Sakoda 1971), or 1950s America when liberal Stanford professors limited blacks to only 10% of the neighborhood (Arrow 1998). By contemporary norms, it would probably be more misleading not to differentiate between those who refuse to let blacks live in their neighborhoods and those who would prefer a 50-50 or even 70-30 racial makeup to being in the minority. And even in the mathematical sense, there is a significant difference in the levels of segregation produced by 50% versus 90% thresholds in models that allow for discrete space (as in the checkerboard model) and/or individual heterogeneity. The authors have also highlighted the fact that emergence is neither magical nor impossible to intuitively grasp. In claiming that it is obvious that any system where everyone wanted to be in the majority would produce segregation, the authors have intuitively identified a primary emergent, non-linear feature of the Schelling model. However, for most people unfamiliar with complex systems dynamics, it is an unfortunate paradox that even people who prefer to live in an integrated neighborhood with a 50/50 racial mix will end up living in segregated ghettos.