1In the 1970s, Thomas C. Schelling (1971, 1978) proposed a model to explain the link between individual preferences about one’s neighbours and spatial segregation. This model has the advantage of being quite simple as all that is required to implement it is a classification of a population into two distinct groups and some information about the proportion of individuals from the other group that individuals are willing to accept in their local environment. We can imagine a large number of applications (Ruoff & Schneider, 2006), but it is indubitably its application to the urban environment which is responsible for the model’s fame. One has to admit that its principal conclusion is striking. On the basis of simulations, Schelling showed that a very high degree of segregation can be the collective outcome of individual decisions which in themselves do not aim to achieve the aforesaid segregation. There appears to be a “perverse” or system effect: even if individuals are quite tolerant about the makeup of their neighbourhood, a high degree of urban segregation occurs independently of other parameters such as the price of housing or income levels.
2In the first version of the model published in 1971, Schelling was fully aware that this segregationist dynamic was not the only explanation. Segregation is also, and perhaps primarily, explained by action that favours it. Such action may be organized and intentional, legal or illegal, coercive or more indirect, subtle or blatant, open or concealed, etc. Segregation is also the outcome of economic and social factors. For example, it is quite wellknown that the wealthiest individuals do not live in the same districts as the poorest or that the most educated individuals do not mix with the least educated, and so on and so forth. Schelling was nevertheless convinced that segregation must still be explained by a less visible mechanism which he compared to an evolutionary process or the operation of an “unseen hand” (Schelling, 1971: 146). In this regard, he was careful to distinguish the dynamic aspect of his model, which is its original feature, from the static factors that cannot be ignored. Schelling thus quite rightly stressed that if everyone wants his or her group to represent the majority in his or her neighbourhood, the only possible overall solution is complete segregation. More generally still, he stressed that collective demands with regard to one’s neighbours can generate structural constraints which make mixtures highly unlikely or even impossible [1]. However, even though these static constraints obviously play a role, they are not Schelling’s focus: “The simple mathematics of ratios and mixtures tells us something about what outcomes are logically possible, but tells us little about the behavior that leads to, or that leads away from, particular outcomes” (Schelling, 1971: 147). Schelling is thus principally concerned with the dynamic at work. This paper will, on the contrary, examine these static constraints in order to assess their impact.
3Although Schelling proposed several dynamic models, the twodimensional spatial model – as it was presented in 1978 and which we shall consider here – remains the most wellknown. If we conduct the simulation that Schelling proposes, the result he claims is undeniably produced. However, if we examine it closely, we are forced to conclude that the usual interpretation of this result is distorted by the failure to take proper account of certain structural aspects. Even if, as we stated above, Schelling mentioned structural constraints, the role they play in the final explanation has never been evaluated. And yet the fact that the individual preferences Schelling considers make mixtures unlikely or almost impossible irrespective of the dynamic should, on the contrary, lead us to ask very direct questions about the generality of the model.
4Nonetheless, this is not the only concern. Schelling takes the view that the only possible explanation for the results of his model is that an evolutionary process is taking place. He does not merely state that individual actions may, by aggregation, produce an undesired collective effect, but he goes further, claiming that the undesired effect is additional segregation. This conclusion should, however, give rise to doubt when one considers what we know about the laws of chance, and in particular the principle of entropy, which leads one to think that order (or what amounts to the same thing here, segregation) is not spontaneously produced over and above the conscious efforts made to create some sort of order. Intentional actions create order. The problem is to know if, beyond, the unintentional consequences create on the average more order than disorder. Then, there would actually be a kind of system effect. The most probable dynamic is, on the contrary, that which spontaneously produces disorder or, in this case, a mixture associated with a low level of segregation (which will be computed later in this paper). This raises a question: what is the generality of Schelling’s model? For we cannot simply state that there is a general evolutionary process which creates slightly more segregation than individuals desire. There must be implicit hypotheses which explain this “evolution”. In other terms, the general model which stipulates that the most probable undesired effect is disorder or mixture, prompts us to ask what these hypotheses are which must explain Schelling’s strange “paradox”.
5To answer this question, after a description of the model and how it relates to the principle of entropy, the paper will examine the levels of spatial segregation which would result from chance alone. Although a benchmarking of this type has never been produced, it is obviously indispensable in order to ascertain that a given level of segregation is high or low. It will allow us to identify all deviations that result from the process implemented by Schelling, thereby revealing the role played by the structural constraints or the set of hypotheses on which Schelling’s model is based and to show that they fully explain (in other words, construct, from start to finish) the levels of segregation that are achieved. Ultimately, it will be seen that segregation is the result of intolerance but that we cannot say – and this is the principal concern of this paper – that a low level of individual intolerance leads to a high level of collective segregation. At last, we shall examine some of the many more recent proposals that have been made for extending the model, but only in order to discuss this relationship between low intolerance and segregation [2].
1 – Schelling’s model
6Let us begin by restating the form the model took in Schelling’s most recent version which dates from 1978. Let us take a checkerboard (an 8x8 matrix) and place black and white pawns on it. We have here a very stylized representation of an urban area in which two groups live, for example Catholics and Protestants, rich and poor, or any other division. To begin with, the black and white pawns are arranged alternately in order to represent the best possible integration between the two communities.
7Schelling then adds a rule for the displacement of pawns: although the individuals are tolerant, they will move if one third or less of their immediate neighbours are of like colour. Conversely, they will be contented and will not move if strictly more than one third of their immediate neighbours are of like colour. This rule of one third may appear to be quite flexible. These individuals cannot be said to be seeking segregation; it is only when each individual wishes to be in the majority that they are really aiming for segregation. Below this level, some minorities may accept each other and a mixture between the two communities remains possible.
8In the situation of an alternate starting situation (with empty corners), all the individuals are satisfied with their neighbours. Schelling then introduces a minor disturbance. He begins by removing 10 white pawns and 10 black pawns at random, then he selects 5 white or black pawns at random and places these randomly on the board (after which there will be 45 pawns and 19 empty spaces). This disturbance is usually sufficient to create a degree of discontent and the pawns involved will then be moved randomly towards the blank spaces until a new equilibrium is reached in which everybody is contented with regard to the rule of one third. Under these conditions, the result is surprising: black neighbourhoods and white neighbourhoods come into being without anybody really wishing it, at least that is what appears to happen. As we can see in Figure 1, there is a high degree of segregation once this process of migration has finished. However, this outcome is not a marginal result. As Schelling states, if one plays this game several times a similar result will usually be obtained. Actually, as we shall see below, if several thousand simulations are performed a nonnegligible standard deviation is observed around the mean segregation. But the mean segregation closely matches the outcome described by Schelling.
9What is so striking in this model is that everybody knows that order does not appear spontaneously. If all the white pawns are placed on one side of the checkerboard and all the black ones on the other and the board is then agitated, a miracle has taken place if they are not mixed up. In a system like Schelling’s one, the most likely situation is that of disorder or an increase in entropy, i.e. a degree of integration between the two communities. In fact, in probability theory (whether it is applied to thermodynamics as by Boltzmann or to communication as by Shannon), a system that is subject to random agitation can reach any logically conceivable macrostate, but some are much more probable as many more paths lead to them (i.e. there are many more microstates which correspond to them), and sometimes so many more that the other macrostates only exist as theoretical possibilities. It is this tendency to move towards the most probable situation that Schelling’s model defies, and therefore before we decide that a paradox is involved we must consider whether the outcome is not the expression of a set of implicit hypotheses which give the appearance of a surprising effect to what is only a logical result.
An example of the movement towards segregation
An example of the movement towards segregation
2 – The statistical estimation of social segregation with reference to individual tolerance
10To this end, the first question that has to be asked is: what is the probable segregation that results from a given constraint (i.e. a level of individual tolerance)? As tolerance lessens, there are fewer and fewer possible geographical arrangements, and situations with high segregation become more and more probable. It is therefore necessary to estimate the probable level of segregation which one can expect to avoid confusing this probability with a paradoxical effect. To perform an estimate of this type, it is necessary to possess an indicator of the degree of segregation in a given distribution. And, in this connection, it has now been clearly established that entropy, as defined by Shannon (1948) in the context of information theory (based of course on the definition given by Boltzmann in thermodynamics), measures the degree of disorder in a system in an accurate and reliable manner (Kapur & Kesavan, 1992). It is written as follows:
11Where N_{tot} is the number of pawns on the board which have neighbours (i.e. that are not isolated), N_{ij} is the number of pawns in the neighbourhood of pawn [ij] (including itself) and C_{ij} is the number of paws having the same colour as pawn [ij] (again including itself). This entropy H is divided in the above formula by its maximum value and therefore varies between 0 (total segregation) and 1 (perfect mixture). An indicator of segregation (S) is obtained simply by subtracting H from 1:
S  1 = H
12This index of segregation S varies from 0 in the case of perfect mixture to 1 in the case of total segregation. For the sake of accuracy, another index, denoted by S’, has also been calculated, which takes account of the fact that in a neighbourhood which contains, for example, 5 individuals, it is obviously impossible to achieve the situation of perfect mixture (2.5 black pawns and 2.5 white pawns). To overcome this problem it is necessary to modify standardization depending on whether the number of individuals living in a neighbourhood is even. S’ is then computed by replacing, in the formula which gives S, the factor ln(2) in the denominator – which corresponds in fact to an ln(1/2) and standardizes entropy to a value of 1 when the mixture is exactly 50% – 50% – by a variable factor which depends on the number of residents in the neighbourhood, but standardizes to a value of 1 in the situation which is as near as possible to the perfect mixture (for example the situation where there are 2 white pawns and 3 black pawns or the opposite, in neighbourhoods with 5 residents). However, later in this paper it will be shown that this difference has a negligible influence, but this needs to be verified.
13For the sake of clarity, we will not make a detailed comparison with other possible indices of segregation. They do not possess the fifteen desirable properties of the Shannon index (S) for accurately measuring a system’s level of disorder (Kapur & Kesavan, 1992: 2735). They are nevertheless correlated with S. For example the index of dissimilarity D, which is frequently used in studies of urban segregation (Fossett, 2006: 202), overestimates segregation when individual intolerance is low, but this overestimation becomes less marked as intolerance increases and almost disappears when it becomes very high. In fact, the relation between S and D is approximately D = 0.8S + 0.2 and, ultimately, the coefficient of correlation between the two is equal to 99.6% for all the values of segregation that can be calculated for all the possible intolerance levels.
14The problem now is how to estimate the statistically likely level of segregation that will result from a given constraint. It is difficult to obtain here an analytical solution, as it has been shown in physics and mathematics. In physics, models known as Ising Spin Models are used to investigate phase changes in a magnetic system where there are local interactions between spins, which makes such models partially similar to Schelling’s [3] one. However, at present, there is no analytical solution to these models, even in relatively simple cases. In Mathematics, random Markov Field models, which are also similar to Schelling’s model, do not afford analytical solutions either. One of the most fruitful approaches uses the Monte Carlo method, which is based on repeated computer simulations.
15This method involves performing millions of simulations on Schelling’s checkerboard in order to derive a statistic and then examining the characteristics of this statistic. The objective is to produce the segregation statistic S or S’ as a function of the individual tolerance threshold. This would express the chances of obtaining randomly a checkerboard in which all individuals are satisfied with their neighbours with a observed threshold. Table 1 gives the result for 100 million randomly selected checkerboards for which Schelling’s conditions apply (20 black pawns, 20 white pawns, and 5 pawns randomly distributed between the two colours; the four corners are not necessarily empty).
The statistic for 100 million randomly selected checkerboards
The statistic for 100 million randomly selected checkerboards
16The first thing to emerge from these simulations is that it is almost impossible to obtain randomly a checkerboard in which a level of satisfaction is strictly greater than one third (i.e. the threshold selected by Schelling). Out of the 100 million checkerboards that were generated, none were obtained. Even after generating 200 million checkerboards, still none were obtained. Unsurprisingly, the second law of thermodynamics has been confirmed! The most probable situations correspond to high entropies, that is to say checkerboards where segregation is the lowest. In 97% of the checkerboards, at least one individual has no like neighbour and the average value of segregation S is 0.15. The next most likely case is that where the least satisfied individual has only one like individual out of six neighbours. In this case, segregation rises to 0.16. And so on and so forth.
17If now we imagine a Schelling checkerboard in which the pawns move in a random manner, whether they are satisfied or not, with no premature cessation of the process, we can deduce the average temporal segregation (which is also the average value of the previous statistic – the phase space statistic – as a consequence of the ergodic principle). In the same way it is possible to find out the scale of the variations. The continual mixing will in all probability maintain the minimum demand level which is observed within the checkerboard at zero, but occasionally some variations will occur. With one different checkerboard generated every second, the observed minimum demand will increase to 1/3 approximately once every 83,752 seconds, that is to say about once a day. However, there is little chance that even after 3 years, a checkerboard where a minimum demand strictly greater than 1/3 is met will appear.
18The Monte Carlo method does not therefore inform about segregation for levels which are strictly higher than one third. However, it is straightforward to compute the average segregation of the checkerboards where an observed minimum is satisfied. This is done by computing an average which is weighted by the number of occurrences of the segregations corresponding to the checkerboards with minimum demands that are equal to or higher than the fixed level.
Average segregation S as a function of the minimum tolerated level
Average segregation S as a function of the minimum tolerated level
19These values are given in Table 2. They give the reference levels of segregation, that is to say a value against which we can position a given level of segregation attained by Schelling’s model. Let us consider the situation where the minimum requirement of every individual is to have 1/3 similar neighbours. If we choose at random one checkerboard among all those that meet this criterion, on average we should expect it to have a level of segregation of 0.25. The crucial question is now to decide if there is a paradoxical effect when we play Schelling’s game, that is to say which is neutral from a random standpoint but which tends to increase this segregation even further.
3 – Simulations of Schelling’s game
20To answer this question, we need to perform the simulations for this game. This should make it possible to measure two things: the final segregation S, but also the real tolerance of the individuals in this model. The first thing to note is that Schelling presents his model in a deceptive way. He presents the individual tolerance threshold as a continuous variable where in fact it is a discreet variable, which completely modifies the interpretation of the results. Let us consider the case where an individual has three neighbours. What concrete form does an individual’s demand for more than a third of like neighbours take? By rejecting situations with no or one like neighbours. The individual accepts only the situations where 2 or more of his neighbours are of like colour. In other words, he wants to belong to the majority. He rejects 0/3, 1/3 and accepts only 2/3 and 3/3. Although it might be stated that the individual is ready to accept 34%, in discreet concrete applications, as an individual can obviously not have for example 1.6 neighbours, this means that in reality his minimum intolerance level is 2/3. This calculation can of course be carried out for each possible number of neighbours from 1 to 8. For example, in Schelling’s case where the threshold is more than 1/3, the real minimum intolerance corresponds to: 1 like neighbour out of 1 neighbour in all, 1 like neighbour at least out of 2, 2 like neighbours at least out of 3, 2 like neighbours at least out of 4, 2 like neighbours out of 5, 3 out of 6, 3 out of 7 and at last, 3 out of 8. The minimum accepted proportions are therefore as follows: 1/1, 1/2, 2/3, 2/4, 2/5, 3/6, 3/7 and 3/8 (the numerator being the minimum number of like neighbours and the denominator the total number of neighbours). As we can see in this series, the 34% level chosen by Schelling does not appear in any of the cases of numbers of neighbours that can occur during a simulation. Next, it is straightforward to establish this series of minimum proportions which are really accepted for any continuous level of intolerance (between 0 and 1). Consequently, for a given threshold, it is possible to calculate the average minimum intolerance which is effectively present in the model by weighting each of these proportions by the number of times it has been encountered during a simulation, i.e. by taking account of all the stages that lead to a final state of equilibrium (at which time there are no dissatisfied pawns). In the rest of this paper, the term effective intolerance will be used to describe the result of this calculation as it is this intolerance that corresponds to situations that may effectively be encountered and that we will simply weight by the number of times they occurred during a simulation that ended with equilibrium.
21But before performing this calculation, it should also be noted that the initial conditions of Schelling’s model constitute another important factor. These conditions are quite exceptional: they are always very close to perfect alternation which greatly diminishes the generality of the model. Here as elsewhere it is difficult to draw general conclusions from a model whose initial conditions are extremely unusual. Of course, as we are discussing Schelling’s model, we have performed simulations starting from the same initial conditions. Strictly speaking, once pawns have been randomly removed, the starting point is an almost alternate one (with empty corners). But we have also performed simulations starting from completely random situations (as was done to obtain the results given in Table 1, where in each simulation there were 20 black pawns, 20 white pawns, and 5 pawns that were randomly distributed between the two colours). As will be seen below, the comparison between the two starting points is highly instructive. By starting from a random situation we would nevertheless be better able to compare the results obtained with the random reference previously established.
22In principal, other features of Schelling’s game could affect the final segregation, but as their effect is very much more marginal they can easily be ignored. To begin with, Schelling leaves the four corners of the board empty. This is to counteract to some extent the edge effects, which are real. In fact, simulations performed with or without this constraint produce very similar results. Likewise, increasing the size of the checkerboard (which diminishes the impact of edge effects) or considering a toric surface does not give a fundamentally different result from that observed by Schelling. Secondly, the numbers of black and white pawns are not the same. But here too, if we impose a condition of equality we obtain practically the same result as Schelling. Lastly, the dissatisfied pawns move to the nearest satisfactory position. The effect here is to speed up convergence to equilibrium, which is understandable if, like Schelling, one performs the operations manually. But it is obviously not necessary if one can use a computer. If the pawns are moved randomly, convergence to equilibrium takes longer time, but the end result is very similar to that obtained with Schelling’s procedure. So, simulations can be performed on checkerboards of various sizes, with random moves, with or without empty corners, with or without a condition of equality as regards the numbers of pawns of the two colours, without changing the order of magnitude of the results obtained by following exactly the conditions specified by Schelling in 1978. In what follows we will retain the same size of checkerboard but relax these unnecessary constraints which make the situation less random in order to draw as close as possible to a random situation as this is the reference with which we wish to compare the results. The fact that we can do this without significantly changing the levels of segregation that are obtained when Schelling’s instructions are followed to the letter is to Schelling’s credit as it shows that his model is robust with respect to these constraints.
23In addition, we shall only consider balanced results, by which we mean those that satisfy the condition for stopping moves that stipulates that ultimately all the pawns must be contented with their environment. Under some conditions, for example when there is not enough empty room, this equilibrium may be unattainable. In this case, we may be faced with a state that is fairly stable (as regards the level of segregation) but not stationary. We shall not discuss these states here as they occurred very rarely in our simulations. Anyway, they were not considered by Schelling himself although they have been dealt with in some recent studies. We shall nevertheless return to this topic when discussing the studies in question.
4 – Explanation of the results
4.1 – Real individual intolerance is considerably underestimated
24Table 3 shows the results for the simulations conducted for different levels of individual demand, under the conditions we have fixed above. We are now in a position to make three observations about them.
251) Real (“discreet”) intolerance is always considerably greater than is suggested by the corresponding continuous level as used by Schelling, that is to say chosen in the continuous domain [4].
Results of simulations of Schelling’s game with almost alternate starting situations and random starting situations
Results of simulations of Schelling’s game with almost alternate starting situations and random starting situations
262) For the threshold of more than one third applied by Schelling, this intolerance is in reality of the order of 50%. With a so high level, the observed segregation is the logical outcome and there is no reason for perceiving a paradox in this case. With such a high level of individual demand, which does not arise by chance alone (see Table 1 or 2) and which expresses a preference or a near preference on the part of every individual to live in an environment where he/she is in the majority, it is not surprising that the environment attains a high degree of segregation, which is abnormal compared with the random result presented in Table 1. When one applies a highly segregationist rule that everyone must obey one finds this high segregation in the result.
273) On the other hand, when the effective intolerance is approximately 1/3, the segregation is approximately 0.25 for random starting situations (it even falls to 0.13 for quasialternate starting situations). However, this figure corresponds to what one can anticipate from the effect of chance on its own (see Tables 1 or 2), as it is the same order of magnitude as the segregation that is obtained when one simply places the same number of pawns at random on a checkerboard and notes that the distribution satisfies the individual tolerance level of an inclusive third of the pawns (after repeating the operation millions of times). In addition, in this case a third of the checkerboards exceed the value of 0.28, a quarter exceed a segregation level 0.31 and there is a 10% chance of overall segregation exceeding 0.37 (see Table 1).
28The conclusion is therefore clear: presenting the rule of one third as a continuous variable conceals an individual intolerance which is actually much higher (approximately one half). The level of segregation obtained is not at all surprising in view of the real sorting rule (which is abnormal compared with the random situation) that has been adopted. On the contrary, when this real rule is effectively set at around one third, sorting logically appears to be quite incapable of producing more segregation than is produced randomly. Strictly speaking, therefore, there is no Schelling effect over and above the segregation that is generated by chance.
4.2 – A linear link between intolerance and segregation
29The fact that the effects of a certain level of intolerance are not disproportionate to the observed segregation is also apparent if we examine the relationship between the different levels of effective intolerance and the resulting segregation. Figure 2 shows the results from all the simulations for the random starting points (1,000 for each level) that are presented in Table 3. In these results, we can observe no break or tipping in the segregation level above a given level of intolerance. Quite good fit is obtained for these data with a linear function (R^{2} = 0.85), in any case better than that obtained with the other possible functions such as a logistical function (R^{2} = 0.83), whose “S” shape is so flattened that it is quite difficult to distinguish it from a straight line (and yet it is this type of function which is commonly used to model diffusion processes and, in the present case, “contagion” phenomena associated with moderately intolerant dissatisfaction). The best model, and the best explanation, therefore always consists of stating that the resulting segregation is simply proportional to individual intolerance.
30It should be noted that the dispersion of the segregation (i.e. the vertical dimension of each of the “ellipses” in Figure 2) is approximately normally distributed around its mean, which incidentally agrees well with Schelling’s result. As it has been observed by Zhang (2004a) and Young (1998), we can speak of “stochastically stable” simulations, but in the case of a given simulation we can nevertheless move away from this mean (given in Table 3) and achieve a considerably higher or lower degree of segregation. The dispersion of segregation nevertheless tends to be considerably lower when intolerance is high.
31In the case of quasialternate starting situations (see Figure 3), what is above all apparent is a marked reduction in dispersion in the case of low individual demands (we shall return to the explanation of this very striking difference with random starting situations later). In spite of this, the fit that is achieved by a straight line is once more the best it is possible to find. The R^{2} value in this case is 0.89, which is once again slightly better than that obtained with a logistical function (R^{2} = 0.87), which in this case too resembles a straight line. Even here, segregation remains proportional to individual intolerance.
32Obviously, our aim is not to fit a straight line at any price. To understand the relationship between intolerance and segregation, all that is necessary is to examine the shape of the scatterplot. Nevertheless, if one considers that a nonlinear fit is satisfactory and that this provides some support for the idea that there is a paradoxical effect in Schelling’s model, we must highlight that a linear fit is always on the average better, whatever the starting situation.
Scatterplot of social segregation and individual intolerance for random starting situations
Scatterplot of social segregation and individual intolerance for random starting situations
Scatterplot of social segregation and individual intolerance for almost alternate starting situations
Scatterplot of social segregation and individual intolerance for almost alternate starting situations
4.3 – The results vary according to the population density
33At a given threshold, it is nevertheless important to realize that it is possible to vary the real tolerance and therefore the segregation by modifying the density of the pawns on the checkerboard. To give an example, let us retain the threshold of one third (exclusive) and compare the results we have just obtained for a density of approximately 70% (= 45/64) with those for a lower density of 60% (with 38 pawns) and a higher density of approximately 80% (with 52 pawns). In Schelling’s situation, a pawn has, on average, 4.7 neighbours which most frequently number 3, 4, 5 or 6 and which corresponds to minimum requirement thresholds of 2/3, 2/4, 2/5 and 3/6 leading, as we have seen, to a mean effective intolerance of 0.50. Lower density means that there is more room for districts with a small number of neighbours. The most frequent situations involve 1, 2, 3 or 4 neighbours (on average 3.9), i.e. high minimum intolerances of 1/1, 1/2, 2/3 and 2/4. In this case, the mean effective intolerance for all the simulations is equal to 0.54. With a higher density, there are necessarily fewer such districts (one pawn has 5.4 neighbours on average). The most frequent situations involve 5, 6, 7, or 8 neighbours with what are in this case lower minimum demands: 2/5, 3/6, 3/7 and 3/8. There is therefore a reduction in the mean effective intolerance which is now only 0.47. The overall segregation levels (at equilibrium) obviously therefore follow the different levels of effective intolerance that correspond to the different densities. Thus with a density of 60%, the effective mean intolerance of 0.54 leads to a mean segregation of 0.63. With a density of 70%, the intolerance of 0.50 leads to a segregation of 0.57, and with a density of 80% intolerance falls to 0.47 and leads to a segregation of 0.51. Ultimately, when the satisfaction threshold in the continuous domain is fixed, in this example as in the case of Schelling’s model at one third, the higher the density the lower the effective intolerance and, consequently, the segregation. If we follow Schelling’s model, we would therefore need to believe that for a given level of individual preference, there would be more segregation in a small town than in a large one. This is a prediction which may be contradicted empirically. In any case, it is possible to modify the levels of segregation that are attained, all other things being equal, in Schelling’s model by modifying the overall population density.
34In this connection, it is important to note that while the density leads to a marked change in the interpretation of the threshold fixed in the continuous domain and therefore, of final segregation, the effective intolerance appears to be a good predictor of segregation independently of density. In other words, as a first approximation, it is quite possible to construct a table of correspondence between effective intolerance and final segregation which takes account of the results of simulations with any level of density. Density only provides a secondorder correction. The correspondence table between the continuous threshold and the final segregation is much less reliable. This provides further confirmation that the effective intolerance is the most relevant information in the present case, not the threshold.
4.4 – Dependency on initial conditions
35In view of this, let us return to Figures 2 and 3 and see how it is possible to explain the differences in the shape of the scatterplot between the quasialternate starting situations and the random starting situations with moderate levels of intolerance. This difference is also observed in the case of the mean values set out in Table 3. In particular, this table and these figures show that this applies only for low levels of intolerance and that it breaks down as intolerance increases. The two series converge almost perfectly, particularly above the level of one third. This highlights the fact that the socalled Schelling “effect” is not due to a special phenomenon – such as, for example, a catastrophic bifurcation – that occurs with a moderate intolerance threshold. The impression of an effect is simply generated by the selection of starting checkerboards where the alternation is almost too perfect to be realistic. It is therefore not due to excessive segregation above a certain level, but, on the contrary, to excessively low segregation below this level.
36To clarify this point, it is first of all necessary to reexamine the hypotheses which underlie the ergodic principle. The result obtained with the Monte Carlo method is the average segregation for all the possible states of the checkerboard (with a constraint in the form of a minimum level). If it is assumed that the pawns on the checkerboard move randomly in very large numbers while complying with the constraints in place, the ergodic principle can be applied. This consists of saying that the temporal average of the segregation observed on the checkerboard tends towards the calculated average for all the possible states. The equalization of the averages is based on the idea that this random movement of the pawns will, after a certain period of time, cover all the possible situations without favouring any of them as a result of the random nature of the movements.
37However, following the rules of Schelling’s game, the ergodic principle is not verified. The game can stop rapidly, meaning that the final segregation does not have enough time to converge towards its target (the segregation calculated by the Monte Carlo method). The ergodic principle comes down to ignoring the initial conditions on the grounds that random movements will eventually remove any features that depend on the starting point: under these conditions, the only factor that counts is chance. In Schelling’s game, on the contrary, path dependency exists as the final checkerboard is still very similar to the initial checkerboard. This is obviously what happens in the case of simulations with checkerboards with an almost alternate starting point. With low demand levels, only a few moves are required for everybody to be contented and final segregation remains particularly low and near to the initial segregation. This explains why the results for the alternate starting situation are also close to the first decile of the Monte Carlo distribution.
38With such a starting point, the rules of Schelling’s game are not necessary to increase segregation. Even pure random movements would lead to an increase. Choosing this exceptional starting point, as for the integration of both communities, a dynamic is attributed to a “Schelling effect” while random movements can easily produce it. Next, obviously, the higher the level of intolerance, the less apparent this dependency on the initial conditions becomes. The results are no longer affected by the starting situation. Beyond the threshold of one third, which in fact expresses the intention of each individual to live in an environment where he/she is almost in the majority, there is no longer a difference between the random and the quasialternate starting situations. But the difference between the segregation that is obtained for this threshold and the starting point is obviously much more striking if this starting point is one of alternation rather than a random situation.
4.5. A – selective migration process which generates nucleation
39Another constraint that results from the rules of the game is added to those we have just described above in order to produce the result observed by Schelling. On the one hand, it is assumed that there is random movement within each of the checkerboards which complies with a given threshold constraint and from which a target distribution for segregation is deduced. On the other hand, there is a selective movement from checkerboards that fail to satisfy this level, and this movement is continued until a satisfactory position for each pawn is obtained. There would be convergence between the final segregations if, once a satisfactory checkerboard was obtained using Schelling’s method pawns continued to move randomly with the only constraint of producing no dissatisfied pawn. However this is not the form Schelling’s model takes. One of its implicit hypotheses is, on the contrary, that satisfied pawns never move. This means that the migratory process is not simply a minimal selective movement, which would only require pawns to move from an unsatisfactory position to a satisfactory position by any path; at the same time it is a selective process which imposes potentially different mobilities on the pawns. There is no potential for the inhabitants of a completely homogeneous neighbourhood to move. Physicists talk of nucleation phenomena in this connection. Homogeneous cores will form in a mixture if the bonding energy between like pawns (the energy required to create a homogeneous neighbourhood) is lower than the pressure exerted by the pawns of the opposite colour. In this case, the homogeneous cores will grow until they encounter excessive pressure. In the Schelling model, the cost of forming a homogeneous neighbourhood is nil so any homogeneous neighbourhood can only increase and not decrease in size. It should be noted that nucleation phenomena are kinetic in nature and explain divergences from the thermodynamic equilibrium. They are transitory states or metastable equilibrium states (that is to say in a precarious equilibrium where a small change can bring the system back to stable equilibrium). Schelling’s model thus produces a metastable equilibrium by means of a rule whose significance is not directly stated, namely the rule of never making satisfied pawns move. If we can draw a parallel, this rule is like telling a cook to make pancake mixture without breaking up the lumps and then finding it surprising that the lumps have not disappeared but have even grown!
40This rule explains why the final segregation of the checkerboards obtained with Schelling’s method for a random starting point is slightly higher than the equilibrium segregation (from the Monte Carlo method). However, as it has been said above, for effective intolerances below 50%, the results differ very little from those obtained with the Monte Carlo method.
5 – Beyond Schelling’s model
41If we put all the above elements together, it is clearly apparent that there is no paradox in the results given by Schelling’s model. It is not possible to maintain that in general a low level of individual intolerance leads to high collective segregation. Nevertheless, this model has been very widely applied in different fields such as economics, game theory and urban sociology. Many variants have been put forward in which the role played by one parameter or another is modified. It is obviously impossible to deal here with all the conceivable variants. With regard to the point we are making, we must however examine those which might put our explanation of Schelling’s “paradox” to the test. In this connection, there are essentially two issues which must draw our attention. First, we have shown that the small number of neighbours is completely responsible for creating the illusion of low theoretical tolerance when effective intolerance is high. It is therefore important to increase the size of the neighbourhood in order to see what happens. Next, we have highlighted the role played by the hypothesis that satisfied individuals never move, but its impact can be evaluated with even more precision.
42Let us begin with the issue of the neighbourhood size. A maximum of eight neighbours is too little to distinguish on a finely graduated scale between low intolerance and high intolerance, and we have seen the outcome. There are however many ways of overcoming this problem, for example by changing the form of the checkerboard or the size of the neighbourhood, or by permitting the stacking of pawns. What happens when these parameters are modified? In particular, what happens when the size of the neighbourhood is increased beyond the first ring? This extension reduces the size of the discreet jumps, making the structure of the game less coarse. This allows us to examine models in which the (continuous) theoretical level is less misleading with regard to the (discreet) effective intolerance. What, a priori, should we expect under these conditions? Let us consider the case where there are as many white pawns as black pawns and begin with considering the extreme case where the neighbourhood is so large that it takes in the entire checkerboard. It is quite clear that everybody will be satisfied up to the precise threshold of 50% and dissatisfied above. In the first case, nothing happens. In the second, everybody is and will remain dissatisfied, however the pawns are arranged. Let us now consider an intermediate situation where the neighbourhood is large without covering the entire checkerboard. In this case, each pawn has a large number of neighbours and the structure of the game enables each individual to be above the 33% threshold without automatically wishing to belong to the majority. Mixture, which was almost impossible in Schelling’s structure above 33% threshold becomes the most likely situation and constitutes a stable state for intolerance thresholds below 50%. As soon as one draws close to or exceeds the decisive threshold of 50% however, the extension of the neighbourhood will lead to moving on a very large scale. The reason for this is that increasing the size of the neighbourhood also increases the “community of destinies” and if, for example, a white pawn is dissatisfied in a large district, all the white pawns in this district are dissatisfied at the same time and they will all move until there are none of them left there. This process cannot be described as paradoxical: it is the logical and perfectly foreseeable outcome of a collective moving rule. It does not illustrate a “domino” effect. What is occurring is simply the rapid homogenisation of mixed districts by simultaneous moves of those who are dissatisfied and that are identified as such from the starting situation.
43This hypothesis has been clearly confirmed by a research conducted by Laurie and Jaggi (2003). This involved “Schellinglike” simulations with different sizes of neighbourhood. They considered that the distance between two pawns was equal to the number of rows and columns between them. In other words, they counted the number of spaces that separate two pawns by moving along the rows or columns of the checkerboard but not along the diagonals (the distance between two pawns is: x_{2} – x_{1} + y_{2} – y_{1}). Then, they defined the neighbourhood with a radius R as being that set of pawns that was located at a distance of less than R. They then showed that almost nothing happened with a 30% threshold when the neighbourhood consisted of all the pawns which were less than 3 spaces away from the central pawn (R=3); in addition, there were even fewer moves for R=5. However, with a 50% threshold, which would logically force segregation, these scholars observed that the larger the neighbourhoods, the larger the homogeneous zones. Thus, for R=1 or R=3, segregation was already high and the zones were already homogeneous, but there were still many regions of contact between opposite colours. For R=5, there were fewer of these; the boundaries became straight, because no pawns could cross the boundary between two regions. In the case of low levels stability therefore increased, and in the case of high levels massive movement occurred within larger and larger “communities of destiny”. Once one has understood the role of each parameter, all the cases in their study are readily explained (even if some criticisms have been made of the model in particular by Fossett and Waren, 2005). Lastly, it should be noted that Laurie and Jaggi failed to realize that if the neighbourhood radius is reduced to 1 – a situation in which the central pawn has at most 4 neighbours – the discreet jumps have the maximum effect and it is simply absurd to speak of a threshold of 33% when the effective intolerance is necessarily greater than 50%.
44When all is said and done, extending the neighbourhood is a useful strategy for reducing the role of the discrete intolerance in Schelling’s model. Nevertheless, this extension has impacts. Schelling’s idea is that, little by little, very local differences cause general effects. But extending the neighbourhood amounts to transforming local differences into large “communities of destiny”. The problem of the discreet jump is therefore solved, but by moving to another model which is not really appropriate for the phenomenon that Schelling was attempting to examine.
45There are however other ways of reducing the problem of the difference between the continuous threshold and effective (discrete) intolerance. The most widely used consists in replacing the level of intolerance by a continuous utility function. This strategy was, in particular, pursued by Junfu Zhang (2004b), Roman Pancs and Nicolaas Vriend (2007) or Elizabeth Bruch and Robert Mare (2006). From the outset, we must nevertheless note that this constitutes a substantial modification of Schelling’s “paradox”, as we no longer know beyond what quantum of utility we should be surprised to obtain a segregated checkerboard. To decide on this point, we need to be able to compare the result to a reference (in the same way as we did above), but it is precisely this comparison which is absent from the research in question. Moreover, if the utility function is an increasing function which is therefore favourable to creating the most homogeneous neighbourhood possible, the Pareto optimum can be easily characterized without even performing a computation: the optimum is achieved when on the checkerboard the black pawns are separated as well as possible from the white pawns. The utility of each pawn is at this point maximum.
46However, these studies have the value of allowing us to test our second explanation which is based on the role played by the fact that satisfied individuals do not move. To understand what is involved here it is no doubt useful to examine the analogy put forward by Dejan Vinkovic and Alan Kirman (2006) between Schelling’s model and the physics of a fluid which cools while nucleations take place within it. For this phenomenon to be observed, two conditions must be met. First, the fluid in question must have a kinetic that favours the formation of pure clusters in that these must form more rapidly (at least up to a certain size) than they are broken up. Second, this fluid must be cooled in order to “freeze” the pure clusters permanently, or they would merely be transient. In statistical physics, the temperature is the indicator of thermal agitation and may be considered as being analogous to the rate of moving (the “agitation” of the individuals) in society. This analogy shows that Schelling’s conclusion is also based on these two conditions which must now be reformulated in social terms. The favourable kinetic corresponds to the central hypothesis that only individuals who are dissatisfied with the colour of their neighbours move. The second condition assumes, among other things, that one is within a society which is “cooling down”, i.e. within a society where nobody would move for other reasons than their racial preferences until such a time as everyone is satisfied from this viewpoint. It is therefore essential to check the “temperature”, i.e. the moving behaviour of individuals as soon as one wishes to identify a Schelling effect within a real society. It is not sufficient to observe that individuals have requirements that are only slightly segregationist, we need to observe in addition that these demands are the only factors which determine their decision to move. This is a major hypothesis because we all know – and Schelling as well as anybody – that individuals also move for reasons that have nothing to do with the colour of their neighbours.
47As soon as these other reasons for moving are considered, the society becomes the location of movements which are independent from racial or ethnic issues and which are capable of dismantling ghettoes or stopping them from forming. Apart from a kinetic which is favourable to the formation of ghettoes, we need in reality to take account of a competing kinetic which is indifferent to ethnic preferences and, as a simple consequence of this, favourable to mixture. It is important to stress this point. The unintentional consequence of a blind social dynamic is very likely to be disorder, noise or pollution. The entropy principle explains this very well. There is no need to target mixture to achieve it. On the contrary, it is when segregation is obtained that we need to ask questions about the mechanisms which have caused it. Schelling’s mechanism only produces segregation in proportion to what the individuals desire and, as we have shown, there are no grounds for stating that they obtain more segregation than what can be inferred from their preferences. Put simply, disorder is the most likely situation within a closed system, that is to say a system that changes without any external influence. In order to obtain order, we need to consider an open system similar to Kirman and Vinkovic’s cooling fluid which looses energy when solidifying. But even in this case, it is imperative to show clearly the role played by the environment by controlling for its influence. In this connection, there are countless possibilities. It is well known, for example, that different rates of cooling of a liquid alloy can lead to different structures (slow cooling as opposed to quenching). The question is which of the two kinetics – that which forms homogeneous clusters or that which favours mixtures – dominates the other within a given experimental system, knowing that on the way stable equilibria, metastable equilibria or situations with no equilibrium may occur. In the case of Schelling’s model, it therefore seems difficult to conclude that a ghetto effect exists without investigating the general level of moves, whether these are motivated by issues of racial tolerance or not. To sum up, either we consider that Schelling’s model does not apply to presentday societies which are manifestly “hot” societies, or we test the model again by maintaining individuals’ levels of moving for all sorts of reasons, at a high level, but at the same time this alters the stopping condition of the game.
48In this connection, the modelling performed by Elizabeth Bruch and Robert Mare (2006) allows us to observe what happens when the “temperature” is maintained almost steady within the society. Instead of making only dissatisfied people move, these scholars have modified the moving decision function by adding chance. What they developed was therefore a probabilistic choice model (known as a random utility model), in which the decision to move is no longer entirely determined by an individual’s neighbours but is simply made more probable. There is therefore continual mobility in addition to that of the dissatisfied pawns. This is not the only difference with Schelling’s model. In particular, Bruch and Mare had to abandon the stopping condition of Schelling’s game: the game no longer stops when everybody is satisfied [5]. The degree of segregation is now measured during the game. It nevertheless appears that this provides a good test of the Schelling effect, when the movements of individuals on other than racial grounds are controlled for. However, like many other scholars who have proposed simulations that follow on from Schelling, the tolerance threshold adopted for the discreet case is no longer 34% but 50%. So basically what the model does is to observe that if everybody wants to belong to the majority the result is high segregation. The paradox which made Schelling’s model so famous has been lost, but it is nevertheless possible to learn useful lessons from their simulations. Thus, they state there is no longer a Schelling effect when the probability of moving increases continuously with the proportion of neighbours of different colours increases. Allowing satisfied individuals to move therefore prevents the formation of ghettoes. The study by Arnout van de Rijt, David Siegel and Michael Macy (2009) disputes Bruch and Mare’s findings. Nevertheless, the two studies are very similar and their findings are perfectly consistent. The disagreement relates to the difference between the situation where the probability of moving increases continuously with the proportion of neighbours of different colour (“continuous preferences”) and the case where the same probability increases suddenly beyond the 50% threshold (“threshold preferences”). However, it is quite obvious that this difference depends on the movements of satisfied individuals. The greater the number of moves, the more the conclusions favour Bruch and Mare. And conversely, Rijt, Siegel and Macy confirm that there is a difference between the continuous and the threshold preferences and conclude that the scale of random moves among satisfied individuals is unfavourable to segregation: “we find that random behavior eventually precludes segregation with both functional forms, but more randomness is required if preferences are threshold” (2009: 1172). In fact, the decisive factor here is probably not whether we are in the continuous or discreet case; it is more likely to be the effective rate of moving among satisfied individuals. Segregation simply falls as this rate increases and the way this rate is implemented as a decision function is of secondary importance. As it is implemented, this rate is greater in the continuous case than in the discreet case. It is therefore more difficult for the kinetic which is favourable to segregation to establish itself in the continuous case.
49Finally, there is nothing unexpected about any of these results. This is certainly the worst conclusion for someone creating models in order to present paradoxes. But from the scientific point of view, it is thoroughly satisfactory: the model leads to results which can be fully explained to the extent that they present no surprise and hide no mystery or paradox due to the intervention of an “Invisible Hand” – or “unseen hand”, using Schelling’s expression. The final segregation is simply the outcome of the law of probability that is defined on the basis of the effective intolerance and the effective rate of moving among satisfied individuals compared with dissatisfied individuals.
Conclusion
50Urban segregation has a large number of political, economic and social causes. And even if these causes are not present, history plays a major role in slowing change in the urban landscape and creating an attachment among residents for their district. This means that urban segregation can hardly be explained by a single factor. A family’s decision to live in a district is certainly, to a considerable extent, determined by household income but also closeness to a good school for the children or a reasonable journey time to go to work for the parents. In addition to these criteria, we can add closeness to grandparents, knowledge of the district, public services such as public transport, the presence of a kindergarten, or the liveliness or the reputation of a district, etc.
51Nevertheless, Schelling’s model was extremely successful because it seemed to illustrate perfectly the idea that at a collective level we do not always obtain what each person intends individually. Urban segregation was considered to be an undesired effect of moderately racist behaviours. Another point in favour of the model was that it appeared at just the right time to explain why segregation still existed in societies which were becoming more and more tolerant. However, before using this model to diagnose these problems, should we not first take account of the limits that Schelling himself laid down? Should we not also ask if the model’s structural constraints really have as little effect on the observed dynamic as Schelling thought or if, on the contrary, they go a long way to explain the results? To go even further, are individuals as tolerant as the model would seem to indicate? Does the form of the checkerboard and the neighbourhood enable individuals to tolerate only 34% of neighbours who are different? Or do these mean that they wish to belong to the majority? And in this case, is not segregation a trivial result?
52Schelling’s thesis according to which we find ourselves in a highly segregated society, even with a moderate level of intolerance does not stand up to detailed examination of the answers to the above questions. As we have seen, when intolerance is really moderate, segregation is moderate and when intolerance is high, segregation is obviously high – the link between these two variables is quasilinear. In this regard, the study of structural constraints confirms that there is nothing exceptional about the level of segregation which results from Schelling’s game. We cannot state that the tipping point towards high segregation occurs at lower levels of intolerance than one would expect. While some people are convinced that this model gives results which are more than trivial, it is finally due to two tautologies that combine to give what, if one looks at it closely, is a paralogism: it is true that small demands about one’s neighbours create some sort of order; it is also true that if everybody wishes to be in the majority it is quite impossible for minorities to coexist and segregation is high; but it is not true that segregation does not reflect what can be expected from the laws of chance and what each individual effectively wishes. Schelling’s model does not allow us to conclude that, in general, major ghettoes result from innocent decisions.
Notes

[1]
Likewise, if we generalize the constraint to a multicultural society, as several authors are doing today (Clark & Fossett, 2008; Fossett, 2006; Macy & van de Rijt, 2006), we must not lose sight of the fact that if everyone wants to live in a district in which his or her ethnic group is overrepresented, coexistence is impossible.

[2]
It should nevertheless be noted that the point we are examining here, and which has certainly contributed to the renown of Schelling’s thesis is not the central concern of many more recent studies that focus more on variants of Schelling’s model by considering only strong individual intolerances.

[3]
Only partially, because the value of the spin can flip, while pawns cannot change their colour.

[4]
In the following, we shall summarize this choice by the expression “continuous threshold”.

[5]
This abandonment is completely compatible with the idea that underlies this new model as it attempts to incorporate random agitation that corresponds to a form of temperature fluctuation. Maintaining the previous stopping condition would mean that the game would only stop if, by chance, the temperature fell to absolute zero. As the chances of this occurring are very small, the game would only end in fairly exceptional circumstances which could not really be used to learn general lessons about segregation.