Keeping in line with our primary objective of identifying eﬀects of racial segregation on movement behavior, we calculate number of visits among users living in tracts of diﬀerent races. Human mobility, however, tends to follow uniform patterns and can be simulated by parameterized models. The question that arises is whether or not by visiting a tract of similar or dissimilar race, a user is simply adhering to the ideal movement pattern he/she is supposed to follow, or is there a bias due to the presence of a particular race. To answer this question, we build models of movement patterns for each of the three cities and generate synthetic datasets. Measuring the variation of actual mobility data from the ideal (simulated) movement patterns would indicate the presence of any inter-race bias. This steps involved in this process are explained next.

Fig. 2. Figure shows displacement from home while tweeting in (a) New York City, (b) Los Angeles, and (c) Chicago

4.1 Models of Movement Pattern

An established characteristic of human mobility is its Levy ﬂight and random walk properties. In essence, the trajectory of movement follows a sequence of random steps where the step size belongs to a power law distribution (probability distribution function (PDF): f (x) = Cxα ), meaning, there are a large number of short hops and fewer long hops. As shown in [13] the distance from home while tweeting also follows a power law distribution.

Figure 2 shows the distance from home distribution over the range 100m to 50km, and the corresponding least square ﬁt for power law in the three cities. As a test of correctness we use a two-sample Kolmogorov-Smirnov test, where the null hypothesis states that the two samples are drawn from a continuous power law distribution. In each case, the null hypothesis was accepted with signiﬁcance (p < 0.05), hence verifying the correctness of the parameter ﬁts. Tweets within 100m from the home location of users were removed as such small shifts in distance may occur due to GPS noise even when the user is stationary.

As shown in [13], the direction of travel from home also follows a uniform distribution, only to be skewed by physical and geographic barriers like freeways, oceans etc. For computational simplicity we disregard any such skew and assume that the distribution follows a perfect uniform distribution, i.e. equally likely to travel in any direction. The resulting PDF, shown below, is the product of two probabilitiesone for distance and the other for direction of travel θ (θ is constant).

Artiﬁcial location data for a user is generated by creating a random sample from the distribution in Equation 1. Keeping the number of simulated tweet locations equal to the number of actual tweets, a synthetic dataset is created by sampling for each user.

(a) New York

(b) Los Angeles

(c) Chicago

Fig. 3. Figure shows fraction of visits between each of the four race in the three cities. Colors blue:white, green:black, red:Asian, orange:Hispanic.

Found a mistake? Please highlight the word and press Shift + Enter