Log in / Register
Home arrow Computer Science arrow Social Informatics
< Prev   CONTENTS   Next >

2 Data

Recent growth in smart phones usage [6] and emergence of location aware services has enabled large-scale data collection [7] through participatory sensor networks [8]. A key feature that makes such systems particularly relevant for urban informatics [9] is the ubiquity of the sensors, and the existence of infrastructures that enable sensing. Twitter is an example of a participatory sensor network. It is a microblogging service that allows people to share events and news or have conversations in real time [10]. Empirical studies have shown that people generally use Twitter to describe what they are doing or express how they are feeling [11]. Apart from text content, each tweet is accompanied by a range of meta data such as timestamp and geographic location. We refer to geographically referenced tweets as geolocated tweets. Geographic referencing is not exclusive to Twitter but has been a popular concept, implemented in many other social media services. Depending on individual preferences, Twitter users may decide to publically share their activities on other social media. When they do so, the information posted on those services are also publicised on Twitter. Foursquare, an online service for users to share their whereabouts is an example of such a network. Because Twitter offers a relatively simple protocol to access such information, other studies in literature [12, 13] have also collected geolocated data from other social media through Twitter. For these reasons, we developed our system based on geolocated twitter data. Yet, the concepts we describe can be generalizable to a wider class of geolocated social media data with similar characteristics.

Prior to the availability of geolocated social media data, large-scale studies of mobility were mainly based on cellular activity logs [14-17] that track the spatial position of people at different moments in time. To analyse movement in cellular datasets, analyst rely on techniques that partition a given territory into subspaces based on the locations of cellular base stations. The position of a cell phone is then approximated to the location of the base station responsible for routing its signal. An estimated trajectory can then be constructed by chronologically ordering the locations of the base stations that served the cell phone. While this approach has revealed valuable insights about human mobility [14, 15], the spatial resolution in which studies can be conducted depends on the physical geometry of the infrastructure. In comparison to geolocated social media datasets that offer spatial information of up to street level precision, the space partitioning technique implies that studies conducted in territories with sparsely distributed base stations will be limited to relatively low resolution spatial analysis. Moreover, cellular dataset are proprietary in nature. In most cases, obtaining such data tend to require a long time to accomplish due to complicated procedures and long discussions with stakeholders.

Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
Business & Finance
Computer Science
Language & Literature
Political science