본문 바로가기
대학원/논문 리뷰

Zhao, Z., Shaw, S. L., Xu, Y., Lu, F., Chen, J., & Yin, L. (2016). Understanding the bias of call detail records in human mobility research. International Journal of Geographical Information Science, 30(9), 1738-1762.

by lucky__lucy 2024. 9. 2.

1. Introduction

- There have been debates regarding the biases that come with the geo-tagged social media data.

- Call detail records (CDRs) were most widely used in existing studies.

- However, most previous studies did not discuss how representative their data were and the applicability of their analysis results to the entire population. Therefore, the representativeness of CDRs needs to be examined.

- Regarding analyzing CDRs, uneven distribution of people's phone communication activities in space and time is a matter. The population size drops with the increased intensity of phone-related activities.

- This paper compares human mobility patterns derived from CDRs vs. the complete dataset.

 

2. Relevant research

2.1. CDRs and human mobility

- CDRs have helped enhance our knowledge of human mobility in recent years. A large body of literature focuses on individual activity space, which denotes the spatial extent of people’s daily activities.

- Activity space can be characterized by individual trajectories reflecting a person’s space movement over time.

- Human movements often follow reproducible patterns. However, using CDRs collected over a long time cannot address 'quiet minority's mobile patterns.

 

2.2 CDRs and urban dynamics

- Many studies adopt a collective approach to uncover varying mobility patterns instead of focusing on individual trajectories. e.g. K-means clustering, eigen decomposition, and dynamic time warping.

- CDR-based urban dynamics studies also implicitly assume that phone communication records can serve as a direct indication of human activity intensity, which is debatable.

 

2.3 Uncertainty issue

- Many concerns have been raised regarding how uncertainties could influence our findings and the risk level in a decision-making process.

- From a spatial perspective, (i) spatial resolution of CDR data is limited to the cell tower level; (ii) occurrence of a signal jump is another major issue, which occurs when a mobile device switches back and forth among a set of neighboring cell towers due to similar intensity of signal strength.

- From a temporal perspective, the locations of a subscriber between two phone communication events are uncertain. It could be managed by using data such as active pinging.

- The uncertainty issue itself cannot be fully prevented, so the object of this paper is to understand how uncertainties could result in imperfect knowledge and recognize 'which cannot be known'.

 

3. Data

3.1. Area of study

- Eight districts out of sixteen are known as the Puxi region, the downtown area of Shanghai

 

3.2. Dataset

- Datasets are provided by a major MNO as a joint research collaboration. It has seven event codes. This MNO operates over 33,000 cell towers in Shanghai and every tower has a unique ID.

 

3.3. Data processing

- They separated the dataset into two. One has only CDRs (i.e., IN and OT events) and another has the complete set of records.

- Depending on how actively one engages in phone communication activities, the ratio (number of CDRs/number of total records) varies significantly.

 

4. Individual human mobility

- This section focuses on evaluating the representativeness of CDRs in individual daily mobile pattern analysis. They used the following three indicators: (1) the total travel distance, (2) the radius of gyration, and (3) movement entropy.

- They divided the day into four six-hour periods and only those with at least one footprint in each six periods are included in this study.

- Also, they grouped subscribers into four classes by their CDR ratio.

 

4.1. Total travel distance

- Total travel distance is calculated as the sum of the Euclidian distance. A large deviation from the diagonal line indicates that CDRs tend to underestimate the total number travel distance.

- The findings are as follows. (1) $D_{CDR}$ and $D_{complete}$ have high positive correlation; (2) as the CDR ratio declines, points deviate more from the diagonal. This pattern of heteroscedasticity can be explained by (1) one's daily travel distance being longer, the range of estimated travel distance being wider, or (2) the size of subscribers drops as the total travel distance increases.

 

4.2. Radius of gyration 

- The radius of gyration is defined as the root mean squared distance between a set of visited locations up to time $t$ and the center of mass

- $R_{CDR} can be larger than $R_{complete}$ if CDR footprints spread more widely than non-CDR records.

 

4.3. Movement entropy

- It measures the heterogeneity of visitation patterns. The value of movement entropy grows with a more heterogeneous visitation pattern.

- If a subscriber visits Location A one time and Location B four times, $E=-(0.2*log_{2}0.2 + 0.8*log_{2}0.8) ≈ 0.72$

made by ChatGPT

 

5. Collective human mobility

5.1. Distance decay effect

- The existing studies reveal that human movements can be modeled by a Lévy flight, while the power law distribution of displacements is an indication of the distance decay effect.

더보기

Lévy flights (Lévy walks) are random walks comprised of clusters of multiple short steps with longer steps between them.

source: https://www.sciencedirect.com/topics/physics-and-astronomy/levy-flight

- CDR data slightly underestimate the distance decay effect in Shenzhen.

 

5.2. Community detection

- Identifying communities in a network can help us understand the internal structure of a city that is shaped by human interactions as opposed to pre-defined administrative boundaries.

Community detection aims at partitioning a network into communities that consist of densely connected nodes.

 

 

 

 

 

gyration /ˌjīˈrāSH(ə)n/

heteroscedastic /ˌhɛtərəʊskɪˈdæstɪk/

 

 

 

 

 

- How did CDR data perform based on in terms of modeling individual mobility patterns?

     - What measures did they use to measure individual mobility?

     - What did they find?

- How did they test the distance decay effect?

728x90
반응형