Abstract:
The need for frequent transportation planning has become a key factor since people
started becoming more mobile making urban traffic patterns more complex. The primary
source for analysing such travel behavior is through manual surveys. These surveys
are expensive, time consuming and often are outdated by the time the survey is completed
for analysis. To overcome these issues, Mobile Network Big Data (MNBD) which
concerns large data sets can be used over such traditional data collection processes. Call
Detail Records (CDR) which is a subset of MNBD is readily available as most of the
telecommunication service providers maintain CDR. Thus, analyzing CDR leads to an
efficient identification of human behavior and location.
However, many researches on CDRs have been done focusing to identify travel patterns
in order to understand human mobility behavior. Relatively high percentage of
sparse data and other scenarios like the Load Sharing Effect (LSE) causes difficulties in
identifying precise location of the user when using CDR data. Existing approaches for
identifying precise user location patterns have certain constraints. Past researches utilizing
CDRs have used primary approaches in recognizing load sharing effects and have
given minimum consideration to the transmission power of the respective cell towers
when localizing the users. Furthermore, these studies have neglected the differences in
mobility behavior of different segment of users and taken the entire community of users
as a single cluster.
In this research, a novel methodology to overcome these limitations is introduced
for locating users from CDRs by dividing the users into distinct clusters for identifying
the model parameters and through enhanced identification of load sharing effects by
taking the transmission power into consideration. Further, this study contributes to the
transport sector by identifying secondary activities from CDR data, without limiting to
the primary activity recognition. This research uses approximately 4 billion CDR data
points, voluntarily collected mobile data and manually collected travel survey data to
find techniques to overcome the existing limitations and validate the results.
Proposed dynamic filtering algorithm for load shared records identification showed
a significant improvement on accuracy over previous predefined speed based filtering
methods. Further, we found that, IO-HMM outperforms standard HMM results on
activity recognition.