Cross-device tracking with machine learning

Volkova, Elena

Master thesis

View/Open

Master_Cross_device_Volkova.pdf (793.9Kb)

Year

2017

Abstract

Personalizing the user experience on the web is important for news or product recommendations, online advertising and other domains. Users increasingly expect personalized experiences, and personalization technologies allow publishers to create more appealing digital products that they can charge for or otherwise monetize. On the web users are generally anonymous and the majority of traffic comes from users that haven’t logged in or otherwise authenticated themselves. For this reason, one of the big challenges for personalization is the fact that the same user will often access websites from several different devices (e.g., PCs, mobile phones, or tablets) and the websites have problems discerning if the requests come from the same user or not. The problem of identifying users across multiple devices is known as cross-device tracking, and has not been extensively researched yet. In this project we carried out a number of experiments with cross-device tracking techniques based on applying machine learning to real-world traffic data. We extracted labelled datasets from traffic logs, applied both supervised classifiers and unsupervised clustering techniques to the data, focusing on minimizing the number of false connections, and evaluated them using the standard binary recall and precision metrics. Some of the resulting models performed well enough for practical applications, although important issues remain unsolved, such as how to make sure that a model performs well after applying it to the data from a new website.