4.7 Article

Session stitching using sequence fingerprinting for web page visits

期刊

DECISION SUPPORT SYSTEMS
卷 150, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.dss.2021.113579

关键词

Session stitching; Web analytics; Sequence mining; Session fingerprinting

资金

  1. Edinburgh Parallel Computing Centre (EPCC)
  2. DataLab [7868323]

向作者/读者索取更多资源

The way people navigate the web has changed significantly with the use of multiple devices and shared devices. Analyzing a large volume of seemingly disjoint data can support decision-making through machine learning. This study introduces an alternative approach based on learning behavioral patterns from web page visit fingerprints to identify and stitch web sessions efficiently.
The nature of people's web navigation has significantly changed in recent years. The advent of smartphones and other handheld devices has given rise to web users consulting websites with more than one device, or using a shared device. As a result, large volumes of seemingly disjoint data are available, which when analysed together can support decision-making. The task of identifying web sessions by linking such data back to a specific person, however, is hard. The idea of session stitching aims to overcome this by using machine learning inference to identify similar or identical users. Many such efforts use various demographic data or device-based features to train matching algorithms. However, often these variables are not available for every dataset or are recorded differently, making a streamlined setup difficult. Besides, they often result in vast feature spaces which are hard to use for actionable interpretation. In this paper, we present an alternative approach based on the fingerprinting of web pages visited by users in a single session. By learning behavioural patterns from these sequences of page visits, we obtain features that can be used for matching without requiring sensitive user-agent data such as IP, geo location, or device details as is common with other approaches. Using these sequential fingerprints does not rely on pre-defined features, but only requires the recording of web page visits, making our approach actionable. The approach is empirically tested on real-life web logs and compared with matching using regular user-agent features and state-of-the-art embedding techniques. Results in an ecommerce context show sequential features can still obtain strong performance with fewer features, facilitating decision-making on session stitching and inform subsequent related activities such as marketing or customer analysis.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据