TrackSign: Guided Web Tracking Discovery

2021
Current web tracking practices pose a constant threat to the privacy of Internet users. As a result, the research community has recently proposed different tools to combat well-known tracking methods. However, the early detection of new, previously unseen tracking systems is still an open research problem. In this paper, we present TrackSign, a novel approach to discover new web tracking methods. The main idea behind TrackSign is the use of code fingerprinting to identify common pieces of code shared across multiple domains. To detect tracking fingerprints, TrackSign builds a novel 3-mode network graph that captures the relationship between fingerprints, resources and domains. We evaluated TrackSign with the top-100K most popular Internet domains, including almost 1M web resources from more than 5M HTTP requests. Our results show that our method can detect new web tracking resources with high precision (over 92%). TrackSign was able to detect 30K new trackers, more than 10K new tracking resources and 270K new tracking URLs, not yet detected by most popular blacklists. Finally, we also validate the effectiveness of TrackSign with more than 20 years of historical data from the Internet Archive.
    • Correction
    • Source
    • Cite
    • Save
    22
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map