A groundbreaking study shows that your everyday browsing routine, what sites you visit most, can uniquely identify you, proving that anonymity online may be more illusion than reality.
Study: Browsing behavior exposes identities on the Web. Image credit: 13_Phunkod/Shutterstock.com
In a recent study published in Scientific Reports, researchers examined whether individuals can be uniquely identified based solely on their web browsing behavior, particularly their most frequently visited websites.
Concerningly, in 95% of cases, knowing a user’s four most visited domains allowed researchers to identify them; on average, only 2.45 steps (roughly two or three top websites) were needed to isolate a user, and in 80% of cases, the user could be re-identified over time. However, re-identification rates depended on the fingerprint length, rising from about 60% for five domains to 80% for 10 and 90% for 15. Thus, patterns in browsing habits create unique and stable ‘behavioral fingerprints’ that threaten online privacy.
Background
In today’s digital world, people’s online behaviors have become valuable assets for companies that collect and monetize data through personalized advertising. By analyzing browsing patterns, businesses can predict and influence individual actions, yet the behavioral foundations of this profitability remain poorly understood.
Research shows that online behavior is highly predictable (about 85% predictable on average) because people tend to follow consistent browsing routines, like habitual behavior observed in shopping or mobility. While this predictability enhances user experience through tailored services, it raises privacy and ethical concerns.
The ability to anticipate and manipulate behavior forms the basis of “surveillance capitalism,” where users’ actions are monitored and potentially shaped to serve commercial or political goals.
Uniqueness in behavior, whether in movement, purchases, or web use, can be a digital fingerprint, allowing individuals to be identified without traditional personal identifiers. Earlier studies demonstrated that only a few data points from phone records or credit card transactions could re-identify most users.
Similarly, prior online research has shown that factors like browser settings or browsing history can reveal user identity. However, few studies have examined how the repetitive, habitual nature of everyday web use might produce stable and identifiable behavioral patterns in real-world settings.
About the study
The study analyzed the web browsing activity of 2,148 German users over one month. Participants were recruited through a General Data Protection Regulation (GDPR)-compliant online panel, gave informed consent, and were compensated. The anonymized dataset contained over 9 million website visits across nearly 50,000 unique domains.
Each record included the website’s domain name, visit time, and duration of activity, with all personally identifiable information removed before analysis. Participants also provided demographic data such as age, gender, education, family status, and income, making the sample representative of German internet users under 65.
To identify unique browsing “fingerprints,” researchers represented each user by an n-tuple of their n most visited domains and calculated how many users had unique combinations. Statistical variability was assessed using the Jackknife method.
To determine how easily users could be identified, they simulated stepwise matching by progressively comparing domain overlaps until a single user remained, repeating this process 300 times per user.
Re-identification analysis tested the stability of these fingerprints by dividing each user’s browsing data into two consecutive periods, ranging from a few to several hours, and checking whether fingerprints from the first period matched those from the second. Success rates were calculated as the proportion of users consistently re-identified across time slices.
Key findings
Researchers analyzed web tracking data from 2,148 German users, covering over 9 million website visits across nearly 50,000 domains, to determine how browsing habits create unique behavioral “fingerprints.”
The researchers found that individuals’ four most visited websites were enough to uniquely identify 95% of users, regardless of gender, age, education, or income. On average, only 2.45 steps (equivalent to identifying two or three top websites) were needed to pinpoint a user, showing that few data points can reveal identity.
The findings also demonstrated that user identifiability remains high even with limited data: information from just the top 100 most visited domains (0.2% of all domains) still identified 82% of users.
Behavioral uniqueness was driven largely by personal browsing differences, with popular domains reducing distinctiveness while less common ones enhanced it.
Moreover, these fingerprints were stable over time, with 80% of users successfully re-identified across adjacent time slices of data, showing high short-term consistency. Re-identification rates increased with longer browsing fingerprints and longer tracking durations, though gains diminished after about six hours of data collection.
Conclusions
Researchers successfully demonstrated that individuals’ web browsing habits act as distinctive and stable behavioral fingerprints, allowing them to be uniquely and repeatedly identified online.
Unlike earlier research on technical identifiers, this work highlights that ordinary browsing routines pose significant privacy risks. The findings show high identifiability and re-identifiability across short time spans, emphasizing that users’ consistent habits can compromise digital anonymity.
Despite widespread privacy precautions, such as cookie blocking or virtual private network (VPN) use, these risks persist because they stem from behavior, not technology. The study's strengths include robust evidence drawn from real-world, GDPR-compliant data and replication across multiple datasets.
However, it is limited by its regional scope, short-term analysis, and focus on simple domain-based fingerprints. The study makes no claims about the long-term stability of these behavioral fingerprints, which remains an open question for future research. Future studies should examine long-term and cross-cultural stability of these behavioral patterns, integrate temporal or contextual factors, and develop practical privacy-preserving strategies to mitigate online identifiability.
Download your PDF copy now!
Journal reference:
- Oliveira, M., Yang, J., Griffiths, D., Bonnay, D., Kulshrestha, J. (2025). Browsing behavior exposes identities on the Web. Scientific Reports 15, 36066. DOI: 10.1038/s41598-025-19950-3. https://www.nature.com/articles/s41598-025-19950-3