Swedes Online: You Are More Tracked Than You Think Joel Purra's master's thesis


Thesis defense presentation

Video recording from the master's thesis defense 2015-02-19 at Linköping University. The length is 48:16 minutes. Also see the presentation slides, IEEE LCN 2016 paper, and full thesis.

Video download: high resolution (258 MB), medium resolution (123 MB), low resolution (42 MB). All are in MP4 (H.264/AAC) format.

Paper presented at the 41st IEEE Conference on Local Computer Networks (LCN)

The paper Third-party Tracking on the Web: A Swedish Perspective (pdf, slides), based on the master's thesis, has been accepted to IEEE LCN 2016. It was co-written and presented (2016-11-08, Dubai, United Arab Emirates) by my thesis supervisor Niklas Carlsson, associate professor at LiU.

Abstract

Today, third-party tracking services and passive traffic monitoring are extensively used to gather knowledge about users’ internet activities and interests. Such tracking has significant privacy implications for the end users. This paper presents an overview of the third-party tracking usage. Using measurements, we highlight the current state of the third-party tracking landscape and differences observed across tracking service classes (e.g., advertising, analytics, and content), across domain categories (e.g., popular vs. less popular, and national vs. global domains), and with regards to the organizations that owns many of the tracker services, when using HTTP and HTTPS, respectively. Understanding these differences help answer questions related to the third-party services that track modern web users and their coverage of our browsing.

IEEE LCN 2016 full paper (pdf, 7 pages, doi 10.1109/LCN.2016.14, pgp signature, sha512, sha256, sha1, md5, )

Reference/citation

J. Purra, N. Carlsson, Third-party Tracking on the Web: A Swedish Perspective, Proc. IEEE Conference on Local Computer Networks (LCN), Dubai, UAE, Nov. 2016. https://joelpurra.com/projects/masters-thesis/

Master's thesis presented at Linköping University.

The thesis Swedes Online: You Are More Tracked Than You Think. (pdf, slides) was presented (2015-02-19, Linköping, Sweden) in front of the thesis supervisor, the thesis opponent, and an audience of interested students.

Abstract

When you are browsing websites, third-party resources record your online habits; such tracking can be considered an invasion of privacy. It was previously unknown how many third-party resources, trackers and tracker companies are present in the different classes of websites chosen: globally popular websites, random samples of .se/.dk/.com/.net domains and curated lists of websites of public interest in Sweden. The in-browser HTTP/HTTPS traffic was recorded while downloading over 150,000 websites, allowing comparison of HTTPS adaption and third-party tracking within and across the different classes of websites.

The data shows that known third-party resources including known trackers are present on over 90% of most classes, that third-party hosted content such as video, scripts and fonts make up a large portion of the known trackers seen on a typical website and that tracking is just as prevalent on secure as insecure sites.

Observations include that Google is the most widespread tracker organization by far, that content is being served by known trackers may suggest that trackers are moving to providing services to the end user to avoid being blocked by privacy tools and ad blockers, and that the small difference in tracking between using HTTP and HTTPS connections may suggest that users are given a false sense of privacy when using HTTPS.

Thesis full report v1.0.0 (pdf, 144 pages, urn:nbn:se:liu:diva-117075, pgp signature, sha512, sha256, sha1, md5, text-only html, )

Reference/citation

Joel Purra. 2015. Swedes Online: You Are More Tracked Than You Think. Master's thesis. Linköping University (LiU), Linköping, Sweden. https://joelpurra.com/projects/masters-thesis/

More reference/citation formats

Swedish standard (SS038207)

Purra, Joel, Swedes Online: You Are More Tracked Than You Think. - 2015

Vancouver

Purra J. Swedes Online: You Are More Tracked Than You Think. 2015.

ACM Reference Format

Joel Purra. 2015. Swedes Online: You Are More Tracked Than You Think. Master's thesis. Linköping University (LiU), Linköping, Sweden. https://joelpurra.com/projects/masters-thesis/

Bibtex

@mastersthesis{Purra:2015:DIVA2:807623,
Abstract = {When you are browsing websites, third-party resources record your online habits; such tracking can be considered an invasion of privacy. It was previously unknown how many third-party resources, trackers and tracker companies are present in the different classes of websites chosen: globally popular websites, random samples of .se/.dk/.com/.net domains and curated lists of websites of public interest in Sweden. The in-browser HTTP/HTTPS traffic was recorded while downloading over 150,000 websites, allowing comparison of HTTPS adaption and third-party tracking within and across the different classes of websites.

The data shows that known third-party resources including known trackers are present on over 90% of most classes, that third-party hosted content such as video, scripts and fonts make up a large portion of the known trackers seen on a typical website and that tracking is just as prevalent on secure as insecure sites.

Observations include that Google is the most widespread tracker organization by far, that content is being served by known trackers may suggest that trackers are moving to providing services to the end user to avoid being blocked by privacy tools and ad blockers, and that the small difference in tracking between using HTTP and HTTPS connections may suggest that users are given a false sense of privacy when using HTTPS.},
Author = {Purra, Joel},
Diva = {diva2:807623},
Institution = {Department of Computer and Information Science},
Isrn = {LIU-IDA/LITH-EX-A--15/007--SE},
Keywords = {measurement, security, privacy, internet, web, http, https, tracking, domains, web browser, website retrieval, redirects, privacy enhancing technologies, online advertising, web analytics, content blocking, google, facebook, twitter, adblock, disconnect.me, pjantomjs, http archive, alexa, sweden},
Month = {February},
Note = {Source code, datasets, and a video recording of the presentation is available on the master's thesis website.},
Oai = {oai:DiVA.org:liu-117075},
School = {Link{\"o}ping University},
Title = {Swedes Online: You Are More Tracked Than You Think},
Url = {https://joelpurra.com/projects/masters-thesis/},
Urn = {urn:nbn:se:liu:diva-117075},
Year = {2015}
}

RIS

TY  - MAST
AB  - When you are browsing websites, third-party resources record your online habits; such tracking can be considered an invasion of privacy. It was previously unknown how many third-party resources, trackers and tracker companies are present in the different classes of websites chosen: globally popular websites, random samples of .se/.dk/.com/.net domains and curated lists of websites of public interest in Sweden. The in-browser HTTP/HTTPS traffic was recorded while downloading over 150,000 websites, allowing comparison of HTTPS adaption and third-party tracking within and across the different classes of websites.

The data shows that known third-party resources including known trackers are present on over 90% of most classes, that third-party hosted content such as video, scripts and fonts make up a large portion of the known trackers seen on a typical website and that tracking is just as prevalent on secure as insecure sites.

Observations include that Google is the most widespread tracker organization by far, that content is being served by known trackers may suggest that trackers are moving to providing services to the end user to avoid being blocked by privacy tools and ad blockers, and that the small difference in tracking between using HTTP and HTTPS connections may suggest that users are given a false sense of privacy when using HTTPS.
AU  - Purra, Joel
KW  - measurement
KW  -  security
KW  -  privacy
KW  -  internet
KW  -  web
KW  -  http
KW  -  https
KW  -  tracking
KW  -  domains
KW  -  web browser
KW  -  website retrieval
KW  -  redirects
KW  -  privacy enhancing technologies
KW  -  online advertising
KW  -  web analytics
KW  -  content blocking
KW  -  google
KW  -  facebook
KW  -  twitter
KW  -  adblock
KW  -  disconnect.me
KW  -  pjantomjs
KW  -  http archive
KW  -  alexa
KW  -  sweden
Y2  - February
T1  - Swedes Online: You Are More Tracked Than You Think
UR  - https://joelpurra.com/projects/masters-thesis/
PY  - 2015
ER  - 
Other documents

Thesis planning and draft (pdf)

A document that combines planning with ongoing thesis writing. Read the abstract and background for an overview.

Thesis subject proposal (pdf)

A broadly formulated subject proposal, which allowed .SE and I to specify direction and scope after an initial survey.

Open source projects

In the spirit of free and open source software as well as open data, the source code for both documents and tools are released to the public. If you use the code in your research, please include a reference to the IEEE LCN 2016 paper (pdf) and a link to this page in your work.

The power is in your hands. Use the source, Luke!

Open data

To encourage a quick start as well as verification of my results, downloaded data based on publicly available lists of domains are made available. If you use the data in your research, please include a reference to the IEEE LCN 2016 paper (pdf) and a link to this page in your work.

Current datasets include those based on Alexa's top 1.000.000 list from 2014-09-01. Screenshots have not been made available at the moment, but HAR data has. Please use har-dulcify to analyze it.

People involved