Today’s study group was led by Luke Mather, who gave us an
insight into three advanced methods of web tracking taken from the
paper “The Web Never Forgets: Persistent Tracking Mechanisms in the Wild” by
Acar et al, [1].
The study started with a reminder of what cookies are and how they are used on the web – as a method of a browser identifying itself to a server though information (cookies) stored in the browser (more info on this here ). We then went on to examine how whilst it is possible for users to change tracking preferences on browsers to not use cookies, there are ways in which these user preferences can be circumvented. The paper that was presented looks at three ways in which the users tracking preferences can be bypassed by being difficult to discover and hard to remove. The three methods are canvas fingerprinting, evercookies and respawning, and cookie syncing. A description of each is given below
This method uses cookies stored in Flash, localStorage, sessionStorage and ETags to “respawn” cookies that were previously removed in the browser. This allows the cookies to be reused and thus allows users to be tracked having believed their cookies to have been removed.
3) Cookie Syncing
The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. ACM Conference on Computer and Communications Security 2014: 674-689
The study started with a reminder of what cookies are and how they are used on the web – as a method of a browser identifying itself to a server though information (cookies) stored in the browser (more info on this here ). We then went on to examine how whilst it is possible for users to change tracking preferences on browsers to not use cookies, there are ways in which these user preferences can be circumvented. The paper that was presented looks at three ways in which the users tracking preferences can be bypassed by being difficult to discover and hard to remove. The three methods are canvas fingerprinting, evercookies and respawning, and cookie syncing. A description of each is given below
1)
Canvas fingerprinting
This exploits the Canvas API that is available on modern
browsers that render the same text or WebGL scenes slightly differently for different
computers. This API works by rendering
the text differently depending on features such as the operating system, font
library, graphics card etc. As this representation will be different for
different machines, it can be used create a fingerprint of a machine that can
then be used to track a user. A description of how this process works is given
below.
Stage 1: a user visits a page and the fingerprinting script
first draws text with the font and size of its choice and adds background colours.
Stage 2: the script calls Canvas API’s ToDataURL method to
get the canvas pixel data in dataURL format, a Base64 encoded representation of
the binary pixel data.
Stage 3: The script takes the hash of the text-encoded pixel
data, which serves as the fingerprint and may be combined with other
high-entropy browser properties such as the list of plugins, the list of fonts,
or the user agent string. [1]
2)
Evercookies and RespawningThis method uses cookies stored in Flash, localStorage, sessionStorage and ETags to “respawn” cookies that were previously removed in the browser. This allows the cookies to be reused and thus allows users to be tracked having believed their cookies to have been removed.
3) Cookie Syncing
This is a practice of tracker domains passing pseudonymous
IDs associated with a given user to (usually stored as cookies) between each
other. Domain A, for instance, could pass an ID to domain B by making a request
to a URL hosted by domain B which contains the ID as a parameter string.
According to Google’s developer guide to cookie syncing, it provides a means
for domains sharing cookie values, given the restriction that sites can’t read
each other cookies, in order to better facilitate targeting and real-time
bidding [1]. This therefore allows users to be
tracked beyond what their preferences may state by third parties sharing information on users.
Discussion
After a few questions regarding the details of these three
methods, our discussion at the end of the talk focused on how these methods
were actually used “in the wild,” with a study from the paper showing that
canvas fingerprinting was used in 5.5% of the Top Alexa 100,000 sites for
instance.
[1] Gunes Acar, Christian
Eubank, Steven
Englehardt, Marc Juárez, Arvind
Narayanan, Claudia Díaz:The Web Never Forgets: Persistent Tracking Mechanisms in the Wild. ACM Conference on Computer and Communications Security 2014: 674-689
No comments:
Post a Comment