Find 75 ways to say EXPERIMENT, along with antonyms, related words, and example sentences at Thesaurus.com, the world's most trusted free thesaurus. Land Lines is an experiment that lets you explore Google Earth satellite imagery through gesture. “Draw” to find satellite images that match your every line; “Drag” to create an infinite line of connected rivers, highways and coastlines.
The experiment was published on two occasions. Groups of eight male college students participated in a simple 'perceptual' task. In reality, all but one of the participants were actors, and the true focus of the study was about how the remaining participant would react to the actors' behavior.
Satellite images provide a wealth of visual data from which we can visualize in interesting ways. Land Lines is an experiment that lets you explore Google Earth satellite imagery through gesture. “Draw” to find satellite images that match your every line; “Drag” to create an infinite line of connected rivers, highways and coastlines.
Using a combination of machine learning, optimized algorithms, and graphics card power, the experiment is able to run efficiently on your phone’s web browser without a need for backend servers.
Learn more about how the project was created in this technical case study or browse the open-source code on GitHub.
We used a combination of OpenCVStructured Forests and ImageJ’s Ridge Detection to analyze and identify dominant visual lines in the initial dataset of 50,000+ images. This helped cull down the original dataset to just a few thousand of the most interesting images.
For the draw application, we stored the resulting line data in a vantage point tree. This data structure made it fast and easy to find matches from the dataset in real time right in your phone or desktop web browser.
We used Pixi.js, an open source library built upon the WebGL API, to rapidly draw and redraw 2D WebGL graphics without hindering performance.
All images are hosted on Google Cloud Storage so images are served quickly to users worldwide.
Made by Zach Lieberman, Matt Felsen, and the Data Arts Team. Special thanks to Local Projects.
This is part 3 of a series of posts on user tracking on the modern web. You can also read part 1 and part 2.
Whenever you visit a web page, your browser sends a 'User Agent' header to the website saying precisely which operating system and web browser you are using. This information could help distinguish Internet users from one another because these versions differ, often considerably, from person to person. We recently ran an experiment to see to what extent this information could be used to track people (for instance, if someone deletes their browser cookies, would the User Agent, alone or in combination with some other detail, be unique enough to let a site recognize them and re-create their old cookie?).
Our experiment to date has shown that the browser User Agent string usually carries 5-15 bits of identifying information (about 10.5 bits on average). That means that on average, only one person in about 1,500 (210.5) will have the same User Agent as you. On its own, that isn't enough to recreate cookies and track people perfectly, but in combination with another detail like geolocation to a particular ZIP code or having an uncommon browser plugin installed, the User Agent string becomes a real privacy problem.
When we analyze the privacy of web users, we usually focus on user accounts, cookies, and IP addresses, because those are the usual means by which a request to a web server can be associated with other requests and/or linked back to an individual human being, computer, or local network.
Typical advice for improving your privacy as you surf the web might include blocking or deleting cookies (and supercookies), and using proxy servers or tools like Tor to hide your IP address.
It's not intuitively obvious that a User Agent poses a similar risk to a unique tracking cookie. After all, cookies were designed, in part, to help web sites distinguish and recognize individual browsers, and User Agents weren't. And there could be millions of people out there who use the same browser and operating system that you do. But let's examine the matter more closely. A typical User Agent string looks something like this:
In fact, that was the most common user agent string among browsers visiting the EFF website during the test period: Firefox 3.5.3 running on Windows XP. Notice that the operating system and browser versions are extremely specific and that the User Agent also includes the user's preferred language. There are a lot of things that can vary inside that string, and those variations can be used to distinguish and track people as they browse the Web.
We ran an experiment to measure precisely how identifying the User Agent strings would have been among a 36-hour anonymized sample of requests to the EFF website. The following table shows different classes of browser, with the number of bits for best and average case User Agents within that class:
Browser class | Avg. identifying information | Minimum identifying information | (Least identifying user agent) |
---|---|---|---|
Modern Windows Desktops | 10.3-11.3 bits | 4.6 - 5.0 bits | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729) |
Internet Explorer | 13.2-13.5 bits | 6.3 - 7.2 bits | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) |
Firefox | 8.6 - 9.4 bits | 4.6 - 5.0 bits | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729) |
Chrome | 7.5-8.5 bits | 5.7 - 6.2 bits | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/532.0 (KHTML, like Gecko) Chrome/3.0.195.27 Safari/532.0 |
Linux | 11.8-13.15 bits | 6.6-7.9 bits | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty) Firefox/3.0.14 |
Ubuntu | 9.6 - 11.7 bits | 6.6 - 7.8 bits | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty) Firefox/3.0.14 |
Debian | 13.5-15.3 bits | 10.50 - 11.7 bits | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.14) Gecko/2009091010 Iceweasel/3.0.6 (Debian-3.0.6-3) |
Macintosh | 8.8-9.3 bits | 5.8-5.8 bits | Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 |
iPhone | 10.8 - 11.3 bits | 8.7 - 9.3 bits | Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7C144 Safari/528.16 |
Blackberry | 14.7 - 15.5 bits | 12.0 - 12.7 bits | BlackBerry9530/4.7.0.148 Profile/MIDP-2.0 Configuration/CLDC-1.1 VendorID/105 |
Android | 14.4 - 14.4 bits | 12.2-12.4 bits | Mozilla/5.0 (Linux; U; Android 1.6; en-us; T-Mobile G1 Build/DRC83) AppleWebKit/528.5+ (KHTML, like Gecko) Version/3.1.2 Mobile Safari/525.20.1 |
There are several remarkable facts about this dataset. Overall, it's amazing how identifying User Agent strings are. 10.5 bits is about one-third of the total information required to identify an Internet user.
It's also surprising that platforms like Firefox and Ubuntu, which have lower market penetration, are on average comparable or even less identifying than Windows and Microsoft Internet Explorer, which have very large userbases and should therefore have larger crowds to hide in. Part of this may be that visitors to the EFF website are over-representative of the former groups, but it's also clear that a large part of this is that Internet Explorer has a very high level of variation in its User Agent strings, with typical examples looking something like this:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.30729; .NET CLR 3.0.30618)
All of the different library and component versions there essentially function as partial tracking tokens.
We've launched a project called Panopticlick to collect a new dataset that extends this analysis from User Agents to the full browser plugin and configuration space. You can use Panopticlick to receive a uniqueness measurement for your own browser, and help EFF's privacy research efforts at the same time!
During September 2009, we took a 36 hour sample of anonymized requests to the eff.org web server by hashing the IP address of each request with a random salt, and throwing away the salt. We then calculated the amount of identifying information conveyed by each browser. Identifying information is measured in 'bits of entropy', and says how large a crowd the information would reveal you within. Browsers usually convey between 5 and 15 bits of identifying information, about 10.5 bits on average. 10 bits of identifying information would allow you to be picked out of a crowd of 210, or 1024 people. 10.5 bits of information identifies can identify people from crowds of just under 1,448.
Because we did not use cookies or any other mechanism to distinguish between repeat and new visitors, each measurement of bits of identifying information lies between an upper and lower bound.1