Data Leakage

The level of online privacy is decreasing as their is a growing disconnect between increased leakage to, and by, aggregators with methods for protection. “Leakage” is defined by Krishnamurthy, Naryshkin & Wills as any distribution of a user’s private data to a third party site. First party sites distribute information to third parties to obtain analytics catered to the user’s interests. Krishnamurthy, Naryshkin & Wills (2011) used Alexa.com to draft a list of websites that provided users the ability for registration and signed up for accounts. They identified the ten most popular websites out of seventeen categories. Social networks based on the web was a special category because they contain sensitive information. Leakage was observed upon confirmation of an account. When logging in and navigating through the website, private information was given to an unidentified third party server. It appeared as though it was a first party domain, but was actually a third party address. The results depicted that 56% of the sampled websites directly leaked private information to at least one third party domain. Additionally, 48% of the websites leaked a user ID to third-parties. This occurred mostly through social networks. Combined, 75% of the websites directly leaked the user ID or private information to at least one third party.

Instances of Data Leakage
The Wall Street Journal was found to be passing user’s names and emails to third parties. In response, they claimed that it was done in error and they are working to correct the situation. The Wall Street Journal has a data protection policy which forbids sharing or selling personal information. The children’s site, ClubPeguin.com, tested positive for providing usernames to twelve separate companies. Disney stated that these third parties are not allowed to use the information for anything besides specified purposes, such as ad serving and traffic reporting. The receivers of this data claim they do not keep the usernames. Pandora admits to using the first half of a user’s e-mail address to track ad traffic. They use gender, age, and zip code information to provide the user with relevant advertising. This is an occurrence that exists on most websites that allow for user registration.