The Article 29 Data Protection Working Party has apparently released [pdf] its “Opinion on data protection issues related to search engines.” It’s a very important decision and governs how search engines collect, process, use, and retain individual consumer information. It expands the discussion on why IP addresses and cookies are to be considered personal information as they relate to search. Search engine companies engaged in business in the EU will have to comply with this directive, which will include Google, Yahoo, MSN, etc.
Among the key findings [excerpts]:
“… search engines collect and process vast amounts of user data, including data gathered by technical means, such as cookies. Data collected can range from the IP address of individual users to extensive histories of past searching behaviour or data provided by users themselves when signing up to use personalised services… It is the opinion of the Working Party that search engines in their role as collectors of user data have so far insufficiently explained the nature and purpose of their operations to the users of their services.â€
“…IP information as personal data… These considerations will apply equally to search engine operators.”
“When a cookie contains a unique user ID, this ID is clearly personal data. The use of persistent cookies or similar devices with a unique user ID allows tracking users of a certain computer even when dynamic IP addresses are used11. The behavioural data that is generated through the use of these devices allows focusing even more on the personal characteristics of the individual concerned.â€
“Generally, search engine providers fail to provide a comprehensive overview of the different specified, explicit and legitimate purposes for which they process personal data. Firstly, some purposes, such as ‘improvement of the service’ or ‘the offering of personalised advertising’ are too broadly defined to offer an appropriate framework to judge the legitimacy of the purpose. Secondly, because many search engine providers mention many different purposes for the processing, it is not clear to what extent data are reprocessed for another purpose that is incompatible with the purpose for which they were originally collected.â€
“Search engine providers that wish to provide personalised advertising in order to increase their revenues, may find a ground for the legitimate processing of some personal data in Article 7 (a) of the Directive (consent) or Article 7 (b) of the Directive (performance of a contract) but it is difficult to find a legitimate ground for this practice for users who have not specifically signed in based on specific information about the purpose of the processing. The Working Party has a clear preference for anonymised data.â€
“If personal data are stored, the retention period should be no longer than necessary for the specific purposes of the processing. Therefore, after the end of a search session, personal data could be deleted, and continued storage therefore needs an adequate justification. However, some search engine companies seem to retain data indefinitely, which is prohibited. For each purpose, a limited retention time should be defined. Moreover, the set of personal data to be retained should not be excessive in relation to each purpose.
In practice, the major search engines retain data about their users in personally identifiable form for over a year (precise terms vary). The Working Party welcomes the recent reductions in retention periods of personal data by major search engine providers. However, the fact that leading companies in the field have been able to reduce their retention periods suggests that the previous terms were longer than necessary.
In view of the initial explanations given by search engine providers on the possible purposes for collecting personal data, the Working Party does not see a basis for a retention period beyond 6 months.â€
“Persistent cookies containing a unique user ID are personal data and therefore subject to applicable data protection legislation. The responsibility for their processing cannot be reduced to the responsibility of the user for taking or not taking certain precautions in his browser settings. The search engine provider decides if a cookie is stored, what cookie is stored and for what purposes it is used. Finally, expiration dates of cookies set by some search engine providers seem to be excessive. For instance, several companies set cookies that expire after many years. When a cookie is used, an appropriate cookie lifetime should be defined both to allow an improved surfing experience and a limited cookie duration. Especially in view of the default settings of browsers, it is very important that users are fully informed about the use and effect of cookies. This information should be more prominent than simply being part of a search engine’s privacy policy, which may not be immediately apparent.
Anonymisation
If there is no legitimate ground for processing, or for use beyond the well-specified legitimate purposes, search engine providers must delete personal data. Instead of deletion, search engines may also anonymise data, but such anonymisation must be completely irreversible for the Data Protection Directive to no longer apply.
Even where an IP address and cookie are replaced by a unique identifier, the correlation of stored search queries may allow individuals to be identified. For this reason, where anonymisation rather than deletion of data is chosen, the methods used should be considered carefully and performed thoroughly. This might involve the removal of parts of the search history to avoid the possibility of indirect identification of the user who performed those searches.
Anonymisation of data should exclude any possibility of individuals to be identified, even by combining anonymised information held by the search engine company with information held by another stakeholder (for instance, an internet service provider). Currently, some search engine providers truncate IPv4 addresses by removing the final octet, thus in effect retaining information about the user’s ISP or subnet, but not directly identifying the individual. The activity could then originate from any of 254 IP addresses. This may not always be enough to guarantee anonymisation.â€
“The Working Party finds that the correlation of personal data across services and platforms for authenticated users can only be legitimately done based on consent, after the users have been adequately informed… Finally, some search engine providers explicitly admit in their privacy policy that they enrich data provided by users with data from third parties, other companies that may for
example attach geographical information to ranges of IP addresses or websites carrying advertisements sold by the search engine provider24. This kind of correlation might be unlawful, if the data subjects are not informed at the time of collecting their personal data and if they are not granted an easy way of access to their personal profiles and the right to correct or delete certain elements that are incorrect or superfluous. If the processing in question is not necessary for providing the (search) service, the freely given, informed consent of the user would be required for lawful processing.â€
“OBLIGATION TO INFORM DATA SUBJECT
Most internet users are unaware of the large amounts of data that are processed about their search behaviour, and of the purposes they are being used for. If they are not aware of this processing they are unable to make informed decisions about it.”