Undertaking data analytics without breaking the law

In brief

Guidance highlighting the necessity of being transparent when undertaking data analytics on personal information, as well as other matters that organisations should consider so as to better manage compliance risk when undertaking data analytics, has been released. Partner Michael Morris Lawyer Jaclyn Webb and Lawyer Amy Detheridge report on some of the key messages.

The release of the Guides reminds us that transparency is fundamental to privacy compliance, including when conducting different types of data analytics upon personal information.
Where your organisation might be using de-identified data to conduct data analytics, you should continually re-assess compliance risk, given the rise of big data and the increasing risks of re-identification.

Background

The Office of the Australian Information Commissioner (OAIC) has recently released guidance on Data Analytics and the Australian Privacy Principles, and De-identification and the Privacy Act (the Guides).

Data analytics has already become a business-critical tool, driving key business decisions and revenue streams. Data analytics enables businesses to turn raw data into something useful and valuable, such as insights into customer behaviours, preferences and trends; and allowing businesses to identify and implement more effective targeting campaigns, and to develop new products or services. In addition, businesses with high volumes of data can now utilise artificial intelligence to efficiently process data for commercialisation.

If your business undertakes data analytics using personal information, it is important to comply with your obligations under the Privacy Act 1988 (Cth) (the Privacy Act).

The release of the Guides, and OAIC's recent decision to investigate Facebook after the personal information of 87 million users (including more than 311,000 Australians) was improperly used and disclosed to data analytics company Cambridge Analytica, serves as a timely and useful reminder to organisations about the ramifications under the Privacy Act when conducting data analytics.

OAIC considers data analytics to include:

big data (the processing of high-volume and high-speed data from varied sources);
data integration (collating multiple datasets to create new datasets) – this is often performed when undertaking research or utilising statistics internally;
data mining (identifying trends and patterns from large data sets) – this is commonly used by organisations for product development and targeted marketing campaigns; and
data matching (comparing two data sets containing personal information to obtain a match) – this is typically used by banks, government or debt-collection agencies for various purposes including fraud detection, tax purposes or debt collection.

Even if you are not currently undertaking data analytics but may wish to do so in future, you should consider future-proofing your privacy practices and approach, so that you retain flexibility to conduct data analytics within regulatory parameters.

Of course, as with all areas of privacy compliance, this cannot be a 'set and forget' approach. Your organisation should regularly monitor changes to the regulatory environment, prevailing industry practices and individuals' expectations, to ensure it remains compliant.

Why does it matter?

As was the case with Facebook, compliance risks generally arise where data analytics are conducted on information that could be considered to be 'personal information', including 'sensitive information'.

If you currently undertake data analytics, it is worth considering whether you comply with the Privacy Act by either adequately transparently informing individuals that their information will be used for the purposes of data analytics; or, alternatively, engaging in an initial de-identification exercise and then conducting data analytics on information that is no longer considered 'personal information'.

Data de-identification involves the process of, first, modifying personal information to remove direct identifiers (such as name, address, and other information by which an individual could be identified, or be reasonably identifiable). The next step is to remove or alter the remaining information that may allow an individual to be identified (eg because of a rare characteristic), as well as implementing controls and safeguards around processes and the people with access to the de-identified data. However, de-identified data raises compliance concerns if it is not properly de-identified, such that it becomes re-identifiable.

What does it mean for you?

Importantly, you must determine whether you will choose to (or currently) conduct data analytics on de-identified information, or on personal information that means you need to be transparent about that approach with individuals (ie tell them that you will use their information for data analytics at the time of collection and, in certain circumstances, obtain their consent). This may apply even where your data analytics activities involve mere data integration and not data mining or data matching.

Data analytics using de-identified data

Where your organisation is undertaking data analytics in relation to de-identified data, it is important that this data is in fact properly de-identified and cannot be re-identified (whether the re-identification occurs by virtue of data analytics or otherwise). In assessing risk of re-identification, data will be considered de-identified where there is no reasonable likelihood of re-identification (so, the risk of re-identification does not have to be zero).

In de-identifying data, you should consider the following:

whether the modified data would enable an individual to be reasonably identifiable in the context of:
- the nature and amount of information;
- who will hold and have access to the information;
- the other information available to persons with access to the modified data; and
- the practicability of using that information to identify an individual. Eg is it impossible to identify the individual because it is impracticable?;
what motivations may exist to re-identify data;
the gravity of harm to individuals arising from re-identification;
implementing data reduction techniques such as sampling; limiting the variables (ie removing further information such as their profession, significant dates or income); having grouping data such as age brackets, rather than individual ages; or encryption or 'hashing' of identifiers (so that different data sets of information can be linked but the personal information is hidden);
limiting access to the de-identified data. You might also consider procuring contractual obligations from persons receiving the de-identified data, including not to attempt to re-identify the information; and
continuing to reassess the re-identification risk, as new technology develops, or further information is made accessible.

Data analytics using personal information (not de-identified)

If your organisation conducts data analytics on personal information, adequate disclosures should be properly communicated to affected individuals. Collection statements and privacy policies should include information about the collection and the potential uses and disclosures. Specifically, if you are undertaking data analytics, the OAIC would expect to see privacy notices that include clear statements such as:

that analytics will be conducted on personal information for the purposes of marketing and determining preferred customer products / services;
that analytics may be conducted on the information provided directly by the individual, as well as third party sources (these sources should be listed);
any anticipated secondary purposes that the data may be used for (eg an insurance company could use an individual's personal information to determine the likelihood of certain events occurring);
any potential adverse consequences for the individual, by virtue of the data collection, use or disclosure; and
any anticipated disclosures of personal information to third parties and a list of those entities.

Generic and broad statements in collection notices and privacy policies – eg that information might be used for 'business purposes' – will not be adequate disclosures for data analytics purposes.

Key takeaways

The release of the Guides serves as a reminder to check your business's processes and identify any privacy compliance issues. Based on what is in the Guides, we suggest keeping the following tips in mind, to help to ensure your business is compliant with the Privacy Act.

If you are using de-identified data:

ensure that any data is properly de-identified and that there is no reasonable likelihood of re-identification as a result of the data analytics, or otherwise;
where retaining both a de-identified dataset and the original dataset, put appropriate measures in place, such that those persons with access to the de-identified data (for the purpose of data analytics) cannot also access the original data; and
even where data is de-identified, when dealing with genetic or biometric information, health data, tax file numbers or sensitive information, do not run data analytics without first assessing privacy implications.

If you are using (non-sensitive) personal information:

ensure that any use or disclosure of personal information is aligned to the primary purpose of collecting the information. Otherwise, your business will likely:
- require the individual's consent; or
- need to be comfortable with an assessment that the relevant individual would have a reasonable expectation the information would be used for the proposed secondary purpose;
if your business is undertaking data analytics with personal information for direct marketing purposes, ensure that a clear opt-out is provided to the relevant individual; and
ensure your privacy policy and other notifications are up to date and transparent regarding any data analytics that you will be undertaking and the purposes of the data analytics.