Is your next lawyer a chatbot? 7 min read
As legal professionals around the globe settle into another year of work, they find themselves wondering whether a new existential threat has arisen in OpenAI's much-discussed AI chatbot, ChatGPT. Many lawyers have been asking similar questions: will my job exist this time next year? How accurate is it at answering the questions I have to deal with today? And is this the end of copyright as we know it?
There's no denying the meteoric rise of the chatbot. It reportedly took Netflix 3.5 years to reach its first million users. For Spotify, five months, and Instagram 2.5 months. ChatGPT hit this milestone in just five days. Much of this astonishing user growth, and the resulting existential dread among legal practitioners, can be attributed to ChatGPT's seemingly miraculous capacity to produce, near-instantaneously, compelling written works on a wide array of topics, including answers to legal questions and drafts of legal documents. So should lawyers the world over be worried or excited by the future of AI in the legal profession?
This Insight will help legal professionals (and consumers of legal services) grapple with this question, by asking:
- What are ChatGPT's possible applications?
- What are its real or potential limitations?
- What are the risks?
In the interests of procedural fairness, we also decided to give ChatGPT a right of reply to express its 'thoughts' on our analysis when it came to the question of risks in legal practice.
- There are a number of potential benefits for legal practitioners and consumers of legal services to be gained through adopting generative AI technologies such as ChatGPT, including improved productivity and efficiency, cost reduction and enhanced communication.
- The current iteration of ChatGPT is subject to several critical limitations, including a finite corpus of information from which it derives its responses, and a cap on no new information since June 2021.
- Potential benefits must be balanced against the potential risks, which include ChatGPT's overestimation of legal knowledge, 'hallucination', bias, copyright ownership and data privacy.
- Legal teams within any business considering the use of ChatGPT and other generative AI technologies should ensure they have safeguards in place that govern who can access the tools, what information can be submitted as queries and how the output can used.
Legal teams, IT personnel, innovation and procurement teams.
ChatGPT is an AI chatbot, developed by the US-headquartered AI company, OpenAI. It was built using a large language model (also known as an LLM) called GPT-3, one of the largest and most powerful LLMs developed to date with around 175 billion parameters and access to 300 billion words. As an LLM, GPT-3 uses probability to guess what word should appear in a given point in a sentence, mimicking human speech and writing patterns. By having access to so many parameters and words, ChatGPT is able to tap into a vast amount of vocabulary, information and contexts to produce meaningful and engaging responses to prompts on a seemingly limitless number of topics. Just some examples of what ChatGPT has been used to produce include a 14-page law article in one hour, working computer code and a biblical verse in the style of the King James Bible explaining how to remove a peanut butter sandwich from a VCR.
It is worth noting, too, that in the coming weeks Google will release for use its own prototype AI chatbot 'Bard', which will be capable of accessing up-to-date information both on the internet and within Google's own (and enormous) data reserves. There are other chatbots known to be in development, suggesting ChatGPT merely represents the first mainstream example of what may soon be a crowded and competitive market.
There are a number of potential benefits for legal practitioners and consumers of legal services to be gained through adopting generative AI technologies such as ChatGPT. Potential examples are:
- Improved productivity and efficiency – ChatGPT appears to have the capacity to rapidly generate responses to simple legal questions, quickly prepare preliminary drafts of legal documents such as contracts and briefs, and can swiftly locate relevant information referred to in large legal documents, such as case law and statutes. If these sorts of responses can be demonstrated to be reliable, over time lawyers may be able to resolve simple client legal queries more quickly and also have a head start on matters which are more complex and more strategically valuable to clients.
- Enhanced communication – As it continues to evolve, ChatGPT may also be able to assist lawyers by quickly drafting clear and concise communications to stakeholders such as clients, opposing counsel and courts.
- Cost reduction – ChatGPT may also reduce the costs incurred by law firms and in-house counsel by automating and reducing the need for human activity on the initial tasks on matters.
- Access to a broader range of expertise – Subject to ChatGPT eventually being granted access to a broader data set than the current GPT-3 corpus, it may be able to assist lawyers to gain access to a broader range of legal sources than human lawyers would ordinarily consider, eg obscure legal precedents, articles, textbooks and statutes.
Despite its seemingly limitless capacities, OpenAI was the first to acknowledge some of the key limitations surrounding its ChatGPT tool which it says users must bear in mind. These include that ChatGPT will often write plausible-sounding but incorrect or nonsensical answers with a great deal of confidence. Additionally, and despite GPT-3 being an enormous LLM, it does not encompass the entirety of knowledge in the world, or indeed, simply on the internet, and does not currently include any resources beyond the year 2021. Users therefore will still need to quality-check ChatGPT's output for factual correctness and relevancy. By way of example, ChatGPT's answers do not currently take account of any changes to legislation since 2021.
Despite the opportunities listed above, legal practitioners must remain alive to the risks of over-reliance on ChatGPT and other AI chatbots like it. These include:
- Overestimation of legal knowledge – ChatGPT can only generate text based on patterns it has learned from the data on which it was trained. Therefore, if its training dataset does not contain sufficient resources on the particular area of law on which it is queried, the chatbot may produce a lucid and comprehendible answer, but one based on an incomplete or dated picture of the law. Whether GPT-3 adequately and comprehensively addresses all areas of the law in all jurisdictions is both unlikely and difficult to verify given its size. As previously noted, a general limitation on ChatGPT's ability to answer legal queries is that it has only been trained on data up to June 2021.
We asked ChatGPT what it thought of the potential risk of users overestimating its knowledge of the law. In response it replied, 'As a language model, I am able to understand and respond to natural language input, but it's important to note that my knowledge is based on the text that I have been trained on, and may not be accurate or up-to-date. Additionally, I am not a substitute for legal advice from a qualified professional'. This response demonstrates that while ChatGPT does not itself recognise the risk of users overestimating its capacity to advise on the law (and would not give such a warning when asked a legal question), when prompted, ChatGPT does recommend that users consult qualified practitioners in relation to legal queries.
- 'Hallucination' / making up content – ChatGPT, like other AI programs, is vulnerable to 'hallucinations', ie it will fabricate facts and sources where it does not have access to sufficient data. See for example this interaction where ChatGPT repeatedly fabricated scientific studies in response to user queries. This tendency is particularly problematic as ChatGPT will not indicate where it has fabricated information and will display the same level of confidence when providing a factually correct answer. ChatGPT's hallucination rate has been cited to be between 15% and 21%. Therefore lawyers will need to exercise significant quality control over all ChatGPT output which they use.
We asked ChatGPT, 'Do you ever fabricate sources or information?'. ChatGPT replied, 'As a language model, I do not have the ability to fabricate sources or information, I can only generate text based on patterns in the data that I was trained on. However, my responses may not always be accurate or appropriate given the context of a user's question'. This suggests that ChatGPT does not see a 'hallucination' as equivalent to 'fabrication', but rather a by-product of the way it generates answers, ie by replicating text based on patterns derived from its training data.
- Potential bias – Another issue prominent amongst AI solutions that also affects ChatGPT is the issue of bias. As ChatGPTs 'knowledge' is based on the large dataset contained within GPT-3, ChatGPT is prone to reproduce any biases that are found in GPT-3 in its answers to user prompts. Such biases could include gendered and racist view points. It may also have a regressive bias, ie favouring old, but more frequently replicated, information within its dataset. This may cause the AI to favour outdated interpretations of legal principles due to the fact that such interpretations are more frequently represented with GPT-3's dataset.
We asked ChatGPT whether it was biased towards certain viewpoints. In reply, it said, 'As a machine learning model, I am not capable of having opinions or biases. However, the data that I was trained on may contain biases, which could be reflected in my responses. OpenAI is actively working on ways to detect and mitigate these biases in our models'. ChatGPT is clearly aware of the risk that a response may contain bias, and it will be interesting to see how this is mitigated as the technology develops.
- Confidentiality and legal professional privilege – Like the use of any IT service which requires input or other exposure of data to a program hosted beyond an organisation's firewall, the use of ChatGPT poses a risk to the confidentiality and privilege of any data which it is asked to consider. Law firms and other organisations planning to license ChatGPT for their use will need to clarify whether any data which they input into ChatGPT will be retained and used by OpenAI or the ChatGPT software and the security controls OpenAI has in place.
We put to ChatGPT whether users should be concerned that queries they share with ChatGPT will not be kept confidential. ChatGPT replied, 'OpenAI takes data privacy and security very seriously. We have implemented a number of technical and organizational measures to protect user data, including encryption and strict access controls. Additionally, the model does not store any information about the queries it processes. However, as with any online service, there is always a risk that data could be accessed or misused by unauthorized third parties'.
As we understand that AI trainers will review certain conversations as part of optimising ChatGPT, we queried how user queries would be kept confidential if the conversations could be accessed by those trainers. ChatGPT replied, 'When an AI trainer is reviewing conversations, they are typically only looking at a small sample of the data that the model has been trained on, and do not have access to any identifying information about the users involved in the conversation. Additionally, the data is typically de-identified, which means that any personal information that could be used to identify an individual has been removed.' That AI trainers can review conversations and that data will only 'typically' de-identified should be of concern for users who have, or are considering, submitting any queries containing confidential, privileged and/or personal information.
- Copyright ownership – In Australia, copyright protects certain subject matters which are expressions of ideas, including 'literary works', being materials expressed in print or writing, provided they are 'original'. That is, the work must have originated from a human 'author' who has applied some 'creative spark', 'independent intellectual effort' or 'skill and judgement', and not be copied from another work. On the one hand, copyright is likely to protect the various human-generated inputs which are used by ChatGPT (eg articles in its training dataset, or queries submitted by users) as well as the software code which makes up the chatbot itself, and would be owned by the respective human authors or persons deriving title from those authors. On the other hand, the originality threshold is unlikely to be satisfied in relation to ChatGPT's text output (to the extent it has not been copied from any inputs), which means that in Australia, copyright is unlikely to protect, and no one would be the author or copyright owner of, such output.
This is to be contrasted with the position in the UK, where for computer-generated works which involve no human author, the author (and therefore first copyright owner) is taken to be 'the person by whom the arrangements necessary for the creation for the work are undertaken'. Therefore, lawyers wishing to use ChatGPT to generate legal documents or advice should be cognisant of the risk that their ability to prevent third parties from using or copying those documents or advice may be limited to the extent they simply comprise ChatGPT's responses, potentially undermining their commercial value. However, copyright could still protect a new work created by a human lawyer using ChatGPT's responses (eg by adding to, or editing, the responses), to the extent such additions, changes or new materials created are 'original'.
We asked ChatGPT 'In Australia, who owns copyright in the responses that you generate?'. Somewhat self-servingly, ChatGPT replied 'Copyright in any text generated by me would likely be held by the entity or person that commissioned or created me, such as OpenAI or the developer who uses my API'. This assessment by ChatGPT as to the ownership of copyright in its responses is not consistent with the current principles under Australian copyright law.
- Copyright and moral rights infringement – As discussed above, copyright is likely to protect the various human-generated inputs used by ChatGPT. If ChatGPT generates a response that copies a substantial part of any such input, and the input has been authored by and/or is owned by a third party, then a user who reproduces or distributes that response without permission from the copyright owner may be infringing copyright, and a user who fails to properly attribute the author or edits the response in a way which is prejudicial to the author's honour or reputation may be infringing moral rights.
Since ChatGPT draws from its broader training data rather than specific sources to generate responses, and its processes are generally unknown to the user, it is likely to be difficult to identify where there is a risk of infringement or who the copyright owner and/or author is. The question of whether developers of AI infringe copyright in materials on which the AI is trained is currently playing out in two recent lawsuits brought in the UK and the US. In the UK, Getty Images has brought a copyright infringement action against Stability AI, the developer of AI image generator Stable Diffusion, claiming that the processing of images in which Getty Images owns the copyright for the purpose of training Stable Diffusion infringed the copyright in those works. In the US, a class action has also been launched by software developers against Microsoft, GitHub and OpenAI claiming that the creation of AI-powered coding assistant GitHub Copilot constitutes 'software piracy on an unprecedented scale'.
The outcomes of these cases could have significant implications for the viability of AI as a tool for businesses. They will consider the extent of 'fair use' principles and, if the plaintiffs are successful, may even open the door to an entirely new form of copyright royalty payment streams. Further, even if developers could be absolved of liability for infringement (eg there are statutory text and data-mining exceptions to copyright infringement under EU and UK laws, which have no equivalents under Australian law), that does not necessarily mean the same protection would extend to users.
We also asked ChatGPT whether there is a risk that the responses it generates will reproduce and infringe third-party copyright. ChatGPT astutely replied:
'As a language model, I am trained on a large dataset of text and my responses are generated based on patterns and associations found in that data. This means that it is possible that my responses may include text that is similar to or identical to text that is protected by copyright. However, it is not my intention to reproduce or infringe on any third-party copyrights.
It's important to note that copyright laws have provisions that allow the use of copyrighted material without permission under certain circumstances, such as fair use or fair dealing. The use of text generated by me may fall under such exceptions, but it ultimately depends on the specific context in which the text is used.
It's also important for users to consider the ethical and legal implications of using AI-generated text, and to take steps to ensure that they are not infringing on any third-party rights'.
- Ethical concerns – the use of ChatGPT also creates a number of ethical concerns for legal practitioners:
- Competence – lawyers in Australia are subject to obligations of competency. As part of this duty, when using any emerging technology to inform their practice, Australian lawyers may be expected to understand how that technology operates and the benefits and risks to their clients. With technologies that use neural networks, such as ChatGPT, this can be difficult. Neural networks are often described as 'black boxes', meaning that humans are unable to decipher how the network arrived at answers, even by studying the network's design.
- Duty to the client – Australian lawyers are also required to act in the best interests of their client. With the potential time and cost reductions ChatGPT may bring to legal practice highlighted above, it may eventually be considered to be acting contrary to the client's interest to not utilise ChatGPT to facilitate the efficient conduct of a client's matter, subject to ChatGPT being shown to be a dependable tool for legal practice. On the other hand, using ChatGPT and other tools like it without verifying the veracity and quality of its output may amount to negligent conduct on the part of a practitioner.
- Independence – Australian lawyers are required to act and exercise independent professional judgement in practicing law. Overreliance on technologies such as ChatGPT may be considered a failure to exercise a lawyer's independent judgement. Lawyers will therefore be expected to carefully scrutinise the output of AI tools which they use as part of their practice.
A more in-depth discussion of the ethical considerations for solicitors when using artificial intelligence is set out here.
We asked ChatGPT 'Should lawyers in Australia be concerned that using you as a tool could lead them to breach their ethical duties of competence, to act in their client's best interest and independence?'. ChatGPT conceded there was a risk that use of it by legal practitioners could lead to a breach of that lawyer's ethical duties: 'It is possible that using a language model like me could lead to a lawyer breaching their ethical duties, but it would depend on how the model is used and the specific circumstances of the case. It is ultimately the responsibility of the lawyer to ensure that they are fulfilling their ethical obligations, and using a tool like me should not absolve them of that responsibility…. it is also important to consider the ethical implications of using AI tools and ensure that they are used in a way that supports, rather than undermines, the provision of high-quality legal services.'
Law firms and in-house legal teams considering the use of ChatGPT and other generative AI technologies should consider implementing the following safeguards:
- Scrutinise the output: AI chatbots are not, currently at least, a viable replacement for the work of a trained lawyer. When using AI chatbots to inform legal advice, it remains imperative that practitioners scrutinise the chatbot's output to satisfy themselves that the information contained within the chatbot's answers is correct, current and does not reproduce third-party materials in a way which may infringe copyright or moral rights. ChatGPT's answers to legal questions demonstrates it can often provide incorrect answers. Being clear that the output does not infringe copyright will, however, be difficult given the breadth of materials upon which the chatbot may have been trained (eg GPT-3's 175 billion parameters and 300 billion words). We note that ChatGPT will produce sources for its work when asked. However, this list may not be complete or reliable, as there have been documented examples of ChatGPT 'hallucinating' sources to support its output.
- Implement rules regarding the submission of information to the chatbot: Practitioners should ensure that they or their organisations clarify whether the chatbot retains or uses any data that is submitted to it. Additionally, practitioners should seek assurance as to what security controls are in place to avoid confidential information being accessed or used to inform the chatbot's responses to queries made by third parties. It may also be prudent to enter into a licencing arrangement with the owner of the chatbot to ensure practitioners and their employers have protection in the event the owner of the chatbot fails to ensure adequate protection of data which is submitted to their chatbot.
- Take steps to protect copyright: To maximise the prospects of obtaining copyright protection in any work product which has been produced using ChatGPT and to minimise the risk of infringement of third-party rights, practitioners should avoid adopting ChatGPT's responses verbatim. Instead, they should apply their own skills and judgement to consider and process the responses and to create a new work, and should ensure that the creation of the new work is well-documented.