How good Is AI at reviewing contracts? A practical legal test

Written by 
LawVu
Updated April 20, 2026

We put AI through a series of real contract review tasks, from redlining an NDA to identifying missing provisions and anonymizing a licensing agreement. Here is what it got right, what it got wrong, and what it means for your legal team.

TL;DR

  • AI is genuinely useful for contract review tasks, but its performance is uneven and depends heavily on how the task is structured.
  • For simple, clearly defined instructions like applying a specific clause standard, AI performs well.
  • For complex or nuanced legal judgment calls, like assessing whether a clause is problematic in context, AI frequently misses the mark.
  • The most effective approach is AI working alongside a qualified lawyer, not replacing one.
  • Purpose-built tools like LawVu Draft Review combine AI capabilities with your own playbooks and clause library, which produces far better results than using a general AI model alone.

Can AI actually review a contract?

It is one of the most common questions in legal tech right now. AI models can pass the bar exam, draft coherent clauses, and summarize complex documents in seconds.

So how do they hold up when asked to do what many lawyers spend a significant portion of their time on: reviewing a contract, identifying problems, and marking up proposed changes?

To find out, we ran a series of practical tests using AI on real contract review tasks. The tasks ranged from simple redlining instructions to open-ended analysis and document anonymization. The results were instructive, not because AI failed, but because of the specific pattern of where it succeeded and where it fell short.

The short answer: AI is a genuinely useful contract review tool when it is well-directed. It struggles when the task requires deep legal judgment, contextual understanding, or consistency across a long document. And it works best when it is grounded in your own standards, not left to interpret instructions from scratch.

Continuing reading for more specifics.

What is AI contract review?

AI contract review is the use of large language models (LLMs) or purpose-built AI tools to read, analyze, and suggest changes to contracts. It can include:

  • Redlining: Identifying language that does not meet your standards and proposing alternative wording
  • Risk flagging: Surfacing clauses that are missing, inconsistent, or potentially problematic
  • Playbook enforcement: Checking a counterparty’s contract against your approved positions
  • Anonymization: Replacing party names and identifying information with generic terms

Each of these tasks requires a different kind of AI capability, and each has its own ceiling.

Test 1: Redlining a specific clause

We started with the most straightforward type of review task. We gave AI a mutual NDA and asked it to amend the definition of Confidential Information so that it covers all information that would reasonably be considered confidential, rather than only information explicitly marked as such.

The result was good. AI understood the instruction, identified the relevant definition, and proposed a reasonable amendment that reflected the requested change. The language was clean, the redline was appropriate, and a lawyer reviewing the output would have little to fix.

This kind of task is where AI performs best. The instruction is specific, the scope is narrow, and success is easy to define. When you tell AI exactly what change to make and where, it can execute reliably.

Takeaway: For targeted, well-defined redline instructions, AI is genuinely useful and fast. This is the clearest current use case for AI in contract review.

Test 2: Making an NDA unilateral

We stepped up the complexity. We asked AI to make the entire NDA unilateral in favor of one party.

This requires the AI to understand what “unilateral” means in a legal context and to apply that understanding consistently across an entire document. The results were mixed.

On the positive side, AI correctly understood that “unilateral” means one party discloses and one receives. It replaced “Disclosing Party” with the favored party name and “Receiving Party” with the other, in most places. It also updated the document title, which was unnecessary but not wrong.

The problems showed up in two areas. First, inconsistency: across the document, AI made the right change in some places and missed it in others, for no apparent reason. A lawyer reviewing the output would need to read every clause carefully to catch the missed instances. Second, one early change was simply wrong: AI removed the other party from the introduction clause entirely, apparently taking “unilateral” too literally. A contract with one party is not a contract.

This test illustrates a pattern that appears consistently in AI contract review. The model understands the concept at a general level but does not apply it with the legal precision and consistency that a careful human review would produce. It gets most of it right, which can create false confidence, and then gets something wrong in a way that requires the same careful read-through as if AI had not been involved at all.

Takeaway: AI can handle document-wide changes but introduces inconsistencies that require careful human review. The time savings are real but smaller than they first appear.

Test 3: Open-ended risk analysis of an NDA

We asked AI to identify the most problematic clauses in the NDA, without giving it any specific criteria. This is the kind of open-ended analysis a senior lawyer might ask a junior associate to perform.

The results were genuinely interesting, though uneven.

What it got right: AI flagged that the definition of “Affiliate” referenced an outdated piece of legislation that was no longer in force. This is exactly the kind of thing that is easy to miss in a document you are reviewing quickly and hard to catch without reading closely. The fact that AI caught it demonstrates real value.

AI also correctly identified that no annex on “Inside Information” was attached to the document, even though the agreement referenced one. Finding missing attachments requires reading the entire document carefully and cross-referencing every reference. AI handled this well.

What it got wrong: AI flagged the definition of “Confidential Information” as potentially overly broad. On the surface, that is a reasonable concern. But looking at the actual definition, the language was restricted to information explicitly marked as confidential, which in fact significantly narrows its scope. AI identified a surface-level concern without understanding the limiting language that made it less of a problem than it appeared.

AI also flagged the receiving party’s obligations as potentially too strict. Whether that is true depends on the commercial context, the bargaining positions of the parties, and what the NDA is for. AI had no way to assess any of that.

Takeaway: Open-ended AI analysis produces a mix of genuinely useful catches and noise that requires legal judgment to sort. It is a useful first pass, not a reliable replacement for a qualified review.

Test 4: Playbook-based review of a licensing agreement

This test is where AI showed the most practical potential for in-house and law firm use cases. Rather than open-ended analysis, we gave AI specific criteria for what makes a clause problematic:

  • Liability cap equal to or below $50,000
  • Confidentiality obligations are mutual
  • Dispute resolution not handled by arbitration under specific rules
  • Notice period for termination longer than one month

With clear, measurable criteria, AI’s performance improved significantly. It correctly identified and addressed the dispute resolution clause, inserting the required arbitration language. It also correctly implemented the liability cap.

However, it missed two of the four criteria entirely, identifying the issues in text but not making the corresponding changes. And the liability cap change it did make was superficial: it inserted the dollar amount without knowing whether the existing annual fee provision was above or below the threshold, which means the change may have been unnecessary or even incorrect.

This test makes the case for playbook-based AI review more clearly than any other. When AI has structured, specific criteria to work with, it performs much better than when left to use its own judgment. The implication is that the quality of AI contract review is proportional to the quality of the instructions it is given.

Takeaway: Playbook-enforced AI review outperforms open-ended AI review. The more specific your criteria, the better the results.

Test 5: Document anonymization

We asked AI to anonymize a software licensing agreement by replacing party names and identifying information with generic terms.

AI started well. It recognized the document as a software license agreement and replaced the party names with generic “Licensor” and “Licensee” terms in most places. But it missed some instances of the original party names, anonymized a company’s full name in one location but not its short-form reference throughout the rest of the document, and introduced inconsistencies that would require a careful line-by-line read to catch.

Full anonymization at scale remains one of the harder problems in legal AI. No tool currently guarantees 100 percent accuracy, and this test confirmed that limitation. AI gets you most of the way there but not all the way.

Takeaway: AI anonymization is useful for reducing manual work but cannot be trusted without a thorough human review of the output.

Try LawVu Draft for free

See what's possible when AI and institutional knowledge work together. Request a 14-day free trial and we'll help you get started.

What AI for law firms and in-house legal teams needs to do

The test results above reflect what happens when AI is asked to review contracts without any grounding in your specific standards. The performance ceiling is real, but it is not fixed. It rises significantly when AI is connected to your playbooks, your clause library, and your preferred language.

This is the fundamental difference between using a general AI model for contract review and using a purpose-built AI review tool.

For law firms, the critical requirement is that AI enforces the firm’s standards, not generic ones. When a client sends a contract, the review should check it against the firm’s approved positions on limitation of liability, IP ownership, data protection, and every other issue that matters to the practice group handling the matter. That requires the AI to know those positions, which means connecting it to the firm’s playbook and clause library.

An Haenen, Operations Manager at Sirius Legal, described what purpose-built AI review made possible for their firm:

“The automation of the audit documents through LawVu Draft turned out to be a major step forward for our firm. Now the end-of-year surge can be managed by just one person, and large amounts of audits can be completed within the agreed upon time frame and within available budgets.”

An Haenen, Operations Manager at Sirius Legal

For in-house legal teams, the most valuable use case is reviewing counterparty paper against your company’s approved positions. Every incoming contract should be checked against the same standards, consistently, regardless of which lawyer handles the review. AI can do this at a scale and speed that manual review cannot match, but only if it knows what your standards are.

Fabienne Lallemand, Chief Legal and Compliance Officer at SD Worx, described the shift from manual to structured AI-assisted review:

“LawVu Draft allows our in-house lawyers to centrally manage contracts and make them available in an intelligent, user-friendly way to colleagues who need them. In this way, we streamline the operation between the legal department and the rest of the company and increase the quality of our documents.”

Fabienne Lallemand, Chief Legal and Compliance Officer at SD Worx

How LawVu Draft Review works

LawVu Draft’s Review feature is built around the insight that AI contract review works best when it is grounded in your standards. Here is what that looks like in practice.

Redline against your clause library. Rather than generating generic redlines, LawVu Draft compares third-party contract language against your approved clause library and proposes alternatives using your preferred language, with a one-click insertion that preserves formatting.

Enforce playbook standards automatically. Apply custom playbooks to automatically flag and rewrite non-compliant clauses. When AI has specific criteria to work with, it performs dramatically better than open-ended analysis, which is exactly what the tests above demonstrated.

Spot risks and missing terms in real time. AI highlights risky language, inconsistencies, and missing provisions, so you can direct your attention where it matters most, rather than reading every clause at the same level of attention.

Compare clauses and alternatives. Compare language across documents, evaluate alternatives, and insert the best option into your agreement, all inside Microsoft Word without switching between tools.

Yunna Choi, Former Head of Legal Operations and Innovation at Axel Springer, described why the Word integration mattered for their team:

“Many solutions require lawyers to work in separate platforms with Word-like text editors, which adds unnecessary friction. LawVu Draft’s deep integration with Microsoft Word was a major advantage for us. It supports our team in their existing workflow rather than forcing a system change, while also allowing us to fully leverage Word’s native formatting and tools.”

Yunna Choi, Former Head of Legal Operations and Innovation at Axel Springer

Key takeaways

  • AI performs best on contract review tasks with specific, well-defined criteria. Open-ended review produces a mix of useful catches and noise.
  • AI introduces inconsistencies across longer documents that require careful human review. The time savings are real but depend on how well the task is structured.
  • Playbook-based AI review significantly outperforms general AI review. Connecting AI to your specific standards is the most important factor in review quality.
  • AI is not a replacement for legal judgment. It is a tool that makes qualified lawyers faster and more consistent.
  • Purpose-built AI review tools outperform general models because they are grounded in your clause library, playbooks, and preferred language.

Try LawVu Draft for free

See what's possible when AI and institutional knowledge work together. Request a 14-day free trial and we'll help you get started.