AI-Powered Due Diligence for GP Selection

8 min read

A mid-sized allocator reviews 200+ funds per year and invests in 10–20. That means hundreds of PPMs read, hundreds of DDQs processed, hundreds of track records analyzed — mostly to say no. The analysts doing this work are the most expensive employees in the shop, and they spend the bulk of their time on pattern-matching tasks that are structured, repetitive, and machine-tractable.

The useful framing is not "AI replaces manager diligence." It is "AI handles the parts that are pattern-matching so analysts do the parts that require judgment." The parts that require judgment — reference calls, sitting with the GP, understanding edge cases in the track record — are where alpha lives. The parts that do not are where analyst time currently gets burned.

The best manager researchers are great at reading people. Software should be doing everything else so they have more time to do that.

Where AI handles the work cleanly

Four tasks where production-grade automation works today.

DDQ extraction and comparison. A GP responds to the allocator's DDQ with 80 pages of answers. Another GP in the same strategy responds to the same DDQ differently. Extracting structured answers, normalizing them, and comparing across GPs is exactly what document AI plus a schema does well. This alone saves 2–4 hours per manager.

Track record normalization. GPs report performance in idiosyncratic ways — gross vs. net, with and without fund-of-one accommodations, different vintage classifications, different benchmark choices. Normalizing to a consistent basis so cross-manager comparisons are meaningful is mechanical work that gets done inconsistently by analysts. Software does it consistently.

PPM and LPA red flag scanning. Key man provisions, removal rights, fee structures, waterfall mechanics, indemnification language, valuation policies — known risk patterns that every good LP's legal team screens for. AI can flag deviations from market-standard terms automatically. The legal review still happens; it happens with a prioritized list of issues to examine rather than a cold read.

News and background monitoring. Ongoing monitoring of the GP, key people, and portfolio companies for news, regulatory actions, litigation, and reputational signals. This is pure retrieval and classification work. Software does it continuously; humans do it when they remember.

Task	Manual time	AI-assisted time	Quality delta
DDQ review per manager	4–6 hours	1–2 hours	More consistent
Track record normalization	2–4 hours	30 minutes	More comparable
PPM/LPA risk scan	4–8 hours (legal)	1–2 hours	More exhaustive
Ongoing monitoring	Ad hoc	Continuous	Systematic

Where AI does not work yet

Three tasks where software is not ready and pretending otherwise creates problems.

Judgment about people. Whether this GP is trustworthy, whether the team has the chemistry to survive the next downturn, whether the stated strategy reflects what the team actually does — these are read from tone, body language, reference quality, and a thousand other signals that do not reduce to documents. Models cannot do this reliably and probably will not for a long time.

Strategy coherence assessment. Does the stated strategy actually make sense? Does it fit the market environment? Does the team's background support their thesis? Models can summarize the stated strategy accurately and still miss that it is internally incoherent or out of step with reality. Pattern recognition without domain understanding produces confident wrong answers.

Reference call synthesis. The 30-minute call with a former LP contains signal that is mostly implicit — what was not said, hesitations, careful word choices. Transcription captures words. Signal is not only in words. AI can summarize the transcript; it cannot tell you what the call actually meant.

The demo-to-production gap. Vendors demoing AI diligence tools typically show an impressive output on a polished test case. Put the same tool against a fund manager with unusual reporting conventions, non-English supporting documents, or a complex multi-fund history, and quality degrades fast. Before signing a vendor contract, run the tool against your actual worst-case managers, not the clean test cases.

What a production diligence workflow looks like

Well-designed diligence workflows use AI to produce a structured, pre-digested package that the analyst enters with prepared questions rather than a blank PPM.

Diligence workflow with AI support

Intake: PPM, DDQ, track record, LPA, team bios ingested automatically
Extraction: Key terms, performance metrics, team composition extracted to structured form
Comparison: Cross-manager comparison against same-strategy peers and market standards
Red flag scan: Non-standard terms, performance anomalies, team issues flagged
Analyst review: Analyst enters with prepared hypothesis and prioritized questions
Manager meetings: Focused on judgment and reference calls — the AI-unsolvable parts
Investment committee: Memo draft auto-generated from structured data, analyst adds judgment

The analyst spends more time on manager conversations and less on document processing. The quality of the investment decision improves because the document processing is more thorough and consistent than manual review would be, and the analyst's attention is on the judgment-critical parts.

Where firms get this wrong

Two failure modes.

Over-relying on AI-generated memos. Some firms use AI to generate the investment committee memo and treat the output as the substantive product. This is a mistake. The memo should reflect the analyst's judgment. AI produces a structured draft; the analyst edits to reflect actual conviction. Memos written entirely by AI read like they were — fluent, generic, and uninformative.

Treating AI-generated red flags as decisions. A model flags a non-standard key person provision. That is a signal to examine, not a reason to decline. Firms that filter funds on raw AI outputs without analyst review end up rejecting good managers for nothing and accepting bad ones because the model missed the real issues. AI supports judgment; it does not substitute for it.

For allocators building diligence workflows with AI integration, the alternative investments capability model maps diligence against adjacent capabilities like manager monitoring, portfolio construction, and risk aggregation — useful for scoping where AI investment produces real leverage versus where it is just instrumented document review.

Frequently Asked Questions

Does AI-assisted diligence reduce analyst headcount?

Usually no. It shifts analyst time from document processing to judgment-intensive work. Most firms that adopt AI-assisted diligence end up covering more managers with the same team rather than shrinking the team. The business case is about throughput and quality, not headcount reduction.

How accurate is AI at flagging PPM and LPA risks?

Accuracy on identifying non-standard terms is high — above 90% against a curated risk library. Accuracy on interpreting the significance of identified terms is lower and requires legal review. The right model is AI flags, legal team interprets. Firms that try to use AI for legal interpretation create exposure; firms that use it for term identification augment the legal team usefully.

Can AI help with ongoing manager monitoring after investment?

Yes, and this is often where AI delivers the most value per dollar of investment. Continuous monitoring of news, regulatory filings, and portfolio company signals for every invested manager is intractable with analyst time. AI does it systematically with escalation to analysts on material signals.