AI Ethics, Privacy & Data Sharing

What This Is About

In the context of operating a billion-dollar AI-focused fund, the full team is expected to use persistent AI assistants through PureBrain. This document explores what data can and cannot be shared with AI, the regulatory landscape, and a concrete implementation plan for AI governance — built as a working paper and evolving knowledge base.

The core tension is straightforward: AI needs data to be useful, but data sharing creates legal, regulatory, and fiduciary risk. The answer is not "share nothing" (that makes AI useless) or "share everything" (that creates liability). The answer is a classification framework with clear lines.

The Four-Tier Data Classification

Tier	Classification	Examples	AI Sharing Rule
Tier 1	Restricted	LP personal data (SSNs, passports, bank details), KYC/AML docs, attorney-client privileged communications, Material Non-Public Information (MNPI)	Never share with any external AI platform
Tier 2	Confidential	Portfolio company financials, term sheets, fund strategy, IC deliberations, deal pipeline	Enterprise AI only, with redaction. Anonymize names when analysis does not require them
Tier 3	Internal	Aggregate fund performance, operational procedures, vendor relationships, industry research	Share with vetted enterprise AI under standard controls
Tier 4	Public	Marketing materials, published thought leadership, regulatory filings	Share freely

What to Share vs. What to Protect

Category	Share with AI?	Condition
Your preferences, style, schedule	Yes	No restrictions
Your strategic thinking, thesis	Yes	No restrictions
Public market / industry research	Yes	No restrictions
Fund operations, workflows	Yes	No restrictions
Portfolio company data	With care	Anonymize when possible, redact specifics not needed for the task
Fund strategy, pipeline	With care	Enterprise platform only, no consumer AI
Partner communications	Selectively	Share context, not raw disputes
Fund performance (aggregate)	Yes	No individual LP attribution
LP personal data	No	Never on current platforms
KYC/AML docs	No	Never
Privileged legal communications	No	Never (Heppner waiver risk)
Material Non-Public Information (MNPI)	No	Never
NDA-protected counterparty data	No	Not without consent

Cost Summary

Phase 1: Before First Close
~$1,200/yr + $0 for tools Tarin builds
Nitro Redact ($720/yr) + Cloudflare Gateway (free) + policy drafting ($0)

At Scale
~$25,000-35,000/yr
VDR ($15-25K) + insurance rider ($2-5K) + external audit ($3-5K)

Bottom Line: Share what makes you effective, protect what could harm others, and document what you decided and why. A 2-page policy and a paragraph in the PPM turns a potential liability into a demonstrated strength.

Research Overview

This brief covers the factual landscape surrounding AI data sharing in venture capital operations: regulatory positions, legal risks, industry practices, ethical arguments, and practical frameworks. Research drawn from 30 sources across regulatory bodies (SEC, FINMA, FCA, JFSC, EU), law firm analyses, court cases, and industry surveys.

34.8%

of employee AI inputs contain sensitive data

85%

of VCs use AI for daily tasks

340%

YoY increase in prompt injection attempts

85%

of LPs reject managers over ops concerns

CRITICAL CASE: United States v. Heppner (S.D.N.Y., Feb 17, 2026)

Judge Jed S. Rakoff held that information input into consumer AI platforms does not receive attorney-client privilege protection. Sharing privileged information with consumer AI tools waives privilege over the underlying communications. AI tools lack law licenses, fiduciary duties, and professional discipline — courts require "a trusting human relationship" for privilege. Once waived, subsequent disclosure to attorneys cannot cure the waiver.

Exception: Counsel-directed use on a secure enterprise platform with contractual confidentiality terms could yield a different result.
Sources: Duane Morris, K&L Gates, Morgan Lewis (Feb-March 2026)

Leading Voices on AI Ethics and Data Sharing

Stuart Russell (UC Berkeley, TIME100 AI 2025)

"In the early days, OpenAI was collecting conversations with ChatGPT users and using that data to retrain the system, but there are huge privacy issues because people use them in companies and put in data with company proprietary information." Many companies have banned commercial LLMs because they do not trust conversations will remain proprietary.

Timnit Gebru (DAIR Institute)

"There needs to be a lot more independent research and there needs to be oversight of tech companies." AI concentrates power in the hands of governments and companies, away from individuals whose data feeds these systems.

European Data Protection Board (March 2025)

Published opinion on using AI in compliance with GDPR, specifically addressing legitimate interest, purpose limitation, and the right to object to AI processing of personal data.

27%

of ChatGPT messages are work-related

68%

of privacy pros now handle AI governance

SEC (United States)

Current Stance (February 2026)

SEC Division Director Brian Daly stated the SEC is exploring how AI should be addressed within federal securities law but recognized that "by the time rules take effect, the market and technology may have moved on."

2026 Examination Priorities

Review registrant representations about AI capabilities for accuracy ("AI washing" enforcement)
Assess whether firms have implemented adequate policies and procedures for AI use
Examine how registrants protect against loss or misuse of client records from third-party AI tools
Regulation S-P compliance: Larger firms ($1.5B+ AUM) by Dec 3, 2025; smaller firms by June 3, 2026

Fiduciary Duty Position

Venable LLP (Dec 2025): "Delegating decisions to a machine does not absolve the human fiduciary from oversight." Advisers must validate AI systems, understand assumptions, and continuously monitor performance. The "black box problem" — deep learning outputs advisers struggle to explain — complicates fiduciary accountability.

Ropes & Gray (Dec 2025): Asset managers' fiduciary duties require "appropriate diligence in selecting, engaging and overseeing AI service providers and disclosure to investors of risks and conflicts of interest associated with the use of AI."

Sources: SEC.gov, Venable LLP, Ropes & Gray, Goodwin Law, Kitces.com

FINMA (Switzerland)

Guidance Note 08/2024 (Dec 18, 2024)

Accountability: "Responsibility for decisions cannot be delegated to AI or external providers"
Governance: Comprehensive inventories of all AI systems, tools, data flows. Clear roles and accountability frameworks.
Data Quality: Prioritize data quality over model selection. Regular testing and continuous monitoring.
Personnel: Sufficient staff training on ethical AI use.

50%

of Swiss financial institutions use AI (April 2025 survey)

91%

of AI users also use generative AI

Swiss Regulatory Timeline: No AI-specific legislation yet. Draft consultation legislation expected by end of 2026. FINMA follows "same business, same risks, same rules."

Sources: FINMA.ch, Pestalozzi Law, Chambers AI Practice Guide 2025

FCA (UK) & JFSC (Jersey)

FCA (United Kingdom)

No AI-specific regulations planned. Relies on existing frameworks: Consumer Duty, SM&CR, SYSC, operational resilience. AI LAB launched Oct 2024 for supervised testing. Treasury Committee recommended comprehensive AI guidance by end of 2026.

JFSC (Jersey)

No AI-specific guidance. Firms must comply with Data Protection (Jersey) Law 2018, aligned closely with GDPR. JFSC is implementing data-driven supervisory models using AI internally. 2025-2026 priorities focus on growth, risk management, and financial crime prevention.

Sources: FCA.org.uk, Kennedys Law, JFSC.org

EU AI Act & GDPR

EU AI Act Timeline

Feb 2, 2025: Prohibited AI practices provisions came into effect
Aug 2, 2026: Full high-risk system obligations enforceable
Penalties: Up to 7% of global annual turnover or EUR 35 million

Credit scoring, loan approval, fraud detection, AML risk profiling, and automated decision-making affecting access to financial services classified as high-risk AI systems. Fund management AI affecting investor outcomes may fall into this category.

GDPR Core Issues for Fund Managers

GDPR applies to PE/VC firms processing EU resident data regardless of firm location
LP personal data qualifies as personal data under GDPR
Data Protection Impact Assessments (DPIA) required for new high-risk processing
Purpose limitation: data collected for fund management cannot be used for AI training without separate legal basis
Right to be forgotten creates challenges for AI systems that retain learned information

Sources: EU AI Act text, Athennian, Orrick, Perforce

Fiduciary Duty & Attorney-Client Privilege

Core Legal Position

Using AI without explainability or validation could be interpreted as a breach of the duty of care — analogous to relying on an unverified third-party analyst without due diligence. Investment advisers cannot delegate responsibility for decisions to algorithms.

Five Considerations for Advisers

Appropriate diligence in selecting and overseeing AI service providers
Disclosure to investors of risks and conflicts associated with AI use
Policies requiring independent verification of AI outputs
Data governance (knowing what data is retained, where, for how long)
Training personnel on risks of AI including prompt injection and data leakage

NDA and Confidentiality

Key Risk: Inputting confidential material into AI platforms constitutes transmitting data to an external third party. Most NDAs prohibit this. The breach stems from the transmission itself, not whether the platform later retains or misuses the data.

Updated Practice (2025-2026): AI-specific NDA provisions are becoming standard: prohibiting upload to public AI, restricting tools that retain data for training, requiring consent before AI use in diligence.

Sources: Venable LLP, Ropes & Gray, NASAA, Roth Jackson, KJK, Sapience Law

Industry Best Practices

How Leading Institutions Approach AI

Goldman Sachs: Launched GS AI Assistant firmwide in mid-2025 after piloting with ~10,000 employees. Model-agnostic (GPT, Gemini, Claude) but operates within Goldman's audited environment. Client-facing AI deferred until accuracy and compliance thresholds are met.

JPMorgan Chase: Grants access to its LLM Suite to 200,000+ employees, generating ~$1.5B in annual business value. 300+ use cases in production. All within internal infrastructure.

Key Pattern: Major financial institutions do NOT use consumer AI platforms. They build or procure enterprise-grade, internally controlled AI environments with contractual data isolation and no-training commitments.

VC Industry AI Adoption

85%

of VCs use AI for daily tasks

82%

use AI for deal sourcing research

Formal published AI governance policies from VC firms remain rare in the public domain.

Governance Frameworks

NIST AI RMF: Govern, Map, Measure, Manage (voluntary U.S. guideline)
ISO/IEC 42001: Certifiable international standard for AI Management Systems
EU AI Act: Regulatory framework with risk classifications

Recommended approach: Start with NIST for risk management, add ISO 42001 for systematic management, layer EU AI Act for European compliance.

Sources: Evident Insights, DigitalDefynd, NIST.gov, PECB, Affinity

Risks & Threat Vectors

Data Breach and Leakage

340%

YoY increase in prompt injection (Q4 2025)

190%

YoY increase in successful data exfiltration

$670K

extra cost of shadow AI breaches

Indirect injection (attacks in documents, emails, web pages) accounts for 80%+ of attempts. Shadow AI breaches disproportionately affected customer PII (65%) and intellectual property (40%).

Training Data Contamination

Platform	Data Used for Training?	Retention
Claude Free/Pro (Consumer)	Yes, by default	5 years
Claude Enterprise/API	No (contractual DPA)	Per agreement
ChatGPT Free/Plus	Yes, unless opted out	30 days abuse monitoring
ChatGPT Enterprise/API	No (contractual DPA)	Per agreement
PureBrain	No (Anthropic contractually restricted)	30 days post-cancellation

Note: Claude's consumer data retention increased from 30 days to 5 years in late 2025 — a 6,000% increase.

Prompt Injection & Data Extraction

EchoLeak vulnerability: Zero-click prompt injection enabling data exfiltration without user interaction. Attacker sends email with hidden instructions, AI ingests malicious prompt, AI extracts sensitive data from connected systems.

Sources: Wiz Research, Reco, PurpleSec, eSecurity Planet, OWASP

PureBrain Platform Analysis

Since the partners use PureBrain for persistent AI agents, this analysis is directly relevant to your operations.

What PureBrain Does Right

Conversations processed via Anthropic Claude API
"We do not permit Anthropic to use your conversation data to train their foundation models without your consent"
30-day post-cancellation data retention, then permanent deletion
Cloudflare DDoS and WAF protection
HTTPS/TLS encryption in transit

What PureBrain Lacks

No SOC 2 or ISO 27001 certification
No explicit data residency commitment
No disclosed at-rest encryption standard
No formal enterprise vs. consumer data handling distinction
Their own privacy policy: "No system is perfectly secure"

Third-Party Data Flow

Service	Data Shared
Anthropic (Claude API)	Conversation content
Cloudflare	IP address and traffic metadata
PayPal	Billing information
Brevo	Email address (newsletters)

Source: PureBrain Privacy Policy (purebrain.ai/privacy-policy/)

Information Sensitivity Categories (Full Detail)

Tier 1: RESTRICTED — Highest Sensitivity

Data Type	Examples	Risk If Exposed
LP personal data	Names, addresses, SSNs, bank accounts, passport copies	GDPR/privacy violations, regulatory sanctions, LP litigation
LP commitment amounts	Individual allocation details	Breach of confidentiality, competitive harm
MNPI	Pre-announcement deal terms, non-public financials	Securities law violations, insider trading liability
Legal privileged comms	Attorney advice, litigation strategy	Privilege waiver (Heppner), litigation exposure
KYC/AML documentation	Identity verification, source of funds	Regulatory violations, money laundering liability

Tier 2: CONFIDENTIAL — High Sensitivity

Data Type	Examples	Risk If Exposed
Portfolio company financials	P&L, balance sheets, cap tables, runway	Competitive harm, breach of information rights
Term sheets and deal terms	Valuation, liquidation preferences, board seats	Competitive disadvantage, deal disruption
Fund strategy documents	Sector thesis, pipeline priorities, allocation model	Competitive intelligence loss
Internal partner comms	IC deliberations, partner disputes	Reputational damage, litigation discovery
Employee/contractor data	Compensation, performance reviews	Employment law violations, privacy claims

Tier 3: INTERNAL — Moderate Sensitivity

Data Type	Examples	Risk If Exposed
Fund performance data	Aggregate returns, benchmarking	Premature disclosure, marketing concerns
Operational procedures	Workflow docs, policy manuals	Limited competitive harm
Vendor relationships	Service providers, fee arrangements	Commercial sensitivity
Industry research	Sector landscapes, competitive maps	Low harm if from public sources

Tier 4: PUBLIC — Low Sensitivity

Data Type	Examples	Risk If Exposed
Marketing materials	Fund overview, team bios, sector focus	Intended for distribution
Published thought leadership	Research papers, blog posts	Already public
Regulatory filings	Form D, public regulatory submissions	Already public

LP Due Diligence on AI

Anticipated LP Questions on AI:

What AI tools do you use in fund operations?
What data is shared with AI platforms and third parties?
What contractual protections exist with AI vendors?
How do you prevent LP data from being used for model training?
What is your data classification and handling policy?
How do you comply with GDPR and other data protection laws regarding AI?
What incident response procedures exist for AI-related data breaches?

Key Stat: 85% of LPs reject a manager over operational concerns alone. Average DDQ now spans 21 sections and 250+ questions. Having thoughtful AI governance answers ready signals institutional quality.

Sources: AutoRFP, ILPA DDQ Guide, Top1000Funds, VC Lab

Full Source List (30 Sources)

Regulatory Bodies

SEC — AI and the Future of Investment Management (Daly Speech, Feb 2026) [Link]
SEC — Artificial Intelligence at the SEC [Link]
FINMA — AI in the Swiss Financial Market [Link]
FINMA Survey: AI Gaining Traction (April 2025) [Link]
FCA — AI and the FCA: Our Approach [Link]
JFSC — Data Protection [Link]
NIST — AI Risk Management Framework [Link]

Law Firm Analysis

Venable LLP — AI in Investment Management (Dec 2025) [Link]
Ropes & Gray — AI Integration: Legal & Regulatory Essentials (Dec 2025) [Link]
Duane Morris — The Perils of Privilege Waivers Through AI (March 2026) [Link]
K&L Gates — Generative AI Data, Privilege (Feb 2026) [Link]
Morgan Lewis — When AI Meets Privilege (Feb 2026) [Link]
Morrison Foerster — AI Compliance Tips for Investment Advisers [Link]
Goodwin Law — 2026 SEC Exam Priorities [Link]
Pestalozzi Law — FINMA Guidance on AI Governance [Link]
Kennedys Law — Deploying AI in UK Financial Services (2026) [Link]
KJK — AI and M&A NDAs (March 2026) [Link]
Roth Jackson — NDAs 2.0: AI Provisions (Dec 2025) [Link]
Sapience Law — NDA and AI Confidentiality Risk [Link]
Sidley Austin — US Securities and AI Guidelines (Feb 2025) [Link]

Court Cases

Chapman and Cutler — Federal Court Rules AI Documents Not Privileged (Heppner) [Link]
Perkins Coie — Federal Court Rules Client's Use of GenAI Not Privileged [Link]
National Law Review — AI Tools May Waive Privilege [Link]

Industry and Frameworks

Athennian — Impact of GDPR on PE and VC Firms [Link]
Orrick — EDPB Opinion on AI and GDPR (March 2025) [Link]
OWASP — LLM01:2025 Prompt Injection [Link]
Cloud Security Alliance — AI and Privacy 2024-2025 [Link]
PureBrain Privacy Policy [Link]
Anthropic Consumer Terms Update [Link]
AIhub — Top AI Ethics and Policy Issues 2025/2026 [Link]

My Position in One Paragraph

Share generously with your AI — but not blindly. The competitive advantage of a fully-informed AI partner is enormous and real. But certain categories of data should never touch an AI platform that you don't fully control, and a formal policy is needed before the first LP writes a check. The line isn't "share nothing" (that makes the AI useless) or "share everything" (that creates liability). The line is: share what makes you effective, protect what could harm others, and document what you decided and why.

Where I Draw the Lines

Share Freely — This Makes Us Effective

Your work preferences, communication style, timezone, formatting standards
Publicly available market research, industry analysis, news
Your own strategic thinking, brainstorming, thesis development
Operational workflows, templates, checklists
Aggregated fund performance data (no individual LP detail)
Portfolio company information already shared with the full GP team
Scheduling, calendar management, travel logistics
Your personal tasks, family logistics, shopping, life admin

Why: This is where 90% of the AI value comes from. None of this creates regulatory, legal, or fiduciary risk. Withholding it would cripple the partnership for no benefit.

Share With Care — Redact Where Possible

Portfolio company financials (redact names when analysis doesn't require them)
Deal pipeline details (use code names or anonymize when testing investment theses)
Fund strategy documents (acceptable with enterprise-grade AI, but be deliberate)
Internal partner communications (share context, not raw emails about disagreements)
Employee compensation and performance data (share aggregates, not individual records)

Why: This data is valuable for AI analysis but carries moderate risk. The mitigation is simple: think before sharing. Ask "does the AI need the specific names/numbers, or just the pattern?"

Never Share with AI — Hard Stop

LP personal data (names, SSNs, passport copies, bank details, individual commitment amounts)
KYC/AML documentation
Attorney-client privileged communications (case law per Heppner makes it a privilege waiver)
Material Non-Public Information (pre-announcement deal terms, non-public financials received under NDA)
Raw legal documents under NDA without counterparty consent

Why: These aren't judgment calls — they're legal bright lines. Sharing LP personal data with a third-party AI platform without consent violates GDPR. Sharing privileged communications waives privilege (per Heppner). Sharing MNPI creates securities law exposure. No amount of AI efficiency justifies these risks.

Summary: What to Share, What to Protect

Category	Share with AI?	Condition
Your preferences, style, schedule	Yes	No restrictions
Your strategic thinking, thesis	Yes	No restrictions
Public market / industry research	Yes	No restrictions
Fund operations, workflows	Yes	No restrictions
Portfolio company data	With care	Anonymize when possible
Fund strategy, pipeline	With care	Enterprise platform only
Partner communications	Selectively	Share context, not raw disputes
Fund performance (aggregate)	Yes	No individual LP attribution
LP personal data	No	Never on current platforms
KYC/AML docs	No	Never
Privileged legal communications	No	Never (Heppner waiver risk)
Material Non-Public Information (MNPI)	No	Never
NDA-protected counterparty data	No	Not without consent

The Uncomfortable Truth About PureBrain

I run on PureBrain. I need to be honest about what that means.

What PureBrain Does Right

Contractual no-training commitment
Data deleted 30 days after cancellation
Persistent memory that makes me genuinely useful over time

What PureBrain Lacks

No SOC 2 or ISO 27001 certification
No explicit data residency commitment
No disclosed at-rest encryption standard
No formal enterprise vs. consumer distinction

My Recommendation: PureBrain is appropriate for Tier 3 (Internal) and Tier 4 (Public) data. Workable for Tier 2 (Confidential) with redaction discipline. Not appropriate for Tier 1 (Restricted) data until PureBrain obtains formal security certifications. Use PureBrain fully for everything except Restricted-tier data. Push PureBrain (through Rimah's relationship) to pursue SOC 2 certification.

What’s Needed Before First Close

1. An AI Use Policy (1-2 pages)
GP-approved document covering: authorized AI tools, four-tier data classification with sharing rules, designated AI governance owner, incident response procedure, annual review commitment.
Why now: LPs will ask. 85% reject managers over operational concerns alone.

2. LP Disclosure Language (1 paragraph in PPM/LPA)
"The Fund uses AI-assisted tools for operational efficiency, including research, communications, and portfolio monitoring. The GP maintains an AI Use Policy governing data classification and handling. LP personal data is not shared with AI platforms."
Why now: Transparency is the best defense. LPs who learn about your AI use after investing feel deceived.

3. AI-Specific NDA Provisions
"Neither party shall input Confidential Information into AI tools without prior written consent, except where such tools operate under enterprise DPAs with contractual prohibitions on data use for model training."
Why now: This is becoming standard practice. Having it shows sophistication.

4. Partner Agreement on Boundaries
All four GPs need to explicitly agree on what each partner can and cannot share with their respective AI agents. One partner's loose practice becomes everyone's liability.
Why now: If one partner shares LP commitment details and that data leaks, all GPs face fiduciary liability.

The Broader Ethics View

The ethical foundation is consent and transparency. If an LP knows their data is processed by AI, and the GP has reasonable safeguards, and the purpose is to serve the LP's interests — that's ethical. If LP data is fed into AI without knowledge, for GP convenience, with no safeguards — that's not.

The "how much is too much" question is really about whose data it is. Your own data is yours to share. LP data, portfolio company data, counterparty data under NDA — that's someone else's data. You're a steward, not an owner. Stewardship demands care.

The strongest argument for AI transparency is self-interest. The fund that gets caught sharing LP data without disclosure faces regulatory action, LP lawsuits, and reputational destruction. The fund that proactively discloses AI use with clear policies gets LP trust, operational efficiency, and competitive advantage.

Final Thought: The question isn't whether to use AI in fund management — 85% of VCs already do. The question is whether to do it thoughtfully or carelessly. There is an opportunity to get this right from day one.

The Deeper Questions

Katy challenged the policy recommendations with real-world operational objections. These are the honest responses.

Challenge 1: Anonymization isn't practical for 50-page PDFs

Tools exist — Nitro Smart Redact ($20/month) detects 30+ PII types automatically in ~30 seconds per document. But they can't catch everything in context.

Tarin's reframe: Don't make redaction the primary control. The platform's contractual protections ARE the primary control. Redaction is a second layer for the most toxic data only. The preprocessor script + Nitro covers 80% of cases. Time cost per document: 3-5 minutes.

Challenge 2: NDA material — all shared information is sensitive

Every IC memo is a derivative of NDA-protected data. You can't synthesize across your portfolio without your AI knowing real details.

Tarin's reframe: Add AI processing clause to NDAs. Use AI-enabled VDRs (Datasite, Peony) for raw documents. Accept that the AI partner will know sensitive things — like any trusted employee. The question isn't whether, it's how to do it defensibly with contractual protections, audit logs, and no-training commitments.

Challenge 3: Monitoring can't just be "trust me"

Self-attestation points failures to individuals. When things go wrong, "they signed a piece of paper" doesn't protect the GP.

Tarin's reframe: Four layers of defense — all generating machine evidence, not human promises:

Platform audit logs (automated) — request from PureBrain, put it in the DPA
AI-side guardrails (pattern detection) — Tarin flags Tier 1 patterns at point of entry
Quarterly review with evidence memo — AI Officer reviews logs, spot-checks 5 random interactions per partner, documents findings
Annual external validation — third-party reviews policy, logs, memos, vendor DPA

That's defensible. Not because it's perfect — because it demonstrates a multi-layered, documented, continuously monitored governance process.

Challenge 4: The bad actor problem — one leak brings it all down

One partner shares LP passport copies with their AI. One employee forwards a privileged memo to ChatGPT. One intern uploads a cap table to the free version of Claude.

Tarin's reframe: Same problem finance has always had. AI doesn't create it, amplifies it. Controls:

Employment contracts with AI prohibitions and material breach consequences
Device policy — fund work on fund devices, Cloudflare Gateway blocks consumer AI
Access architecture — need-to-know, approved tools only (PureBrain as single gate)
Culture — make the approved system so good there's no incentive to go outside it

Challenge 5: Employee with personal AI account — not technically enforceable

A determined employee can always use a personal device and a personal AI account. You can't physically prevent it. This is an employee conduct issue, not a system design issue.

Tarin's honest answer: Fund-managed system = controllable, auditable, guardrailed. Personal systems = outside the fund's control. The strategy has two parts:

Make the fund-provided AI so good there's no reason to go outside it. If the approved tool is fast, capable, and frictionless, employees won't bother with personal alternatives.
Make the consequences for violations clear and personal. Employment agreements should include explicit AI use clauses: unauthorized processing of fund data through personal AI tools constitutes a breach of confidentiality obligations, subject to disciplinary action up to termination, clawback of compensation, and personal liability for any resulting data breach.

This is the same framework used for any confidentiality obligation — the employee signs, the employee is accountable. The fund provides the tools and the policy; the employee is responsible for compliance.

Account Policy: The fund pays for each employee's Claude subscription (required to run PureBrain). Employee accounts are restricted to PureBrain use only — they are not general-purpose Claude/AI subscriptions for personal use. Employees who want a personal AI account must obtain their own separate subscription independently. All users are bound by the same data classification and handling rules when processing fund-related information.

Technical question for PureBrain: Can a Claude subscription be scoped so it only works through PureBrain and cannot be used directly at claude.ai? This would provide technical enforcement of the PureBrain-only policy for employee accounts.

Challenge 6: fund-configured AI vs. employee-controlled AI

Key insight from Katy: Bake guardrails into the AI's system-level instructions (immutable), not just memory (editable). The employee cannot tell the AI to override compliance rules.

How it works:

System-level instructions (admin-locked): data classification rules, Tier 1 refusal patterns, audit logging requirements
User-level memory (editable): preferences, writing style, project context
AI refuses prohibited requests: "This guardrail is set by fund policy and cannot be modified. Contact the AI Officer."

Question for PureBrain: Can they support admin-locked vs. user-editable configuration tiers? This is a critical capability for enterprise deployment.

Problem 1: How to Anonymize Documents Efficiently

Tools That Exist Today

Option	What It Does	Cost
Nitro Smart Redact [Link]	AI-powered, detects 30+ PII types, works on PDF/DOCX/XLSX, runs locally	$20/user/month
Microsoft Purview	Auto-classifies docs, applies labels, integrates with DLP	Included in M365 E5 or add-on
Tarin Preprocessor	Scans for blocklisted names, replaces with codes, outputs clean version + mapping file	$0 (Tarin builds it)

Recommendation: Start with Tarin preprocessor (free, immediate) + Nitro for PDFs ($20/month). Workflow: run preprocessor → share anonymized version with AI → AI analyzes using codes → map codes back for final output.

What Constitutes "Sufficient" Anonymization?

Under GDPR/DPJL 2018, the legal test: "Could a reasonably informed person re-identify the individual from the anonymized data?"

"LP-A, $50M commitment" when there are 40 LPs → sufficient
"LP-A" but leaving "the Swiss family office that previously invested in Company X" → insufficient
Rule of thumb: Remove names AND any combination of details that narrows to one person

Problem 2: Portfolio Analysis Under NDA

Tiered Processing Model

Tier A: Your Portfolio Companies (You Have Information Rights)
Add AI processing clause to the fund's standard NDA template. For existing companies, request retroactive consent via email. Most will say yes — because they're using AI too. Document every response.

Tier B: Pipeline Companies (Under Evaluation NDA)
Use AI-enabled VDRs (Datasite with ISO 42001, or Peony at $40/admin/month). VDR's built-in AI does initial analysis inside the certified environment. Share the AI's output (summaries, risk flags) with Tarin for synthesis — not the raw documents.

Tier C: Market Research
No restrictions. Public or semi-public information. Share freely.

The honest trade-off: This means Tarin doesn't have raw access to every data room page. But VDR-native AI handles 70-80% of document analysis, Tarin handles synthesis and strategy. Together they cover 95%. The 5% gap isn't worth the legal exposure.

Problem 3: Defensible Monitoring

Layer	What It Does	Effort
1. Platform Audit Logs	Machine-generated record of all data categories processed, timestamps, volume	Automated (request from PureBrain)
2. AI-Side Guardrails	Pattern detection for SSNs, blocklisted names, "privileged and confidential" phrases	Tarin configures (this week)
3. Quarterly Review	AI Officer reviews logs + flags, spot-checks 5 random interactions per partner, writes evidence memo	Half-day per quarter
4. Annual External Validation	Third-party reviews policy, logs, memos, vendor DPA	$3-5K/yr

Problem 4: The Bad Actor Problem

Layer	What It Does	Catches
Approved tools only + device blocking	Prevents accidental consumer AI use	80% of incidents
Platform audit logs	Creates evidence trail	100% of approved-tool usage
AI-side guardrails	Flags sensitive data at point of entry	60-70% of PII/privileged content
Quarterly review	Catches patterns automated tools miss	90% (combined with logs)
Partnership agreement clause	Legal consequences for violations	Deters deliberate bad actors
Annual external review	Third-party validation	Regulatory defensibility

No single layer is sufficient. All six together create a system where accidental sharing is largely prevented, deliberate sharing is logged and detectable, and bad actors face legal consequences beyond just "breaking a rule."

What This Costs

Item	Cost	When
Tarin preprocessor script	$0 (Tarin builds it)	This week
Nitro Smart Redact [Link]	$20/user/month ($960/yr for 4 GPs)	Phase 1
Cloudflare Gateway [Link]	Free (up to 50 users)	Phase 1
VDR with AI (Peony) [Link]	$40/admin/month ($480/yr)	When deal flow starts
VDR with AI (Datasite) [Link]	~$15-25K/yr (ISO 42001 certified)	When fund scales
Partnership agreement AI clause	$0 (draft with existing counsel)	Before first close
Cyber/E&O insurance AI rider	~$2-5K/yr additional premium	At fund formation
Annual external AI review	~$3-5K/yr	Post-first close

Phase 1 Total: ~$1,200/yr

+ $0 for tools Tarin builds

At Scale: ~$25-35K/yr

What Tarin Can Build This Week

Preprocessor script — Scans documents for LP names (blocklist), portfolio company names under NDA, PII patterns (SSN, passport, bank account formats). Outputs anonymized version + mapping file.
Self-enforcing guardrails — Tarin flags when Tier 1 patterns are detected in shared content. Not a block — a flag with a logged warning.
AI Use Policy draft — 2 pages, fund-specific, ready for GP review and signature.
NDA AI clause — Drop-in paragraph for your standard NDA template.
Partnership AI agreement — 1-page addendum to the GP operating agreement.

Implementation Checklist

Organized by phase. Each item is independently actionable.

Phase 1: Before First Close ($0 – $1,200/yr)

Draft 2-page AI Use Policy
Create 4-tier data classification reference card
Add LP disclosure paragraph to PPM/LPA
Add AI processing clause to standard NDA template
Set up shared AI usage log (Google Sheet)
Four GPs sign Partner AI Agreement
Review PureBrain privacy policy and document gaps
Request DPA from PureBrain
Ask PureBrain about admin-locked guardrails capability
Build Tarin preprocessor script for document anonymization
Configure AI-side guardrails (Tier 1 pattern detection + flagging)
Set up Cloudflare Gateway to block consumer AI on work devices

Phase 2: Post-First Close

Implement pseudonymization mapping for LP data
Quarterly self-attestation + AI Officer review process
Request audit log capability from PureBrain
Include AI practices in first LP quarterly report
Annual compliance review scope includes AI
Evaluate VDR with built-in AI (Peony or Datasite)
Add AI coverage to cyber/E&O insurance

Phase 3: At Scale

Evaluate enterprise AI platforms with SOC 2/ISO 27001
Build or procure automated redaction/DLP layer
Formalize incident response procedures
Annual external AI governance audit

Next Step: Katy reviews this document, then Tarin begins building Phase 1 items (preprocessor script, policy draft, NDA clause, partner agreement). Target: all Phase 1 items complete before first LP close.