Enterprise AI Security Pre-Flight Checklist

A practical go-live gate for connecting LLMs to enterprise data safely.

Security pre-flight checklist

If you're connecting LLMs to real organizational data, this is the "don't embarrass yourself in production" list.

1) Identity and Access

  • SSO + MFA enforced for all users (no exceptions, no "temporary" accounts)
  • Separate identities for humans, services, and agents (no shared "data_bot")
  • Least privilege by default: start read-only, scoped datasets, scoped tools
  • RBAC/ABAC defined (roles/attributes mapped to reality, not fantasy)
  • Row/column-level security for sensitive domains (HR, finance, customer PII)
  • Time-bound access for elevated permissions (auto-expire; approvals logged)
  • Secrets management + rotation (no long-lived keys; no creds in prompts/logs)

2) Data Classification and Handling

  • Classification tags applied (PII/PHI/PCI/Confidential/Public)
  • Approved-use policies per classification (e.g., "PII can't leave VPC")
  • Masking/tokenization defined and enforced
  • Retention rules for data and AI artifacts (prompts, transcripts, embeddings)

3) Agent Tooling Controls (the "sharp objects" section)

  • Tool allowlist (explicit connectors/actions; everything else blocked)
  • Read vs write separation (writes require stronger auth + gating)
  • Action approvals for high-risk actions (exports, perms, deletions, sends)
  • Sandboxed execution (no unrestricted network, no arbitrary file writes)
  • Query validation to prevent exfil patterns; enforce limits and safe joins
  • Egress controls (network policies, domain allowlists, "no internet" mode)

4) Prompt Injection and Retrieval Safety

  • Treat retrieved content as untrusted input
  • Grounding required: answers cite catalog/contracts/lineage where possible
  • Permission-filtered retrieval (per-user isolation, per-dataset constraints)
  • Output filtering: redact sensitive values; block disallowed classes
  • No training on customer data unless explicitly governed/contracted

5) Auditability and Evidence

  • End-to-end audit logs: who asked, what was retrieved/queried/output
  • Immutable log storage with compliance-aligned retention
  • Lineage + run logs connected to AI actions ("this answer came from...")
  • Change control for prompts/tools/policies (versioning + approvals + rollbacks)

6) Observability and Detection

  • Data observability: freshness, volume, distribution shifts, schema changes
  • Security observability: query anomalies, export spikes, new principals, drift
  • Cost observability: token usage + query spend per team/agent
  • Alerts routed to owners (not #general) with playbooks attached

7) Incident Response (because it's not "if")

  • Playbooks for: leaks, injection, tool misuse, credential exposure
  • Kill switch: disable an agent/connector instantly
  • Key rotation runbook tested (not theoretical)
  • Postmortems include blast radius + datasets + outputs + remediations

8) Trust Gradients (adoption hinge)

  • Graduated autonomy levels (Manual -> Suggested -> Assisted -> Autonomous)
  • Show sources + planned actions + impact before execution
  • Risk scoring for actions (higher risk -> more friction/approval)
  • Easy rollback for automated changes