ⓘ Hover over any item to see detailed technical information

💡
Key Concept: Vault is a control plane, not a separate data store
Google Vault does not store copies of data. It operates as a control and metadata layer over the existing Gmail, Drive, Chat, and other Workspace stores. Vault holds, retention rules, and searches act as flags and directives applied to data that lives in the native service stores. Export is the only operation that produces a separate copy.
Vault Coverage — Supported Data Sources
Services whose data Vault can hold, search, and export
✉️ Gmail 📁 Drive 💬 Chat 📹 Meet recordings 👥 Groups 📅 Calendar 🌐 Sites
Note: Vault does not cover Google Voice (separate retention product), third-party app data in Drive, or Chat messages with history off unless a hold is applied.
📁
Matter Object Model
The top-level organizational unit in Vault
Matter
Case or investigation container
Matter
A Matter is the top-level container in Vault, representing a legal case, investigation, audit, or compliance project. All Vault activity — holds, searches, exports — is scoped to a specific matter. Matters are independent of each other.
  • Unique matter ID assigned at creation
  • Status: Open, Closed, or Deleted
  • Closed matters are read-only
  • Deleted matters are purged after 30 days
  • Matter access controlled by collaborator list
Matter Collaborators
Per-matter access control
Matter Collaborators
Access to a matter is controlled by a collaborator list. Roles: Owner (full control, can delete) and Collaborator (can view, search, and export but not delete). Vault admins can access all matters regardless of collaborator list.
  • One owner per matter (creator by default)
  • Multiple collaborators allowed
  • Access scoped to the matter, not org-wide
  • Vault Admin role bypasses collaborator list
Matter Audit Trail
Immutable log of all matter activity
Matter Audit Trail
Every action taken within a matter is logged: hold creation/modification/release, search execution, export creation, collaborator changes, and matter status changes. Audit logs are immutable and cannot be deleted by Vault users — accessible via Vault audit reports and the Admin SDK Reports API.
  • Captures who did what and when
  • Cannot be modified or deleted by admins
  • Retained even after matter deletion
  • Exportable via Admin SDK audit reports
🔒
Holds
Preservation directives on user data
Hold Object
Indefinite preservation marker
Hold Object
A hold is a directive stored in Vault metadata that instructs the native service (Gmail, Drive, Chat) to preserve matching data indefinitely, regardless of user actions or retention rules. The hold itself is a lightweight record — the data remains in its native store.
  • Stored in Vault metadata, not the data store
  • No expiry — indefinite until released
  • Overrides all retention policies
  • Overrides user deletion, filter-to-delete, and auto-purge
Hold Scope
Who and what is preserved
Hold Scope
A hold targets one data source (Gmail, Drive, Chat, Groups, Meet, Voice) and specifies the custodians (individual users) or org units to hold. Optionally, a search query (date range, terms) can narrow the hold to matching data only.
  • Custodian hold: named users or service accounts
  • Org unit hold: all current/future members of an OU
  • Query-based: only data matching search criteria
  • Non-query hold: all data for the custodian
Hold Behavior on Delete
Preserved data stays in native store
Hold Behavior on Delete
When a user deletes a message or file that is under a hold, the item is removed from their view (moved to Trash, then appears deleted) but is retained in a hidden preserved state in the native service store. Vault can still search and export it. The item is only truly deleted when the hold is released and any retention period has expired.
  • Invisible to end users after deletion
  • Still indexed and searchable via Vault
  • Survives Trash empty, account deletion initiation
  • Account suspension does not release holds
Hold Release
Explicit admin action, not automatic
Hold Release
Holds must be explicitly released by a Vault admin or matter collaborator. There is no automatic expiry. Releasing a hold does not immediately delete the preserved data — the data then becomes subject to normal retention rules, which may result in eventual deletion.
  • Manual action required — no scheduled expiry
  • Released data then governed by retention policy
  • If no retention rule: user-controlled lifecycle resumes
  • Release is logged in the matter audit trail
⏱️
Retention Rules
Time-based data lifecycle policies
Retention Rule Object
Service + scope + duration + action
Retention Rule Object
A retention rule specifies: the data service (Gmail, Drive, Chat), the scope (entire domain, specific OUs, or specific users), the retention duration (in days from creation or receipt), and the action on expiry (delete or do nothing). Rules are evaluated continuously against all in-scope data.
  • Duration: 1 day to 36,500 days (100 years)
  • Action on expiry: delete or expire (no-op)
  • Scope: domain-wide, OU, or individual user
  • More specific rules override broader ones
Default vs Custom Rules
Rule priority cascade
Default vs Custom Retention Rules
If no custom rule applies to a user's data, the default retention rule for that service applies. Custom rules (scoped to an OU or specific users) override the default rule for their scope. The most specific rule wins — individual user rule beats OU rule beats default.
  • Default rule: applies when no custom rule matches
  • Custom rule: OU-scoped or user-scoped
  • Priority: User > OU > Default
  • No rule = data retained until user deletes
Retention vs Holds Interaction
Hold always wins
Retention vs Holds Interaction
When both a retention rule and a hold apply to the same data, the hold takes precedence — the data is preserved indefinitely regardless of the retention rule's expiry timer. When the hold is released, the retention clock resumes. If the retention expiry has already passed, deletion may happen shortly after hold release.
  • Hold overrides retention expiry
  • Retention resumes after hold release
  • Post-release deletion is not immediate
  • Retention can also extend beyond hold (if rule is longer)
Expiry and Deletion
How data is purged after retention period
Expiry and Deletion
When a retention rule's duration expires for a piece of data, Vault marks it for deletion from the native service store. Deletion is not instantaneous — Google processes it in batches. Once deleted, the data is unrecoverable unless a hold was applied or the rule is changed before processing completes.
  • Expiry triggers deletion from native store
  • Not immediate — batch processing delay
  • Deletion is permanent and unrecoverable
  • Changing rule before processing can rescue data
🔍
Vault Search
Query execution across preserved data
Search Scope
Custodians, dates, terms, services
Search Scope
Vault searches are scoped by: data service (Gmail, Drive, Chat, etc.), account scope (all accounts, specific accounts, or OUs), date range (sent/received, creation date, or modification date), and optional search terms. Vault searches the live data stores — not a separate index.
  • Queries the native service's preserved data
  • Includes data hidden from users (held items)
  • Date range: absolute or relative
  • Terms: Gmail search operators, Drive query syntax
Search Result Preview
In-browser preview before export
Search Result Preview
Vault provides an in-browser preview of search results — message text, Drive file previews, and Chat messages — without requiring an export. This allows investigators to review relevance before committing to a full export. Preview access is logged in the audit trail.
  • No export required to review results
  • Preview is read-only — no modification possible
  • All previews logged in matter audit trail
  • Drive files preview in read-only viewer
Count Estimates
Message/file counts before export
Count Estimates
Before running an export, Vault can return an estimated count of messages or files matching a search query. This is useful for gauging export size and scope. Counts are estimates, not exact — the actual export may differ due to deduplication and processing.
  • Available before committing to export
  • Counts are estimates, not guaranteed exact
  • Helps size export storage requirements
  • Separate count request from search/export
📤
Exports
How Vault produces discoverable data packages
Export Object
Immutable data package per export run
Export Object
An export is an async job that materializes a search query into a downloadable data package stored in Google Cloud Storage. The export object in Vault tracks: status (in-progress/completed/failed), file list with download URLs, export query parameters, and creation metadata.
  • Async — large exports can take hours
  • Stored in GCS for 15 days then auto-deleted
  • Download URLs are time-limited signed URLs
  • Export is associated with the matter, not a hold
Export Deduplication
Duplicate message handling options
Export Deduplication
For Gmail, Vault can optionally deduplicate messages across custodians — if the same message exists in multiple accounts (e.g., sender and recipient), only one copy is exported. Deduplication uses the RFC 2822 Message-ID header as the key. Drive and Chat exports do not deduplicate.
  • Gmail: deduplication is optional per export
  • Dedup key: RFC 2822 Message-ID header
  • Drive: no deduplication — all revisions included
  • Chat: deduplication not applicable (per-space)
Export Package Contents
Data files + metadata + integrity hashes
Export Package Contents
Every Vault export package includes: the data files (format varies by service), a metadata file (JSON) with per-item metadata, a custodian list, and SHA-256 hash files for every data file for chain-of-custody verification. The hash file is critical for legal admissibility.
  • Data files: MBOX, PST, JSON, ZIP
  • Metadata.json: per-item accounts, dates, labels
  • SHA-256 hash file for every data file
  • Hashes enable chain-of-custody attestation
⚙️
Vault API & Admin Integration
Programmatic access and admin touchpoints
Vault API (v1)
Full CRUD on matters, holds, exports
Vault API v1
The Google Vault API exposes full programmatic access to matters, holds, saved queries, and exports. Common automation: bulk hold creation across many custodians, automatic matter creation from a ticketing system, export status polling and download automation.
  • matters.*, holds.*, savedQueries.*, exports.*
  • Requires Vault Admin or matter collaborator role
  • OAuth 2.0 with vault.readonly or vault scope
  • Rate limited — use exponential backoff
Admin SDK Integration
Vault audit events via Reports API
Admin SDK Integration
Vault audit events are surfaced through the Admin SDK Reports API under the vault application name. Events include: matter creation, hold changes, export jobs, and access events. These can be streamed to SIEM systems or BigQuery for compliance monitoring.
  • Admin SDK: reports.activities.list (vault)
  • Streamable to BigQuery via Log Exports
  • Integrates with Google Cloud Pub/Sub
  • Useful for SIEM integration and alerts
Saved Queries
Reusable search definitions per matter
Saved Queries
Vault allows saving search queries within a matter for reuse. Saved queries store the full query parameters (service, scope, date range, terms) and can be re-executed at any time. They are matter-scoped and accessible to all matter collaborators.
  • Stored as savedQuery objects in the matter
  • Can be used to create exports directly
  • Shared across all matter collaborators
  • Manageable via Vault API savedQueries.*
👤
Account Lifecycle Interactions
How Vault behaves during account suspension, deletion, and offboarding
Account Suspended
Holds remain active
Account Suspended
Suspending a user account does not affect Vault holds. All holds on the suspended account remain active, and held data is preserved. Vault can still search and export data from a suspended account. Retention rules continue to run on non-held data.
  • Holds: unaffected by suspension
  • Retention rules: continue to run
  • Search and export: still possible via Vault
  • License: Vault license still required
Account Deletion Initiated
20-day grace period before data purge
Account Deletion Initiated
When a Workspace account is deleted, Google retains the data for 20 days before purging. Vault can still search and export the data during this window. Held data is retained beyond the 20-day window. If no hold exists, data is permanently deleted after 20 days.
  • 20-day grace window for recovery/export
  • Vault can export during the 20-day window
  • Held data survives account deletion
  • Non-held data: permanently deleted after 20 days
Former Employee / Custodian
Best practices for offboarding
Former Employee / Custodian
For offboarding employees who are custodians of active matters, best practice is: (1) apply a Vault hold before deprovisioning, (2) export relevant data, (3) transfer Drive file ownership to a manager/archive account, (4) then suspend or delete the account. Hold preservation survives the deletion.
  • Hold before delete — order matters
  • Transfer Drive ownership first
  • Vault search works on deleted accounts (if held)
  • Consider retaining license if active legal matter
Matter Lifecycle — Typical eDiscovery Workflow
1
Create Matter
Open case, name it, add collaborators
2
Apply Holds
Preserve custodian data indefinitely
3
Search
Query across held and live data
4
Preview & Refine
Review results in-browser, adjust scope
5
Export
Generate data package with hashes
6
Close / Release
Release holds, close matter when complete
Hold vs Retention Rule — Key Differences

🔒 Hold

  • Indefinite — no expiry until explicitly released
  • Triggered by legal/investigation need
  • Scoped to specific custodians or OUs
  • Overrides all retention rules
  • Managed within a Matter
  • Invisible to end users
  • Releases must be logged and deliberate
  • Survives user deletion and account closure

⏱️ Retention Rule

  • Time-bound — expires after configured duration
  • Triggered by compliance/policy need
  • Scoped to domain, OU, or user
  • Overridden by any active Hold
  • Managed at the org/Vault policy level (not per-matter)
  • Transparent — part of org policy
  • Auto-applies and auto-expires
  • Deletion on expiry is automatic (if configured)
Export Formats by Data Source
Gmail
MBOX or PST
Full RFC 2822 messages with headers. Metadata JSON includes label info, account, and dates. SHA-256 hashes per file. Deduplication optional.
Google Drive
Native format or PDF
Google Docs/Sheets/Slides exported as Office or PDF. Uploaded files in original format. Metadata JSON with Drive file IDs, owners, dates.
Google Chat
JSON
Messages exported as structured JSON with space ID, sender, timestamp, thread ID, and message text. Attachments exported separately.
Google Meet
MP4 video + chat JSON
Meeting recordings as MP4. In-meeting chat as JSON. Metadata includes organizer, attendees, start/end times.
Google Groups
MBOX
Group messages in MBOX format with full headers. Metadata JSON includes group address and message threading.
Google Voice
MP3 + JSON
Voicemails as MP3. Call logs and SMS as JSON. Requires Google Voice for Google Workspace — separate from standard Vault licensing.
Legend
Matter object model
Holds and preservation
Retention rules and lifecycle
Search and query
Export and packaging
API and admin integration
Account lifecycle

Architecture Documentation Notes

Key Insight: Control Plane, Not Data Plane

Vault is purely a control layer. It stores matter metadata, hold directives, retention rule definitions, and export job records — but not the underlying data. All preservation happens in-place in Gmail, Drive, and Chat stores. This means Vault's storage footprint is tiny; the compliance cost is in the native service stores.

Hold Priority Chain

The effective lifecycle of any piece of data follows this priority chain: Active Hold wins absolutely → if no hold, Retention Rule governs → if no rule, user controls deletion. Understanding this chain is essential for designing defensible legal hold and retention workflows.

Migration Implication

During a Workspace tenant-to-tenant migration, Vault holds and retention rules do not migrate — they must be recreated in the destination tenant. Held data may be trapped in the source tenant and require export before migration. Export packages (MBOX, JSON) can be re-imported into destination tools or eDiscovery platforms.

Source Confidence

Matter model, holds, retention rules, export formats, account lifecycle behavior: all documented in the Google Vault Help Center, Vault API v1 reference, and Admin SDK Reports API documentation. These are fully publicly documented behaviors. Underlying storage infrastructure details are not disclosed.