- "Encrypted analytics" is not binary. Systems differ in what they protect against (cloud operators, collaborators, DBAs, output inference, side channels, collusion) and in what they can compute.
- The choice is threat-model-first, not feature-first — pick the weakest tool that still closes the threat you actually care about.
- TEEs give the broadest functionality at near-native speed; MPC suits multi-party analytics; FHE removes the most server trust; searchable/OPE trade leakage for query speed; FE is narrow; DP guards the outputs.
- The dominant 2026 production pattern is hybridization — confidential computing for general execution, cryptographic sub-protocols for the sensitive steps, searchable encryption for queryability, and DP for release.
- PETs are complementary safeguards, not compliance exemptions — pseudonymised data is still personal data under GDPR.
01 — ScopeDefinitions and scope
In the strict cryptographic sense, encrypted analytics covers homomorphic encryption, MPC, structured/searchable encryption, private set operations, and functional encryption. In industry practice the umbrella also includes trusted execution environments and confidential computing.
Any architecture that allows useful analytics, search, joining, or ML over sensitive data while preventing the processing environment from having ordinary, unrestricted plaintext visibility — via computation on ciphertext or secret shares, queries over protected indexes, hardware-attested execution, or DP-controlled output release.
A useful distinction is between strict cryptographic opacity and reduced plaintext exposure. FHE, MPC, PIR/PSI, and most functional-encryption constructions aim for the former. TEEs aim for the latter: data becomes plaintext inside an attested enclave or confidential VM, but the operator, hypervisor, and surrounding platform are meant to stay outside the trust boundary. Crucially, ordinary encryption at rest or in transit alone does not qualify — the analytic engine still processes plaintext in its normal memory.
NIST's Privacy-Enhancing Cryptography project gives a stable foundation: MPC lets distrustful parties compute on private inputs; FHE evaluates functions on ciphertexts; PIR retrieves an item without revealing the query; structured encryption enables private queries over encrypted data structures; and functional encryption is listed among related tools. The "analytics" in scope is broad — aggregation, ML inference and (where feasible) training, SQL-like filtering and selected joins, search over encrypted data, and cross-party linkage such as private join and compute.
02 — Threats & LawThreat models and regulatory constraints
The first question is who is the adversary. Common models include an honest-but-curious cloud provider; a malicious operator or hypervisor; colluding input parties in an MPC workflow; a database administrator who can dump storage and logs; a side-channel attacker observing memory, page faults, or microarchitectural effects; and an analyst who sees only aggregate outputs but attempts membership or reconstruction attacks. Different PETs address very different subsets of that list.
- Cryptographic techniques rest on lattice or number-theoretic hardness and on how much leakage the scheme permits by design.
- MPC hinges on the corruption model and collusion threshold — passive vs malicious, honest- vs dishonest-majority, and behavior under abort.
- TEEs have a narrower, operational model: only an attested workload may access keys or plaintext — so hardware/firmware bugs, side channels, or vendor supply-chain trust fall outside it.
Regulation is nuanced. Under GDPR, pseudonymisation is still processing of personal data, not automatic anonymization; Article 25 requires data protection by design and default, and Article 32 cites pseudonymisation and encryption as example measures. The ICO's PET guidance is explicit that PETs support data minimisation and security but are not a silver bullet, and still require lawful, fair, transparent processing plus a case-by-case DPIA. U.S. sectoral regimes echo this: HIPAA points to NIST controls, the FTC stresses understanding data flows and avoiding deceptive privacy claims, and the GLBA Safeguards Rule is explicitly risk-based.
Encrypted analytics helps with risk reduction, processor minimization, breach resilience, and cross-organizational sharing — but it usually does not eliminate obligations around purpose limitation, transparency, data-subject rights, retention, or transfers, especially when a controller can still decrypt outputs or relink results to individuals.
03 — CatalogThe technique catalog
Eight families, each strongest at a different job. The colored spine on each card matches the spectrum strip at the top of the page.
04 — TradeoffsComparative tradeoffs
A qualitative synthesis, not a literal benchmark. "Security level" means how much trust is removed from the execution environment when the stated assumptions hold.
| Technique | Security level | Supported analytics | Performance | Best fit | Primary caveat |
|---|---|---|---|---|---|
| Partial HE | High for narrow arithmetic | Counts, sums, weighted sums, aggregation | High (no bootstrapping) | Simple outsourced arithmetic | Functionality too narrow for rich queries |
| Full HE | Very high trust reduction | Aggregation, vector ops, similarity, ML inference | Low–medium; often slowest | Single-owner outsourced compute | Ciphertext blow-up, slow bootstrapping, limited SQL/training |
| MPC | Very high within collusion thresholds | Aggregation, joins, PSI/PJC, partitioned ML | Medium; network/round-bound | Cross-org collaboration, no trusted hardware | Operational complexity & collusion assumptions |
| TEE / confidential computing | High if HW/firmware/attestation hold | Broadest: SQL, joins, arbitrary code, ML | High; often closest to native | Lift-and-shift confidential analytics | Side channels, larger TCB, HW vulnerabilities |
| Searchable encryption | Medium–high, leakage-prone | Equality, keyword, some range/prefix/suffix | High | Queryable encrypted databases | Search/access/frequency leakage |
| OPE / ORE | Low–medium (order leaks) | Sorting, range filters, thresholding | Very high | Fast range search when leakage acceptable | Inference attacks recover plaintext structure |
| Functional encryption | High for supported functions | Inner products, selected linear/quadratic | Medium for narrow tasks | Fine-grained delegated analytics | Narrow functionality, low ecosystem maturity |
| DP + encrypted execution | High vs output inference | Aggregates, telemetry, federated learning | High for DP step | Safe result release after protected compute | Utility/privacy tradeoff & budget accounting |
| Hybrid | Potentially strongest overall | Broadest practical coverage | Medium–high if well partitioned | Real-world enterprise deployments | Compositional proofs & operating complexity |
← swipe the table to see all columns →
05 — HybridHybrid architectures
Hybrids are increasingly the production default because they align technique to task. SecretFlow abstracts MPC, HE, and TEE into one framework; Duality's AWS work uses Nitro Enclaves alongside prior FHE, federated learning, and DP; Decentriq combines Azure confidential computing with other privacy technologies including DP. A well-partitioned stack looks like this:
The value is straightforward: this often beats any single primitive on the joint objective of security, functionality, and cost. The downside is equally straightforward — security proofs become compositional rather than monolithic, and operational complexity rises sharply because each layer adds new assumptions, observability needs, and failure modes.
06 — DeploymentsDeployments and vendor landscape
The clearest production maturity today is in confidential-computing and clean-room deployments — broad-functionality encrypted analytics is, in practice, currently led by TEE-centric and hybrid architectures.
Google Confidential Space (and Google Ads confidential matching), Azure Confidential Computing with Decentriq clean rooms, and AWS Nitro Enclaves with Duality (including cross-border cancer research).
IBM HElayers (an Intesa Sanpaolo digital-transaction deployment), Duality for healthcare/finance/government, Zama (TFHE-rs, Concrete, Concrete ML), and Inpher for MPC/HE/federated learning.
MongoDB Queryable Encryption — the most prominent mainstream example — plus CipherSweet, OpenSSE, Cosmian Findex, and CipherStash for application-level searchable encryption.
Prio and descendants: Mozilla's Prio-based DAP in Firefox and Divvi Up (Prio3), plus Google federated learning with secure aggregation and AWS Clean Rooms Differential Privacy.
On the cryptography side the market is real but more selective. MongoDB Queryable Encryption is the clearest queryable-database example: equality and range queries are production-supported, while prefix/suffix/substring remain public preview in 8.2 (GA targeted for 2026) — and MongoDB documents the real costs of queryability: extra storage, query-performance impact, and reduced observability because encrypted collections redact some logs.
07 — Signals2026 signals
What's moving right now
- Confidential computing went to the GPU. NVIDIA Confidential Computing across Hopper and Blackwell GPUs (encrypted VRAM, attestable alongside a CPU TEE) makes TEE-based private AI inference and training practical — a major boost for the broad-functionality end of this spectrum.
- TEE risk kept evolving. The 2024 SGX.Fail systematization and the 2026 "Fabricked" SEV-SNP result (arbitrary read/write and forged attestation under a routing-misconfiguration attack, acknowledged by an AMD bulletin) confirm that TEE security must be revisited as hardware, attestation, and patch guidance change.
- FHE commercialized and accelerated. Zama became the first FHE unicorn (2025) and targets 500–1,000 TPS via GPU; the FHE Benchmarking Suite matured into a standard way to compare latency, throughput, memory, storage blow-up, communication, and accuracy loss.
- Queryable encryption broadened. MongoDB QE added production range queries and moved prefix/suffix/substring into public preview (8.2), with GA expected in 2026 — narrowing the gap between "encrypted at rest" and "queryable in use."
- Private telemetry scaled. Prio/DAP deployments (Firefox, Divvi Up) show encrypted analytics is not only about databases and model serving, but also safe measurement at population scale.
08 — ChooseSelection criteria & deployment checklist
The first decision criterion is not vendor or algorithm — it's the trust boundary you are trying to move. The second is workload shape. A concise decision rule: use the weakest tool that still closes the threat you actually care about, and combine two or more rather than forcing one primitive to do everything.
- Distrust the cloud operator but trust a hardware root and need rich existing software → TEEs.
- Multiple organizations, no single operator may see data → MPC / PJC.
- One owner outsourcing computation, server fully untrusted → FHE / PHE.
- Mostly equality/range retrieval in a database → searchable / queryable encryption.
- Sensitive outputs, not just inputs → add differential privacy.
Stage-gate deployment checklist
| Stage | What to do | Pass condition |
|---|---|---|
| Problem framing | Classify data, outputs, parties, and exact operators | You know if it's aggregation, search, join, inference, or training |
| Threat model | Write down adversaries, collusion assumptions, unacceptable leakages | Named threat model approved by security/legal |
| Technique shortlist | Map workload to 2–3 candidate architectures | ≥1 cryptographic and ≥1 operationally efficient option considered |
| Key & identity design | Define key custody, attestation flow, or share-holder governance | Keys/shares are never ad hoc |
| Prototype | Benchmark on representative data at realistic security levels | Meets p95 latency, throughput, and cost guardrails |
| Leakage review | Document observable metadata, patterns, or outputs | Explicit acceptance or rejection of the leakage profile |
| Release controls | Add DP, quotas, or query governance if results leave the boundary | Output policy defined and testable |
| Red-team & compliance | Test side channels, patching, logging, legal claims | Findings resolved before rollout |
← swipe the table →
09 — MetricsWhat to actually measure
The benchmark program should be explicit. The FHE Benchmarking Suite is a good model, centering latency, throughput, memory, storage expansion, communication complexity, and quality loss. Extend it per technique:
- TEE-based SQL — attestation time, EPC/enclave-paging behavior, cache-miss amplification, observable overhead under realistic OLAP.
- Searchable encryption — index size, query selectivity, token-generation cost, and a documented leakage profile.
- DP-based releases — epsilon, delta, contribution bounding, privacy-budget burn rate, and utility loss.
A good cross-technique suite includes at least five workload families: aggregations on wide tables; private join / PSI-plus-sum on skewed identifiers; search with equality and range predicates; SQL analytics on a TPC-H-like subset with one or two joins; and ML with one classical and one compact neural model. For each, measure p50/p95 latency, throughput, ciphertext/share expansion, network bytes, RAM/VRAM, accuracy degradation, deployment time, and operator effort. If a solution needs hidden parameter tuning or hand-crafted circuits your own engineers can't maintain, treat that as a first-class cost signal, not a footnote.
10 — LimitsOpen questions and limitations
Several parts of this field move quickly enough that any static guide has limits. General-purpose FHE for SQL and large-model training is improving, but the most credible evidence still points to selective inference and narrow analytics rather than drop-in encrypted datastores. Searchable-encryption leakage remains an open design fault line — "acceptable leakage" is highly context-specific and still being quantified, and this is where vendor positioning and academic caution most often diverge. TEE risk is not stable, as recent SGX and SEV-SNP results show. And functional encryption remains under-evidenced in production relative to the rest of the field, which argues for targeted pilots rather than broad enterprise commitments unless the function family is unusually well matched.
11 — FAQFrequently asked questions
What is encrypted data analytics? +
Which PET should I choose for analytics? +
Does encrypted analytics make data exempt from GDPR or HIPAA? +
Is FHE practical for analytics in 2026? +
What's the risk with searchable and order-preserving encryption? +
What is the most common production pattern? +
12 — SourcesPrimary sources
Synthesized from standards bodies, regulators, vendor documentation, peer-reviewed research, and 2024–2026 systems work.