Resource disaggregation

Co-locate same components such as CPUs, DRAMs as a group and interconnect them over high-speed network fabric.
Resource disaggregation first started with the computing units such as CPU, GPU, and FPGA.
- It allows flexible and efficient resource allocation for computing workloads that change its demand over time.
- e.g., CPU, GPU, AI processor, SmartNIC, IPU
More recently, the idea has expanded to memory components such as SSD, DRAM and PRAM.
- allows an efficient sharing of persistent data among the tenants as well as across VM migration.

Challenges

Current usage of TEEs is confined in monolithic server model.
- All required resources such as storage and memory are present in the same physical machine, without considering the disaggregated resources presented in data centers.
CC doesn’t trust any components outside of its TCB, and current TEE ecosystem does not provide a way to build trust among different TEE components or other non-secure computing units.
Data inside TEEs, by design, cannot be shared with hardware accelerators such as FPGAs.
- Recent works solve this problem: [Sec’24] ACAI and [NDSS’24] CAGE

HW components

Survey on the readiness of commodity HW

How to expand trust from in-host TEE to other components, especially to other physical machine?

One solution is to make all nodes TEE-enabled.

Protecting non-TEE nodes can be done via bus-level isolation if they are directly physically connected to a TEE host.

[Arxiv’21] Composite Enclaves: Towards Disaggregated Trusted Execution

[Micro’22] CRONUS: Fault-isolated, Secure and High-performance Heterogeneous Computing for Trusted Execution Environment

DSA (Data Streaming Accelerator)

DSA is an accelerator that enables high performance data mover capabilities (e.g., copy to/from volatile memory, persistent memory, memory-mapped I/O) and transformation operations (e.g., memory comparison and delta generation, VM fast checkpointing). It will also be integrated into the upcoming Intel Xeon processor (Sapphire Rapids).

CXL (Computer Express Link)

CXL is an open standard for high-speed low-latency interconnect between CPUs and other hardware devices such as accelerators. CXL was first proposed by Intel to solve the communication problems of hardware disaggregation in data centers. CXL will be a feature in the new Intel Xeon processor (Sapphire Rapids), which is to be released at the end of 2022.

PCIe

PCIe (Peripheral Component Interconnect Express) is a high-speed serial computer expansion bus standard used for connecting hardware components to the motherboard.

TDISP

TDISP stands for TEE Device Interface Security Protocol. It is a new framework and architecture to secure I/O virtualization (for both CXL and PCIe) and manage secure environments.

GPU / Accelerators

Accelerators supporting CC

[OSDI’18]Graviton: Trusted execution environments on gpus
[Arxiv’21] IceClave: A Trusted Execution Environment for In-Storage Computing
[Arxiv’22] ShEF: Shielded Enclaves for Cloud FPGAs
[FCCM’21]Trusted Configuration in Cloud FPGAs
[Web] NVIDIA Hopper Architecture In-Depth
- New Confidential Computing support protects user data, defends against hardware and software attacks, and better isolates and protects virtual machines (VMs) from each other in virtualized and MIG environments. H100 implements the world’s first native Confidential Computing GPU and extends the trusted execution environment (TEE) with CPUs at full PCIe line rate.

Building trust from cpu to accelerator

IPU

TODO

Software challenges

What can be additional overhead for applying confidential computing?

Is there any software techniques or optimizations cannot be applied in the context of CC?

Is there any missing part in software layer to apply CC?

Opposite stance

Based on some observation that VMs used in public clouds don’t utilize hypervisor features and resources are mostly assigned statically, [OSDI’23] Paper: Core Slicing suggests to use bare-metal hardware without virtualization by just statically slice hardware to guests. We both observed that current CC can’t fully leverage hypervisor features, and this paper takes their steps to remove hypervisor. On the other hand, we want to utilize those missing features properly.

Memory deduplication

This is straightforward.

Large page table

TODO

IOMMU

TODO

Similar works

[Arxiv’22] Empowering Data Centers for Next Generation Trusted Computing
- The paper design a distributed TEE solution that allows a tenant to securely use TEE nodes (including CPUs and accelerators) and non-TEE legacy nodes.
  1. Use TEEs on CPUs and DSAs when available. Use such TEEs to protect all the data leaving the corresponding nodes.
  2. Employ a centralized security controller that shields all the non-TEE nodes. All non-TEE nodes are placed behind the trusted controller who imparts TEE properties such as attestation, isolation, and secure channel.
  3. The initial state of the nodes is attested and cannot be changed thereafter. The controller checks that the CSP’s resource management decisions do not violate resource isolation and secure path guarantees.
[Arxiv’21] Composite Enclaves: Towards Disaggregated Trusted Execution
[Micro’22] CRONUS: Fault-isolated, Secure and High-performance Heterogeneous Computing for Trusted Execution Environment
[S&P’20] Enabling rack-scale conﬁdential computing using heterogeneous trusted execution environment
- Enable TEE abstractions for a single rack containing non-TEE nodes
- Do not scale to multiple racks and are not designed to leverage nodes that have TEE support.
[APSys’23] Trusted Heterogeneous Disaggregated Architectures

🪴Research Note

Explorer

Readiness of disaggregated CC

Resource disaggregation

Challenges

HW components

DSA (Data Streaming Accelerator)

CXL (Computer Express Link)

PCIe

TDISP

GPU / Accelerators

Accelerators supporting CC

Building trust from cpu to accelerator

IPU

Software challenges

Memory deduplication

Large page table

IOMMU

Similar works

Graph View

Table of Contents

Backlinks