Resource disaggregation
-
Co-locate same components such as CPUs, DRAMs as a group and interconnect them over high-speed network fabric.
-
Resource disaggregation first started with the computing units such as CPU, GPU, and FPGA.
- It allows flexible and efficient resource allocation for computing workloads that change its demand over time.
- e.g., CPU, GPU, AI processor, SmartNIC, IPU
-
More recently, the idea has expanded to memory components such as SSD, DRAM and PRAM.
- allows an efficient sharing of persistent data among the tenants as well as across VM migration.
Challenges
- Current usage of TEEs is confined in monolithic server model.
- All required resources such as storage and memory are present in the same physical machine, without considering the disaggregated resources presented in data centers.
- CC doesnât trust any components outside of its TCB, and current TEE ecosystem does not provide a way to build trust among different TEE components or other non-secure computing units.
- Data inside TEEs, by design, cannot be shared with hardware accelerators such as FPGAs.
- Recent works solve this problem: [Secâ24] ACAI and [NDSSâ24] CAGE
HW components
Survey on the readiness of commodity HW
How to expand trust from in-host TEE to other components, especially to other physical machine?
- One solution is to make all nodes TEE-enabled.
- Protecting non-TEE nodes can be done via bus-level isolation if they are directly physically connected to a TEE host.
DSA (Data Streaming Accelerator)
DSA is an accelerator that enables high performance data mover capabilities (e.g., copy to/from volatile memory, persistent memory, memory-mapped I/O) and transformation operations (e.g., memory comparison and delta generation, VM fast checkpointing). It will also be integrated into the upcoming Intel Xeon processor (Sapphire Rapids).
CXL (Computer Express Link)
CXL is an open standard for high-speed low-latency interconnect between CPUs and other hardware devices such as accelerators. CXL was first proposed by Intel to solve the communication problems of hardware disaggregation in data centers. CXL will be a feature in the new Intel Xeon processor (Sapphire Rapids), which is to be released at the end of 2022.
PCIe
PCIe (Peripheral Component Interconnect Express) is a high-speed serial computer expansion bus standard used for connecting hardware components to the motherboard.
TDISP
TDISP stands for TEE Device Interface Security Protocol. It is a new framework and architecture to secure I/O virtualization (for both CXL and PCIe) and manage secure environments.
GPU / Accelerators
Accelerators supporting CC
- [OSDIâ18]Graviton: Trusted execution environments on gpus
- [Arxivâ21] IceClave: A Trusted Execution Environment for In-Storage Computing
- [Arxivâ22] ShEF: Shielded Enclaves for Cloud FPGAs
- [FCCMâ21]Trusted Configuration in Cloud FPGAs
- [Web] NVIDIA Hopper Architecture In-Depth
- New Confidential Computing support protects user data, defends against hardware and software attacks, and better isolates and protects virtual machines (VMs) from each other in virtualized and MIG environments. H100 implements the worldâs first native Confidential Computing GPU and extends the trusted execution environment (TEE) with CPUs at full PCIe line rate.
Building trust from cpu to accelerator
- [Secâ24] ACAI: Protecting Accelerator Execution with Arm Confidential Computing Architecture | USENIX
- [NDSSâ24] CAGE: Complementing Arm CCA with GPU Extensions
- [USENIX Secâ23] SHELTER: Extending Arm CCA with Isolation in User Space
IPU
TODO
Software challenges
What can be additional overhead for applying confidential computing?
Is there any software techniques or optimizations cannot be applied in the context of CC?
Is there any missing part in software layer to apply CC?
Opposite stance
Based on some observation that VMs used in public clouds donât utilize hypervisor features and resources are mostly assigned statically, [OSDIâ23] Paper: Core Slicing suggests to use bare-metal hardware without virtualization by just statically slice hardware to guests. We both observed that current CC canât fully leverage hypervisor features, and this paper takes their steps to remove hypervisor. On the other hand, we want to utilize those missing features properly.
Memory deduplication
This is straightforward.
Large page table
TODO
IOMMU
TODO
Similar works
- [Arxivâ22] Empowering Data Centers for Next Generation Trusted Computing
- The paper design a distributed TEE solution that allows a tenant to securely use TEE nodes (including CPUs and accelerators) and non-TEE legacy nodes.
- Use TEEs on CPUs and DSAs when available. Use such TEEs to protect all the data leaving the corresponding nodes.
- Employ a centralized security controller that shields all the non-TEE nodes. All non-TEE nodes are placed behind the trusted controller who imparts TEE properties such as attestation, isolation, and secure channel.
- The initial state of the nodes is attested and cannot be changed thereafter. The controller checks that the CSPâs resource management decisions do not violate resource isolation and secure path guarantees.
- The paper design a distributed TEE solution that allows a tenant to securely use TEE nodes (including CPUs and accelerators) and non-TEE legacy nodes.
- [Arxivâ21] Composite Enclaves: Towards Disaggregated Trusted Execution
- [Microâ22] CRONUS: Fault-isolated, Secure and High-performance Heterogeneous Computing for Trusted Execution Environment
- [S&Pâ20] Enabling rack-scale conďŹdential computing using heterogeneous trusted execution environment
- Enable TEE abstractions for a single rack containing non-TEE nodes
- Do not scale to multiple racks and are not designed to leverage nodes that have TEE support.
- [APSysâ23] Trusted Heterogeneous Disaggregated Architectures