June 10, 2024
Private Cloud Compute: A new frontier for AI privacy in the cloudWritten by Apple Security Engineering and Architecture (SEAR), User Privacy, Core Operating Systems (Core OS), Services Engineering (ASE), and Machine Learning and AI (AIML)
Apple Intelligence is the personal intelligence system that brings powerful generative models to iPhone, iPad, and Mac. For advanced features that need to reason over complex data with larger foundation models, we created Private Cloud Compute (PCC), a groundbreaking cloud intelligence system designed specifically for private AI processing. For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud, making sure that personal user data sent to PCC isnât accessible to anyone other than the user â not even to Apple. Built with custom Apple silicon and a hardened operating system designed for privacy, we believe PCC is the most advanced security architecture ever deployed for cloud AI compute at scale.
Apple has long championed on-device processing as the cornerstone for the security and privacy of user data. Data that exists only on user devices is by definition disaggregated and not subject to any centralized point of attack. When Apple is responsible for user data in the cloud, we protect it with state-of-the-art security in our services â and for the most sensitive data, we believe end-to-end encryption is our most powerful defense. For cloud services where end-to-end encryption is not appropriate, we strive to process user data ephemerally or under uncorrelated randomized identifiers that obscure the userâs identity.
Secure and private AI processing in the cloud poses a formidable new challenge. Powerful AI hardware in the data center can fulfill a userâs request with large, complex machine learning models â but it requires unencrypted access to the user's request and accompanying personal data. That precludes the use of end-to-end encryption, so cloud AI applications have to date employed traditional approaches to cloud security. Such approaches present a few key challenges:
When on-device computation with Apple devices such as iPhone and Mac is possible, the security and privacy advantages are clear: users control their own devices, researchers can inspect both hardware and software, runtime transparency is cryptographically assured through Secure Boot, and Apple retains no privileged access (as a concrete example, the Data Protection file encryption system cryptographically prevents Apple from disabling or guessing the passcode of a given iPhone).
However, to process more sophisticated requests, Apple Intelligence needs to be able to enlist help from larger, more complex models in the cloud. For these cloud requests to live up to the security and privacy guarantees that our users expect from our devices, the traditional cloud service security model isn't a viable starting point. Instead, we need to bring our industry-leading device security model, for the first time ever, to the cloud.
The rest of this post is an initial technical overview of Private Cloud Compute, to be followed by a deep dive after PCC becomes available in beta. We know researchers will have many detailed questions, and we look forward to answering more of them in our follow-up post.
Designing Private Cloud ComputeWe set out to build Private Cloud Compute with a set of core requirements:
This is an extraordinary set of requirements, and one that we believe represents a generational leap over any traditional cloud service security model.
Introducing Private Cloud Compute nodesThe root of trust for Private Cloud Compute is our compute node: custom-built server hardware that brings the power and security of Apple silicon to the data center, with the same hardware security technologies used in iPhone, including the Secure Enclave and Secure Boot. We paired this hardware with a new operating system: a hardened subset of the foundations of iOS and macOS tailored to support Large Language Model (LLM) inference workloads while presenting an extremely narrow attack surface. This allows us to take advantage of iOS security technologies such as Code Signing and sandboxing.
On top of this foundation, we built a custom set of cloud extensions with privacy in mind. We excluded components that are traditionally critical to data center administration, such as remote shells and system introspection and observability tools. We replaced those general-purpose software components with components that are purpose-built to deterministically provide only a small, restricted set of operational metrics to SRE staff. And finally, we used Swift on Server to build a new Machine Learning stack specifically for hosting our cloud-based foundation model.
Letâs take another look at our core Private Cloud Compute requirements and the features we built to achieve them.
Stateless computation and enforceable guaranteesWith services that are end-to-end encrypted, such as iMessage, the service operator cannot access the data that transits through the system. One of the key reasons such designs can assure privacy is specifically because they prevent the service from performing computations on user data. Since Private Cloud Compute needs to be able to access the data in the userâs request to allow a large foundation model to fulfill it, complete end-to-end encryption is not an option. Instead, the PCC compute node must have technical enforcement for the privacy of user data during processing, and must be incapable of retaining user data after its duty cycle is complete.
We designed Private Cloud Compute to make several guarantees about the way it handles user data:
When Apple Intelligence needs to draw on Private Cloud Compute, it constructs a request â consisting of the prompt, plus the desired model and inferencing parameters â that will serve as input to the cloud model. The PCC client on the userâs device then encrypts this request directly to the public keys of the PCC nodes that it has first confirmed are valid and cryptographically certified. This provides end-to-end encryption from the userâs device to the validated PCC nodes, ensuring the request cannot be accessed in transit by anything outside those highly protected PCC nodes. Supporting data center services, such as load balancers and privacy gateways, run outside of this trust boundary and do not have the keys required to decrypt the userâs request, thus contributing to our enforceable guarantees.
Next, we must protect the integrity of the PCC node and prevent any tampering with the keys used by PCC to decrypt user requests. The system uses Secure Boot and Code Signing for an enforceable guarantee that only authorized and cryptographically measured code is executable on the node. All code that can run on the node must be part of a trust cache that has been signed by Apple, approved for that specific PCC node, and loaded by the Secure Enclave such that it cannot be changed or amended at runtime. This also ensures that JIT mappings cannot be created, preventing compilation or injection of new code at runtime. Additionally, all code and model assets use the same integrity protection that powers the Signed System Volume. Finally, the Secure Enclave provides an enforceable guarantee that the keys that are used to decrypt requests cannot be duplicated or extracted.
The Private Cloud Compute software stack is designed to ensure that user data is not leaked outside the trust boundary or retained once a request is complete, even in the presence of implementation errors. The Secure Enclave randomizes the data volumeâs encryption keys on every reboot and does not persist these random keys, ensuring that data written to the data volume cannot be retained across reboot. In other words, there is an enforceable guarantee that the data volume is cryptographically erased every time the PCC nodeâs Secure Enclave Processor reboots. The inference process on the PCC node deletes data associated with a request upon completion, and the address spaces that are used to handle user data are periodically recycled to limit the impact of any data that may have been unexpectedly retained in memory.
Finally, for our enforceable guarantees to be meaningful, we also need to protect against exploitation that could bypass these guarantees. Technologies such as Pointer Authentication Codes and sandboxing act to resist such exploitation and limit an attackerâs horizontal movement within the PCC node. The inference control and dispatch layers are written in Swift, ensuring memory safety, and use separate address spaces to isolate initial processing of requests. This combination of memory safety and the principle of least privilege removes entire classes of attacks on the inference stack itself and limits the level of control and capability that a successful attack can obtain.
No privileged runtime accessWe designed Private Cloud Compute to ensure that privileged access doesnât allow anyone to bypass our stateless computation guarantees.
First, we intentionally did not include remote shell or interactive debugging mechanisms on the PCC node. Our Code Signing machinery prevents such mechanisms from loading additional code, but this sort of open-ended access would provide a broad attack surface to subvert the systemâs security or privacy. Beyond simply not including a shell, remote or otherwise, PCC nodes cannot enable Developer Mode and do not include the tools needed by debugging workflows.
Next, we built the systemâs observability and management tooling with privacy safeguards that are designed to prevent user data from being exposed. For example, the system doesnât even include a general-purpose logging mechanism. Instead, only pre-specified, structured, and audited logs and metrics can leave the node, and multiple independent layers of review help prevent user data from accidentally being exposed through these mechanisms. With traditional cloud AI services, such mechanisms might allow someone with privileged access to observe or collect user data.
Together, these techniques provide enforceable guarantees that only specifically designated code has access to user data and that user data cannot leak outside the PCC node during system administration.
Non-targetabilityOur threat model for Private Cloud Compute includes an attacker with physical access to a compute node and a high level of sophistication â that is, an attacker who has the resources and expertise to subvert some of the hardware security properties of the system and potentially extract data that is being actively processed by a compute node.
We defend against this type of attack in two ways:
Private Cloud Compute hardware security starts at manufacturing, where we inventory and perform high-resolution imaging of the components of the PCC node before each server is sealed and its tamper switch is activated. When they arrive in the data center, we perform extensive revalidation before the servers are allowed to be provisioned for PCC. The process involves multiple Apple teams that cross-check data from independent sources, and the process is further monitored by a third-party observer not affiliated with Apple. At the end, a certificate is issued for keys rooted in the Secure Enclave UID for each PCC node. The userâs device will not send data to any PCC nodes if it cannot validate their certificates.
These processes broadly protect hardware from compromise. To guard against smaller, more sophisticated attacks that might otherwise avoid detection, Private Cloud Compute uses an approach we call target diffusion to ensure requests cannot be routed to specific nodes based on the user or their content.
Target diffusion starts with the request metadata, which leaves out any personally identifiable information about the source device or user, and includes only limited contextual data about the request thatâs required to enable routing to the appropriate model. This metadata is the only part of the userâs request that is available to load balancers and other data center components running outside of the PCC trust boundary. The metadata also includes a single-use credential, based on RSA Blind Signatures, to authorize valid requests without tying them to a specific user. Additionally, PCC requests go through an OHTTP relay â operated by a third party â which hides the deviceâs source IP address before the request ever reaches the PCC infrastructure. This prevents an attacker from using an IP address to identify requests or associate them with an individual. It also means that an attacker would have to compromise both the third-party relay and our load balancer to steer traffic based on the source IP address.
User devices encrypt requests only for a subset of PCC nodes, rather than the PCC service as a whole. When asked by a user device, the load balancer returns a subset of PCC nodes that are most likely to be ready to process the userâs inference request â however, as the load balancer has no identifying information about the user or device for which itâs choosing nodes, it cannot bias the set for targeted users. By limiting the PCC nodes that can decrypt each request in this way, we ensure that if a single node were ever to be compromised, it would not be able to decrypt more than a small portion of incoming requests. Finally, the selection of PCC nodes by the load balancer is statistically auditable to protect against a highly sophisticated attack where the attacker compromises a PCC node as well as obtains complete control of the PCC load balancer.
Verifiable transparencyWe consider allowing security researchers to verify the end-to-end security and privacy guarantees of Private Cloud Compute to be a critical requirement for ongoing public trust in the system. Traditional cloud services do not make their full production software images available to researchers â and even if they did, thereâs no general mechanism to allow researchers to verify that those software images match whatâs actually running in the production environment. (Some specialized mechanisms exist, such as Intel SGX and AWS Nitro attestation.)
When we launch Private Cloud Compute, weâll take the extraordinary step of making software images of every production build of PCC publicly available for security research. This promise, too, is an enforceable guarantee: user devices will be willing to send data only to PCC nodes that can cryptographically attest to running publicly listed software. We want to ensure that security and privacy researchers can inspect Private Cloud Compute software, verify its functionality, and help identify issues â just like they can with Apple devices.
Our commitment to verifiable transparency includes:
Every production Private Cloud Compute software image will be published for independent binary inspection â including the OS, applications, and all relevant executables, which researchers can verify against the measurements in the transparency log. Software will be published within 90 days of inclusion in the log, or after relevant software updates are available, whichever is sooner. Once a release has been signed into the log, it cannot be removed without detection, much like the log-backed map data structure used by the Key Transparency mechanism for iMessage Contact Key Verification.
As we mentioned, user devices will ensure that theyâre communicating only with PCC nodes running authorized and verifiable software images. Specifically, the userâs device will wrap its request payload key only to the public keys of those PCC nodes whose attested measurements match a software release in the public transparency log. And the same strict Code Signing technologies that prevent loading unauthorized software also ensure that all code on the PCC node is included in the attestation.
Making Private Cloud Compute software logged and inspectable in this way is a strong demonstration of our commitment to enable independent research on the platform. But we want to ensure researchers can rapidly get up to speed, verify our PCC privacy claims, and look for issues, so weâre going further with three specific steps:
The Apple Security Bounty will reward research findings in the entire Private Cloud Compute software stack â with especially significant payouts for any issues that undermine our privacy claims.
More to comePrivate Cloud Compute continues Appleâs profound commitment to user privacy. With sophisticated technologies to satisfy our requirements of stateless computation, enforceable guarantees, no privileged access, non-targetability, and verifiable transparency, we believe Private Cloud Compute is nothing short of the world-leading security architecture for cloud AI compute at scale.
We look forward to sharing many more technical details about PCC, including the implementation and behavior behind each of our core requirements. And weâre especially excited to soon invite security researchers for a first look at the Private Cloud Compute software and our PCC Virtual Research Environment.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.3