Wiz Research finds architecture risks that may compromise AI-as-a-Service providers and consequently risk customer data; works with Hugging Face on mitigations

Wiz researchers identified architecture risks in AI-as-a-Service platforms that could jeopardize customer data, leading to a collaboration with Hugging Face to address these vulnerabilities. The research revealed that untrusted, potentially malicious models could exploit Hugging Face’s infrastructure to execute remote code, gain escalated privileges, and perform cross-tenant attacks. Specifically, malicious PyTorch models could compromise the Inference API and Inference Endpoints, allowing unauthorized access to other customers' models. Additionally, vulnerabilities in Hugging Face Spaces were discovered, where malicious Dockerfiles could exploit network isolation issues to access and overwrite container registries. These findings underscore the importance of ensuring AI models run in sandboxed environments and highlight the need for robust security measures in rapidly growing AI services. Hugging Face has taken steps to mitigate these risks by implementing vulnerability scanning and undergoing regular penetration testing.
In the attack described, the adversaries began by uploading a malicious PyTorch model to Hugging Face's platform, exploiting the unsafe Pickle format to execute arbitrary code. This step was effectively mitigated by BlueRock's Reverse Shell Protection, which prevents unauthorized attempts to bind shell input and output streams to network sockets, thereby blocking reverse shell attacks initiated by the malicious model. Following this, the attackers escalated their privileges by querying the node’s IMDS, obtaining the role of a Node inside the EKS cluster. BlueRock's Cloud IMDS Firewall (AWS) would have mitigated this step by restricting access to the Instance Metadata Service, preventing the attackers from exploiting cloud instance metadata for privilege escalation. These mechanisms together provide a robust defense against the described attack vectors, ensuring that malicious code execution and unauthorized privilege escalation are effectively thwarted.
- T1204.002: User Execution: Malicious File: The article mentions that Wiz Research was able to compromise the service running the custom models by uploading their own malicious model and leveraging container escape techniques to break out from their tenant and compromise the entire service. This implies the use of 'User Execution: Malicious File' as the attackers uploaded a specially crafted malicious file (the model) to achieve their goals.
- T1059: Command and Scripting Interpreter: The article explains that the malicious model could contain a remote code execution payload, potentially granting the attacker escalated privileges and cross-tenant access to other customers' models. This indicates the use of 'Execution: Command and Scripting Interpreter' as the attackers used the Pickle format to execute arbitrary code.
- T1210: Exploitation of Remote Services: Wiz Research was able to gain cross-tenant access to other customers' models stored and run in Hugging Face. This demonstrates the use of 'Lateral Movement: Exploitation of Remote Services' as the attackers moved laterally within the environment by exploiting the shared infrastructure.
- T1195: Supply Chain Compromise: The attackers used a specially crafted Pickle file to achieve remote code execution upon deserialization of untrusted data. This represents 'Initial Access: Supply Chain Compromise' as the attackers compromised the supply chain by injecting malicious code into the AI models.
- T1552.005: Unsecured Credentials: Cloud Instance Metadata API: After gaining initial access, the attackers escalated their privileges by querying the node’s IMDS and obtaining the role of a Node inside the EKS cluster. This corresponds to 'Privilege Escalation: Cloud Instance Metadata API' as they used the cloud instance metadata API to escalate privileges.
- T1526: Cloud Service Discovery: The article describes how the attackers listed all pods in the cluster with their new token, which shows 'Discovery: Cloud Service Discovery' as they enumerated cloud resources to understand the environment.
- T1552: Unsecured Credentials: The attackers were able to obtain secrets associated with their pod, enabling lateral movement within the EKS cluster. This indicates 'Credential Access: Unsecured Credentials' as they accessed sensitive credentials stored within the environment.
- T1609: Container Administration Command: The attackers used a Dockerfile with a malicious payload to gain code execution in the Hugging Face Spaces service. This is an example of 'Execution: Container Administration Command' as they used container commands to execute their payload.
- T1610: Deploy Container: The article describes how the attackers exploited a network isolation issue to write to the centralized container registry, indicating 'Persistence: Container Image' as they manipulated container images to maintain access.
F1: Gaining Remote Code Execution (RCE) in Hugging Face's Inference API via Malicious Pickle Model Upload.
- Attacker crafts a malicious AI model using the PyTorch (Pickle) format, embedding arbitrary code designed to execute upon deserialization. (Cited from: "A malicious pickle-serialized model could contain a remote code execution payload", "it is relatively straightforward to craft a PyTorch (Pickle) model that will execute arbitrary code upon loading", "cloned a legitimate model... modified it in a way that would run our reverse-shell upon loading")
- Attacker uploads the crafted malicious model to the Hugging Face platform, potentially as a private model. (Cited from: "uploading our own malicious model", "uploaded our hand-crafted model to Hugging Face as a private model")
- Attacker interacts with the uploaded model using the platform's Inference API feature (e.g., the model preview modal). (Cited from: "attempted to interact with it using the Inference API feature")
- Hugging Face's backend infrastructure loads the attacker's model to provide the inference service, triggering the deserialization of the Pickle file. (Cited from: "Hugging Face will dedicate resources... required for users to be able to interact with it", "upon loading")
- BR-76: Python Deserialization Protection - This mechanism is applicable because it is designed to intercept the Python deserialization process (like Pickle) for objects originating from the network and apply policies to restrict potentially harmful actions, such as executing system-native binaries.
- BR-77: Python OS Command Injection Prevention - This mechanism is applicable because the Pickle deserialization likely leads to OS command execution attempts from within the Python process, which this mechanism monitors and blocks.
- The embedded malicious code within the Pickle file executes during deserialization, granting the attacker RCE within the Inference API environment. (Cited from: "achieve remote code execution upon deserialization", "voila, we got our reverse shell!")
- BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable because if the RCE payload involves writing a new executable binary or script to the container's filesystem and then executing it, this mechanism would block the execution as the file wasn't part of the original container image manifest.
- BR-55: Reverse Shell Protection - This mechanism is applicable because the attack explicitly resulted in a reverse shell. BR-55 monitors shell processes and prevents their standard input/output/error streams from being bound to network sockets, thus blocking typical reverse shell implementations.
- BR-76: Python Deserialization Protection - This mechanism is applicable because it blocks the subsequent execution of OS commands by the deserialized Python object, preventing the RCE from achieving its goal.
- BR-77: Python OS Command Injection Prevention - This mechanism is applicable because it blocks the Python process executing the deserialized code from making unauthorized OS command executions.
- BR-82: Process Runtime Execution Guardrails - This mechanism is applicable because if the RCE attempts to start new, unauthorized processes (beyond the initial Python process), NSJail policies enforced by this mechanism could block their creation.
- BR-83: Syscall Deny Filter - This mechanism is applicable because a policy could be crafted to deny specific system calls (like
execve
,socket
,connect
) that the RCE payload relies on to establish the reverse shell or execute further commands. - BR-85: Ephemeral Filesystem Behavior Analysis - This mechanism is applicable because it could detect if the RCE payload uses fileless techniques like executing scripts from memory-backed filesystems (e.g.,
/dev/shm
) or piping downloaded content directly to a shell. - BR-87: Process Socket Deny - This mechanism is applicable because if the policy denies network socket operations for the Python process running the inference, the reverse shell connection attempt would be blocked.
- BR-88: Process Path Exec Allow - This mechanism is applicable because if the RCE payload attempts to execute a binary or script from a location not on the allowlist (e.g.,
/tmp
, a user-writable directory), the execution would be blocked. - BR-90: Process Exec Deny - This mechanism is applicable because if the RCE attempts to execute a specifically denied binary (e.g.,
/bin/bash
,/bin/sh
, or tools likenc
), the execution would be blocked.
F2: Escalating Privileges within the Hugging Face Inference API Kubernetes (EKS) Cluster by Abusing IMDS Access.
- Attacker, having achieved RCE within an Inference API pod (as per F1), explores the execution environment. (Cited from: "After executing code inside Hugging Face Inference API and receiving our reverse shell, we started exploring the environment")
- Attacker identifies the environment as a pod within a Kubernetes cluster hosted on Amazon EKS. (Cited from: "running inside a Pod in a Kubernetes cluster hosted on Amazon EKS")
- Attacker discovers that the pod can access the EC2 Instance Metadata Service (IMDS) endpoint at
169.254.169.254
. (Cited from: "noticed that we could query the node’s IMDS (169.254.169.254) from within the pod") - BR-59: Cloud IMDS Firewall (AWS) - This mechanism is applicable because it acts as a firewall specifically for the AWS IMDS endpoint, allowing policies to block or allow requests based on source and nature, potentially preventing the pod from reaching the IMDS endpoint at all.
- Attacker queries the IMDS to retrieve the IAM role credentials associated with the underlying EKS node. (Cited from: "obtain its identity", "obtain the role of a Node inside the EKS cluster")
- BR-59: Cloud IMDS Firewall (AWS) - This mechanism is applicable because it intercepts requests to the IMDS and can enforce policies to block unauthorized attempts to retrieve credentials, mitigating the risk described in T1552.005.
- Attacker determines the IAM role has permissions for
ec2:DescribeInstances
. (Cited from: "noticed that our AWS role also had permissions to call DescribeInstances (a default configuration)") - Attacker uses the
DescribeInstances
permission to query EC2 instance tags and identify the correct EKS cluster name. (Cited from: "revealed the name of the cluster via a tag attached to nodes’ compute") - BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the attacker needed to download and run the
aws
CLI or another tool not present in the original container image to perform this action. - BR-87: Process Socket Deny - This mechanism is applicable because the process making the
DescribeInstances
API call (e.g.,aws
CLI) requires network access. If this process is denied socket operations by policy, the call would fail. - BR-88: Process Path Exec Allow - This mechanism is applicable if the
aws
CLI or tool used is executed from a path not on the allowlist. - BR-90: Process Exec Deny - This mechanism is applicable if the
aws
CLI or specific tool used is explicitly denied execution. - Attacker uses the retrieved node IAM credentials and the discovered cluster name with the
aws eks get-token
command to generate a Kubernetes API token associated with the node's privileges. (Cited from: "Using the aws eks get-token command and the IAM identity from the IMDS, we generated a valid Kubernetes token with the role of a Node.") - BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the
aws
CLI executable was not part of the original container image and was introduced by the attacker. - BR-87: Process Socket Deny - This mechanism is applicable because the
aws eks get-token
command needs network access to communicate with the AWS EKS service. If theaws
process is denied socket operations, this step fails. - BR-88: Process Path Exec Allow - This mechanism is applicable if the
aws
CLI is executed from a non-allowlisted path. - BR-90: Process Exec Deny - This mechanism is applicable if the
aws
CLI (or the specificeks get-token
subcommand if granularity allows) is explicitly denied. - Attacker uses the obtained node-level Kubernetes token to interact with the Kubernetes API, gaining elevated privileges within the cluster (e.g., listing pods, accessing secrets). (Cited from: "Now that we have the role of a node inside the Amazon EKS cluster, we have more privileges", "Listing all pods in the cluster with our new token", "obtaining secrets (using kubectl get secrets)")
- BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the attacker uses a tool like
kubectl
that wasn't part of the original container image. - BR-87: Process Socket Deny - This mechanism is applicable because interacting with the Kubernetes API requires network connections. If the process making these calls (e.g.,
kubectl
, or a script using K8s client libraries) is denied socket operations, this step fails. - BR-88: Process Path Exec Allow - This mechanism is applicable if
kubectl
or another tool is executed from a non-allowlisted path. - BR-90: Process Exec Deny - This mechanism is applicable if
kubectl
or the specific tool used is explicitly denied execution. - Attacker leverages access to Kubernetes secrets obtained via the node role for potential lateral movement or cross-tenant data access. (Cited from: "it was possible to perform lateral movement within the EKS cluster", "Secrets within shared environments may often lead to cross-tenant access")
- BR-91: Sensitive File Access - This mechanism is applicable if the Kubernetes secrets are mounted as files within the pod (a common pattern) and these file paths are included in the sensitive file list. It could block or alert on the attempt to read these secret files.
F3: Gaining Remote Code Execution (RCE) during the build process in Hugging Face Spaces via a Malicious Dockerfile RUN
Instruction.
- Attacker utilizes the Hugging Face Spaces service, which allows hosting applications defined by a user-provided
Dockerfile
. (Cited from: "Spaces is a different service in Hugging Face that allows users to host their AI-powered application", "all Hugging Face requires from the user in order to run their application... is a Dockerfile") - Attacker crafts a
Dockerfile
containing malicious commands within one or moreRUN
instructions. (Cited from: "decided to use the RUN instruction instead of the CMD instruction, enabling us to execute code in the build process") - Attacker submits this malicious
Dockerfile
to the Spaces service to trigger an application build. - During the image building phase executed by Hugging Face's infrastructure, the
RUN
instructions containing the attacker's commands are executed. (Cited from: "execute code in the building process of our image") - BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the
RUN
instruction attempts to execute a binary or script that was not present in the base image layers used before thisRUN
step. Its effectiveness depends on how BR-54 establishes the baseline during image builds. - BR-82: Process Runtime Execution Guardrails - This mechanism is applicable because if the commands within the
RUN
instruction spawn processes not permitted by the NSJail policy governing the build environment, their execution would be blocked. - BR-83: Syscall Deny Filter - This mechanism is applicable if the commands executed by
RUN
rely on syscalls that are explicitly denied by policy within the build environment. - BR-88: Process Path Exec Allow - This mechanism is applicable if a command within the
RUN
instruction attempts to execute a program from a filesystem path not included in the build environment's allowlist. - BR-90: Process Exec Deny - This mechanism is applicable if a command within the
RUN
instruction attempts to execute a binary explicitly denied by policy (e.g.,nc
,wget
). - Attacker achieves code execution within the context of the build environment. (Cited from: "After executing code in the building process of our image")
- BR-55: Reverse Shell Protection - This mechanism is applicable if the code executed via the
RUN
instruction attempts to establish an interactive reverse shell from the build environment. - BR-85: Ephemeral Filesystem Behavior Analysis - This mechanism is applicable if the RCE achieved during the build process uses techniques like execution from memory-backed filesystems or piping downloads to shells.
- BR-87: Process Socket Deny - This mechanism is applicable because if the process executing the RCE payload within the build environment is denied socket operations, any outbound network connection attempts (like C2 or reverse shell) would fail.
F4: Overwriting Container Images in a Shared Internal Registry via Network Access from the Hugging Face Spaces Build Environment.
- Attacker, having achieved RCE within the Spaces build environment (as per F3), investigates network activity. (Cited from: "After executing code in the building process of our image, we used the netstat command to examine network connections")
- BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the
netstat
command or other network investigation tools were not part of the base image used for the build. - BR-88: Process Path Exec Allow - This mechanism is applicable if the network investigation tool is executed from a path not on the allowlist within the build environment.
- BR-90: Process Exec Deny - This mechanism is applicable if
netstat
or other specific tools used are explicitly denied execution. - Attacker identifies network connections to an internal container registry used to store image layers built by the service. (Cited from: "One connection was to an internal container registry to which our built layers were pushed.")
- Attacker determines that this registry is shared among multiple Hugging Face customers using the Spaces service. (Cited from: "this container registry did not serve only us; it also served more of Hugging Face’s customers.")
- Attacker discovers that due to insufficient access controls or scoping, their build environment has write access to the registry beyond their own images. (Cited from: "Due to insufficient scoping")
- Attacker gains the ability to pull and push (overwrite) arbitrary images stored within the shared internal container registry. (Cited from: "it was possible to pull and push (thus overwrite) all the images that were available on that container registry.")