Critical Unpatched Ray AI Platform Vulnerability Exploited for Cryptocurrency Mining

Summary:
Researchers have recently uncovered an active cyberattack campaign targeting a shadow vulnerability in Ray, developed by Anyscale, an open-source AI framework employed by thousands of companies and servers. A shadow vulnerability is a CVE that does not appear in static scans but can still used by attackers for data breaches and significant losses. CVE-2023-48022 does not appear in static scans because it remains a disputed vulnerability. AnyScale does not consider this CVE a risk and it was not addressed in the hotfix like the other 4 CVEs that were disclosed concurrently. This vulnerability was observed in the wild compromising thousands of publicly exposed Ray servers, some for up to 7 months and counting, and has been named ShadowRay. The exploiters of ShadowRay often had access to the command history included on the machine, making understanding where sensitive information resides extremely straightforward. ShadowRay, which marks the first instance of AI workloads being exploited in the wild, is a gateway for a plethora of attacks that can cause irreversible damage to a company and its highly sensitive jackpot of information. An internet-facing vulnerability like this allows remote code execution which creates an easy way to monetize attacks for different purposes. The attackers can remain undetectable because the disputed CVE is considered benign in public scans.

This unknown attacker could affect an AI model's accuracy or infect it during the training phase, download complete databases while remaining obscured, steal password hashes by analyzing the command history, utilize private SSH keys to move laterally thereby gaining more computational power and maintaining persistence in the environment, drain company credits on OpenAI accounts and payment accounts through token stealing, deliver supply chain attacks through hugging-face tokens that enable editing of existing AI models, use access to the cloud environment to infect cloud workloads, and most importantly drop cryptocurrency mining malware on machines with immense compute power. GPU on-demand prices from AWS can reach a cost of over $850,000 annually, making the estimated collective valuation of the total amount of machines that might have been compromised almost 1 billion USD. The crypto-mining campaigns leverage ShadowRay to deploy three different kinds of crypto-miners. The first crypto-miner was uncovered on February 21, 2024, before the vulnerability was disclosed by AnyScale. The attackers are a part of ZEPH Mining Pool. A mining pool is a group of miners who pool their efforts to receive a reward more quickly which is divvied depending on the proportion of an individual’s processing power.

Analyst Comments:
Due to CVE-2023-48022 being a shadow vulnerability, many development teams that use Ray are not aware that this is a highly concerning critical vulnerability. To further bolster detection evasion capabilities, the attacker utilizes free public servers connected to the interactsh open source service to send C2 communication containing the compromised machine’s IP address to their controlled free subdomain. AnyScale cites that it was their intention to include code execution capabilities in Ray. By design, Ray expects its users to be responsible for its locality and security. The Ray dashboard should not be internet-facing because it lacks authorization features purposefully. However, Ray’s official Kubernetes deployment guide and Kuberay’s Kubernetes operator encourage people to expose the dashboard on address 0.0.0.0. This campaign underscores a pressing question of whether SaaS providers should be mainly responsible for balancing the security and usability of their software or does some of the responsibility belong to the customers as well. This is especially important for critical systems like Ray.

Suggested Corrections:
Follow the best practices for securing Ray deployments:

  • Start with running Ray within a secured, trusted environment.
  • Always add firewall rules or security groups to prevent unauthorized access.

Add authorization on top of Ray Dashboard port (8265 by default):

  • If you do need Ray’s dashboard to be accessible, implement a proxy that adds an authorization layer to the Ray API when exposing it over the network.

Continuously monitor your production environments and AI clusters for anomalies, even within Ray:

  • Ray depends on arbitrary code execution to function. Code Scanning and Misconfiguration tools will not be able to detect such attacks, because the open-source maintainers of Ray (Anyscale) marked it as disputed and confirmed it is not a bug—at the time of writing, it is a feature.
  • Don’t bind on 0.0.0.0 to make your life easy - It is recommended to use an IP of an explicit network interface, such as the IP that is in the subnet of your local network or a trusted private VPC/VPN.
  • Don’t trust the default - Sometimes tools assume you read their docs. Do it.
  • Use the right tools - The technical burden of securing open source is yours. Don't rely on the maintainers, there are tools that can help you protect your production workloads from the risks of using open source in runtime.

Researchers at Oligo Security have published IOCs relevant to this campaign here:

https://www.oligo.security/blog/shadowray-attack-ai-workloads-actively-exploited-in-the-wild

Link(s):

https://thehackernews.com/2024/03/critical-unpatched-ray-ai-platform.html