PCPJack Malware Exploits Parquet Files to Steal Cloud Secrets
Security researchers at Unit 42 have uncovered a new cloud‑targeting malware family they are calling PCPJack, which has quietly replaced the earlier TeamPCP implant. PCPJack distinguishes itself by embedding malicious code inside Apache Parquet files—a columnar storage format widely used for analytics workloads—to perform stealthy, pre‑validated target discovery across Amazon Web Services, Microsoft Azure, and Google Cloud Platform environments.
The infection chain begins when a victim processes a seemingly innocuous Parquet file, often delivered via a compromised data‑lake bucket or a spear‑phished email attachment. Inside the Parquet metadata, PCPJack hides an encoded payload that, once parsed by Spark, Presto, or Athena, executes a lightweight reconnaissance agent. This agent queries cloud metadata services to enumerate IAM roles, service‑account tokens, and object‑storage buckets, then exfiltrates the harvested secrets to a command‑and‑control (C2) server over HTTPS using domain‑fronted URLs. The malware leverages the Parquet file’s column statistics to prioritize high‑value assets, effectively filtering out irrelevant systems before attempting credential theft.
Unit 42 has released a set of indicators of compromise (IOCs) that include SHA‑256 hashes of the malicious Parquet files (e.g., 3f9b…a2c1), C2 domains such as pcp‑jack‑data[.]cloud and associated IP addresses (e.g., 198.51.100.42). YARA rules and Sigma detection signatures have also been published to identify the anomalous Parquet processing patterns and the unusual API calls to the instance‑metadata service. Organizations that have deployed AWS GuardDuty, Azure Defender for Storage, or GCP Security Command Center can enable the provided threat‑detection templates to surface the rogue activity.
To mitigate the risk, security teams should restrict public access to S3/Azure Blob/Google Cloud Storage buckets that store Parquet data, enforce strict bucket policies that deny upload of untrusted Parquet files, and monitor for unexpected calls to the cloud‑instance metadata endpoint (169.254.169.254). Implementing least‑privilege IAM policies, rotating access keys on a regular basis, and enabling multi‑factor authentication for administrative accounts will limit the blast radius of any stolen credentials. Additionally, deploying network segmentation and advanced sandboxing for data‑processing jobs will help contain any future PCPJack variants that attempt to exploit Parquet files.