sysops

A critical remote code execution (RCE) vulnerability has been discovered in Apache Parquet’s Java library, potentially affecting thousands of data analytics systems worldwide.

The flaw, identified as CVE-2025-30065, carries the highest possible CVSS score of 10.0 and allows attackers to execute arbitrary code by exploiting unsafe deserialization in the parquet-avro module.

The security issue, classified as “Deserialization of Untrusted Data” (CWE-502), affects all Apache Parquet Java versions through 1.15.0.

Apache Parquet RCE Vulnerability

The vulnerability was introduced in version 1.8.0, though all historical versions should be reviewed. At the core of this vulnerability lies a critical flaw in schema parsing within the parquet-avro module.

“Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous versions allows bad actors to execute arbitrary code,” according to the official advisory from Apache.

The vulnerability’s technical root cause involves insecure class loading during Avro schema parsing, allowing attackers to inject and execute malicious code when a specially crafted Parquet file is processed.

The exploitation requires no user interaction or authentication. An attacker needs to convince a target to process a malicious Parquet file through their data pipeline.

The vulnerability was discovered and responsibly disclosed by Amazon researcher Keyi Li. The summary of the vulnerability is given below:

Risk FactorsDetailsAffected ProductsApache Parquet Java library versions ≤ 1.15.0 (including parquet-avro module)ImpactRemote Code Execution (RCE)Exploit PrerequisitesSpecially crafted Parquet file; no user interaction or authentication neededCVSS 3.1 Score10.0 (Critical)

Widespread Impact on Big Data Ecosystems

The vulnerability affects numerous big data environments, including Hadoop, Spark, and Flink implementations, as well as analytics systems on AWS, Google, and Azure cloud platforms.

Major companies known to use Parquet in their data infrastructure include Netflix, Uber, Airbnb, and LinkedIn.

If successfully exploited, attackers could:

Gain complete control of vulnerable systems

Exfiltrate or tamper with sensitive data

Deploy ransomware or other malicious payloads

Disrupt critical data services and operations

“The vulnerability can impact data pipelines and analytics systems that import Parquet files, particularly when those files come from external or untrusted sources,” warns Endor Labs in their security advisory.

All aspects of system security—confidentiality, integrity, and availability—are at high risk.

Immediate Remediation Steps

The Apache Software Foundation has released version 1.15.1, which fixes the vulnerability. Organizations are strongly advised to take the following actions immediately:

Upgrade all Apache Parquet Java dependencies to version 1.15.1.

Implement strict validation of Parquet files, particularly those from external sources, for systems that cannot be immediately updated.

Enhance monitoring and logging for systems processing Parquet files to detect potential exploitation attempts.

Review data processing workflows to identify potential exposure points.

As of April 4, 2025, there are no confirmed reports of this vulnerability being exploited in the wild. However, security experts warn that, given the vulnerability’s severity and now-public nature, exploitation attempts may begin soon.

“Despite the frightening potential, it’s important to note that the vulnerability can only be exploited if a malicious Parquet file is imported,” researchers said.

Nevertheless, the critical nature of this vulnerability demands immediate attention from all organizations using Apache Parquet in their data infrastructure.

Investigate Real-World Malicious Links & Phishing Attacks With Threat Intelligence Lookup – Try 50 Request for Free

Related Posts