VMware Greenplum Platform Extension Framework Overview
What is the Platform Extension Framework?
When multiple data sets exist in external systems, it is often more efficient to join data sets remotely and return only the results, rather than negotiate the time and storage requirements of performing an expensive full data load operation. The VMware Greenplum Platform Extension Framework (PXF) is a feature that provides this capability via parallel, high throughput data access and federated query processing.
PXF is a system for extending the functionality of the VMware Greenplum database platform. It allows for the creation of custom extensions that can be used to perform specific tasks or add new functionality to the platform. The framework is designed to be flexible and modular, enabling developers to create custom extensions for specific use cases, such as data analysis, data manipulation, and data visualization.
With PXF, you can use VMware Greenplum and SQL to query a variety of heterogeneous data sources and data formats. The latest PXF version offers some enhanced features for several of the supported data formats, including:
- Parquet Array Support
- Fixed Width Text Data Support
Check out the links below to learn more!
- Multibyte Delimiter Support
- Parquet Array Support (Whitepaper)
- Greenplum PXF Support for Apache Parquet (Video)
- Reading and Writing HDFS Parquet Data (Documentation)
- Fixed Width Text Data Support (Whitepaper)
- Greenplum PXF Support for Read and Write with ORC (Video)
- VMware Greenplum Platform Extension Framework (Documentation)