Parqify converts CSV and JSON to optimized Parquet files directly on AWS. Fast.
Automated CSV/JSON-to-Parquet Conversion on AWS
Deploy Parqify AMI to seamlessly transform your S3 data into optimized Parquet files.
Parqify provides a server application, packaged as an AMI, that customers can deploy within their AWS environment.
This server will be responsible for consuming CSV and JSON files from a specified S3 bucket, converting them to the Parquet format, and then writing the converted files back to another S3 bucket.
Key Features of Parqify
- File Format Conversion: Supports conversion from CSV and JSON to Parquet.
- S3 Integration: Seamless integration with Amazon S3 for both input and output.
- AMI-based Deployment: Easy deployment via AWS Marketplace.
- Scalability: Customers can scale the EC2 instance size based on their processing needs.
- Configurability: Configuration options for S3 bucket names, file prefixes, and other conversion parameters.
- Parallel Execution: Ability to process multiple files concurrently for improved performance.
- Parquet File Optimizations: Techniques to optimize the generated Parquet files for efficient storage and querying.
- Partitioning Support: Support for partitioning output Parquet files based on specified criteria.
- Custom Schema Definition: Allows users to define custom schemas for Parquet conversion.
- Compression Options: Provides various compression options for Parquet files to reduce storage size.