Performance testing and benchmarking for ParquetSharpLINQ.
dotnet run -c Release -- generate ./benchmark_data 5000Generates test data with Hive-style partitioning across multiple years, months, and regions.
dotnet run -c Release -- analyze ./benchmark_dataRuns detailed performance analysis with partition pruning and column projection metrics.
dotnet run -c Release -- cleanup ./benchmark_dataAzure benchmarks use Azurite by default - no Azure account needed!
docker run -d -p 10000:10000 --name azurite \
mcr.microsoft.com/azure-storage/azuritedotnet run -c Release -- azure-upload parquet-bench 5000Generates and uploads test data to Azurite.
dotnet run -c Release -- azure-analyze parquet-benchTests Azure streaming performance including caching effects.
dotnet run -c Release -- azure-cleanup parquet-benchTo use real Azure Storage instead of Azurite, set the AZURE_STORAGE_CONNECTION_STRING environment variable:
Linux/macOS:
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=...;AccountKey=..."
dotnet run -c Release -- azure-upload parquet-bench 5000Windows (PowerShell):
$env:AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=..."
dotnet run -c Release -- azure-upload parquet-bench 5000benchmark_data/
├── year=2023/
│ ├── month=01/
│ │ ├── region=us-east/data.parquet
│ │ ├── region=us-west/data.parquet
│ │ ├── region=eu-central/data.parquet
│ │ ├── region=eu-west/data.parquet
│ │ └── region=ap-southeast/data.parquet
│ ├── month=02/...
│ └── month=12/...
├── year=2024/...
└── year=2025/...
Total: 180 partitions, 5K-10K records per partition
# Generate with custom size
dotnet run -c Release -- generate ./data 10000
# Detailed analysis
dotnet run -c Release -- analyze ./data
# Cleanup
dotnet run -c Release -- cleanup ./data