Skip to content

Latest commit

 

History

History
872 lines (671 loc) · 18.6 KB

File metadata and controls

872 lines (671 loc) · 18.6 KB

Testing Strategy for Soroban Groth16 Verifier

Executive Summary

This document outlines the comprehensive testing strategy for the Soroban Groth16 verifier implementation. Given the critical nature of cryptographic code, we employ multiple testing methodologies to ensure correctness, security, and robustness.


Table of Contents

  1. Testing Principles
  2. Test Categories
  3. Current Test Coverage
  4. Test Execution
  5. Regression Testing
  6. Security Testing
  7. Performance Benchmarks
  8. Cross-Validation with EVM
  9. Continuous Integration
  10. Future Improvements

1. Testing Principles

1.1 Cryptographic Code Requirements

Cryptographic implementations must satisfy:

    • Correctness: Results match mathematical specifications
    • Security: Resistant to timing attacks and side channels
    • Robustness: Handle edge cases and malicious inputs
    • Determinism: Same inputs always produce same outputs
    • Completeness: All code paths tested

1.2 Testing Philosophy

We follow a defense-in-depth approach with multiple layers:

Layer 1: Unit Tests           → Individual function correctness
Layer 2: Integration Tests    → Module interaction correctness
Layer 3: Property-Based Tests → Mathematical properties
Layer 4: Reference Vectors    → Cross-validation with known results
Layer 5: Fuzzing              → Edge cases and malicious inputs
Layer 6: Regression Tests     → Prevent introduction of bugs
Layer 7: Security Audits      → Professional third-party review

2. Test Categories

2.1 Unit Tests (- Implemented)

Purpose: Verify individual functions work correctly in isolation

Coverage:

  • Field arithmetic (Fq, Fq2, Fq6, Fq12)
  • Curve operations (G1, G2)
  • Pairing operations
  • Helper functions

Examples:

// Field arithmetic
#[test]
fn test_fq_add() { ... }
#[test]
fn test_fq_mul() { ... }
#[test]
fn test_fq_inverse() { ... }

// Curve operations
#[test]
fn test_g1_double() { ... }
#[test]
fn test_g1_add() { ... }
#[test]
fn test_g2_scalar_mul() { ... }

// Pairing
#[test]
fn test_pairing_identity() { ... }
#[test]
fn test_miller_loop_structure() { ... }
#[test]
fn test_final_exponentiation() { ... }

Current Count: 30 unit tests

2.2 Property-Based Tests (📋 Planned)

Purpose: Verify mathematical properties hold for all valid inputs

Properties to Test:

// Field properties
∀ a, b ∈ Fq:
  - Commutativity: a + b = b + a
  - Associativity: (a + b) + c = a + (b + c)
  - Identity: a + 0 = a, a * 1 = a
  - Inverse: a * a⁻¹ = 1 (for a ≠ 0)
  - Distributivity: a * (b + c) = a*b + a*c

// Curve propertiesP, QG1:
  - Commutativity: P + Q = Q + P
  - Associativity: (P + Q) + R = P + (Q + R)
  - Identity: P + O = P (O = point at infinity)
  - Inverse: P + (-P) = O

// Pairing propertiesPG1, QG2, a, b ∈ Fr:
  - Bilinearity: e(aP, bQ) = e(P, Q)^(ab)
  - Non-degeneracy: e(G1, G2)1
  - Computability: e(P, Q) is efficiently computable

Implementation Plan:

# Add to Cargo.toml
[dev-dependencies]
proptest = "1.4"
quickcheck = "1.0"

2.3 Integration Tests (- Partial)

Purpose: Test complete verification workflow

Scenarios:

  • Valid proof verification
  • Invalid proof rejection
  • Malformed input handling
  • Boundary conditions

Example:

#[test]
fn test_verify_valid_proof() {
    // Setup: Create proof with known valid inputs
    let proof = create_test_proof();
    let vk = create_test_vk();
    let public_inputs = vec![...];

    // Execute
    let result = verify_proof(proof, vk, public_inputs);

    // Assert
    assert!(result);
}

#[test]
fn test_verify_invalid_proof() {
    // Test that invalid proofs are rejected
    let proof = create_invalid_proof();
    let result = verify_proof(proof, vk, public_inputs);
    assert!(!result);
}

2.4 Reference Test Vectors (📋 Planned)

Purpose: Cross-validate with known correct results

Sources:

  1. EVM Verifier: Compare results with EVM precompile outputs
  2. snarkJS: Use test vectors from snarkJS library
  3. Circom: Proofs generated by Circom circuits
  4. Academic Papers: Test vectors from BN254 research papers

Implementation:

#[test]
fn test_evm_cross_validation() {
    // Use same inputs as EVM test
    let test_vectors = load_evm_test_vectors();

    for vector in test_vectors {
        let soroban_result = verify_proof(
            vector.proof,
            vector.vk,
            vector.public_inputs
        );

        let evm_result = vector.expected_result;

        assert_eq!(soroban_result, evm_result,
            "Soroban result differs from EVM for vector {:?}",
            vector.name
        );
    }
}

Test Vector Format (JSON):

{
  "name": "simple_range_proof",
  "curve": "BN254",
  "proof": {
    "pi_a": ["0x...", "0x..."],
    "pi_b": [["0x...", "0x..."], ["0x...", "0x..."]],
    "pi_c": ["0x...", "0x..."]
  },
  "vk": {
    "alpha": ["0x...", "0x..."],
    "beta": [["0x...", "0x..."], ["0x...", "0x..."]],
    ...
  },
  "public_inputs": ["0x..."],
  "expected": true
}

2.5 Fuzzing Tests (📋 Planned)

Purpose: Discover edge cases and crashes with random inputs

Tools:

  • cargo-fuzz (libFuzzer)
  • AFL.rs
  • Honggfuzz

Targets:

// Fuzz field operations
fuzz_target!(|data: &[u8]| {
    if data.len() >= 64 {
        let a = Fq::from_bytes_be(&data[0..32]);
        let b = Fq::from_bytes_be(&data[32..64]);

        // Should never panic
        let _ = a.add(&b);
        let _ = a.mul(&b);

        if !b.is_zero() {
            let _ = a.mul(&b.inverse().unwrap());
        }
    }
});

// Fuzz pairing with arbitrary points
fuzz_target!(|data: &[u8]| {
    if let Ok((g1, g2)) = parse_points(data) {
        // Should never panic
        let _ = pairing(&g1, &g2);
    }
});

2.6 Regression Tests (📋 Planned)

Purpose: Ensure bugs don't reappear

Approach:

  • Create test for every bug fix
  • Version tag each test
  • Run full regression suite on every commit

Example:

// Regression test for issue #123: Division by zero in field inversion
#[test]
fn test_field_inverse_zero_regression() {
    let zero = Fq::zero();
    assert!(zero.inverse().is_none());
}

// Regression test for issue #456: Miller loop overflow
#[test]
fn test_miller_loop_no_overflow() {
    let large_scalar = [u64::MAX, u64::MAX, u64::MAX, u64::MAX];
    let g1 = G1Affine::generator().mul(&large_scalar);
    let g2 = G2Affine::generator();

    // Should not overflow or panic
    let _ = pairing(&g1, &g2);
}

3. Current Test Coverage

3.1 Module Breakdown

Module Unit Tests Coverage Status
field.rs 3 Basic -
curve.rs 3 Basic -
fq12.rs 4 Basic -
pairing.rs 15 Good -
lib.rs 5 Basic -
Total 30 ~40% Partial

3.2 Test Coverage by Category

Field Arithmetic:
  - Addition, multiplication, squaring
  - Inverse computation
  - Zero and one identities
  ⚠️  Edge cases (overflow, modular reduction)
  - Frobenius map properties
  - Montgomery form correctness

Curve Operations:
  - Point addition and doubling
  - Scalar multiplication
  - Infinity point handling
  - Point negation
  ⚠️  On-curve validation
  - Subgroup check
  - Curve equation verification

Pairing:
  - Identity pairing
  - Infinity handling
  - Miller loop structure
  - Final exponentiation
  - Multi-pairing
  ⚠️  Bilinearity property
  - Cross-validation with EVM
  - Edge case inputs

Integration:
  - Basic structure validation
  - Contract version
  ⚠️  End-to-end proof verification
  - Malformed input handling
  - Gas/resource estimation

Legend:

    • Fully tested
  • ⚠️ Partially tested
    • Not tested

3.3 Code Coverage Metrics

To measure actual code coverage, use tarpaulin:

# Install tarpaulin
cargo install cargo-tarpaulin

# Run coverage (for non-WASM target)
cargo tarpaulin --out Html --output-dir coverage

# View report
open coverage/index.html

Target Coverage Goals:

  • Critical paths (pairing, field ops): 100%
  • Helper functions: 90%
  • Error handling: 85%
  • Overall: 95%+

4. Test Execution

4.1 Running Tests

# Run all library tests (non-WASM)
cargo test --lib

# Run specific module tests
cargo test --lib field::tests
cargo test --lib pairing::tests

# Run with verbose output
cargo test --lib -- --nocapture

# Run specific test
cargo test --lib test_pairing_bilinearity_scalar

# Run ignored tests
cargo test --lib -- --ignored

# Show test output even for passing tests
cargo test --lib -- --show-output

4.2 Test Organization

Tests are organized in each module:

// In field.rs
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_fq_operations() { ... }
}

// In pairing.rs
#[cfg(test)]
mod tests {
    use super::*;
    use crate::curve::{G1Affine, G2Affine};

    #[test]
    fn test_pairing_properties() { ... }
}

4.3 Test Data

For complex tests requiring test vectors:

# Create test data directory
mkdir -p tests/vectors

# Store test vectors
tests/vectors/
├── bn254_g1_points.json
├── bn254_g2_points.json
├── pairing_test_vectors.json
├── evm_comparison_vectors.json
└── edge_cases.json

5. Regression Testing

5.1 Regression Test Suite

Create a comprehensive regression test suite:

// tests/regression.rs
#[cfg(test)]
mod regression_tests {
    use soroban_groth16_verifier::*;

    /// Test suite version
    const TEST_SUITE_VERSION: &str = "v1.0.0";

    #[test]
    fn test_suite_version() {
        // Ensure test suite is up to date
        println!("Running regression test suite {}", TEST_SUITE_VERSION);
    }

    // Add regression test for each discovered bug
    #[test]
    fn regression_field_inverse_none() {
        // Bug #001: Field inverse should return None for zero
        let zero = Fq::zero();
        assert!(zero.inverse().is_none());
    }

    #[test]
    fn regression_pairing_infinity() {
        // Bug #002: Pairing with infinity should return 1
        let g1_inf = G1Affine::infinity();
        let g2 = G2Affine::generator();
        let result = pairing(&g1_inf, &g2);
        assert!(result.is_one());
    }
}

5.2 Regression Test Workflow

1. Bug Discovered
   ↓
2. Create Failing Test
   ↓
3. Fix Bug
   ↓
4. Verify Test Passes
   ↓
5. Add to Regression Suite
   ↓
6. Tag with Issue Number
   ↓
7. Run Full Regression Suite

6. Security Testing

6.1 Timing Attack Tests

Concern: Constant-time operations to prevent side-channel attacks

#[test]
#[ignore] // Requires special timing setup
fn test_field_mul_constant_time() {
    use std::time::Instant;

    let samples = 10000;
    let a = Fq::from_montgomery([42, 0, 0, 0]);

    // Time multiplication with zero
    let b_zero = Fq::zero();
    let start = Instant::now();
    for _ in 0..samples {
        let _ = a.mul(&b_zero);
    }
    let time_zero = start.elapsed();

    // Time multiplication with non-zero
    let b_nonzero = Fq::from_montgomery([123, 456, 789, 101]);
    let start = Instant::now();
    for _ in 0..samples {
        let _ = a.mul(&b_nonzero);
    }
    let time_nonzero = start.elapsed();

    // Times should be similar (within 10%)
    let ratio = time_zero.as_nanos() as f64 / time_nonzero.as_nanos() as f64;
    assert!(ratio > 0.9 && ratio < 1.1,
        "Timing difference detected: {:?} vs {:?}",
        time_zero, time_nonzero);
}

6.2 Input Validation Tests

#[test]
fn test_point_not_on_curve_rejected() {
    // Create point not on curve
    let bad_point = G1Point {
        x: Bytes::from_array(&env, &[1u8; 32]),
        y: Bytes::from_array(&env, &[2u8; 32]),
    };

    assert!(!is_on_curve_g1(&env, &bad_point));
}

#[test]
fn test_malformed_proof_rejected() {
    // Proof with invalid field elements
    let proof = create_invalid_proof();
    let result = verify_proof(env, proof, vk, public_inputs);
    assert!(!result);
}

6.3 Resource Exhaustion Tests

#[test]
fn test_large_public_input_array() {
    // Should handle large (but valid) arrays without panic
    let large_inputs = vec![Bytes::from_array(&env, &[0u8; 32]); 1000];

    // Should fail gracefully, not panic
    let result = verify_proof(env, proof, vk, large_inputs);
    // Either succeeds or returns false, but doesn't panic
}

7. Performance Benchmarks

7.1 Benchmark Setup

# Add to Cargo.toml
[[bench]]
name = "field_ops"
harness = false

[[bench]]
name = "pairing_bench"
harness = false

[dev-dependencies]
criterion = "0.5"

7.2 Benchmark Examples

// benches/field_ops.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use soroban_groth16_verifier::field::*;

fn bench_field_mul(c: &mut Criterion) {
    let a = Fq::from_montgomery([1, 2, 3, 4]);
    let b = Fq::from_montgomery([5, 6, 7, 8]);

    c.bench_function("field_mul", |bench| {
        bench.iter(|| {
            black_box(a.mul(&b))
        });
    });
}

fn bench_pairing(c: &mut Criterion) {
    let g1 = G1Affine::generator();
    let g2 = G2Affine::generator();

    c.bench_function("pairing", |bench| {
        bench.iter(|| {
            black_box(pairing(&g1, &g2))
        });
    });
}

criterion_group!(benches, bench_field_mul, bench_pairing);
criterion_main!(benches);

7.3 Running Benchmarks

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench field_mul

# Save baseline
cargo bench -- --save-baseline main

# Compare with baseline
cargo bench -- --baseline main

8. Cross-Validation with EVM

8.1 Test Vector Generation

Generate test vectors using snarkJS and EVM verifier:

// scripts/generate_test_vectors.js
const snarkjs = require("snarkjs");
const fs = require("fs");

async function generateVectors() {
    // Generate proof using snarkJS
    const { proof, publicSignals } = await snarkjs.groth16.fullProve(
        { input: 42 },
        "circuit.wasm",
        "circuit_final.zkey"
    );

    // Export for Soroban testing
    const testVector = {
        name: "simple_range_proof",
        proof: {
            pi_a: proof.pi_a,
            pi_b: proof.pi_b,
            pi_c: proof.pi_c
        },
        publicSignals: publicSignals,
        expected: true
    };

    fs.writeFileSync(
        "soroban/tests/vectors/simple_range_proof.json",
        JSON.stringify(testVector, null, 2)
    );
}

generateVectors();

8.2 Cross-Validation Test

#[test]
fn test_cross_validate_with_evm() {
    // Load test vector
    let vector = load_test_vector("simple_range_proof.json");

    // Convert to Soroban format
    let proof = convert_proof(&vector.proof);
    let vk = load_verification_key();
    let inputs = convert_inputs(&vector.publicSignals);

    // Verify
    let result = verify_proof(env, proof, vk, inputs);

    // Should match EVM result
    assert_eq!(result, vector.expected);
}

9. Continuous Integration

9.1 CI Pipeline

# .github/workflows/tests.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install Rust
        uses: actions-rs/toolchain@v1
        with:
          toolchain: stable

      - name: Run tests
        run: cargo test --lib

      - name: Run clippy
        run: cargo clippy -- -D warnings

      - name: Check formatting
        run: cargo fmt -- --check

  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install tarpaulin
        run: cargo install cargo-tarpaulin

      - name: Generate coverage
        run: cargo tarpaulin --out Xml

      - name: Upload to codecov
        uses: codecov/codecov-action@v3

9.2 Pre-commit Hooks

# .git/hooks/pre-commit
#!/bin/bash

# Run tests
cargo test --lib || exit 1

# Run clippy
cargo clippy -- -D warnings || exit 1

# Check formatting
cargo fmt -- --check || exit 1

echo "All checks passed!"

10. Future Improvements

10.1 Short-term (1-2 weeks)

  • Add property-based tests with proptest
  • Create EVM cross-validation test suite
  • Implement fuzzing for field operations
  • Add regression test template
  • Set up CI/CD pipeline

10.2 Medium-term (1-2 months)

  • Achieve 95% code coverage
  • Complete timing attack analysis
  • Add performance benchmarks
  • Create comprehensive test vector library
  • Document all edge cases

10.3 Long-term (3-6 months)

  • Formal verification of critical paths
  • Independent security audit
  • Chaos engineering tests
  • Load testing for Soroban limits
  • Comparative analysis with other implementations

Appendix A: Test Checklist

Pre-release Checklist

  • All unit tests passing
  • Property-based tests added for new features
  • Cross-validation with EVM successful
  • No compiler warnings
  • Code coverage ≥ 95%
  • Benchmarks show no performance regression
  • Security tests passing (timing, input validation)
  • Regression suite updated
  • Documentation updated
  • CHANGELOG updated

Quality Gates

Gate Requirement Status
Unit Tests 100% passing -
Coverage ≥ 95% ⚠️ (40%)
Clippy No warnings -
Format cargo fmt clean -
Benchmarks No regressions ⚠️ (not set up)
Security All checks pass ⚠️ (partial)

Appendix B: Test Execution Commands

# Quick test run (unit tests only)
cargo test --lib

# Full test suite
cargo test --lib --all-features

# With coverage
cargo tarpaulin --out Html

# With benchmarks
cargo bench

# With fuzzing
cargo fuzz run field_ops

# Regression suite
cargo test --lib regression_tests::

# Security tests
cargo test --lib -- --ignored

# Cross-validation
npm run generate-vectors && cargo test --lib test_cross_validate

# CI simulation
./scripts/ci_local.sh

Document Information

Version: 1.0 Date: 2025 Authors: OpenZKTool Development Team Last Updated: After v4 pairing implementation Related:

  • CRYPTOGRAPHIC_COMPARISON.md - EVM vs Soroban comparison
  • README.md - Main project documentation
  • soroban/src/* - Implementation files

This testing strategy ensures the Soroban Groth16 verifier meets the highest standards of correctness, security, and reliability required for production cryptographic systems.