Testing Strategy for Soroban Groth16 Verifier

Executive Summary

This document outlines the comprehensive testing strategy for the Soroban Groth16 verifier implementation. Given the critical nature of cryptographic code, we employ multiple testing methodologies to ensure correctness, security, and robustness.

Testing Principles
Test Categories
Current Test Coverage
Test Execution
Regression Testing
Security Testing
Performance Benchmarks
Cross-Validation with EVM
Continuous Integration
Future Improvements

1. Testing Principles

1.1 Cryptographic Code Requirements

Cryptographic implementations must satisfy:

- Correctness: Results match mathematical specifications
- Security: Resistant to timing attacks and side channels
- Robustness: Handle edge cases and malicious inputs
- Determinism: Same inputs always produce same outputs
- Completeness: All code paths tested

1.2 Testing Philosophy

We follow a defense-in-depth approach with multiple layers:

Layer 1: Unit Tests           → Individual function correctness
Layer 2: Integration Tests    → Module interaction correctness
Layer 3: Property-Based Tests → Mathematical properties
Layer 4: Reference Vectors    → Cross-validation with known results
Layer 5: Fuzzing              → Edge cases and malicious inputs
Layer 6: Regression Tests     → Prevent introduction of bugs
Layer 7: Security Audits      → Professional third-party review

2. Test Categories

2.1 Unit Tests (- Implemented)

Purpose: Verify individual functions work correctly in isolation

Coverage:

Field arithmetic (Fq, Fq2, Fq6, Fq12)
Curve operations (G1, G2)
Pairing operations
Helper functions

Examples:

// Field arithmetic
#[test]
fn test_fq_add() { ... }
#[test]
fn test_fq_mul() { ... }
#[test]
fn test_fq_inverse() { ... }

// Curve operations
#[test]
fn test_g1_double() { ... }
#[test]
fn test_g1_add() { ... }
#[test]
fn test_g2_scalar_mul() { ... }

// Pairing
#[test]
fn test_pairing_identity() { ... }
#[test]
fn test_miller_loop_structure() { ... }
#[test]
fn test_final_exponentiation() { ... }

Current Count: 30 unit tests

2.2 Property-Based Tests (📋 Planned)

Purpose: Verify mathematical properties hold for all valid inputs

Properties to Test:

// Field properties
∀ a, b ∈ Fq:
  - Commutativity: a + b = b + a
  - Associativity: (a + b) + c = a + (b + c)
  - Identity: a + 0 = a, a * 1 = a
  - Inverse: a * a⁻¹ = 1 (for a ≠ 0)
  - Distributivity: a * (b + c) = a*b + a*c

// Curve properties
∀ P, Q ∈ G1:
  - Commutativity: P + Q = Q + P
  - Associativity: (P + Q) + R = P + (Q + R)
  - Identity: P + O = P (O = point at infinity)
  - Inverse: P + (-P) = O

// Pairing properties
∀ P ∈ G1, Q ∈ G2, a, b ∈ Fr:
  - Bilinearity: e(aP, bQ) = e(P, Q)^(ab)
  - Non-degeneracy: e(G1, G2) ≠ 1
  - Computability: e(P, Q) is efficiently computable

Implementation Plan:

# Add to Cargo.toml
[dev-dependencies]
proptest = "1.4"
quickcheck = "1.0"

2.3 Integration Tests (- Partial)

Purpose: Test complete verification workflow

Scenarios:

Valid proof verification
Invalid proof rejection
Malformed input handling
Boundary conditions

Example:

#[test]
fn test_verify_valid_proof() {
    // Setup: Create proof with known valid inputs
    let proof = create_test_proof();
    let vk = create_test_vk();
    let public_inputs = vec![...];

    // Execute
    let result = verify_proof(proof, vk, public_inputs);

    // Assert
    assert!(result);
}

#[test]
fn test_verify_invalid_proof() {
    // Test that invalid proofs are rejected
    let proof = create_invalid_proof();
    let result = verify_proof(proof, vk, public_inputs);
    assert!(!result);
}

2.4 Reference Test Vectors (📋 Planned)

Purpose: Cross-validate with known correct results

Sources:

EVM Verifier: Compare results with EVM precompile outputs
snarkJS: Use test vectors from snarkJS library
Circom: Proofs generated by Circom circuits
Academic Papers: Test vectors from BN254 research papers

Implementation:

#[test]
fn test_evm_cross_validation() {
    // Use same inputs as EVM test
    let test_vectors = load_evm_test_vectors();

    for vector in test_vectors {
        let soroban_result = verify_proof(
            vector.proof,
            vector.vk,
            vector.public_inputs
        );

        let evm_result = vector.expected_result;

        assert_eq!(soroban_result, evm_result,
            "Soroban result differs from EVM for vector {:?}",
            vector.name
        );
    }
}

Test Vector Format (JSON):

{
  "name": "simple_range_proof",
  "curve": "BN254",
  "proof": {
    "pi_a": ["0x...", "0x..."],
    "pi_b": [["0x...", "0x..."], ["0x...", "0x..."]],
    "pi_c": ["0x...", "0x..."]
  },
  "vk": {
    "alpha": ["0x...", "0x..."],
    "beta": [["0x...", "0x..."], ["0x...", "0x..."]],
    ...
  },
  "public_inputs": ["0x..."],
  "expected": true
}

2.5 Fuzzing Tests (📋 Planned)

Purpose: Discover edge cases and crashes with random inputs

Tools:

cargo-fuzz (libFuzzer)
AFL.rs
Honggfuzz

Targets:

// Fuzz field operations
fuzz_target!(|data: &[u8]| {
    if data.len() >= 64 {
        let a = Fq::from_bytes_be(&data[0..32]);
        let b = Fq::from_bytes_be(&data[32..64]);

        // Should never panic
        let _ = a.add(&b);
        let _ = a.mul(&b);

        if !b.is_zero() {
            let _ = a.mul(&b.inverse().unwrap());
        }
    }
});

// Fuzz pairing with arbitrary points
fuzz_target!(|data: &[u8]| {
    if let Ok((g1, g2)) = parse_points(data) {
        // Should never panic
        let _ = pairing(&g1, &g2);
    }
});

2.6 Regression Tests (📋 Planned)

Purpose: Ensure bugs don't reappear

Approach:

Create test for every bug fix
Version tag each test
Run full regression suite on every commit

Example:

// Regression test for issue #123: Division by zero in field inversion
#[test]
fn test_field_inverse_zero_regression() {
    let zero = Fq::zero();
    assert!(zero.inverse().is_none());
}

// Regression test for issue #456: Miller loop overflow
#[test]
fn test_miller_loop_no_overflow() {
    let large_scalar = [u64::MAX, u64::MAX, u64::MAX, u64::MAX];
    let g1 = G1Affine::generator().mul(&large_scalar);
    let g2 = G2Affine::generator();

    // Should not overflow or panic
    let _ = pairing(&g1, &g2);
}

3. Current Test Coverage

3.1 Module Breakdown

Module	Unit Tests	Coverage	Status
`field.rs`	3	Basic	-
`curve.rs`	3	Basic	-
`fq12.rs`	4	Basic	-
`pairing.rs`	15	Good	-
`lib.rs`	5	Basic	-
Total	30	~40%	Partial

3.2 Test Coverage by Category

Field Arithmetic:
  - Addition, multiplication, squaring
  - Inverse computation
  - Zero and one identities
  ⚠️  Edge cases (overflow, modular reduction)
  - Frobenius map properties
  - Montgomery form correctness

Curve Operations:
  - Point addition and doubling
  - Scalar multiplication
  - Infinity point handling
  - Point negation
  ⚠️  On-curve validation
  - Subgroup check
  - Curve equation verification

Pairing:
  - Identity pairing
  - Infinity handling
  - Miller loop structure
  - Final exponentiation
  - Multi-pairing
  ⚠️  Bilinearity property
  - Cross-validation with EVM
  - Edge case inputs

Integration:
  - Basic structure validation
  - Contract version
  ⚠️  End-to-end proof verification
  - Malformed input handling
  - Gas/resource estimation

Legend:

- Fully tested
⚠️ Partially tested
- Not tested

3.3 Code Coverage Metrics

To measure actual code coverage, use tarpaulin:

# Install tarpaulin
cargo install cargo-tarpaulin

# Run coverage (for non-WASM target)
cargo tarpaulin --out Html --output-dir coverage

# View report
open coverage/index.html

Target Coverage Goals:

Critical paths (pairing, field ops): 100%
Helper functions: 90%
Error handling: 85%
Overall: 95%+

4. Test Execution

4.1 Running Tests

# Run all library tests (non-WASM)
cargo test --lib

# Run specific module tests
cargo test --lib field::tests
cargo test --lib pairing::tests

# Run with verbose output
cargo test --lib -- --nocapture

# Run specific test
cargo test --lib test_pairing_bilinearity_scalar

# Run ignored tests
cargo test --lib -- --ignored

# Show test output even for passing tests
cargo test --lib -- --show-output

4.2 Test Organization

Tests are organized in each module:

// In field.rs
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_fq_operations() { ... }
}

// In pairing.rs
#[cfg(test)]
mod tests {
    use super::*;
    use crate::curve::{G1Affine, G2Affine};

    #[test]
    fn test_pairing_properties() { ... }
}

4.3 Test Data

For complex tests requiring test vectors:

# Create test data directory
mkdir -p tests/vectors

# Store test vectors
tests/vectors/
├── bn254_g1_points.json
├── bn254_g2_points.json
├── pairing_test_vectors.json
├── evm_comparison_vectors.json
└── edge_cases.json

5. Regression Testing

5.1 Regression Test Suite

Create a comprehensive regression test suite:

// tests/regression.rs
#[cfg(test)]
mod regression_tests {
    use soroban_groth16_verifier::*;

    /// Test suite version
    const TEST_SUITE_VERSION: &str = "v1.0.0";

    #[test]
    fn test_suite_version() {
        // Ensure test suite is up to date
        println!("Running regression test suite {}", TEST_SUITE_VERSION);
    }

    // Add regression test for each discovered bug
    #[test]
    fn regression_field_inverse_none() {
        // Bug #001: Field inverse should return None for zero
        let zero = Fq::zero();
        assert!(zero.inverse().is_none());
    }

    #[test]
    fn regression_pairing_infinity() {
        // Bug #002: Pairing with infinity should return 1
        let g1_inf = G1Affine::infinity();
        let g2 = G2Affine::generator();
        let result = pairing(&g1_inf, &g2);
        assert!(result.is_one());
    }
}

5.2 Regression Test Workflow

1. Bug Discovered
   ↓
2. Create Failing Test
   ↓
3. Fix Bug
   ↓
4. Verify Test Passes
   ↓
5. Add to Regression Suite
   ↓
6. Tag with Issue Number
   ↓
7. Run Full Regression Suite

6. Security Testing

6.1 Timing Attack Tests

Concern: Constant-time operations to prevent side-channel attacks

#[test]
#[ignore] // Requires special timing setup
fn test_field_mul_constant_time() {
    use std::time::Instant;

    let samples = 10000;
    let a = Fq::from_montgomery([42, 0, 0, 0]);

    // Time multiplication with zero
    let b_zero = Fq::zero();
    let start = Instant::now();
    for _ in 0..samples {
        let _ = a.mul(&b_zero);
    }
    let time_zero = start.elapsed();

    // Time multiplication with non-zero
    let b_nonzero = Fq::from_montgomery([123, 456, 789, 101]);
    let start = Instant::now();
    for _ in 0..samples {
        let _ = a.mul(&b_nonzero);
    }
    let time_nonzero = start.elapsed();

    // Times should be similar (within 10%)
    let ratio = time_zero.as_nanos() as f64 / time_nonzero.as_nanos() as f64;
    assert!(ratio > 0.9 && ratio < 1.1,
        "Timing difference detected: {:?} vs {:?}",
        time_zero, time_nonzero);
}

6.2 Input Validation Tests

#[test]
fn test_point_not_on_curve_rejected() {
    // Create point not on curve
    let bad_point = G1Point {
        x: Bytes::from_array(&env, &[1u8; 32]),
        y: Bytes::from_array(&env, &[2u8; 32]),
    };

    assert!(!is_on_curve_g1(&env, &bad_point));
}

#[test]
fn test_malformed_proof_rejected() {
    // Proof with invalid field elements
    let proof = create_invalid_proof();
    let result = verify_proof(env, proof, vk, public_inputs);
    assert!(!result);
}

6.3 Resource Exhaustion Tests

#[test]
fn test_large_public_input_array() {
    // Should handle large (but valid) arrays without panic
    let large_inputs = vec![Bytes::from_array(&env, &[0u8; 32]); 1000];

    // Should fail gracefully, not panic
    let result = verify_proof(env, proof, vk, large_inputs);
    // Either succeeds or returns false, but doesn't panic
}

7. Performance Benchmarks

7.1 Benchmark Setup

# Add to Cargo.toml
[[bench]]
name = "field_ops"
harness = false

[[bench]]
name = "pairing_bench"
harness = false

[dev-dependencies]
criterion = "0.5"

7.2 Benchmark Examples

// benches/field_ops.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use soroban_groth16_verifier::field::*;

fn bench_field_mul(c: &mut Criterion) {
    let a = Fq::from_montgomery([1, 2, 3, 4]);
    let b = Fq::from_montgomery([5, 6, 7, 8]);

    c.bench_function("field_mul", |bench| {
        bench.iter(|| {
            black_box(a.mul(&b))
        });
    });
}

fn bench_pairing(c: &mut Criterion) {
    let g1 = G1Affine::generator();
    let g2 = G2Affine::generator();

    c.bench_function("pairing", |bench| {
        bench.iter(|| {
            black_box(pairing(&g1, &g2))
        });
    });
}

criterion_group!(benches, bench_field_mul, bench_pairing);
criterion_main!(benches);

7.3 Running Benchmarks

# Run all benchmarks
cargo bench

# Run specific benchmark
cargo bench field_mul

# Save baseline
cargo bench -- --save-baseline main

# Compare with baseline
cargo bench -- --baseline main

8. Cross-Validation with EVM

8.1 Test Vector Generation

Generate test vectors using snarkJS and EVM verifier:

// scripts/generate_test_vectors.js
const snarkjs = require("snarkjs");
const fs = require("fs");

async function generateVectors() {
    // Generate proof using snarkJS
    const { proof, publicSignals } = await snarkjs.groth16.fullProve(
        { input: 42 },
        "circuit.wasm",
        "circuit_final.zkey"
    );

    // Export for Soroban testing
    const testVector = {
        name: "simple_range_proof",
        proof: {
            pi_a: proof.pi_a,
            pi_b: proof.pi_b,
            pi_c: proof.pi_c
        },
        publicSignals: publicSignals,
        expected: true
    };

    fs.writeFileSync(
        "soroban/tests/vectors/simple_range_proof.json",
        JSON.stringify(testVector, null, 2)
    );
}

generateVectors();

8.2 Cross-Validation Test

#[test]
fn test_cross_validate_with_evm() {
    // Load test vector
    let vector = load_test_vector("simple_range_proof.json");

    // Convert to Soroban format
    let proof = convert_proof(&vector.proof);
    let vk = load_verification_key();
    let inputs = convert_inputs(&vector.publicSignals);

    // Verify
    let result = verify_proof(env, proof, vk, inputs);

    // Should match EVM result
    assert_eq!(result, vector.expected);
}

9. Continuous Integration

9.1 CI Pipeline

# .github/workflows/tests.yml
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install Rust
        uses: actions-rs/toolchain@v1
        with:
          toolchain: stable

      - name: Run tests
        run: cargo test --lib

      - name: Run clippy
        run: cargo clippy -- -D warnings

      - name: Check formatting
        run: cargo fmt -- --check

  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Install tarpaulin
        run: cargo install cargo-tarpaulin

      - name: Generate coverage
        run: cargo tarpaulin --out Xml

      - name: Upload to codecov
        uses: codecov/codecov-action@v3

9.2 Pre-commit Hooks

# .git/hooks/pre-commit
#!/bin/bash

# Run tests
cargo test --lib || exit 1

# Run clippy
cargo clippy -- -D warnings || exit 1

# Check formatting
cargo fmt -- --check || exit 1

echo "All checks passed!"

10. Future Improvements

10.1 Short-term (1-2 weeks)

Add property-based tests with proptest
Create EVM cross-validation test suite
Implement fuzzing for field operations
Add regression test template
Set up CI/CD pipeline

10.2 Medium-term (1-2 months)

Achieve 95% code coverage
Complete timing attack analysis
Add performance benchmarks
Create comprehensive test vector library
Document all edge cases

10.3 Long-term (3-6 months)

Formal verification of critical paths
Independent security audit
Chaos engineering tests
Load testing for Soroban limits
Comparative analysis with other implementations

Appendix A: Test Checklist

Pre-release Checklist

Quality Gates

Gate	Requirement	Status
Unit Tests	100% passing	-
Coverage	≥ 95%	⚠️ (40%)
Clippy	No warnings	-
Format	`cargo fmt` clean	-
Benchmarks	No regressions	⚠️ (not set up)
Security	All checks pass	⚠️ (partial)

Appendix B: Test Execution Commands

# Quick test run (unit tests only)
cargo test --lib

# Full test suite
cargo test --lib --all-features

# With coverage
cargo tarpaulin --out Html

# With benchmarks
cargo bench

# With fuzzing
cargo fuzz run field_ops

# Regression suite
cargo test --lib regression_tests::

# Security tests
cargo test --lib -- --ignored

# Cross-validation
npm run generate-vectors && cargo test --lib test_cross_validate

# CI simulation
./scripts/ci_local.sh

Document Information

Version: 1.0 Date: 2025 Authors: OpenZKTool Development Team Last Updated: After v4 pairing implementation Related:

CRYPTOGRAPHIC_COMPARISON.md - EVM vs Soroban comparison
README.md - Main project documentation
soroban/src/* - Implementation files

This testing strategy ensures the Soroban Groth16 verifier meets the highest standards of correctness, security, and reliability required for production cryptographic systems.

FilesExpand file tree

TESTING_STRATEGY.md

Latest commit

History