Architecture & How It Works

Privalyse goes beyond simple regex matching by building a semantic understanding of your code.

Core Concepts

1. Semantic Data Flow Graph

Privalyse parses your code (AST for Python, AST/Regex for JS) to build a graph where:

Nodes represent variables, functions, API calls, and data sources.
Edges represent data flow (assignments, function calls, return values).

This allows the scanner to trace data from a Source (e.g., user input) to a Sink (e.g., logging, external API).

2. Taint Tracking

The scanner identifies "tainted" data—variables containing PII or secrets. It then propagates this taint through the graph.

Example:

email = request.form['email'] -> email is tainted (Source: User Input).
log_msg = f"User: {email}" -> log_msg is tainted (Propagation).
logging.info(log_msg) -> Leak Detected (Sink: Logging).

3. Cross-File Analysis

Privalyse resolves imports to track data flow across multiple files. If a function in utils.py returns PII, and main.py logs the result of that function, Privalyse detects the leak.

Scanner Pipeline

Discovery: Find all relevant files in the project.
Import Resolution: Build a dependency graph of modules.
Symbol Analysis: Index functions, classes, and variables (Global Symbol Table).
Intra-file Analysis:
- Parse code to AST.
- Identify Sources (PII, Secrets).
- Identify Sinks (APIs, Logs, DBs).
- Track data flow within the file.
Cross-file Propagation: Connect flows between modules using the Import Graph.
Policy Check: Verify findings against configured policies (e.g., GDPR compliance).
Reporting: Generate output in the requested format.

Supported Languages

Python: Full AST-based analysis with cross-file tracking.
JavaScript/TypeScript: Hybrid analysis (Regex + partial AST) for detecting common patterns in React/Node.js apps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Architecture & How It Works

Core Concepts

1. Semantic Data Flow Graph

2. Taint Tracking

3. Cross-File Analysis

Scanner Pipeline

Supported Languages

Uh oh!

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture & How It Works

Core Concepts

1. Semantic Data Flow Graph

2. Taint Tracking

3. Cross-File Analysis

Scanner Pipeline

Supported Languages