Cross-platform arbitrage and edge detection for prediction markets.
import { createOpportunityFinder } from './opportunity';
const finder = createOpportunityFinder(db, feeds, embeddings, {
minEdge: 0.5,
semanticMatching: true,
});
// Find opportunities
const opps = await finder.scan({ query: 'fed rate', minEdge: 1 });
// Real-time alerts
finder.on('opportunity', (opp) => console.log('Found:', opp.edgePct, '%'));
await finder.startRealtime();| Command | Description |
|---|---|
/opportunity scan [query] |
Find opportunities |
/opportunity active |
Show active opportunities |
/opportunity combinatorial |
Scan for combinatorial arbitrage (arXiv:2508.03474) |
/opportunity link <a> <b> |
Link equivalent markets |
/opportunity stats |
View performance stats |
/opportunity pairs |
Platform pair analysis |
/opportunity realtime start |
Enable real-time scanning |
Buy YES + NO on same market for < $1.00
Example: Polymarket "Will X happen?"
YES: 45c + NO: 52c = 97c
Edge: 3% guaranteed profit
Same market priced differently across platforms
Example: "Fed rate hike in Jan"
Polymarket YES: 65c
Kalshi YES: 72c
Strategy: Buy YES @ 65c on Polymarket
Buy NO @ 28c on Kalshi (or sell YES)
Edge: 7%
Market mispriced vs external benchmarks (polls, models)
Example: Election market
Market price: 45%
538 model: 52%
Edge: 7% (buy YES)
{
"opportunityFinder": {
"enabled": true,
"minEdge": 0.5,
"minLiquidity": 100,
"platforms": ["polymarket", "kalshi", "betfair"],
"semanticMatching": true,
"similarityThreshold": 0.85,
"realtime": false,
"scanIntervalMs": 10000
}
}Opportunities are scored 0-100 based on:
| Factor | Weight | Description |
|---|---|---|
| Edge % | 35% | Raw arbitrage spread |
| Liquidity | 25% | Available $ to trade |
| Confidence | 25% | Match quality / fair value confidence |
| Execution | 15% | Platform reliability, fees |
Score = EdgeScore + LiquidityScore + ConfidenceScore + ExecutionScore - Penalties
EdgeScore (0-40): edge% / 10 * 40
LiquidityScore (0-25): min(liquidity/$50k, 1) * 25
ConfidenceScore (0-25): confidence * 25
ExecutionScore (0-10): platform reliability factors
- Low liquidity: -5 if < 5x minimum
- Cross-platform: -3 per additional platform
- High slippage: -5 if > 2%
- Low confidence: -5 if fair value confidence < 70%
Uses embeddings to match markets with different wording:
"Will the Fed raise rates?"
= "FOMC vote for rate hike?"
= "Federal Reserve interest rate increase?"
Tokenizes and compares using Jaccard similarity:
- Removes stop words (will, the, be, etc.)
- Normalizes entities (Fed = FOMC, Jan = January)
- Requires 60% token overlap
Override automatic matching:
/opportunity link polymarket:abc123 kalshi:fed-rate-janNote: These factors are heuristic estimates, not empirically validated. Actual slippage varies significantly based on market, time of day, and current orderbook depth.
Platform-specific slippage factors (relative scale):
| Platform | Factor | Rationale |
|---|---|---|
| Betfair | 0.6 | High-volume sports exchange |
| Smarkets | 0.7 | Good liquidity, regulated |
| Polymarket | 0.8 | Varies greatly by market |
| Drift | 0.9 | Solana DEX |
| Kalshi | 1.0 | Baseline (moderate liquidity) |
| PredictIt | 1.2 | Lower liquidity, US only |
| Manifold | 1.5 | Play money (less market depth) |
| Metaculus | 2.0 | Community predictions |
Slippage formula (heuristic):
slippage = sqrt(size / liquidity) * 2 * platform_factor + spread/2
Recommendation: For real trading, fetch actual orderbook depth via /orderbook command rather than relying on these estimates.
Recommended position sizing:
kelly = edge * confidence * 0.25 (quarter Kelly)
Capped at 25% of bankroll per opportunity.
/opportunity stats 30 # Last 30 daysExample output (illustrative, not actual results):
Found: 1,247
Taken: 89
Win Rate: 67.4%
Total Profit: $4,521.00
Avg Edge: 2.3%
By Type:
internal: 412 found, 34 taken, 71.2% WR
cross_platform: 623 found, 41 taken, 65.8% WR
edge: 212 found, 14 taken, 64.3% WR
Note: Actual results depend on execution speed, market conditions, and timing. Past performance does not guarantee future results.
/opportunity pairsExample output (illustrative):
polymarket <-> kalshi
Opportunities: 423 | Taken: 32
Win Rate: 68.8% | Profit: $2,140
Avg Edge: 2.1%
polymarket <-> betfair
Opportunities: 198 | Taken: 21
Win Rate: 71.4% | Profit: $1,890
Avg Edge: 2.8%
Cross-platform market identity mapping
| Column | Type | Description |
|---|---|---|
| id | TEXT | Link ID |
| market_a | TEXT | platform:marketId |
| market_b | TEXT | platform:marketId |
| confidence | REAL | 0-1 match confidence |
| source | TEXT | manual/auto/semantic |
Historical opportunity tracking
| Column | Type | Description |
|---|---|---|
| id | TEXT | Opportunity ID |
| type | TEXT | internal/cross_platform/edge |
| edge_pct | REAL | Arbitrage spread % |
| score | REAL | 0-100 score |
| status | TEXT | active/taken/expired/closed |
| realized_pnl | REAL | Actual profit/loss |
Aggregated performance by platform combination
| Column | Type | Description |
|---|---|---|
| platform_a | TEXT | First platform |
| platform_b | TEXT | Second platform |
| total_opportunities | INT | Count found |
| wins | INT | Profitable trades |
| total_profit | REAL | Cumulative P&L |
Creates opportunity finder instance.
Parameters:
db- Database instancefeeds- FeedManager instanceembeddings- Optional EmbeddingsService for semantic matchingconfig- OpportunityFinderConfig
Returns: OpportunityFinder
Scan for opportunities.
Options:
query- Filter by market textminEdge- Minimum edge % (default: 0.5)minLiquidity- Minimum $ liquidity (default: 100)platforms- Platforms to scantypes- Opportunity types to includelimit- Max results (default: 50)sortBy- Sort by: edge, score, liquidity, profit
Returns: Promise<Opportunity[]>
Start real-time opportunity scanning.
Stop real-time scanning.
Manually link two markets as equivalent.
Get performance statistics.
Options:
days- Time period (default: 30)platform- Filter by platform
Returns: OpportunityStats
| Event | Payload | Description |
|---|---|---|
opportunity |
Opportunity | New opportunity found |
expired |
Opportunity | Opportunity expired |
taken |
Opportunity | Marked as taken |
closed |
Opportunity | Final outcome recorded |
started |
- | Real-time scanning started |
stopped |
- | Real-time scanning stopped |
- Start with higher minEdge (2-3%) to filter noise
- Enable semantic matching if you have embeddings configured
- Monitor platform pairs - some combinations are more reliable
- Use quarter Kelly - the default is conservative for a reason
- Link markets manually when auto-matching misses obvious pairs
- Track outcomes - use
/opportunity takeand record results
- Lower
minEdgethreshold - Add more platforms to scan
- Check feed connectivity
- Enable semantic matching
- Manually link known equivalent markets
- Adjust
similarityThreshold
- Reduce position size
- Wait for better liquidity
- Use limit orders instead of market
Based on arXiv:2508.03474 - "Unravelling the Probabilistic Forest"
The paper analyzed Polymarket data from April 2024 to April 2025 (86 million bets across thousands of markets) and found $40M in realized arbitrage profits extracted by traders. Key caveats:
- Most profits were captured by sophisticated arbitrageurs with fast execution
- Political markets (2024 US election) had the largest spreads
- Sports markets had more frequent but smaller opportunities
The paper identifies two mechanisms:
When YES + NO prices don't sum to $1.00:
Example: Market totals $0.97
Buy YES @ 45c + Buy NO @ 52c = 97c
One outcome pays $1.00
Guaranteed profit: 3c per dollar
Long when sum < $1, short when sum > $1.
Markets with logical relationships:
| Relationship | Formula | Example |
|---|---|---|
| Implies (→) | P(A) ≤ P(B) | "Trump wins" → "Republican wins" |
| Inverse (¬) | P(A) + P(B) = 1 | "X happens" vs "X doesn't happen" |
| Exclusive (⊕) | P(A) + P(B) ≤ 1 | "Biden wins" vs "Trump wins" |
| Exhaustive (∨) | ΣP(i) = 1 | All candidates in race |
Arbitrage exists when market prices violate these constraints.
# Scan for combinatorial arbitrage
/opportunity combinatorial
# With options
/opportunity comb minEdge=1 platforms=polymarket,kalshiThe naive algorithm is O(2^n+m) - computationally infeasible. We use three heuristics:
- Timeliness: Only compare markets ending within 30 days of each other
- Topical similarity: Cluster markets by topic (elections, crypto, fed, sports)
- Logical relationships: Only check pairs with detectable dependencies
This reduces millions of comparisons to thousands.
Additional predictive indicators:
OBI = (Q_bid - Q_ask) / (Q_bid + Q_ask)
Research shows:
- OBI explains ~65% of short-term price variance
- Imbalance Ratio > 0.65 predicts price increase (58% accuracy)
Kelly criterion for combinatorial positions:
f* = (P_true - P_market) / (1 - P_market)
Use fractional Kelly (25-50%) for safety.
Reduce position as expiry approaches:
Position(t) = Initial × √(T_remaining / T_initial)
This reduces exposure ~65% in the final week.
import { scanCombinatorialArbitrage } from './opportunity/combinatorial';
const result = await scanCombinatorialArbitrage(feeds, {
platforms: ['polymarket', 'kalshi'],
minEdgePct: 0.5,
});
// result.rebalance - YES+NO != $1 opportunities
// result.combinatorial - conditional dependency opportunities
// result.clusters - market topic clusters| Column | Type | Description |
|---|---|---|
| id | TEXT | Opportunity ID |
| type | TEXT | rebalance_long/rebalance_short/combinatorial |
| markets_json | TEXT | Markets involved |
| relationship | TEXT | implies/inverse/exclusive/exhaustive |
| edge_pct | REAL | Arbitrage edge % |
| confidence | REAL | Match confidence |
| Column | Type | Description |
|---|---|---|
| id | TEXT | Cluster ID |
| topic | TEXT | election_2024, bitcoin, fed_rates, etc. |
| market_ids_json | TEXT | Markets in cluster |
| avg_similarity | REAL | Average pairwise similarity |
Find mispriced markets based on logical and statistical correlations between assets.
Markets often have relationships that aren't reflected in their pricing:
- If "Trump wins presidency" is 60%, then "Republican wins presidency" should be ≥ 60%
- If "BTC hits $100k" moves, "ETH hits $5k" often follows within minutes
- "Harris wins" and "Trump wins" are mutually exclusive (can't both be > 50%)
The correlation finder detects these relationships and alerts when prices violate them.
| Type | Description | Example |
|---|---|---|
identical |
Same underlying event | "Fed raises rates" on Polymarket vs Kalshi |
implies |
A implies B (P(A) ≤ P(B)) | "Trump wins" → "Republican wins" |
mutually_exclusive |
Both can't happen (P(A) + P(B) ≤ 1) | "Biden wins" vs "Trump wins" |
time_shifted |
Correlated with lag (arbitrage window) | BTC price leads ETH by ~60s |
partial |
Statistical correlation (not logical) | Crypto prices tend to move together |
import { createCorrelationFinder } from './opportunity/correlation';
const corrFinder = createCorrelationFinder(feeds, db, {
minCorrelation: 0.7,
maxLagSeconds: 300,
minMispricingPct: 2.0,
});
// Scan for correlation arbitrage
const opportunities = await corrFinder.scan();
// Real-time alerts
corrFinder.on('mispricing', (opp) => {
console.log(`Mispricing: ${opp.marketA.question} vs ${opp.marketB.question}`);
console.log(`Expected: ${opp.expectedPrice}, Actual: ${opp.actualPrice}`);
console.log(`Edge: ${opp.edgePct}%`);
});
await corrFinder.startMonitoring();The system includes pre-configured rules for common relationships:
// Candidate → Party
"Trump wins" implies "Republican wins"
"Harris wins" implies "Democrat wins"
"Biden wins" implies "Democrat wins"
// State → National (strong states)
"Trump wins Florida" implies "Trump wins presidency" (partial)// BTC leads other crypto (60-120s typical lag)
"BTC up 5%" → "ETH up" (corr: 0.85, lag: 60-120s)
"BTC up 5%" → "SOL up" (corr: 0.75, lag: 90-180s)
// Major events propagate
"Coinbase lists X" → "X price up" (lag: varies)// Same race
"Biden wins 2024" + "Trump wins 2024" ≤ 1.0
"Harris wins" + "Trump wins" + "Other wins" = 1.0 (exhaustive)interface CorrelationConfig {
// Minimum correlation coefficient to consider
minCorrelation?: number; // default: 0.6
// Maximum lag for time-shifted correlations (seconds)
maxLagSeconds?: number; // default: 300
// Minimum mispricing to alert
minMispricingPct?: number; // default: 1.5
// Window for calculating correlations
correlationWindowMs?: number; // default: 3600000 (1 hour)
// Custom rules to add
customRules?: CorrelationRule[];
// Categories to scan (default: all)
categories?: string[];
}Add your own correlation rules:
const finder = createCorrelationFinder(feeds, db, {
customRules: [
{
type: 'implies',
marketA: { pattern: /SpaceX.*launch/i },
marketB: { pattern: /Starship.*success/i },
description: 'SpaceX launch implies Starship program',
},
{
type: 'time_shifted',
marketA: { category: 'crypto', asset: 'BTC' },
marketB: { category: 'crypto', asset: 'DOGE' },
lagRangeSeconds: [120, 600],
minCorrelation: 0.65,
},
{
type: 'mutually_exclusive',
markets: [
{ pattern: /Biden wins 2024/i },
{ pattern: /Trump wins 2024/i },
{ pattern: /Third party wins 2024/i },
],
sumConstraint: 1.0, // exhaustive
},
],
});Creates a correlation finder instance.
Parameters:
feeds- FeedManager instance for market datadb- Database instance for persistenceconfig- CorrelationConfig options
Returns: CorrelationFinder
Scan for correlation-based arbitrage opportunities.
Options:
categories- Categories to scanminEdge- Minimum edge % to includelimit- Max results
Returns: Promise<CorrelationOpportunity[]>
Add a custom correlation rule.
Get correlation coefficients between markets.
Returns: Map<string, Map<string, number>>
Start real-time correlation monitoring.
Stop monitoring.
| Event | Payload | Description |
|---|---|---|
mispricing |
CorrelationOpportunity | Mispricing detected |
correlation_break |
{ marketA, marketB, oldCorr, newCorr } | Historical correlation broke |
rule_triggered |
{ rule, markets, edge } | Custom rule matched |
// Detect implies violation
const finder = createCorrelationFinder(feeds, db);
finder.on('mispricing', async (opp) => {
if (opp.type === 'implies' && opp.edgePct > 3) {
// "Trump wins" at 55% but "Republican wins" at 52%
// This violates implies relationship
console.log('Implies violation!');
console.log(`Buy "${opp.marketB.question}" at ${opp.marketB.price}`);
console.log(`Expected fair value: ${opp.expectedPrice} (${opp.edgePct}% edge)`);
}
});const finder = createCorrelationFinder(feeds, db, {
categories: ['crypto'],
maxLagSeconds: 180,
});
finder.on('mispricing', async (opp) => {
if (opp.type === 'time_shifted') {
// BTC moved 5% up, ETH hasn't moved yet
console.log(`${opp.marketA.asset} moved, ${opp.marketB.asset} lagging`);
console.log(`Expected move: ${opp.expectedMove}%`);
console.log(`Time remaining: ${opp.lagRemaining}s`);
}
});| Column | Type | Description |
|---|---|---|
| id | TEXT | Rule ID |
| type | TEXT | implies/mutually_exclusive/time_shifted/partial |
| market_a_pattern | TEXT | Regex or market ID |
| market_b_pattern | TEXT | Regex or market ID |
| parameters_json | TEXT | Lag, correlation threshold, etc. |
| enabled | BOOLEAN | Active status |
| Column | Type | Description |
|---|---|---|
| id | TEXT | Opportunity ID |
| rule_id | TEXT | Triggering rule |
| market_a_id | TEXT | First market |
| market_b_id | TEXT | Second market |
| type | TEXT | Correlation type |
| expected_price | REAL | Fair value based on correlation |
| actual_price | REAL | Current market price |
| edge_pct | REAL | Mispricing % |
| created_at | TIMESTAMP | Detection time |
| Column | Type | Description |
|---|---|---|
| market_a_id | TEXT | First market |
| market_b_id | TEXT | Second market |
| correlation | REAL | Pearson correlation |
| lag_seconds | INT | Optimal lag |
| sample_count | INT | Data points |
| updated_at | TIMESTAMP | Last calculation |