Skip to content

Commit 9e15253

Browse files
NotYuShengclaude
andauthored
feat: network diagram improvements — node limit, filter persistence, report accuracy (#195)
* feat: story improvements — pre-gen screen, context-length error UX, prompt caps, timeline z-index, report filters - Remove auto-generation on Story tab visit; show a pre-generation screen with StoryInfoCard so users can configure settings before generating - Detect LLM context-length exceeded errors (HTTP 422 / errorCode CONTEXT_LENGTH_EXCEEDED) and surface the full prompt in an editable textarea so the user can trim it and retry with a custom prompt - Add prompt-cap controls (Max findings / Max risk matrix rows) to StoryInfoCard: preset buttons, custom number input, and an "All" button that shows the total count once a story has been generated - Backend: new ContextLengthExceededException, LlmClient parses token counts from 400 body; GenerateStoryRequest accepts customPrompt, maxFindings, maxRiskMatrix; StoryService caps findings/risk-matrix rows and short-circuits to custom-prompt path on retry - Fix Recharts Tooltip z-index in TrafficTimeline so modal renders on top - Pass user session filters to captureNetworkDiagrams in report generation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add LLM_CONTEXT_LENGTH env var and clarify LLM_MAX_TOKENS - Add LLM_CONTEXT_LENGTH config property so operators can explicitly set the model's context window size instead of relying on /v1/models auto-detection - LlmClient now prefers the configured context length over auto-detect, with auto-detect as fallback and a clearer log message for each path - Clarify the 80% guard comment: it caps effectiveMaxTokens as a safety measure, not the primary purpose of LLM_MAX_TOKENS - LLM_MAX_TOKENS comment in .env rewritten to make clear it controls response length only (not context window size) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: update .env.example with LLM_CONTEXT_LENGTH and corrected LLM_MAX_TOKENS comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: wire LLM_CONTEXT_LENGTH into docker-compose and add pre-flight token check - Add LLM_CONTEXT_LENGTH to docker-compose.yml so operators can set it without rebuilding the image - LlmClient now performs a pre-flight estimate (chars/4 heuristic) before calling the LLM server; if estimated prompt + response reserve exceeds the known context length, throws ContextLengthExceededException immediately — no LLM call is made Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: keep suggested question buttons visible until send is clicked Previously clicking a suggestion immediately cleared the buttons. Now they persist until the question is actually submitted (they still disappear during loading as before). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address gemini review — transaction scope, pattern precompile, dead code, timeout check - Remove @transactional from generateStory to avoid holding a DB connection open during multi-minute LLM calls - Pre-compile regex Patterns as static constants in LlmClient - Delete unused StoryCountsResponse DTO and STORY_COUNTS endpoint key - Remove unused loadingLimits state from StoryPage - Narrow isTimeout check: drop includes('exceeded') to avoid masking CONTEXT_LENGTH_EXCEEDED errors as timeout messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: network diagram improvements — node limit banner, filter persistence, report accuracy - Add node-limit banner to compare topology view (mirrors single-file view), showing "Top N / All N" controls with presets and custom input; banner stays visible even when all nodes are shown - Per-file buildNetworkGraph in compare mode now passes maxNodes=0 so significance filtering is applied post-merge rather than per-file - Lift network diagram filter state into AnalysisPage so filters survive tab navigation (session-level cache) - Report diagram capture now uses the already-filtered nodes/edges visible on screen instead of re-fetching, so the PDF matches exactly what the user sees - Active filter labels are sent to the backend and rendered as a dedicated PDF section before the topology diagrams Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address gemini review — empty diagrams fallback, null filter guard, token heuristic, context error handling, investigation steps in retry - AnalysisPage: fall back to a fresh conversation fetch when networkGraphStateRef is empty (user never visited Network Diagram tab), so PDF always has diagrams - ReportService: skip null filter strings in addNetworkDiagramFilters to prevent NPE - LlmClient: tighten pre-flight token estimate from chars/4 to chars/2 for denser technical content; broaden context-length error detection to catch provider variants (Ollama, LM Studio); use estimated counts as fallback when regex finds no match - StoryService: re-run Phase 1 investigation during custom prompt retry so investigationSteps are preserved in the response instead of being silently dropped Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 6f42bbe commit 9e15253

10 files changed

Lines changed: 404 additions & 102 deletions

File tree

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11
package com.tracepcap.report;
22

3+
import java.util.List;
34
import lombok.Data;
45
import lombok.NoArgsConstructor;
56

6-
/** Optional diagram images (base64-encoded PNGs) sent by the frontend. */
7+
/** Optional diagram images (base64-encoded PNGs) and active filter labels sent by the frontend. */
78
@Data
89
@NoArgsConstructor
910
public class ReportRequest {
1011
private String forceDirectedImage;
1112
private String hierarchicalImage;
13+
/** Human-readable labels for each active network-diagram filter, e.g. "Protocol: HTTPS". */
14+
private List<String> activeFilters;
1215
}

backend/src/main/java/com/tracepcap/report/ReportService.java

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,10 @@ public void generateReport(UUID fileId, ReportRequest request, OutputStream out)
202202
addExtractedFiles(document, extractedFiles, sec++);
203203
}
204204

205+
if (request.getActiveFilters() != null && !request.getActiveFilters().isEmpty()) {
206+
addNetworkDiagramFilters(document, request.getActiveFilters(), sec++);
207+
}
208+
205209
addTopologyDiagram(document, request.getForceDirectedImage(), "Force-Directed Layout", sec++);
206210
addTopologyDiagram(
207211
document, request.getHierarchicalImage(), "Hierarchical Layout (Top-Down)", sec++);
@@ -775,6 +779,50 @@ private void addExtractedFiles(Document doc, List<ExtractedFileEntity> files, in
775779
doc.add(table);
776780
}
777781

782+
// ══════════════════════════════════════════════════════════════════════════
783+
// Section: Network Diagram Filters Applied
784+
// ══════════════════════════════════════════════════════════════════════════
785+
786+
private void addNetworkDiagramFilters(Document doc, List<String> filters, int sec)
787+
throws Exception {
788+
addSectionHeader(doc, sec + ". Network Diagram — Active Filters");
789+
790+
Font bodyFont = new Font(Font.HELVETICA, 10, Font.NORMAL, C_TEXT);
791+
Paragraph intro =
792+
new Paragraph(
793+
"The network topology diagrams in this report were generated with the following filters applied:",
794+
bodyFont);
795+
intro.setSpacingBefore(6);
796+
intro.setSpacingAfter(8);
797+
doc.add(intro);
798+
799+
PdfPTable table = new PdfPTable(1);
800+
table.setWidthPercentage(100);
801+
table.setSpacingAfter(12);
802+
Font labelFont = new Font(Font.HELVETICA, 10, Font.BOLD, C_LABEL);
803+
Font valueFont = new Font(Font.HELVETICA, 10, Font.NORMAL, C_TEXT);
804+
for (int i = 0; i < filters.size(); i++) {
805+
String filter = filters.get(i);
806+
if (filter == null) continue;
807+
int colon = filter.indexOf(':');
808+
Color bg = i % 2 == 0 ? Color.WHITE : C_ROW_ALT;
809+
PdfPCell cell = new PdfPCell();
810+
cell.setBackgroundColor(bg);
811+
cell.setPadding(6);
812+
cell.setBorder(Rectangle.NO_BORDER);
813+
if (colon > 0) {
814+
Phrase phrase = new Phrase();
815+
phrase.add(new Phrase(filter.substring(0, colon + 1) + " ", labelFont));
816+
phrase.add(new Phrase(filter.substring(colon + 1).trim(), valueFont));
817+
cell.setPhrase(phrase);
818+
} else {
819+
cell.setPhrase(new Phrase(filter, valueFont));
820+
}
821+
table.addCell(cell);
822+
}
823+
doc.add(table);
824+
}
825+
778826
// ══════════════════════════════════════════════════════════════════════════
779827
// Section: Network Topology Diagrams (frontend-rendered PNG)
780828
// ══════════════════════════════════════════════════════════════════════════

backend/src/main/java/com/tracepcap/story/service/LlmClient.java

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -157,10 +157,12 @@ public Integer getModelContextLength() {
157157
*/
158158
public String generateCompletion(String systemPrompt, String userPrompt) {
159159
// Pre-flight context-length check — fail immediately without calling the LLM server.
160-
// Estimate token count using the ~4 chars/token heuristic (good enough for a guard).
160+
// Use ~2 chars/token (conservative) rather than 4 — technical content such as network
161+
// logs, JSON, and hex strings tokenises more densely and can easily fall below 3 chars/token.
162+
// A tighter estimate means fewer false passes that still fail at the server side.
161163
// Only fires when modelContextLength is known (LLM_CONTEXT_LENGTH set or auto-detected).
162164
if (modelContextLength != null) {
163-
int estimatedPromptTokens = (systemPrompt.length() + userPrompt.length()) / 4;
165+
int estimatedPromptTokens = (systemPrompt.length() + userPrompt.length()) / 2;
164166
int responseReserve = getEffectiveMaxTokens();
165167
if (estimatedPromptTokens + responseReserve > modelContextLength) {
166168
log.warn(
@@ -214,15 +216,36 @@ public String generateCompletion(String systemPrompt, String userPrompt) {
214216
} catch (LlmException e) {
215217
throw e;
216218
} catch (Exception e) {
217-
// Detect context-length exceeded (OpenAI-compatible 400 response)
219+
// Detect context-length exceeded errors.
220+
// Primary: OpenAI-compatible "maximum context length" message with token counts.
221+
// Fallback: any HTTP 400 whose message hints at context/token overflow — different
222+
// providers (Ollama, LM Studio, vLLM) use varying formats so regex may not match.
218223
String msg = e.getMessage() != null ? e.getMessage() : "";
219-
if (msg.contains("maximum context length")) {
224+
boolean isHttp400 = msg.contains("400");
225+
boolean looksLikeContextError = msg.contains("maximum context length")
226+
|| msg.contains("context_length_exceeded")
227+
|| msg.contains("context window")
228+
|| msg.contains("token limit");
229+
if (looksLikeContextError) {
220230
int promptTokens = parseGroup(msg, PATTERN_PROMPT_TOKENS);
221231
int contextTokens = parseGroup(msg, PATTERN_CONTEXT_LENGTH);
222232
if (contextTokens == 0) contextTokens = parseGroup(msg, PATTERN_CONTEXT_LENGTH_ALT);
233+
// If regex couldn't extract counts, use the pre-flight estimate as a best-effort value
234+
// so the UI shows something meaningful rather than "0 tokens".
235+
if (promptTokens == 0) {
236+
promptTokens = (systemPrompt.length() + userPrompt.length()) / 2;
237+
}
238+
if (contextTokens == 0 && modelContextLength != null) {
239+
contextTokens = modelContextLength;
240+
}
223241
throw new ContextLengthExceededException(promptTokens, contextTokens, userPrompt);
224242
}
225-
log.error("Error calling LLM API", e);
243+
// Re-classify ambiguous HTTP 400s that don't match the patterns above
244+
if (isHttp400) {
245+
log.warn("LLM returned HTTP 400 — likely context length or bad request: {}", msg);
246+
} else {
247+
log.error("Error calling LLM API", e);
248+
}
226249
throw new LlmException("Failed to reach the LLM service: " + e.getMessage(), e);
227250
}
228251
}

backend/src/main/java/com/tracepcap/story/service/StoryService.java

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,14 +86,40 @@ public StoryResponse generateStory(UUID fileId, String additionalContext, String
8686
// skip all prompt-building phases and send it directly to the LLM.
8787
if (customPrompt != null && !customPrompt.isBlank()) {
8888
log.info("Using user-supplied custom prompt for file: {}", fileId);
89+
long totalConvsCustom = conversationRepository.countByFileId(fileId);
8990
StoryAggregates aggregates = storyAggregatesService.compute(
90-
fileId, List.of(), conversationRepository.countByFileId(fileId));
91+
fileId, List.of(), totalConvsCustom);
9192
List<Finding> findings = findingsService.detectAll(
92-
fileId, conversationRepository.countByFileId(fileId), analysis.getTotalBytes());
93+
fileId, totalConvsCustom, analysis.getTotalBytes());
94+
95+
// Re-run Phase 1 so investigation steps are preserved in the response.
96+
// The custom prompt already encodes the narrative context, but structured
97+
// investigation results are attached separately and displayed in the UI.
98+
List<InvestigationStep> investigationSteps = List.of();
99+
List<TimelineDataDto> timelineBinsCustom = List.of();
100+
try {
101+
timelineBinsCustom = timelineService.getTimelineData(fileId, 1, 50);
102+
} catch (Exception e) {
103+
log.warn("Failed to fetch timeline bins for custom prompt retry: {}", e.getMessage());
104+
}
105+
try {
106+
String phase1Json = llmClient.generateCompletion(
107+
buildHypothesisSystemPrompt(),
108+
buildHypothesisUserPrompt(file, analysis, additionalContext, aggregates, findings,
109+
timelineBinsCustom, maxFindings, maxRiskMatrix));
110+
var phase1 = parseHypothesesAndQueries(phase1Json);
111+
investigationSteps =
112+
investigationService.executeQueries(fileId, phase1.queries(), phase1.hypotheses());
113+
log.info("Custom prompt retry investigation complete: {} steps", investigationSteps.size());
114+
} catch (Exception e) {
115+
log.warn("Investigation phase skipped during custom prompt retry: {}", e.getMessage());
116+
}
117+
93118
String storyContent = llmClient.generateCompletion(buildSystemPrompt(), customPrompt);
94119
StoryResponse storyResponse = parseStoryContent(storyContent, storyId, fileId);
95120
storyResponse.setAggregates(aggregates);
96121
storyResponse.setFindings(findings);
122+
storyResponse.setInvestigationSteps(investigationSteps.isEmpty() ? null : investigationSteps);
97123
StoryEntity story = StoryEntity.builder()
98124
.id(storyId).fileId(fileId).generatedAt(generatedAt)
99125
.status(StoryEntity.StoryStatus.COMPLETED)

frontend/src/features/network/hooks/useCompareData.ts

Lines changed: 38 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
import { useState, useEffect } from 'react';
1+
import { useState, useEffect, useMemo } from 'react';
22
import { conversationService } from '@/features/conversation/services/conversationService';
3-
import { networkService } from '../services/networkService';
3+
import { networkService, selectSignificantNodes } from '../services/networkService';
44
import { mergeGraphs } from '../services/mergeGraphs';
55
import type { GraphNode, GraphEdge } from '../types';
66

@@ -21,6 +21,8 @@ export interface FileStats {
2121
export interface UseCompareDataReturn {
2222
mergedNodes: GraphNode[];
2323
mergedEdges: GraphEdge[];
24+
totalNodes: number;
25+
hiddenNodes: number;
2426
perFileStats: FileStats[];
2527
labels: string[];
2628
loading: boolean;
@@ -55,13 +57,18 @@ async function fetchGraphForFile(fileId: string) {
5557
response.data,
5658
undefined,
5759
MAX_CONVERSATIONS,
58-
hostClassifications
60+
hostClassifications,
61+
0 // no per-file limit — significance filtering is applied after merge
5962
);
6063
}
6164

62-
export function useCompareData(fileIds: string[], labels: string[]): UseCompareDataReturn {
63-
const [mergedNodes, setMergedNodes] = useState<GraphNode[]>([]);
64-
const [mergedEdges, setMergedEdges] = useState<GraphEdge[]>([]);
65+
export function useCompareData(
66+
fileIds: string[],
67+
labels: string[],
68+
nodeLimit: number
69+
): UseCompareDataReturn {
70+
const [allMergedNodes, setAllMergedNodes] = useState<GraphNode[]>([]);
71+
const [allMergedEdges, setAllMergedEdges] = useState<GraphEdge[]>([]);
6572
const [perFileStats, setPerFileStats] = useState<FileStats[]>([]);
6673
const [loading, setLoading] = useState(true);
6774
const [error, setError] = useState<string | null>(null);
@@ -87,8 +94,8 @@ export function useCompareData(fileIds: string[], labels: string[]): UseCompareD
8794

8895
const merged = mergeGraphs(graphs, labels);
8996

90-
setMergedNodes(merged.nodes);
91-
setMergedEdges(merged.edges);
97+
setAllMergedNodes(merged.nodes);
98+
setAllMergedEdges(merged.edges);
9299
setPerFileStats(
93100
graphs.map((g, i) => ({
94101
label: labels[i],
@@ -115,5 +122,27 @@ export function useCompareData(fileIds: string[], labels: string[]): UseCompareD
115122
// eslint-disable-next-line react-hooks/exhaustive-deps
116123
}, [fileIdsKey, labelsKey]);
117124

118-
return { mergedNodes, mergedEdges, perFileStats, labels, loading, error };
125+
const { mergedNodes, mergedEdges, hiddenNodes } = useMemo(() => {
126+
const { significantNodes, hiddenCount } = selectSignificantNodes(
127+
allMergedNodes,
128+
allMergedEdges,
129+
nodeLimit
130+
);
131+
const sigIds = new Set(significantNodes.map(n => n.id));
132+
const visibleEdges = allMergedEdges.filter(
133+
e => sigIds.has(e.source) && sigIds.has(e.target)
134+
);
135+
return { mergedNodes: significantNodes, mergedEdges: visibleEdges, hiddenNodes: hiddenCount };
136+
}, [allMergedNodes, allMergedEdges, nodeLimit]);
137+
138+
return {
139+
mergedNodes,
140+
mergedEdges,
141+
totalNodes: allMergedNodes.length,
142+
hiddenNodes,
143+
perFileStats,
144+
labels,
145+
loading,
146+
error,
147+
};
119148
}

frontend/src/features/network/services/networkService.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -243,7 +243,7 @@ function finalizeNodeRole(node: GraphNode, srcPort: number, dstPort: number) {
243243
*
244244
* Returns the selected nodes and the count of nodes that were hidden.
245245
*/
246-
function selectSignificantNodes(
246+
export function selectSignificantNodes(
247247
nodes: GraphNode[],
248248
edges: GraphEdge[],
249249
limit: number

frontend/src/features/report/captureNetworkDiagrams.ts

Lines changed: 7 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -26,17 +26,8 @@
2626
import { createElement } from 'react';
2727
import { createRoot } from 'react-dom/client';
2828
import { toPng } from 'html-to-image';
29-
import { conversationService } from '@/features/conversation/services/conversationService';
30-
import { networkService } from '@/features/network/services/networkService';
3129
import { NetworkGraph } from '@/components/network/NetworkGraph/NetworkGraph';
32-
import { CONVERSATION_LIMIT_ENABLED } from '@/features/network/hooks/useNetworkData';
3330
import type { GraphNode, GraphEdge } from '@/features/network/types';
34-
import type { ConversationFilters } from '@/features/conversation/types';
35-
import type { AnalysisSummary } from '@/types';
36-
37-
// Match the same conversation cap the Network Diagram page uses so the report
38-
// shows identical edges. If the env flag disables the limit, capture all.
39-
const MAX_CONVERSATIONS = CONVERSATION_LIMIT_ENABLED ? 500 : Infinity;
4031
const CAPTURE_W = 1400;
4132
const CAPTURE_H = 860;
4233

@@ -161,46 +152,15 @@ export interface DiagramImages {
161152
hierarchical: string; // base64 PNG
162153
}
163154

155+
/**
156+
* Captures both ELK layouts for the given pre-filtered nodes and edges.
157+
* The caller is responsible for passing exactly the nodes/edges currently
158+
* visible on screen so the report matches what the user sees.
159+
*/
164160
export async function captureNetworkDiagrams(
165-
fileId: string,
166-
analysisSummary?: AnalysisSummary,
167-
sessionFilters?: Partial<ConversationFilters>
161+
nodes: GraphNode[],
162+
edges: GraphEdge[]
168163
): Promise<DiagramImages> {
169-
const response = await conversationService.getConversations(fileId, {
170-
ip: '',
171-
port: '',
172-
payloadContains: '',
173-
protocols: [],
174-
l7Protocols: [],
175-
apps: [],
176-
categories: [],
177-
hasRisks: false,
178-
fileTypes: [],
179-
riskTypes: [],
180-
customSignatures: [],
181-
deviceTypes: [],
182-
countries: [],
183-
...sessionFilters,
184-
sortBy: '',
185-
sortDir: 'asc',
186-
page: 1,
187-
pageSize: 10000,
188-
});
189-
190-
let hostClassifications;
191-
try {
192-
hostClassifications = await conversationService.getHostClassifications(fileId);
193-
} catch {
194-
/* optional — best effort */
195-
}
196-
197-
const { nodes, edges } = networkService.buildNetworkGraph(
198-
response.data,
199-
analysisSummary,
200-
MAX_CONVERSATIONS,
201-
hostClassifications
202-
);
203-
204164
// Sequential — both layouts share the module-level ELK singleton inside
205165
// NetworkGraph.tsx; running them in parallel risks a layout race condition.
206166
const forceDirected = await captureLayout(nodes, edges, 'forceDirected2d');

0 commit comments

Comments
 (0)