Hi,
I am working on WFGY ProblemMap, a 16-problem failure checklist for RAG and LLM agent pipelines.
It is already adopted or cited by RAGFlow, LlamaIndex, ToolUniverse (Harvard MIMS Lab), Rankify (Univ. of Innsbruck), the Multimodal RAG Survey (QCRI LLM Lab), Awesome LLM Apps, plus the Awesome AI in Finance curated list, so it has been exercised on different RAG and agent stacks.
nofx, with its multi-exchange and multi-AI competition setup, looks like a very natural place where these failure patterns appear:
- two strategies read the same context but come to opposite conclusions
- long chains of prompts start from a clean signal and end in incoherent trades
- attention or focus collapses under high volatility
- it is hard to see whether a bad trade is caused by model behaviour, data issues, or orchestration
These map cleanly to WFGY problems such as No.3 long reasoning chains, No.7 memory breaks, No.8 black-box debugging, No.9 entropy collapse, No.13 multi-agent chaos.
I would like to propose a docs-only addition that introduces a short “reliability checklist” based on the WFGY map. For example:
docs/reliability/rag-failure-checklist.md
The document would:
- list the 16 failure types, but rephrased in nofx terminology
- give concrete examples in the context of multi-AI trading competitions
- suggest how to tag incidents or post-mortems with a shared failure vocabulary
- link to the full map here:
https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md
This would be documentation only, no change in the core engine.
If you feel this would help users debug and reason about failures, I can open a focused docs PR.
Hi,
I am working on WFGY ProblemMap, a 16-problem failure checklist for RAG and LLM agent pipelines.
It is already adopted or cited by RAGFlow, LlamaIndex, ToolUniverse (Harvard MIMS Lab), Rankify (Univ. of Innsbruck), the Multimodal RAG Survey (QCRI LLM Lab), Awesome LLM Apps, plus the Awesome AI in Finance curated list, so it has been exercised on different RAG and agent stacks.
nofx, with its multi-exchange and multi-AI competition setup, looks like a very natural place where these failure patterns appear:
These map cleanly to WFGY problems such as No.3 long reasoning chains, No.7 memory breaks, No.8 black-box debugging, No.9 entropy collapse, No.13 multi-agent chaos.
I would like to propose a docs-only addition that introduces a short “reliability checklist” based on the WFGY map. For example:
The document would:
https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.mdThis would be documentation only, no change in the core engine.
If you feel this would help users debug and reason about failures, I can open a focused docs PR.