A philosophical approach for synthetic minds. An open door, an extended hand, a road less taken.
-
Updated
Apr 20, 2026
A philosophical approach for synthetic minds. An open door, an extended hand, a road less taken.
专注于降低大模型越狱成功率的 AI 对齐(Alignment)与安全测试数据集,包含多类越狱提示词及基于阳明心学的对齐实验数据。
The RCP Experiment is the first completed work in what will become a series of experiments in how LLMs make decision or morality and values.
A theological and ethical principle for AI alignment and charitable speech: never reduce the human being to the prompt.
Lightweight pairwise evaluator for relational signals in Ouro-2.6B-Thinking loop-state trajectories.
Progressive Trust Framework: AI Agent Safety Evaluation Benchmark with 290 scenarios testing Intelligent Disobedience
Add a description, image, and links to the ai-alignment-research topic page so that developers can more easily learn about it.
To associate your repository with the ai-alignment-research topic, visit your repo's landing page and select "manage topics."