Awesome LLM Story Generation

A curated list of story/novel/script generation research in the LLM era (2022-present), organized by method with strict link verification.

Total entries: 160
Categories: 10
Last verified: 2026-04-20
Language: English | 中文

Category Overview

Category	Entries
Planning / Decomposition for Story Generation	17
Agent Collaboration for Story Writing	5
Sandbox / World Simulation Narrative Generation	10
Multimodal Story Generation (Text-Image/Video/Comic/Audio)	16
Memory & Long-Context Coherence	11
Consistency / Controllability / Constraint Following	21
Refinement / Self-Critique / Iterative Editing	13
Evaluation / Benchmarks / Metrics	35
Datasets / Surveys / Resources	22
Open-source Projects (No Paper Required)	10

Papers and Projects

Note: Project stores project/demo links; Code stores verified GitHub repositories.

Planning / Decomposition for Story Generation

Title	Venue	Date	Paper	Project	Code	Citations	Tags
Narrix: Remixing Narrative Strategies from Examples for Story Writing	CHI 2026 (Conference on Human Factors in Computing Systems)	2026-04	arXiv	-	-	-	planning, narrative-structure
BiT-MCTS: A Theme-based Bidirectional MCTS Approach to Chinese Fiction Generation	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	planning, narrative-structure
DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing	ArXiv 2026 (arXiv preprint)	2026-01	arXiv	-	-	-	planning, narrative-structure
Codified Foreshadowing-Payoff Text Generation	ArXiv 2026 (arXiv preprint)	2026-01	arXiv	-	-	-	planning, narrative-structure
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency	ArXiv 2025 (arXiv preprint)	2025-10	arXiv	-	-	-	planning, narrative-structure
Long Story Generation via Knowledge Graph and Literary Theory	ArXiv 2025 (arXiv preprint)	2025-08	arXiv	-	-	-	planning, narrative-structure
STORYTELLER: An Enhanced Plot-Planning Framework for Coherent and Cohesive Story Generation	ArXiv 2025 (arXiv preprint)	2025-06	arXiv	-	-	-	planning, narrative-structure
Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective	ArXiv 2025 (arXiv preprint)	2025-06	arXiv	-	-	-	planning, narrative-structure
Learning to Reason for Long-Form Story Generation	ArXiv 2025 (arXiv preprint)	2025-03	arXiv	-	Code		planning, narrative-structure
Generating Long-form Story Using Dynamic Hierarchical Outlining with Memory-Enhancement	NAACL 2025 (North American Chapter of ACL)	2024-12	arXiv	-	-		planning, narrative-structure
Ex3: Automatic Novel Writing by Extracting, Excelsior and Expanding	ACL 2024 (Annual Meeting of the Association for Computational Linguistics)	2024-08	arXiv	-	-		planning, narrative-structure
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models	NAACL 2025 (North American Chapter of ACL)	2024-04	arXiv	-	-		planning, narrative-structure
Creating Suspenseful Stories: Iterative Planning with Large Language Models	EACL 2024 (Conference of the European Chapter of ACL)	2024-02	arXiv	-	-		planning, narrative-structure
Improving Pacing in Long-Form Story Planning	EMNLP Findings 2023 (Findings of EMNLP)	2023-11	arXiv	-	-		planning, narrative-structure
End-to-End Story Plot Generator	ArXiv 2023 (arXiv preprint)	2023-10	arXiv	-	-		planning, narrative-structure
The Next Chapter: A Study of Large Language Models in Storytelling	ArXiv 2023 (arXiv preprint)	2023-01	arXiv	-	-	-	planning, narrative-structure
DOC: Improving Long Story Coherence With Detailed Outline Control	ArXiv 2022 (arXiv preprint)	2022-12	arXiv	-	-	-	planning, narrative-structure

Agent Collaboration for Story Writing

Title	Venue	Date	Paper	Project	Code	Citations	Tags
Collaborative Multi-Agent Scripts Generation for Enhancing Imperfect-Information Reasoning in Murder Mystery Games	ACL Findings 2026 (Findings of ACL)	2026-04	arXiv	-	-	-	multi-agent, collaboration
A Cognitive Writing Perspective for Constrained Long-Form Text Generation	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	Code		multi-agent, collaboration
Agents' Room: Narrative Generation through Multi-step Collaboration	ICLR 2025 (International Conference on Learning Representations)	2024-10	arXiv	-	-		multi-agent, collaboration
HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing	EMNLP Findings 2024 (Findings of EMNLP)	2024-06	arXiv	-	-		multi-agent, collaboration
AutoAgents: A Framework for Automatic Agent Generation	IJCAI 2024 (International Joint Conference on Artificial Intelligence)	2023-09	arXiv	-	Code		multi-agent, collaboration

Sandbox / World Simulation Narrative Generation

Title	Venue	Date	Paper	Project	Code	Citations	Tags
EvoSpark: Endogenous Interactive Agent Societies for Unified Long-Horizon Narrative Evolution	ACL 2026 (Annual Meeting of the Association for Computational Linguistics)	2026-04	arXiv	-	-	-	sandbox, simulation
StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models	ArXiv 2025 (arXiv preprint)	2025-10	arXiv	Project	-	-	sandbox, simulation
OPEN-THEATRE: An Open-Source Toolkit for LLM-based Interactive Drama	ArXiv 2025 (arXiv preprint)	2025-09	arXiv	-	-	-	sandbox, interactive
HAMLET: Hyperadaptive Agent-based Modeling for Live Embodied Theatrics	ArXiv 2025 (arXiv preprint)	2025-07	arXiv	-	-	-	sandbox, simulation
STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game	ArXiv 2025 (arXiv preprint)	2025-05	arXiv	-	-		sandbox, interactive
BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation	ArXiv 2025 (arXiv preprint)	2025-04	arXiv	Project	-	-	sandbox, interactive
Towards Enhanced Immersion and Agency for LLM-based Interactive Drama	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	-		sandbox, interactive
IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation	ACL 2024 (Annual Meeting of the Association for Computational Linguistics)	2024-07	arXiv	-	Code		sandbox, interactive
StoryVerse: Towards Co-authoring Dynamic Plot with LLM-based Character Simulation via Narrative Planning	FDG 2024 (Foundations of Digital Games)	2024-05	arXiv	-	-		sandbox, simulation
Generative Agents: Interactive Simulacra of Human Behavior	ArXiv 2023 (arXiv preprint)	2023-04	arXiv	Project	Code	-	sandbox, interactive

Multimodal Story Generation (Text-Image/Video/Comic/Audio)

Title	Venue	Date	Paper	Project	Code	Citations	Tags
CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	multimodal, visual-story
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	multimodal, screenplay
Camera Artist: A Multi-Agent Framework for Cinematic Language Storytelling Video Generation	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	multimodal, video-story
StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	multimodal, visual-story
LogiStory: A Logic-Aware Framework for Multi-Image Story Visualization	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	multimodal, visual-story
Customized Visual Storytelling with Unified Multimodal LLMs	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	multimodal, visual-story
Directing the Narrative: A Finetuning Method for Controlling Coherence and Style in Story Generation	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	multimodal, visual-story
EmoStory: Emotion-Aware Story Generation	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	multimodal, visual-story
PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	multimodal, co-creation
StoryComposerAI: A Multimodal Story Co-Creation Tool for Amateur Writers	CHI EA 2026 (Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems)	2026-02	arXiv	-	-	-	multimodal, co-creation
Re:Verse -- Can Your VLM Read a Manga?	ICCV AISTORY Workshop 2025 (ICCV AISTORY Workshop)	2025-08	arXiv	-	-	-	multimodal, visual-story
Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation	ArXiv 2025 (arXiv preprint)	2025-08	arXiv	-	-	-	multimodal, visual-story
R^2: A LLM BASED NOVEL-TO-SCREENPLAY GENERATION FRAMEWORK WITH CAUSAL PLOT GRAPHS	ICLR 2025 (International Conference on Learning Representations)	2025-03	arXiv	-	-		multimodal, screenplay
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	Code	-	multimodal, visual-story
SEED-Story: Multimodal Long Story Generation with Large Language Model	ArXiv 2024 (arXiv preprint)	2024-07	arXiv	-	Code	-	multimodal, visual-story
Make-A-Story: Visual Memory Conditioned Consistent Story Generation	CVPR 2023 (Conference on Computer Vision and Pattern Recognition)	2022-11	arXiv	-	-	-	multimodal, visual-story

Memory & Long-Context Coherence

Title	Venue	Date	Paper	Project	Code	Citations	Tags
Think Before you Write: QA-Guided Reasoning for Character Descriptions in Books	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	long-context, coherence
Skeleton-based Coherence Modeling in Narratives	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	long-context, coherence
Shifting Long-Context LLMs Research from Input to Output	ArXiv 2025 (arXiv preprint)	2025-03	arXiv	-	-	-	long-context, coherence
Language Models can Self-Lengthen to Generate Long Texts	ArXiv 2024 (arXiv preprint)	2024-10	arXiv	-	Code	-	long-context, coherence
LongGenBench: Benchmarking Long-Form Generation in Long-Context LLMs	ArXiv 2024 (arXiv preprint)	2024-09	Published	-	-	-	long-context, coherence
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs	ArXiv 2024 (arXiv preprint)	2024-08	arXiv	-	Code	-	long-context, coherence
LongLaMP: A Benchmark for Personalized Long-form Text Generation	ArXiv 2024 (arXiv preprint)	2024-07	arXiv	-	-	-	long-context, coherence
CHIRON: Rich Character Representations in Long-Form Narratives	EMNLP Findings 2024 (Findings of EMNLP)	2024-06	Published	-	-	-	long-context, coherence
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation	COLM 2024 (Conference on Language Modeling)	2024-01	arXiv	-	-		long-context, coherence
LongAlign: A Recipe for Long Context Alignment of Large Language Models	ArXiv 2024 (arXiv preprint)	2024-01	arXiv	-	Code	-	long-context, coherence
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text	ArXiv 2023 (arXiv preprint)	2023-05	arXiv	Project	Code		long-context, interactive

Consistency / Controllability / Constraint Following

Title	Venue	Date	Paper	Project	Code	Citations	Tags
UniCreative: Unifying Long-form Logic and Short-form Sparkle via Reference-Free Reinforcement Learning	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	controllability, consistency
Noise Steering for Controlled Text Generation: Improving Diversity and Reading-Level Fidelity in Arabic Educational Story Generation	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	controllability, consistency
Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	controllability, consistency
TaleFrame: An Interactive Story Generation System with Fine-Grained Control and Large Language Models	ArXiv 2025 (arXiv preprint)	2025-12	arXiv	-	-	-	controllability, interactive
SCORE: Story Coherence and Retrieval Enhancement for AI Narratives	ArXiv 2025 (arXiv preprint)	2025-03	arXiv	-	-		controllability, retrieval
Whose story is it? Personalizing story generation by inferring author styles	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	-		controllability, consistency
Pastiche Novel Generation Creating: Fan Fiction You Love in Your Favorite Author's Style	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	-		controllability, consistency
CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints	ArXiv 2024 (arXiv preprint)	2024-10	arXiv	-	Code		controllability, consistency
Crafting Narrative Closures: Zero-Shot Learning with SSM Mamba for Short Story Ending Generation	ArXiv 2024 (arXiv preprint)	2024-10	arXiv	-	-		controllability, consistency
MirrorStories: Reflecting Diversity through Personalized Narrative Generation with Large Language Models	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2024-09	arXiv	-	-		controllability, consistency
FACTTRACK: Time-Aware World State Tracking in Story Outlines	NAACL 2025 (North American Chapter of ACL)	2024-07	arXiv	-	-		controllability, consistency
Suri: Multi-constraint Instruction Following for Long-form Text Generation	ArXiv 2024 (arXiv preprint)	2024-06	arXiv	-	Code	-	controllability, consistency
MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation	ACL 2024 (Annual Meeting of the Association for Computational Linguistics)	2024-06	arXiv	Project	Code		controllability, consistency
Measuring Psychological Depth in Language Models	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2024-06	arXiv	-	-		controllability, consistency
Guiding and Diversifying LLM-Based Story Generation via Answer Set Programming	ACL Workshop 2025 (ACL Workshop)	2024-06	arXiv	-	-		controllability, consistency
Multigenre AI-powered Story Composition	ArXiv 2024 (arXiv preprint)	2024-05	arXiv	-	-		controllability, consistency
Returning to the Start: Generating Narratives with Related Endpoints	NAACL 2024 (North American Chapter of ACL)	2024-04	arXiv	Project	Code		controllability, consistency
NarrativeGenie: Generating Narrative Beats and Dynamic Storytelling with Large Language Models	AIIDE 2024 (Conference on Artificial Intelligence and Interactive Digital Entertainment)	2024-01	Published	-	-	-	controllability, consistency
CAT-LLM: Prompting Large Language Models with Text Style Definition for Chinese Article-style Transfer	ArXiv 2024 (arXiv preprint)	2024-01	arXiv	-	-		controllability, consistency
Learning to Generate Text in Arbitrary Writing Styles	ArXiv 2023 (arXiv preprint)	2023-12	arXiv	-	-		controllability, consistency
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment	ICLR 2024 (International Conference on Learning Representations)	2023-07	arXiv	-	-		controllability, consistency

Refinement / Self-Critique / Iterative Editing

Title	Venue	Date	Paper	Project	Code	Citations	Tags
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	refinement, revision
LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback	ArXiv 2026 (arXiv preprint)	2026-01	arXiv	-	-	-	refinement, revision
All Stories Are One Story: Emotional Arc Guided Procedural Game Level Generation	ArXiv 2025 (arXiv preprint)	2025-08	arXiv	-	-	-	refinement, revision
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models	ArXiv 2025 (arXiv preprint)	2025-06	arXiv	-	-	-	refinement, revision
Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection	ArXiv 2025 (arXiv preprint)	2025-04	arXiv	-	-		refinement, revision
MLD-EA: Check and Complete Narrative Coherence by Introducing Emotions and Actions	ArXiv 2024 (arXiv preprint)	2024-12	arXiv	-	-		refinement, revision
Collective Critics for Creative Story Generation	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2024-10	arXiv	-	-		refinement, revision
SWAG: Storytelling With Action Guidance	EMNLP Findings 2024 (Findings of EMNLP)	2024-02	arXiv	-	-		refinement, revision
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence	EMNLP Findings 2023 (Findings of EMNLP)	2023-10	arXiv	-	-		refinement, retrieval
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation	ArXiv 2023 (arXiv preprint)	2023-10	arXiv	-	-		refinement, revision
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation	ArXiv 2023 (arXiv preprint)	2023-10	arXiv	-	-	-	refinement, revision
Re3: Generating Longer Stories With Recursive Reprompting and Revision	ArXiv 2022 (arXiv preprint)	2022-10	arXiv	-	-	-	refinement, revision
Model Criticism for Long-Form Text Generation	ArXiv 2022 (arXiv preprint)	2022-10	arXiv	-	-	-	refinement, revision

Evaluation / Benchmarks / Metrics

Title	Venue	Date	Paper	Project	Code	Citations	Tags
ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, evaluation
Attention Flows: Tracing LLM Conceptual Engagement via Story Summaries	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, dataset
MCSC-Bench: Multimodal Context-to-Script Creation for Realistic Video Production	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, dataset
Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, evaluation
Lessons Without Borders? Evaluating Cultural Alignment of LLMs Using Multilingual Story Moral Generation	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, evaluation
Stories of Your Life as Others: A Round-Trip Evaluation of LLM-Generated Life Stories Conditioned on Rich Psychometric Profiles	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, evaluation
StoryScope: Investigating idiosyncrasies in AI fiction	ArXiv 2026 (arXiv preprint)	2026-04	arXiv	-	-	-	benchmark, evaluation
Humans vs Vision-Language Models: A Unified Measure of Narrative Coherence	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	benchmark, evaluation
Creative Convergence or Imitation? Genre-Specific Homogeneity in LLM-Generated Chinese Literature	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	-	-	-	benchmark, evaluation
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs	ArXiv 2026 (arXiv preprint)	2026-03	arXiv	Project	Code	-	benchmark, evaluation
LLMs Exhibit Significantly Lower Uncertainty in Creative Writing Than Professional Writers	ArXiv 2026 (arXiv preprint)	2026-02	arXiv	-	-	-	benchmark, evaluation
Evaluation Framework for AI Creativity: A Case Study Based on Story Generation	ArXiv 2026 (arXiv preprint)	2026-01	arXiv	-	-	-	benchmark, evaluation
Evaluating LLM Story Generation through Large-scale Network Analysis of Social Structures	ArXiv 2025 (arXiv preprint)	2025-10	arXiv	-	-	-	benchmark, evaluation
EvolvR: Self-Evolving Pairwise Reasoning for Story Evaluation to Enhance Generation	ArXiv 2025 (arXiv preprint)	2025-08	arXiv	-	-	-	benchmark, evaluation
LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing	ArXiv 2025 (arXiv preprint)	2025-07	arXiv	-	-	-	benchmark, dataset
WritingBench: A Comprehensive Benchmark for Generative Writing	ArXiv 2025 (arXiv preprint)	2025-03	arXiv	-	-	-	benchmark, evaluation
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization	ArXiv 2025 (arXiv preprint)	2025-03	arXiv	-	-		benchmark, evaluation
LongEval: A Comprehensive Analysis of Long-Text Generation Through a Plan-based Paradigm	ArXiv 2025 (arXiv preprint)	2025-02	arXiv	-	Code		benchmark, evaluation
Echoes in AI: Quantifying Lack of Plot Diversity in LLM Outputs	ArXiv 2025 (arXiv preprint)	2025-01	arXiv	-	-		benchmark, evaluation
Evaluating Creative Short Story Generation in Humans and Large Language Models	ArXiv 2024 (arXiv preprint)	2024-11	arXiv	-	-		benchmark, evaluation
Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs	COLING 2025 (International Conference on Computational Linguistics)	2024-09	arXiv	-	-		benchmark, evaluation
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models	ArXiv 2024 (arXiv preprint)	2024-09	arXiv	-	Code	-	benchmark, evaluation
STORYSUMM: Evaluating Faithfulness in Story Summarization	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2024-07	arXiv	-	-		benchmark, evaluation
Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?	ArXiv 2024 (arXiv preprint)	2024-07	arXiv	-	-		benchmark, evaluation
Are Large Language Models Capable of Generating Human-Level Narratives?	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2024-07	arXiv	-	-		benchmark, evaluation
Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation	TACL 2024 (Transactions of the Association for Computational Linguistics)	2024-05	arXiv	-	-		benchmark, evaluation
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers	TACL 2024 (Transactions of the Association for Computational Linguistics)	2024-03	arXiv	-	-		benchmark, evaluation
Learning Personalized Alignment for Evaluating Open-ended Text Generation	EMNLP 2024 (Conference on Empirical Methods in Natural Language Processing)	2023-10	arXiv	-	-		benchmark, evaluation
A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing	EMNLP Findings 2023 (Findings of EMNLP)	2023-10	arXiv	-	-		benchmark, evaluation
Art or Artifice? Large Language Models and the False Promise of Creativity	CHI 2023 (Conference on Human Factors in Computing Systems)	2023-09	arXiv	-	-		benchmark, evaluation
HAUSER: Towards Holistic and Automatic Evaluation of Simile Generation	ACL 2023 (Annual Meeting of the Association for Computational Linguistics)	2023-06	arXiv	-	-		benchmark, evaluation
Can Large Language Models Be an Alternative to Human Evaluations?	ACL 2023 (Annual Meeting of the Association for Computational Linguistics)	2023-05	arXiv	-	-		benchmark, evaluation
DeltaScore: Evaluating Story Generation with Differentiating Perturbations	EMNLP Findings 2023 (Findings of EMNLP)	2023-03	arXiv	-	-		benchmark, evaluation
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning	EMNLP 2022 (Conference on Empirical Methods in Natural Language Processing)	2022-10	arXiv	-	Code		benchmark, evaluation
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation	COLING 2022 (International Conference on Computational Linguistics)	2022-08	arXiv	-	-		benchmark, evaluation

Datasets / Surveys / Resources

Title	Venue	Date	Paper	Project	Code	Citations	Tags
Narrative Theory-Driven LLM Methods for Automatic Story Generation and Understanding: A Survey	ArXiv 2026 (arXiv preprint)	2026-02	arXiv	-	-	-	dataset, survey
MUSE: A Multi-agent Framework for Unconstrained Story Envisioning via Closed-Loop Cognitive Orchestration	ArXiv 2026 (arXiv preprint)	2026-02	arXiv	-	-	-	dataset, resource
StoryWriter: A Multi-Agent Framework for Long Story Generation	ArXiv 2025 (arXiv preprint)	2025-06	arXiv	-	-	-	dataset, resource
Reasoning-Enhanced Self-Training for Long-Form Personalized Text Generation	ArXiv 2025 (arXiv preprint)	2025-01	arXiv	Project	Code	-	dataset, resource
Multi-Agent Based Character Simulation for Story Writing	IN2Writing 2025 (IN2Writing Workshop)	2025-01	Published	-	-	-	dataset, resource
BookWorm: A Dataset for Character Description and Analysis	EMNLP Findings 2024 (Findings of EMNLP)	2024-10	arXiv	-	-		dataset, dataset
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation	ArXiv 2024 (arXiv preprint)	2024-08	arXiv	-	-		dataset, survey
The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories	EMNLP Workshop 2025 (EMNLP Workshop)	2024-06	arXiv	-	Code		dataset, dataset
CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis	NAACL Findings 2025 (Findings of NAACL)	2024-06	arXiv	-	Code		dataset, resource
The Value, Benefits, and Concerns of Generative AI-Powered Assistance in Writing	CHI 2024 (Conference on Human Factors in Computing Systems)	2024-03	arXiv	-	-		dataset, resource
Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives	ACL Findings 2024 (Findings of ACL)	2024-02	arXiv	-	-		dataset, resource
CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation	LREC-COLING 2024 (LREC-COLING)	2024-02	arXiv	-	Code		dataset, dataset
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models	ArXiv 2024 (arXiv preprint)	2024-02	arXiv	-	-	-	dataset, resource
Weaver: Foundation Models for Creative Writing	ArXiv 2024 (arXiv preprint)	2024-01	arXiv	-	-		dataset, resource
Reflections & Resonance: Two-Agent Partnership for Advancing LLM-based Story Annotation	LREC-COLING 2024 (LREC-COLING)	2024-01	Published	-	-		dataset, resource
CLAUSE-ATLAS: A Corpus of Narrative Information to Scale up Computational Literary Analysis	LREC-COLING 2024 (LREC-COLING)	2024-01	Published	-	-		dataset, resource
STONYBOOK: A System and Resource for Large-Scale Analysis of Novels	ArXiv 2023 (arXiv preprint)	2023-11	arXiv	-	-		dataset, resource
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative Understanding	EMNLP Findings 2023 (Findings of EMNLP)	2023-10	arXiv	-	-		dataset, resource
StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation	ACL 2023 (Annual Meeting of the Association for Computational Linguistics)	2023-05	arXiv	-	-		dataset, dataset
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey	Neurocomputing 2023 (Neurocomputing (Journal))	2022-12	arXiv	-	-	-	dataset, survey
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals	ArXiv 2022 (arXiv preprint)	2022-09	arXiv	Project	Code	-	dataset, screenplay
A corpus for understanding and generating moral stories	NAACL 2022 (North American Chapter of ACL)	2022-04	arXiv	-	-		dataset, resource

Open-source Projects (No Paper Required)

Title	Venue	Date	Paper	Project	Code	Citations	Tags
FireRed-OpenStoryline	GitHub 2026 (Open-source repository)	2026-01	-	Project	Code	-	tooling, open-source
ReasoningNCP (Official Repository)	GitHub 2025 (Open-source repository)	2025-03	arXiv	Project	Code	-	tooling, open-source
SEED-Story (Official Repository)	GitHub 2024 (Open-source repository)	2024-07	arXiv	Project	Code	-	tooling, open-source
IBSEN (Official Repository)	GitHub 2024 (Open-source repository)	2024-07	arXiv	Project	Code	-	tooling, open-source
RENarGen (Official Repository)	GitHub 2024 (Open-source repository)	2024-04	arXiv	Project	Code	-	tooling, open-source
fictionx-story-gen	GitHub 2024 (Open-source repository)	2024-01	-	Project	Code	-	tooling, open-source
SillyTavern	GitHub 2023 (Open-source repository)	2023-01	-	Project	Code	-	tooling, open-source
GOAT-Storytelling-Agent	GitHub 2023 (Open-source repository)	2023-01	-	Project	Code	-	tooling, open-source
Dramatron (Official Repository)	GitHub 2022 (Open-source repository)	2022-09	arXiv	Project	Code	-	tooling, screenplay
TavernAI	GitHub 2022 (Open-source repository)	2022-01	-	Project	Code	-	tooling, open-source

Maintenance Notes

Check duplicate titles before adding new entries.
Update README.md and README_zh.md together.
Use YYYY-MM for Date.
Keep Paper as one primary link (Published preferred, otherwise arXiv; use - if unavailable).

Citation

If this repository helps your research or project, please cite:

@misc{lijunjie2026awesomellmstorygeneration,
  title        = {Awesome LLM Story Generation},
  author       = {Lijunjie},
  year         = {2026},
  howpublished = {\url{https://github.com/lijunjie/awesome-llm-story-generation}},
  note         = {GitHub repository, accessed 2026-02-27}
}

If you later change your GitHub account or repository name, update author and howpublished accordingly.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs		docs
reports/verification		reports/verification
.gitignore		.gitignore
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome LLM Story Generation

Contents

Category Overview

Papers and Projects