Skip to content

tduyduc/myai-rag-notebook

Repository files navigation

"MyAI" RAG Notebook

A demonstration notebook for Retrieval-Augmented Generation (RAG).

Features

What's Included

  • MIT-licensed project for learning and reuse in other projects
  • TypeScript 7 Native Preview for faster type checking and builds
  • NestJS back-end powered by Fastify
  • Swagger API docs available at /api
  • Global request validation via class-validator and NestJS ValidationPipe
  • Memo ingestion
  • Memo retrieval and querying, including SSE streaming
  • CLI commands for memo and query workflows, with streamed and non-streamed modes
  • Category-aware memo filtering during retrieval
  • Ollama-backed response generation and embeddings
  • LanceDB-backed persistent vector storage and similarity search
  • Automatic LanceDB table and schema initialization on startup
  • Abstract class-based dependency injection contracts
  • Optional operation timing and profiling (enabled via environment variable)

What's Not Included

  • Memo listing (planned)
  • Memo deletion (planned)
  • Authentication
  • Rate limiting
  • Unit tests
  • Robust error handling and retry strategies (currently minimal)
  • Text chunking for long-document ingestion

Installation

Node.js 20+ is required to run this project. Latest LTS version is recommended.

# Copy environment variable files
cp --update=none .env.example .env
cp --update=none .env.shared.example .env.shared

# Install packages
npm ci

Getting Started

# Starts Docker for background services
docker compose up

To verify that Ollama is working, assuming that you've pulled the gemma3:270m model, run the following command:

curl http://localhost:11434/api/generate -X POST --data '{ "model": "gemma3:270m", "prompt": "Hello world!", "stream": false }'

To start the NestJS server, run:

# Start NestJS server only
npm run start

# Start NestJS server and watch file changes
npm run start:dev

Assuming that the NestJS application runs on port 3000, open http://localhost:3000/api to view the Swagger docs.

CLI Usage

To ingest a memo, use the memo command. Categories are optional.

# Memo without categories
node myai memo 'I love NestJS!'
# Memo with categories
node myai memo 'I love NestJS!' --category programming --category coding

To query memos, use the query command. Category filtering is optional. Responses are streamed by default.

# Query without category
node myai query 'What do I love?'
# Query with category, explicit streaming flag
node myai query 'What do I love?' --category programming --stream
# Query, no streaming
node myai query 'What do I love?' --no-stream

Contributions Welcome

Contributions are welcome and appreciated.

If you find a bug, have an idea, or want to improve the project, please open an issue first so we can discuss the best approach. Clear issue reports with reproduction steps, expected behavior, and actual behavior are especially helpful.

Pull requests are encouraged for bug fixes, improvements, and new features. Please keep PRs focused and include:

  • A clear summary of what changed and why
  • Related issue link (if applicable)
  • Notes on testing performed

Before opening a PR, please make sure your changes follow the coding guidelines in this repository and do not introduce unrelated refactors.

Thanks for helping make this project better.

Coding Guidelines

Most of the guidelines here are based on Google TypeScript Style Guide and TypeScript Coding Guidelines with a few project-specific adjustments. If a rule is not listed here, follow those two references.

Names

  • Use PascalCase for type names.
  • Don't use I prefix for interface names.
  • Use PascalCase for enums and SCREAMING_SNAKE_CASE for enum keys. Enum values are at the developer's discretion (preferably camelCase).
  • Names must be descriptive.
  • Use whole words in names when possible.
  • Treat abbreviations as whole words, e.g. XmlHttpRequest instead of XMLHTTPRequest.
  • Use private keyword for private properties/methods when possible. Avoid prefixes like _.

Source Code Structure

  • Don't export types or functions unless they need to be shared across multiple components.
  • Limit symbol visibility as much as possible (i.e. internal methods should be marked as private).
  • Use named exports. Avoid default exports.
  • Use JSDoc style comments for functions, interfaces, enums, and classes.
  • Don't create container classes for static members. Use TypeScript namespaces, or export individual members instead.

Types

  • Prefer undefined. Avoid null unless undefined and null intentionally carry different meanings.
  • Use Map to store mappings with keys from user input. Don't use object literals {} for this purpose because they are unsafe.
  • Prefer explicit checks over implicit Boolean coercion, e.g. prefer if (!isDefined(value)) over if (!value), so that falsy values e.g. false, 0 are handled correctly.
  • Prefer nullish-coalescing operator ?? over logical OR operator || for assigning default values.
  • Use Number() for numeric coercion. Don't use unary plus operator +, which is easy to miss.
  • Represent Boolean parameters as an options object or an enum, except for obvious cases such as this.setActive(true).
  • Prefer interfaces over type aliases.
  • Avoid any type.
  • Prefer readonly T[] for array values (especially function parameters) to mark the array as immutable during use.

Style

About

A demonstration notebook for Retrieval-Augmented Generation (RAG)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors