serving-infrastructure

Here are 3 public repositories matching this topic...

mstar-project / mstar

A high-performance, universal serving framework for any-to-any models.

distributed-systems robotics inference high-performance-computing diffusion multimodal world-models vision-language-models serving-infrastructure speech-language-model agentic-infrastructure vision-language-action-model unified-multimodal-models

Updated Jun 24, 2026
Python

ksm26 / Efficiently-Serving-LLMs

Star

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

text-generation batch-processing server-optimization model-serving model-acceleration inference-optimization optimization-techniques machine-learning-operations deep-learning-techniques model-inference-service performance-enhancement scalability-strategies serving-infrastructure large-scale-deployment

Updated Apr 12, 2024
Jupyter Notebook

uw-syfi / TraceLab

Star

An open toolkit and public dataset hub for collecting, sanitizing, analyzing, and visualizing coding agent traces.

trace serving-infrastructure coding-agent

Updated Jun 23, 2026
Python

Improve this page

Add a description, image, and links to the serving-infrastructure topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the serving-infrastructure topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly