# NVIDIA Dynamo Documentation
## Docs
- [Quickstart](https://docs.dynamo.nvidia.com/dynamo/getting-started/quickstart.md)
- [Support Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/support-matrix.md)
- [Feature Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/feature-matrix.md)
- [Release Artifacts](https://docs.dynamo.nvidia.com/dynamo/getting-started/release-artifacts.md)
- [Examples](https://docs.dynamo.nvidia.com/dynamo/getting-started/examples.md)
- [Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide.md)
- [Detailed Installation Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/detailed-installation-guide.md)
- [Dynamo Operator](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/dynamo-operator.md)
- [Service Discovery](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/service-discovery.md)
- [Webhooks](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/webhooks.md)
- [Minikube Setup](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/minikube-setup.md)
- [Managing Models with DynamoModel](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/managing-models-with-dynamo-model.md)
- [Autoscaling](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/autoscaling.md)
- [Inference Gateway (GAIE)](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/inference-gateway-gaie.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/metrics.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/logging.md)
- [Operator Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/operator-metrics.md)
- [Multinode Deployments](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/multinode-deployments.md)
- [Grove](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/grove.md)
- [KV Cache Aware Routing](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-aware-routing.md): Enable KV-aware routing using Router for Dynamo deployments
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/user-guides/disaggregated-serving.md): Find optimal prefill/decode configuration for disaggregated serving deployments
- [KV Cache Offloading](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-offloading.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [Dynamo Benchmarking Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/dynamo-benchmarking.md): Benchmark and compare performance across Dynamo deployment configurations
- [Multimodality Support](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support.md): Deploy multimodal models with image, video, and audio support in Dynamo
- [vLLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/v-llm-multimodal.md)
- [TensorRT-LLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/tensor-rt-llm-multimodal.md)
- [SGLang Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/sg-lang-multimodal.md)
- [Tool Calling](https://docs.dynamo.nvidia.com/dynamo/user-guides/tool-calling.md): Connect Dynamo to external tools and services using function calling
- [LoRA Adapters](https://docs.dynamo.nvidia.com/dynamo/user-guides/lo-ra-adapters.md): Serve fine-tuned LoRA adapters with dynamic loading and routing in Dynamo
- [Observability (Local)](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local.md): Monitor Dynamo deployments with metrics, logging, and tracing
- [Prometheus + Grafana Setup](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/prometheus-grafana-setup.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics.md)
- [Metrics Developer Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics-developer-guide.md)
- [Health Checks](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/health-checks.md)
- [Tracing](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/tracing.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/logging.md)
- [Fault Tolerance](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance.md): Handle failures gracefully with request migration, cancellation, and graceful shutdown
- [Request Migration](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-migration.md)
- [Request Cancellation](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-cancellation.md)
- [Graceful Shutdown](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/graceful-shutdown.md)
- [Request Rejection](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-rejection.md)
- [Testing](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/testing.md)
- [Writing Python Workers in Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/writing-python-workers-in-dynamo.md): Create custom Python workers and engines for Dynamo
- [vLLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/v-llm.md)
- [SGLang](https://docs.dynamo.nvidia.com/dynamo/components/backends/sg-lang.md)
- [TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/tensor-rt-llm.md)
- [Frontend](https://docs.dynamo.nvidia.com/dynamo/components/frontend.md)
- [Frontend Guide](https://docs.dynamo.nvidia.com/dynamo/components/frontend/frontend-guide.md)
- [Router](https://docs.dynamo.nvidia.com/dynamo/components/router.md)
- [Router Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-aware-routing.md): Enable KV-aware routing using Router for Dynamo deployments
- [Router Examples](https://docs.dynamo.nvidia.com/dynamo/components/router/router-examples.md)
- [Planner](https://docs.dynamo.nvidia.com/dynamo/components/planner.md)
- [Planner Guide](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-guide.md)
- [Planner Examples](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-examples.md)
- [Profiler](https://docs.dynamo.nvidia.com/dynamo/components/profiler.md)
- [Profiler Guide](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-guide.md)
- [Profiler Examples](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-examples.md)
- [KVBM](https://docs.dynamo.nvidia.com/dynamo/components/kvbm.md)
- [KVBM Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-offloading.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [LMCache](https://docs.dynamo.nvidia.com/dynamo/integrations/lm-cache.md)
- [SGLang HiCache](https://docs.dynamo.nvidia.com/dynamo/integrations/sg-lang-hi-cache.md)
- [FlexKV](https://docs.dynamo.nvidia.com/dynamo/integrations/flex-kv.md)
- [KV Events for Custom Engines](https://docs.dynamo.nvidia.com/dynamo/integrations/kv-events-for-custom-engines.md)
- [Overall Architecture](https://docs.dynamo.nvidia.com/dynamo/design-docs/overall-architecture.md)
- [Architecture Flow](https://docs.dynamo.nvidia.com/dynamo/design-docs/architecture-flow.md)
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/design-docs/disaggregated-serving.md)
- [Distributed Runtime](https://docs.dynamo.nvidia.com/dynamo/design-docs/distributed-runtime.md)
- [Discovery Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/discovery-plane.md)
- [Request Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/request-plane.md)
- [Event Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/event-plane.md)
- [Router Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/router-design.md)
- [KVBM Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/kvbm-design.md)
- [Planner Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/planner-design.md)
- [Quickstart](https://docs.dynamo.nvidia.com/dynamo/getting-started/quickstart.md)
- [Support Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/support-matrix.md)
- [Feature Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/feature-matrix.md)
- [Release Artifacts](https://docs.dynamo.nvidia.com/dynamo/getting-started/release-artifacts.md)
- [Examples](https://docs.dynamo.nvidia.com/dynamo/getting-started/examples.md)
- [Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide.md)
- [Detailed Installation Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/detailed-installation-guide.md)
- [Dynamo Operator](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/dynamo-operator.md)
- [Service Discovery](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/service-discovery.md)
- [Webhooks](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/webhooks.md)
- [Minikube Setup](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/minikube-setup.md)
- [Managing Models with DynamoModel](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/managing-models-with-dynamo-model.md)
- [Autoscaling](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/autoscaling.md)
- [Inference Gateway (GAIE)](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/inference-gateway-gaie.md)
- [Checkpointing](https://docs.dynamo.nvidia.com/dynamo/dev/kubernetes-deployment/deployment-guide/checkpointing.md)
- [Integration with Dynamo](https://docs.dynamo.nvidia.com/dynamo/dev/kubernetes-deployment/deployment-guide/checkpointing/integration-with-dynamo.md)
- [Standalone Usage](https://docs.dynamo.nvidia.com/dynamo/dev/kubernetes-deployment/deployment-guide/checkpointing/standalone-usage.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/metrics.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/logging.md)
- [Operator Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/operator-metrics.md)
- [Multinode Deployments](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/multinode-deployments.md)
- [Grove](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/grove.md)
- [Router Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-aware-routing.md): Enable KV-aware routing using Router for Dynamo deployments
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/user-guides/disaggregated-serving.md): Find optimal prefill/decode configuration for disaggregated serving deployments
- [KVBM Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-offloading.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [Dynamo Benchmarking](https://docs.dynamo.nvidia.com/dynamo/user-guides/dynamo-benchmarking.md): Benchmark and compare performance across Dynamo deployment configurations
- [Multimodality Support](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support.md): Deploy multimodal models with image, video, and audio support in Dynamo
- [vLLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/v-llm-multimodal.md)
- [TensorRT-LLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/tensor-rt-llm-multimodal.md)
- [SGLang Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/sg-lang-multimodal.md)
- [Tool Calling](https://docs.dynamo.nvidia.com/dynamo/user-guides/tool-calling.md): Connect Dynamo to external tools and services using function calling
- [LoRA Adapters](https://docs.dynamo.nvidia.com/dynamo/user-guides/lo-ra-adapters.md): Serve fine-tuned LoRA adapters with dynamic loading and routing in Dynamo
- [Observability (Local)](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local.md): Monitor Dynamo deployments with metrics, logging, and tracing
- [Prometheus + Grafana Setup](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/prometheus-grafana-setup.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics.md)
- [Metrics Developer Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics-developer-guide.md)
- [Health Checks](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/health-checks.md)
- [Tracing](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/tracing.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/logging.md)
- [Fault Tolerance](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance.md): Handle failures gracefully with request migration, cancellation, and graceful shutdown
- [Request Migration](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-migration.md)
- [Request Cancellation](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-cancellation.md)
- [Graceful Shutdown](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/graceful-shutdown.md)
- [Request Rejection](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-rejection.md)
- [Testing](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/testing.md)
- [Writing Python Workers in Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/writing-python-workers-in-dynamo.md): Create custom Python workers and engines for Dynamo
- [vLLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/v-llm.md)
- [SGLang](https://docs.dynamo.nvidia.com/dynamo/components/backends/sg-lang.md)
- [TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/tensor-rt-llm.md)
- [Frontend](https://docs.dynamo.nvidia.com/dynamo/components/frontend.md)
- [Frontend Guide](https://docs.dynamo.nvidia.com/dynamo/components/frontend/frontend-guide.md)
- [Router](https://docs.dynamo.nvidia.com/dynamo/components/router.md)
- [Router Guide](https://docs.dynamo.nvidia.com/dynamo/dev/components/router/router-guide.md): Enable KV-aware routing using Router for Dynamo deployments
- [Router Examples](https://docs.dynamo.nvidia.com/dynamo/components/router/router-examples.md)
- [Planner](https://docs.dynamo.nvidia.com/dynamo/components/planner.md)
- [Planner Guide](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-guide.md)
- [Planner Examples](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-examples.md)
- [Profiler](https://docs.dynamo.nvidia.com/dynamo/components/profiler.md)
- [Profiler Guide](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-guide.md)
- [Profiler Examples](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-examples.md)
- [KVBM](https://docs.dynamo.nvidia.com/dynamo/components/kvbm.md)
- [KVBM Guide](https://docs.dynamo.nvidia.com/dynamo/dev/components/kvbm/kvbm-guide.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [LMCache](https://docs.dynamo.nvidia.com/dynamo/integrations/lm-cache.md)
- [SGLang HiCache](https://docs.dynamo.nvidia.com/dynamo/integrations/sg-lang-hi-cache.md)
- [FlexKV](https://docs.dynamo.nvidia.com/dynamo/integrations/flex-kv.md)
- [KV Events for Custom Engines](https://docs.dynamo.nvidia.com/dynamo/integrations/kv-events-for-custom-engines.md)
- [Overall Architecture](https://docs.dynamo.nvidia.com/dynamo/design-docs/overall-architecture.md)
- [Architecture Flow](https://docs.dynamo.nvidia.com/dynamo/design-docs/architecture-flow.md)
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/design-docs/disaggregated-serving.md)
- [Distributed Runtime](https://docs.dynamo.nvidia.com/dynamo/design-docs/distributed-runtime.md)
- [Discovery Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/discovery-plane.md)
- [Request Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/request-plane.md)
- [Event Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/event-plane.md)
- [Router Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/router-design.md)
- [KVBM Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/kvbm-design.md)
- [Planner Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/planner-design.md)
- [Dynamo Blog](https://docs.dynamo.nvidia.com/dynamo/dev/blog.mdx): Technical deep dives, announcements, and updates from the Dynamo team.
- [Quickstart](https://docs.dynamo.nvidia.com/dynamo/getting-started/quickstart.md)
- [Support Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/support-matrix.md)
- [Feature Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/feature-matrix.md)
- [Release Artifacts](https://docs.dynamo.nvidia.com/dynamo/getting-started/release-artifacts.md)
- [Examples](https://docs.dynamo.nvidia.com/dynamo/getting-started/examples.md)
- [Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide.md)
- [Detailed Installation Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/detailed-installation-guide.md)
- [Dynamo Operator](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/dynamo-operator.md)
- [Service Discovery](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/service-discovery.md)
- [Webhooks](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/webhooks.md)
- [Minikube Setup](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/minikube-setup.md)
- [Managing Models with DynamoModel](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/managing-models-with-dynamo-model.md)
- [Autoscaling](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/autoscaling.md)
- [Inference Gateway (GAIE)](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/inference-gateway-gaie.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/metrics.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/logging.md)
- [Operator Metrics](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/operator-metrics.md)
- [Multinode Deployments](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/multinode-deployments.md)
- [Grove](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/grove.md)
- [KV Cache Aware Routing](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-aware-routing.md): Enable KV-aware routing using Router for Dynamo deployments
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/user-guides/disaggregated-serving.md): Find optimal prefill/decode configuration for disaggregated serving deployments
- [KV Cache Offloading](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-offloading.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [Dynamo Benchmarking Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/dynamo-benchmarking.md): Benchmark and compare performance across Dynamo deployment configurations
- [Multimodality Support](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support.md): Deploy multimodal models with image, video, and audio support in Dynamo
- [vLLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/v-llm-multimodal.md)
- [TensorRT-LLM Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/tensor-rt-llm-multimodal.md)
- [SGLang Multimodal](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support/sg-lang-multimodal.md)
- [Tool Calling](https://docs.dynamo.nvidia.com/dynamo/user-guides/tool-calling.md): Connect Dynamo to external tools and services using function calling
- [LoRA Adapters](https://docs.dynamo.nvidia.com/dynamo/user-guides/lo-ra-adapters.md): Serve fine-tuned LoRA adapters with dynamic loading and routing in Dynamo
- [Observability (Local)](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local.md): Monitor Dynamo deployments with metrics, logging, and tracing
- [Prometheus + Grafana Setup](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/prometheus-grafana-setup.md)
- [Metrics](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics.md)
- [Metrics Developer Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics-developer-guide.md)
- [Health Checks](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/health-checks.md)
- [Tracing](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/tracing.md)
- [Logging](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/logging.md)
- [Fault Tolerance](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance.md): Handle failures gracefully with request migration, cancellation, and graceful shutdown
- [Request Migration](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-migration.md)
- [Request Cancellation](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-cancellation.md)
- [Graceful Shutdown](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/graceful-shutdown.md)
- [Request Rejection](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/request-rejection.md)
- [Testing](https://docs.dynamo.nvidia.com/dynamo/user-guides/fault-tolerance/testing.md)
- [Writing Python Workers in Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/writing-python-workers-in-dynamo.md): Create custom Python workers and engines for Dynamo
- [vLLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/v-llm.md)
- [SGLang](https://docs.dynamo.nvidia.com/dynamo/components/backends/sg-lang.md)
- [TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/tensor-rt-llm.md)
- [Frontend](https://docs.dynamo.nvidia.com/dynamo/components/frontend.md)
- [Frontend Guide](https://docs.dynamo.nvidia.com/dynamo/components/frontend/frontend-guide.md)
- [Router](https://docs.dynamo.nvidia.com/dynamo/components/router.md)
- [Router Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-aware-routing.md): Enable KV-aware routing using Router for Dynamo deployments
- [Router Examples](https://docs.dynamo.nvidia.com/dynamo/components/router/router-examples.md)
- [Planner](https://docs.dynamo.nvidia.com/dynamo/components/planner.md)
- [Planner Guide](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-guide.md)
- [Planner Examples](https://docs.dynamo.nvidia.com/dynamo/components/planner/planner-examples.md)
- [Profiler](https://docs.dynamo.nvidia.com/dynamo/components/profiler.md)
- [Profiler Guide](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-guide.md)
- [Profiler Examples](https://docs.dynamo.nvidia.com/dynamo/components/profiler/profiler-examples.md)
- [KVBM](https://docs.dynamo.nvidia.com/dynamo/components/kvbm.md)
- [KVBM Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/kv-cache-offloading.md): Enable KV offloading using KV Block Manager (KVBM) for Dynamo deployments
- [LMCache](https://docs.dynamo.nvidia.com/dynamo/integrations/lm-cache.md)
- [SGLang HiCache](https://docs.dynamo.nvidia.com/dynamo/integrations/sg-lang-hi-cache.md)
- [FlexKV](https://docs.dynamo.nvidia.com/dynamo/integrations/flex-kv.md)
- [KV Events for Custom Engines](https://docs.dynamo.nvidia.com/dynamo/integrations/kv-events-for-custom-engines.md)
- [Overall Architecture](https://docs.dynamo.nvidia.com/dynamo/design-docs/overall-architecture.md)
- [Architecture Flow](https://docs.dynamo.nvidia.com/dynamo/design-docs/architecture-flow.md)
- [Disaggregated Serving](https://docs.dynamo.nvidia.com/dynamo/design-docs/disaggregated-serving.md)
- [Distributed Runtime](https://docs.dynamo.nvidia.com/dynamo/design-docs/distributed-runtime.md)
- [Discovery Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/discovery-plane.md)
- [Request Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/request-plane.md)
- [Event Plane](https://docs.dynamo.nvidia.com/dynamo/design-docs/event-plane.md)
- [Router Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/router-design.md)
- [KVBM Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/kvbm-design.md)
- [Planner Design](https://docs.dynamo.nvidia.com/dynamo/design-docs/planner-design.md)
- [Quickstart](https://docs.dynamo.nvidia.com/dynamo/getting-started/quickstart.md)
- [Installation](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/getting-started/installation.md)
- [Dynamo Support Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/support-matrix.md)
- [Dynamo Feature Compatibility Matrices](https://docs.dynamo.nvidia.com/dynamo/getting-started/feature-matrix.md)
- [Dynamo Examples](https://docs.dynamo.nvidia.com/dynamo/getting-started/examples.md)
- [Deploying Dynamo on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/kubernetes-deployment/deployment-guide/kubernetes-quickstart.md)
- [Installation Guide for Dynamo Kubernetes Platform](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/detailed-installation-guide.md)
- [Working with Dynamo Kubernetes Operator](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/dynamo-operator.md)
- [Minikube Setup Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/minikube-setup.md)
- [Managing Models with DynamoModel](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/managing-models-with-dynamo-model.md)
- [Dynamo Metrics Collection on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/metrics.md)
- [Log Aggregation in Dynamo on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/logging.md)
- [Multinode Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/multinode-deployments.md)
- [Grove Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/grove.md)
- [Tool Calling with Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/tool-calling.md)
- [Multimodal Inference in Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support.md)
- [Finding Best Initial Configs using AIConfigurator](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/finding-best-initial-configs.md)
- [Dynamo Benchmarking Guide](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/dynamo-benchmarking-guide.md)
- [Disaggregation and Performance Tuning](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/tuning-disaggregated-performance.md)
- [Dynamo Runtime](https://docs.dynamo.nvidia.com/dynamo/user-guides/writing-python-workers-in-dynamo.md)
- [Dynamo Observability](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/observability-local/overview.md)
- [Metrics Visualization with Prometheus and Grafana](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/prometheus-grafana-setup.md)
- [Dynamo Metrics](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics.md)
- [Metrics Developer Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics-developer-guide.md)
- [Dynamo Health Checks](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/health-checks.md)
- [Distributed Tracing with Tempo](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/tracing.md)
- [Dynamo Logging](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/logging.md)
- [NVIDIA Dynamo Glossary](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/glossary.md)
- [LLM Deployment using vLLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/v-llm.md)
- [Running SGLang with Dynamo](https://docs.dynamo.nvidia.com/dynamo/components/backends/sg-lang.md)
- [LLM Deployment using TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/tensor-rt-llm.md)
- [KV Router](https://docs.dynamo.nvidia.com/dynamo/components/router.md)
- [Planner](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/overview.md)
- [SLA-Driven Profiling and Planner Deployment Quick Start Guide](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-planner-quick-start.md)
- [SLA-Driven Profiling with DynamoGraphDeploymentRequest](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-driven-profiling.md)
- [SLA-based Planner](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-based-planner.md)
- [KV Block Manager](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/overview.md)
- [Motivation behind KVBM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/motivation.md)
- [KVBM Architecture](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/architecture.md)
- [Understanding KVBM components](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/components.md)
- [KVBM components](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/design-deep-dive.md)
- [KVBM Integrations](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/integrations.md)
- [Running KVBM in vLLM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/kvbm-in-v-llm.md)
- [Running KVBM in TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/kvbm-in-trtllm.md)
- [LMCache Integration in Dynamo](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/lm-cache-integration.md)
- [KVBM Further Reading](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/further-reading.md)
- [High Level Architecture](https://docs.dynamo.nvidia.com/dynamo/design-docs/overall-architecture.md)
- [Dynamo Architecture Flow](https://docs.dynamo.nvidia.com/dynamo/design-docs/architecture-flow.md)
- [Dynamo Disaggregation: Separating Prefill and Decode for Enhanced Performance](https://docs.dynamo.nvidia.com/dynamo/design-docs/disaggregated-serving.md)
- [Dynamo Distributed Runtime](https://docs.dynamo.nvidia.com/dynamo/design-docs/distributed-runtime.md)
- [Quickstart](https://docs.dynamo.nvidia.com/dynamo/getting-started/quickstart.md)
- [Installation](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/getting-started/installation.md)
- [Dynamo Support Matrix](https://docs.dynamo.nvidia.com/dynamo/getting-started/support-matrix.md)
- [Dynamo Feature Compatibility Matrices](https://docs.dynamo.nvidia.com/dynamo/getting-started/feature-matrix.md)
- [Dynamo Examples](https://docs.dynamo.nvidia.com/dynamo/getting-started/examples.md)
- [Deploying Dynamo on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/kubernetes-deployment/deployment-guide/kubernetes-quickstart.md)
- [Installation Guide for Dynamo Kubernetes Platform](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/detailed-installation-guide.md)
- [Working with Dynamo Kubernetes Operator](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/dynamo-operator.md)
- [Minikube Setup Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/minikube-setup.md)
- [Managing Models with DynamoModel](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/deployment-guide/managing-models-with-dynamo-model.md)
- [Dynamo Metrics Collection on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/metrics.md)
- [Log Aggregation in Dynamo on Kubernetes](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/observability-k-8-s/logging.md)
- [Multinode Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/multinode-deployments.md)
- [Grove Deployment Guide](https://docs.dynamo.nvidia.com/dynamo/kubernetes-deployment/multinode/grove.md)
- [Tool Calling with Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/tool-calling.md)
- [Multimodal Inference in Dynamo](https://docs.dynamo.nvidia.com/dynamo/user-guides/multimodality-support.md)
- [Finding Best Initial Configs using AIConfigurator](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/finding-best-initial-configs.md)
- [Dynamo Benchmarking Guide](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/dynamo-benchmarking-guide.md)
- [Disaggregation and Performance Tuning](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/tuning-disaggregated-performance.md)
- [Dynamo Runtime](https://docs.dynamo.nvidia.com/dynamo/user-guides/writing-python-workers-in-dynamo.md)
- [Dynamo Observability](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/observability-local/overview.md)
- [Metrics Visualization with Prometheus and Grafana](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/prometheus-grafana-setup.md)
- [Dynamo Metrics](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics.md)
- [Metrics Developer Guide](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/metrics-developer-guide.md)
- [Dynamo Health Checks](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/health-checks.md)
- [Distributed Tracing with Tempo](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/tracing.md)
- [Dynamo Logging](https://docs.dynamo.nvidia.com/dynamo/user-guides/observability-local/logging.md)
- [NVIDIA Dynamo Glossary](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/user-guides/glossary.md)
- [LLM Deployment using vLLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/v-llm.md)
- [Running SGLang with Dynamo](https://docs.dynamo.nvidia.com/dynamo/components/backends/sg-lang.md)
- [LLM Deployment using TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/components/backends/tensor-rt-llm.md)
- [KV Router](https://docs.dynamo.nvidia.com/dynamo/components/router.md)
- [Planner](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/overview.md)
- [SLA-Driven Profiling and Planner Deployment Quick Start Guide](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-planner-quick-start.md)
- [SLA-Driven Profiling with DynamoGraphDeploymentRequest](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-driven-profiling.md)
- [SLA-based Planner](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/planner/sla-based-planner.md)
- [KV Block Manager](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/overview.md)
- [Motivation behind KVBM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/motivation.md)
- [KVBM Architecture](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/architecture.md)
- [Understanding KVBM components](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/components.md)
- [KVBM components](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/design-deep-dive.md)
- [KVBM Integrations](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/integrations.md)
- [Running KVBM in vLLM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/kvbm-in-v-llm.md)
- [Running KVBM in TensorRT-LLM](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/kvbm-in-trtllm.md)
- [LMCache Integration in Dynamo](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/lm-cache-integration.md)
- [KVBM Further Reading](https://docs.dynamo.nvidia.com/dynamo/v-0-8-1/components/kvbm/further-reading.md)
- [High Level Architecture](https://docs.dynamo.nvidia.com/dynamo/design-docs/overall-architecture.md)
- [Dynamo Architecture Flow](https://docs.dynamo.nvidia.com/dynamo/design-docs/architecture-flow.md)
- [Dynamo Disaggregation: Separating Prefill and Decode for Enhanced Performance](https://docs.dynamo.nvidia.com/dynamo/design-docs/disaggregated-serving.md)
- [Dynamo Distributed Runtime](https://docs.dynamo.nvidia.com/dynamo/design-docs/distributed-runtime.md)