Using Project KAITO in AKS
The Kubernetes AI Toolchain Operator, also known as Project KAITO, is an open-source solution to simplify the deployment of inference models in a Kubernetes cluster. In particular, the focus is on simplifying the operation of the most popular models available (ex. Falcon, Mistral and Llama2).
KAITO provides operators to manage validation of the requested model against the requested nodepool hardware, deployment of the nodepool and the deployment of the model itself along with a REST endpoint to reach the model.

