Skip to main content

Deploying Azure Red Hat OpenShift with Managed Identities

· 6 min read
Diego Casati
Principal Cloud Architect, Azure Global Black Belt

When deploying Azure Red Hat OpenShift (ARO) clusters, managing authentication and authorization for various cluster components traditionally relies on service principals or other credential-based approaches. This introduces operational overhead and potential security risks related to credential rotation and management.

Getting started with Anyscale running on Azure

· 12 min read
Steve Griffith
Principal Cloud Architect, Azure Global Black Belt

In this walk through, we'll set up a very basic AKS cluster to get you quickly up and running with the Anyscale platform using AKS as the compute backend. We'll run this cluster in our own Azure Virtual Network and have it connected to an Azure Blob Storage Account on that VNet. Finally, we'll execute the basic Anyscale 'Hello World' demo on that compute.

When Infrastructure Scales But Understanding Doesn't

· 7 min read
Ray Kao
Principal Cloud Architect, Azure Global Black Belt
Diego Casati
Principal Cloud Architect, Azure Global Black Belt

We all know this, even if we don't like to admit it: modern infrastructure can scale infinitely, but human understanding doesn't.

We've all seen it happen—organizations going from managing dozens of servers to thousands of containers, from deploying weekly to deploying hundreds of times per day, from serving thousands of users to millions. The technology handled the scale beautifully. The humans? Not so much.

This is the first industry issue that platform engineering should be addressing: how do we manage infrastructure complexity that has outgrown not just individual cognitive capacity, but our collective ability to communicate and transfer knowledge as teams?

The Human Scale Problem in Platform Engineering

· 7 min read
Ray Kao
Principal Cloud Architect, Azure Global Black Belt
Diego Casati
Principal Cloud Architect, Azure Global Black Belt

We keep doing this thing where we solve a problem, celebrate the victory, then realize we've created three new problems we didn't even know existed.

Remember when manually configuring servers was the bottleneck? So we built containers. Great! Now we're orchestrating thousands of them. Remember when monolithic deployments were too slow? So we built microservices. Fantastic! Now we're drowning in distributed system complexity. We solved manual infrastructure provisioning with infrastructure as code. Perfect! Now we're coordinating dozens of Terraform modules across environments and wondering how we got here.

Each step forward has been genuinely valuable. But we keep hitting the same pattern: our solutions outpace our ability to operate them at human scale.

Updating AKS Network Plugin from Kubenet to Azure CNI

· 2 min read
Diego Casati
Principal Cloud Architect, Azure Global Black Belt

Problem Statement

When updating an Azure Kubernetes Service (AKS) network plugin from kubenet to Azure CNI, performing the update directly using Terraform may result in the cluster being deleted and recreated. The Terraform plan typically indicates that the cluster will be replaced.

However, using Azure CLI (az cli), the update can be successfully applied without deleting the cluster. Once that's done, you can import the state of the cluster back to terraform.

A validated approach to avoid cluster recreation involves the following steps:

  • Perform the CNI update out-of-band using Azure CLI.
  • Import the state of the cluster back to Terraform.
  • Perform a Terraform refresh and a Terraform plan to validate the new state.

End to End TLS Encryption with AKS and AFD

· 18 min read
Steve Griffith
Principal Cloud Architect, Azure Global Black Belt

Introduction

In this walkthrough we'll create deploy an app with end to end TLS encryption, using Azure Front Door as the Internet Facing TLS endpoint and an Nginx Ingress controller running inside an AKS cluster as the backend.

We'll use Azure Key Vault to store the TLS certificate, and will use the Key Vault CSI Driver to get the secrets into the ingress controller. The Key Vault CSI Driver will use Azure Workload Identity to safely retrieve the certificate.

Multi-Cluster Layer 4 Load Balancing with Fleet Manager

· 10 min read
Diego Casati
Principal Cloud Architect, Azure Global Black Belt

This guide demonstrates how to set up layer 4 load balancing across multiple AKS clusters using Azure Fleet Manager. We’ll create two AKS clusters in different regions (East US and West US), configure Virtual Network (VNet) peering between them, and deploy a demo application using Fleet Manager. The process covers AKS cluster setup, VNet peering, Fleet Manager configuration, and application deployment across regions.

Using Stream Analytics to Filter AKS Control Plane Logs

· 11 min read
Steve Griffith
Principal Cloud Architect, Azure Global Black Belt

Introduction

While AKS does NOT provide access to the cluster's managed control plane, it does provide access to the control plane component logs via diagnostic settings. The easiest option to persist and search this data is to send it directly to Azure Log Analytics, however there is a LOT of data in those logs, which makes it cost prohibitive in Log Analytics. Alternatively, you can send all the data to an Azure Storage Account, but then searching and alerting can be challenging.

To address the above challenge, one option is to stream the data to Azure Event Hub, which then gives you the option to use Azure Stream Analytics to filter out events that you deem important and then just store the rest in cheaper storage (ex. Azure Storage) for potential future diagnostic needs.

Using Project KAITO in AKS

· 7 min read
Steve Griffith
Principal Cloud Architect, Azure Global Black Belt

The Kubernetes AI Toolchain Operator, also known as Project KAITO, is an open-source solution to simplify the deployment of inference models in a Kubernetes cluster. In particular, the focus is on simplifying the operation of the most popular models available (ex. Falcon, Mistral and Llama2).

KAITO provides operators to manage validation of the requested model against the requested nodepool hardware, deployment of the nodepool and the deployment of the model itself along with a REST endpoint to reach the model.

Using the Azure Key Vault CSI Driver with Workload Identity

· 7 min read
Steve Griffith
Principal Cloud Architect, Azure Global Black Belt

When working with secrets in an application running in Kubernetes, you can use native Kubernetes secrets, however there are limitations in the security of those secrets. A better practice is to use a secure vault, like Azure Key Vault.

Azure Key Vault can be accessed via a direct SDK call, as demonstrated in our previous Workload Identity post. However, in some cases you may not have the option to use the SDK, like in cases where you dont have access to source code. In those cases you may prefer to load secrets directly into an environment variable or a file. In these cases, the Azure Key Vault CSI driver is here to save the day.