MinIO S3 Storage Proxy in AKS

Sebastian Nyberg
9 min readMay 25, 2020

Why give up all those S3 tools because you are using Azure?

The S3 API has become more or less a standard interface for cloud storage. Whether you’re using AWS, GCP, IBM Cloud, DigitalOcean, well, pretty much any cloud provider except for Azure, storage is provided with a S3-compliant API.

For this reason, it is quite common that tools within the Open Source community does not support Azure Blob Storage. After all, these tools contain what is contributed to them, and most people are using S3-compliant storage.

While setting up Argo Workflows, I found out that it is possible to provide an S3 API for an Azure Blob Storage with MinIO’s Azure S3 Gateway. Yeah, you read it right: it is possible to use Azure Blob Storage as if it were S3.

This article will give a step-by-step example of how MinIO can be deployed to an AKS cluster and exposed via a public ingress. Halfway through this article, AWS CLI will be used to copy files to Blob Storage. Bear in mind that the article is quite opinionated and should serve as an example rather than a reference.

Introduction & Terminology

MinIO is a “High Performance, Kubernetes Native Object Storage”. For this article, the focus will be on the S3 Gateway Feature and the AKS deployment. However, MinIO has lots of other features, and can also be deployed via the Azure Marketplace. Please check out the MinIO website for more information.

Before we get started, let’s go through some terminology for mapping the S3 API to Azure Storage:

  • Storage Account Name → Access Key
  • Storage Account Key → Secret Key
  • Storage Account → Kinda an AWS Account
  • Storage Container → Bucket

The storage account will have an S3 URL in the following format: s3://$CONTAINER/path/to/blob . The Endpoint URL to the S3 storage will be the DNS of the MinIO installation.

Placeholders in this guide

For the purpose of demonstration, I’ll call my AKS cluster mycluster and the Azure Storage Account miniostorage.

I’ll use the admin context retrieved from az aks get-credentials … --admin when deploying resources, i.e. mycluster-admin.

Please ensure that you update these placeholders as necessary.

Basic MinIO Setup

I will be using the stable/minio Helm Chart.

It is best practice to put deployed manifests in version control, including Helm values. This allows for reproducibility and serves as a reference for future, forgetful you.

For third-party installations, helmfile is my go-to tool, and will be used in this article. To use Helm directly, copy the values section from each helmfile and provide missing secrets via --set , or encrypt the values file with e.g. SOPS or helm-secrets-plugin.

If you are new to helmfile, it might be helpful to look at CloudPosse’s repository of helmfiles. It has certainly served me well.

Deploying with Helmfile

Below is a basic Helmfile installation that deploys the MinIO helm chart into the minio namespace with a release name of miniotest. I’ve pinned the version to 5.0.26 in this example to ensure that it will run. For a production setup however, it is recommended to let patches go through, i.e. with semver ~v5.0 .

To get started, create the minio namespace:

kubectl \
--context mycluster-admin \
create ns minio

Then create the helmfile for the basic setup, let’s call it minio.yaml :

This helmfile showcases the requiredEnv template directive, which marks AZURE_STORAGE_KEY as required, as is seen in this error:

$ helmfile --file minio.yaml apply
in ./minio.yaml: error during minio.yaml.part.0 parsing: template: stringTemplate:12:18: executing "stringTemplate" at <requiredEnv "AZURE_STORAGE_KEY">: error calling requiredEnv: required env var `AZURE_STORAGE_KEY` is not set

Let’s retrieve AZURE_STORAGE_KEY using Azure CLI:

# test that key is available
az storage account keys list -n miniostorage -o table
# set storage key
export AZURE_STORAGE_KEY=$(az storage account keys list -n miniostorage --query '[0].value' -o tsv)

Then, run helmfile again

helmfile --file minio.yaml apply

This will install a bunch of resources in your cluster inside the minio namespace. It may take several minutes for the installation to finish so please be patient. Once done, helm will print instructions in the terminal on how to get started. I would suggest trying out these instructions and putting the output somewhere safe for future reference. For example, I will not show how to use the MinIO client ( mc ) in this article.

Accessing the MinIO UI

Once the installation has finished, open a port-forward to the UI on localhost:9000 :

kubectl \
--context mycluster-admin \
--namespace minio \
port-forward svc/miniotest 9000

The website will ask you to log in:

Provide the Storage Account Name as the Access Key, and the Storage Account Key as the Secret Key.

Once logged in, the default bucket (aptly named default ) that was provided in the Helm values has been created for us. This bucket is actually a container under the hood. The UI can be used to view buckets, their contents, manage policies, and upload files:

MinIO UI while logged in

Let’s put the S3 API to the test by accessing the Azure Storage Account via AWS CLI.

Putting a file into Azure Blob Storage using AWS CLI

To access the Storage Account, provide the storage account name as the AWS_ACCESS_KEY_ID and the storage account key as the AWS_SECRET_ACCESS_KEY .

Bring up the port-forward again if it was shut it down before:

kubectl \
--context mycluster-admin \
--namespace minio \
port-forward svc/miniotest 9000

Open a new terminal. Set the storage account and key (verify output as well).

export AZURE_STORAGE_ACCOUNT=miniostorage
export AZURE_STORAGE_KEY=$(az storage account keys list -n ${AZURE_STORAGE_ACCOUNT} --query '[0].value' -o tsv)
echo $AZURE_STORAGE_KEY

You may now list the “buckets” (actually, containers) in the storage account by passing localhost as the endpoint to AWS CLI:

aws --endpoint-url http://localhost:9000 s3 ls

Let’s create a temporary test file and put it in the default “bucket”

tmpfile=$(mktemp)echo "test" > $tmpfileaws --endpoint-url http://localhost:9000 s3 cp $tmpfile s3://default/test.txt

The output can be seen in Storage Explorer:

First uploaded file to Azure Storage via the S3 API

Cleaning up resources

Remember to clean up the port-forward either by exiting the terminal, or resuming the job with fg and ctrl+c .

To clean up the installed resources, use helmfile destroy or helm del :

helmfile --file minio.yaml destroy# or
helm \
--context mycluster-admin \
--namespace minio \
delete miniotest

Production Setup w/ Public Ingress

Pretty much all guides I see on Medium completely omit this section, and it makes sense. There are many ways to set up a public ingress, and also many things that can go wrong. While getting started with Kubernetes, I remember this being a big pain-point, so I decided to add this section anyway. I hope it helps you in setting up the ingress — the same approach can be used for any other DNS in need of TLS termination.

By default, the deployment does not expose any public endpoints.

Patching the service type to LoadBalancer will expose a public IP that can be used to access the S3 API. However, traffic will remain unencrypted. There primarily are two approaches to provide encryption: either supply the TLS certificate to the installation yourself, or use cert-manager to automatically generate the certificate for you.

Personally, I prefer not to manage certificates myself, so I tap into a working cert-manager + nginx setup. I then add an A-name record to the DNS that points to the Public IP used by the nginx service’s LoadBalancer so that let’s encrypt can issue the ACME challenge, giving the Kubernetes ingress a certificate.

If what I just wrote makes perfect sense to you, and you already know how to do it, skip until the section called “Deploying the MinIO service with an ingress”

cert-manager + NGINX

Normally, I would not write a guide for this since it is very error-prone and beyond the scope of this article. However, when getting started with Kubernetes, I recall having issues finding end-to-end examples of exposing a public ingress such as this one, so I decided to put it here anyway.

Setting up cert-manager (Let’s Encrypt) and NGINX can be a struggle. I recommend following the Azure Guide to get started. If you get stuck, below is the helmfile I typically use to deploy NGINX / Let’s Encrypt to new AKS clusters (note that I use the namespace nginx instead of nginx-ingress ):

Finding the Azure Public IP for the nginx service

Once nginx is up and running, its reverse proxy will be exposed through a service of type LoadBalancer. This service maps to an Azure Public IP inside the Azure Environment. To find the IP, get services in the nginx namespace and make note of the EXTERNAL-IP:

kubectl \
--context mycluster-admin \
--namespace nginx \
get service

Once you know the EXTERNAL-IP of the service, it is a matter of track down the corresponding Azure Public IP. I typically grep for the public IP via Azure CLI — the first column will be the name of the Azure Public IP

az network public-ip list -o table | grep 123.456.789.012

Pointing a DNS to the Public IP

The next step is to get a hold of a DNS that can point to the public IP address.

A quick and simple solution to get a verifiably owned DNS is to add a DNS name label to the Public IP resource. This will grant you a Fully Qualified Domain Name (FQDN) that can be used as a hostname in the MinIO ingress later on. I have used this approach for testing purposes sometimes when there is only one ingress in the cluster that I want to expose to the outside world.

Creating a FQDN for a Public IP used by Kubernetes

If you already own a domain name on some other registrar, it is fairly simple to create a new A-record there that points towards the Azure Public IP and be done with it. I typically create new subdomains for different tools, e.g. artifacts.argo.my.company.com .

If you control a domain name and would like to manage it in Azure, follow this guide to delegate control to Azure DNS Zone. The advantage of doing so is that the A-Name record can be of type “alias record”, which allows attachment to the Public IP resource itself, instead of the IP address of the resource.

Deploying the MinIO service with an ingress

Once your domain name (e.g. miniostorage.my.domain.com ) is pointing towards the nginx ingress service, a Kubernetes ingress with the same hostname can be set up for the MinIO installation. Below is an example helmfile:

Once again apply the file:

helmfile --file minio-with-ingress.yaml apply

This will deploy a Kubernetes ingress object. If your cert-manager setup is working, cert resources will show up in the minio namespace:

kubectl \
--context mycluster-admin \
--namespace minio \
get certificates,challenges,csr

To debug problems, check kubectl get events and logs from the nginx ingress controller. If the DNS does not work, try accessing the service directly via the Public IP. This will show an Invalid Request error once entering the website, but at least you know the ingress is correctly managed by the nginx ingress.

There is one important detail in the deployment manifestabove: nginx does not support uploading of large files by default and will send a 413 - Payload Too Large. It is however possible to remove this limit as an annotation in the ingress object definition.

nginx.ingress.kubernetes.io/proxy-body-size: "0"

Now that all is said and done, the Azure Storage Account is available via a public URL, encrypted with TLS, and accessible as an S3 API.

Conclusion

MinIO has opened up the door for us Azure developers to use a large set of previously inaccessible tools that rely on S3 as the storage backend.

I really want to stress that MinIO is much more than what’s contained in this article, and that you should check out their docs for more info. For example, I highly recommend trying the MinIO client.

Thank you for reading this article, and make sure to subscribe for more content.

--

--