MinIO S3 Storage Proxy in AKS
Why give up all those S3 tools because you are using Azure?
The S3 API has become more or less a standard interface for cloud storage. Whether you’re using AWS, GCP, IBM Cloud, DigitalOcean, well, pretty much any cloud provider except for Azure, storage is provided with a S3-compliant API.
For this reason, it is quite common that tools within the Open Source community does not support Azure Blob Storage. After all, these tools contain what is contributed to them, and most people are using S3-compliant storage.
While setting up Argo Workflows, I found out that it is possible to provide an S3 API for an Azure Blob Storage with MinIO’s Azure S3 Gateway. Yeah, you read it right: it is possible to use Azure Blob Storage as if it were S3.
This article will give a step-by-step example of how MinIO can be deployed to an AKS cluster and exposed via a public ingress. Halfway through this article, AWS CLI will be used to copy files to Blob Storage. Bear in mind that the article is quite opinionated and should serve as an example rather than a reference.
Introduction & Terminology
MinIO is a “High Performance, Kubernetes Native Object Storage”. For this article, the focus will be on the S3 Gateway Feature and the AKS deployment. However, MinIO has lots of other features, and can also be deployed via the Azure Marketplace. Please check out the MinIO website for more information.
Before we get started, let’s go through some terminology for mapping the S3 API to Azure Storage:
- Storage Account Name → Access Key
- Storage Account Key → Secret Key
- Storage Account → Kinda an AWS Account
- Storage Container → Bucket
The storage account will have an S3 URL in the following format: s3://$CONTAINER/path/to/blob
. The Endpoint URL to the S3 storage will be the DNS of the MinIO installation.
Placeholders in this guide
For the purpose of demonstration, I’ll call my AKS cluster mycluster
and the Azure Storage Account miniostorage
.
I’ll use the admin context retrieved from az aks get-credentials … --admin
when deploying resources, i.e. mycluster-admin
.
Please ensure that you update these placeholders as necessary.
Basic MinIO Setup
I will be using the stable/minio Helm Chart.
It is best practice to put deployed manifests in version control, including Helm values. This allows for reproducibility and serves as a reference for future, forgetful you.
For third-party installations, helmfile is my go-to tool, and will be used in this article. To use Helm directly, copy the values section from each helmfile and provide missing secrets via --set
, or encrypt the values file with e.g. SOPS or helm-secrets-plugin.
If you are new to helmfile, it might be helpful to look at CloudPosse’s repository of helmfiles. It has certainly served me well.
Deploying with Helmfile
Below is a basic Helmfile installation that deploys the MinIO helm chart into the minio
namespace with a release name of miniotest
. I’ve pinned the version to 5.0.26 in this example to ensure that it will run. For a production setup however, it is recommended to let patches go through, i.e. with semver ~v5.0
.
To get started, create the minio
namespace:
kubectl \
--context mycluster-admin \
create ns minio
Then create the helmfile for the basic setup, let’s call it minio.yaml
:
This helmfile showcases the requiredEnv
template directive, which marks AZURE_STORAGE_KEY
as required, as is seen in this error:
$ helmfile --file minio.yaml apply
in ./minio.yaml: error during minio.yaml.part.0 parsing: template: stringTemplate:12:18: executing "stringTemplate" at <requiredEnv "AZURE_STORAGE_KEY">: error calling requiredEnv: required env var `AZURE_STORAGE_KEY` is not set
Let’s retrieve AZURE_STORAGE_KEY
using Azure CLI:
# test that key is available
az storage account keys list -n miniostorage -o table# set storage key
export AZURE_STORAGE_KEY=$(az storage account keys list -n miniostorage --query '[0].value' -o tsv)
Then, run helmfile
again
helmfile --file minio.yaml apply
This will install a bunch of resources in your cluster inside the minio
namespace. It may take several minutes for the installation to finish so please be patient. Once done, helm will print instructions in the terminal on how to get started. I would suggest trying out these instructions and putting the output somewhere safe for future reference. For example, I will not show how to use the MinIO client ( mc
) in this article.
Accessing the MinIO UI
Once the installation has finished, open a port-forward to the UI on localhost:9000 :
kubectl \
--context mycluster-admin \
--namespace minio \
port-forward svc/miniotest 9000
The website will ask you to log in:
Provide the Storage Account Name as the Access Key, and the Storage Account Key as the Secret Key.
Once logged in, the default bucket (aptly named default
) that was provided in the Helm values has been created for us. This bucket is actually a container under the hood. The UI can be used to view buckets, their contents, manage policies, and upload files:
Let’s put the S3 API to the test by accessing the Azure Storage Account via AWS CLI.
Putting a file into Azure Blob Storage using AWS CLI
To access the Storage Account, provide the storage account name as the AWS_ACCESS_KEY_ID
and the storage account key as the AWS_SECRET_ACCESS_KEY
.
Bring up the port-forward again if it was shut it down before:
kubectl \
--context mycluster-admin \
--namespace minio \
port-forward svc/miniotest 9000
Open a new terminal. Set the storage account and key (verify output as well).
export AZURE_STORAGE_ACCOUNT=miniostorage
export AZURE_STORAGE_KEY=$(az storage account keys list -n ${AZURE_STORAGE_ACCOUNT} --query '[0].value' -o tsv)echo $AZURE_STORAGE_KEY
You may now list the “buckets” (actually, containers) in the storage account by passing localhost as the endpoint to AWS CLI:
aws --endpoint-url http://localhost:9000 s3 ls
Let’s create a temporary test file and put it in the default “bucket”
tmpfile=$(mktemp)echo "test" > $tmpfileaws --endpoint-url http://localhost:9000 s3 cp $tmpfile s3://default/test.txt
The output can be seen in Storage Explorer:
Cleaning up resources
Remember to clean up the port-forward either by exiting the terminal, or resuming the job with fg
and ctrl+c
.
To clean up the installed resources, use helmfile destroy
or helm del
:
helmfile --file minio.yaml destroy# or
helm \
--context mycluster-admin \
--namespace minio \
delete miniotest
Production Setup w/ Public Ingress
Pretty much all guides I see on Medium completely omit this section, and it makes sense. There are many ways to set up a public ingress, and also many things that can go wrong. While getting started with Kubernetes, I remember this being a big pain-point, so I decided to add this section anyway. I hope it helps you in setting up the ingress — the same approach can be used for any other DNS in need of TLS termination.
By default, the deployment does not expose any public endpoints.
Patching the service type to LoadBalancer
will expose a public IP that can be used to access the S3 API. However, traffic will remain unencrypted. There primarily are two approaches to provide encryption: either supply the TLS certificate to the installation yourself, or use cert-manager to automatically generate the certificate for you.
Personally, I prefer not to manage certificates myself, so I tap into a working cert-manager + nginx setup. I then add an A-name record to the DNS that points to the Public IP used by the nginx service’s LoadBalancer so that let’s encrypt can issue the ACME challenge, giving the Kubernetes ingress a certificate.
If what I just wrote makes perfect sense to you, and you already know how to do it, skip until the section called “Deploying the MinIO service with an ingress”
cert-manager + NGINX
Normally, I would not write a guide for this since it is very error-prone and beyond the scope of this article. However, when getting started with Kubernetes, I recall having issues finding end-to-end examples of exposing a public ingress such as this one, so I decided to put it here anyway.
Setting up cert-manager (Let’s Encrypt) and NGINX can be a struggle. I recommend following the Azure Guide to get started. If you get stuck, below is the helmfile I typically use to deploy NGINX / Let’s Encrypt to new AKS clusters (note that I use the namespace nginx
instead of nginx-ingress
):
Finding the Azure Public IP for the nginx service
Once nginx is up and running, its reverse proxy will be exposed through a service of type LoadBalancer. This service maps to an Azure Public IP inside the Azure Environment. To find the IP, get services in the nginx namespace and make note of the EXTERNAL-IP:
kubectl \
--context mycluster-admin \
--namespace nginx \
get service
Once you know the EXTERNAL-IP of the service, it is a matter of track down the corresponding Azure Public IP. I typically grep for the public IP via Azure CLI — the first column will be the name of the Azure Public IP
az network public-ip list -o table | grep 123.456.789.012
Pointing a DNS to the Public IP
The next step is to get a hold of a DNS that can point to the public IP address.
A quick and simple solution to get a verifiably owned DNS is to add a DNS name label to the Public IP resource. This will grant you a Fully Qualified Domain Name (FQDN) that can be used as a hostname in the MinIO ingress later on. I have used this approach for testing purposes sometimes when there is only one ingress in the cluster that I want to expose to the outside world.
If you already own a domain name on some other registrar, it is fairly simple to create a new A-record there that points towards the Azure Public IP and be done with it. I typically create new subdomains for different tools, e.g. artifacts.argo.my.company.com
.
If you control a domain name and would like to manage it in Azure, follow this guide to delegate control to Azure DNS Zone. The advantage of doing so is that the A-Name record can be of type “alias record”, which allows attachment to the Public IP resource itself, instead of the IP address of the resource.
Deploying the MinIO service with an ingress
Once your domain name (e.g. miniostorage.my.domain.com
) is pointing towards the nginx ingress service, a Kubernetes ingress with the same hostname can be set up for the MinIO installation. Below is an example helmfile:
Once again apply the file:
helmfile --file minio-with-ingress.yaml apply
This will deploy a Kubernetes ingress object. If your cert-manager setup is working, cert resources will show up in the minio namespace:
kubectl \
--context mycluster-admin \
--namespace minio \
get certificates,challenges,csr
To debug problems, check kubectl get events
and logs from the nginx ingress controller. If the DNS does not work, try accessing the service directly via the Public IP. This will show an Invalid Request error once entering the website, but at least you know the ingress is correctly managed by the nginx ingress.
There is one important detail in the deployment manifestabove: nginx does not support uploading of large files by default and will send a 413 - Payload Too Large. It is however possible to remove this limit as an annotation in the ingress object definition.
nginx.ingress.kubernetes.io/proxy-body-size: "0"
Now that all is said and done, the Azure Storage Account is available via a public URL, encrypted with TLS, and accessible as an S3 API.
Conclusion
MinIO has opened up the door for us Azure developers to use a large set of previously inaccessible tools that rely on S3 as the storage backend.
I really want to stress that MinIO is much more than what’s contained in this article, and that you should check out their docs for more info. For example, I highly recommend trying the MinIO client.
Thank you for reading this article, and make sure to subscribe for more content.