All success stories

Achieving True High Availability for a 850-Domain SaaS: The Turisapps Migration

Reading time: 4 minutes

Achieving True High Availability for a 850-Domain SaaS: The Turisapps Migration

The Challenge: Scaling a Stateful, Multi-Tenant SaaS

Turisapps is a robust SaaS platform providing customized web presences and management tools for the tourism and hospitality sector. With over 850 distinct custom domains pointed at their infrastructure, the platform faces a unique set of architectural challenges that standard cloud deployments struggle to handle.

Operating on a legacy cloud setup (DigitalOcean with HashiCorp Nomad), the infrastructure had reached a critical scaling limit. The core challenges were threefold:

1. The Stateful HA Problem Turisapps runs a legacy stateful application that requires shared read/write access to a specific filesystem directory. Because cloud block storage (like AWS EBS or DigitalOcean Volumes) is fundamentally ReadWriteOnce (RWO), the application was chained to a single server. If that specific node went down, the application went down. Achieving true High Availability (HA) and eliminating this Single Point of Failure (SPOF) was the top priority.

2. Dynamic Multi-Tenant Routing When a request hits the Turisapps load balancer, it doesn't just go to a generic backend. Customer pages are dynamically rendered by 5 to 7 different template servers. The infrastructure needed to inspect the requested hostname, query the customer's specific configuration, and dynamically route the traffic to the correct template engine—all in milliseconds.

3. The Bot Storm & False Positives Hosting over 850 independent domains makes you a massive target for web scrapers and bots. Because bots view these as 850 separate websites, the aggregate bot traffic hammering the infrastructure was intense. To protect the legacy cloud servers, strict rate limiting was applied. Unfortunately, this created a degraded user experience, occasionally flagging legitimate customer traffic as "abuse" (false positives).

Turisapps needed an enterprise-grade architectural overhaul.

A Long-Term Infrastructure Partnership

This project represents the second major evolution of the Turisapps stack. In 2020, Nubosas led their initial migration from a single bare-metal server to DigitalOcean, modernizing the platform with Docker and later Nomad. As the platform grew to 850+ domains, the requirements for High Availability and cost-efficiency necessitated this latest move to a Sovereign Kubernetes stack.


The Solution: Sovereign Kubernetes & OpenResty

We designed a migration plan to move Turisapps off the restrictive public cloud and onto a highly resilient, bare-metal Kubernetes cluster hosted on Hetzner.

1. Unlocking HA with Distributed Storage

To solve the stateful application constraint, we implemented Rook-Ceph within the new Kubernetes cluster. By pooling raw NVMe drives across multiple physical servers, we provided the legacy application with ReadWriteMany (RWX) persistent volumes. For the first time, the stateful app could be horizontally scaled across multiple worker nodes. The Single Point of Failure was eliminated.

2. Intelligent Edge Routing with OpenResty

To handle the complex, dynamic routing requirements for 850+ domains, we deployed an internal OpenResty (Nginx + Lua) gateway.

When a request arrives, OpenResty executes a rapid pre-flight HTTP request to an internal microservice, triggering a lightweight database query. This query determines exactly which customer owns the domain and which template server configuration they require. OpenResty then seamlessly proxies the traffic to the correct backend pod. This decoupled the routing logic from the application logic, allowing both to scale independently.

3. Automating the TLS Lifecycle for 850+ Domains

In the previous legacy environment, managing SSL/TLS certificates for over 850 domains relied on a fragile combination of home-grown scripts and certbot. This created a "black box" where certificate failures were often only discovered after a domain had expired.

By moving to Kubernetes, we implemented cert-manager. We replaced the manual scripts with a fully automated lifecycle that handles issuance and renewals via Let's Encrypt. More importantly, we integrated this with our observability stack. We now have real-time dashboards showing the renewal state of every single domain and automated alerts that trigger if a certificate fails to renew well before its expiration. We traded manual anxiety for automated visibility.

4. Brute Force Performance vs. Rate Limits

By moving to a Sovereign bare-metal architecture, we multiplied the raw compute power available to the platform.

Instead of relying on aggressive rate limits that were blocking legitimate users, the new bare-metal worker nodes easily absorbed the heavy traffic spikes. We dropped the strict rate limits for regular web requests—completely eliminating the false-positive abuse detections for real users. To keep the scrapers at bay, we implemented targeted User-Agent blocking directly at the K8s Ingress layer, while we evaluate a comprehensive WAF (Web Application Firewall) for future iterations.


The Execution: The Database Cutover

Migrating a heavily utilized, stateful SaaS requires meticulous planning. The most critical piece of the puzzle was the database.

We executed a cross-provider migration strategy:

  1. We provisioned the new infrastructure on Hetzner and established a secure tunnel to the old DigitalOcean environment.
  2. We created a MySQL read-only replica in the new Hetzner cluster, continuously syncing data from the live DO database.
  3. On the day of the cutover, we halted application writes, promoted the Hetzner replica to the primary Write database, and demoted the old DO database to read-only (disabling replication).
  4. Finally, we updated the DNS records for the 850+ domains to point to the new Hetzner Load Balancer.

While we aimed for zero-downtime, the realities of DNS propagation and legacy application state resulted in a tightly controlled 30-minute maintenance window. Transparency and precise execution ensured the client and their end-users were fully informed and minimally impacted.


The Business Outcome

The transition to a Sovereign Kubernetes stack transformed the operational reality for the Turisapps engineering team.

  • Zero-Downtime Deployments: With the application now stateless at the K8s level (thanks to Ceph RWX volumes) and running in true High Availability, the engineering team can now perform rolling updates. Deploying new code no longer requires maintenance windows or momentary outages.
  • Frictionless User Experience: By leveraging bare-metal performance to drop aggressive rate limits, customer complaints regarding false-positive blocks dropped to zero.
  • Scalable Foundation: The OpenResty routing layer effortlessly handles the 850+ domains, providing a robust foundation to onboard thousands more without architectural bottlenecks.
  • Automated Trust: Replaced manual certificate management with cert-manager, providing a 100% automated TLS lifecycle and real-time observability for over 850 domains.

Turisapps no longer worries about server failures bringing down their customer's websites. They stopped renting their reliability, and started owning it.


Are you facing scaling challenges with a legacy SaaS architecture? Contact Nubosas today to book a Cloud Waste Audit and discover how a Sovereign Kubernetes stack can secure your infrastructure.