Thunk.AI offers versatile deployment options that cater to a wide range of needs, including private cloud locations and cloud providers, and enable scalability across any allowed regions. This flexibility ensures that organizations can deploy Thunk.AI in a manner that aligns with their specific operational requirements and compliance mandates.
The concept of cloud "tenants"
A cloud infrastructure provider (like AWS, Azure, GCP) treats each customer as a tenant (someone who rents a set of server infrastructure). The server infrastructure, data, and network traffic of one tenant are isolated from the infrastructure of another tenant. This isolation is accomplished by a combination of hardware and software automatically provided by the cloud hosting provider. As a customer of cloud hosting services, you have a greater degree of access control over services that are hosted within your tenant.
On the other hand, when you use an external service (eg: you decide to call the Google Maps API), it does not run in your tenant. Such services that are hosted in the cloud serve many customers at the same time. They are called “multi-tenant” services. This may lead to the concern that your API call data might be commingled with data from other customers. This is a common concern with public SaaS services. However, a well-implemented multi-tenant service will ensure in its implementation that traffic from different customers stays isolated from each other, and it will provide contractual guarantees of isolation. As a customer of a multi-tenant service, you rely on these contractual guarantees to maintain isolation of your data.
Thunk.AI deployment options
There are three types of deployments available:
Public multi-tenant cloud service run by the Thunk.AI team - this is the shared multi-tenant cloud-hosted SaaS version of Thunk.AI. This is the ideal option for small and medium-sized businesses. It is straightforward to use with no initial infrastructure provisioning. This public instance of the Thunk.AI SaaS service (accessible at https://app.thunk.ai) is currently hosted in a tenant owned by the Thunk.AI team on the Google Cloud Platform (GCP) public cloud.
Single-tenant cloud service run by the Thunk.AI team - this is a single-tenant cloud-hosted SaaS version of Thunk.AI. Each customer gets their own uique instance, which is still in the cloud but in a unique tenant managed by the Thunk.AI team. This approach does involve initial provisioning and has higher ongoing maintenance costs. This is the ideal option for medium-sized and enterprise businesses that are comfortable with the trade-offs of utilizing cloud-hosted SaaS services, but want the benefits of service tenant isolation.
"On-premise" private instance run by the customer - this is a private instance of the Thunk.AI service installed and hosted within a cloud tenant owned by the customer. This cloud tenant could be hosted on AWS, Azure, GCP, or an on-premise data center that supports modern Kubernetes-based deployments. All services are managed by the customer's IT team potentially with collaboration from the Thunk.AI team. This is the option suitable for customers who want absolutely no data to leave their network boundaries, but it has the highest costs of provisioning and maintenance. This is particularly beneficial for industries with stringent data privacy and security regulations.
Choice of AI models
In all three types of deployments, the AI LLM models used can be any of the well-known public models hosted by large AI providers (GPT hosted by OpenAI or Azure, Claude hosted by Anthropic or AWS, Gemini hosted by Google).
Customers using the on-premise private instance can choose to utilize an AI model hosted within their cloud tenant (for example, a customer deploying within an AWS tenant could choose to use the Claude models hosted in AWS Bedrock).
If utilizing a cloud-hosted LLM (like OpenAI's GPT), we recommend that enterprise customers can set up their independent contracts with the LLM providers and register their LLM API keys with the Thunk.AI platform for use. For small and medium-sized businesses using the public multi-tenant version of the Thunk.AI service, a simpler option is to utilize the default contract that Thunk.AI has with these LLM providers. In effect, your workload would utilize LLMs via our contract with the LLM provider and using our LLM API keys.
Deployment scalability
The Thunk.AI service is designed for elastic and dynamic scale. In other words, the service can deploy new server instances to handle larger workloads, and reclaim them when the workload no longer requires it.
The ability to deploy at scale ensures that Thunk.AI can accommodate the needs of large enterprises. This scalability is crucial for organizations that anticipate growth or have fluctuating workloads, as it allows them to adjust resources dynamically without compromising performance or efficiency.
This article describes the technical limits (or lack thereof) to the scalability of the service. However, depending on your specific subscription level and as described in your subscription contract, throttles may be imposed on scalability and throughput.
Infrastructure schematic for the public multi-tenant instance
Here is a infrastructure-schematic showing the current configuration of services/micro-services for the public multi-tenant instance of Thunk.AI. Each individual service/micro-service may actually have a scalable number of physical instances as appropriate.
There are two main “units” of services to consider:
Perimeter services: these are shown in the top-right box and are hosted in Cloudflare. They primarily handle interaction with the web and with email.
Core services: this is a collection of services/servers that communicate with each other and collectively provide the core functionality of the product.
Some of these services (the purple box in the center) are entirely based on proprietary code created by the Thunk.AI team and represent the unique application model, orchestration, and AI agent governance needed for reliable AI agentic workflow execution.
Some of the other services related to storage and messaging use well-known server products (a database, a memory cache, a message queue).
Each of these services is run on one or more cloud server instances (i.e. server machines hosted by the cloud hosting platform).
In order to deploy and manage this set of services in a coherent and reliable manner, the server infrastructure is defined by industry-standard Kubernetes-based deployment artifacts (the “thunk-images” and “docker artifacts”) at the left of the schematic diagram.
In addition to these main units, the multi-tenant version of the service uses some third-party cloud services for diagnostics and monitoring. These can be disabled or replaced in a single-tenant version.
Deployment in a private instance
The single-tenant private instance of the Thunk.AI service differs from the public multi-tenant version in just a few ways. Here, we describe one specific case where the private instance is installed in the customer's existing AWS tenant. Similar descriptions apply to customer tenants in the other cloud providers or in a custom on-premise data center.
Instead of being hosted on GCP, it is hosted on AWS in a standalone AWS tenant provided by the customer.
All Google Cloud services are replaced with equivalent AWS services as described below
GKE (Google Kubernetes Engine) -> EKS (Elastic Kubernetes Service)
Artifact Registry -> ECR (Elastic Container Registry)
Google Cloud Storage -> Amazon S3 (Simple Storage Service)
RabbitMQ -> Amazon MQ (RabbitMQ compatible)
PostgreSQL Compatible DB -> Amazon RDS (PostgreSQL)
Redis Compatible KV Store -> Amazon ElastiCache (Redis)
Load Balancer -> Elastic Load Balancing (ELB)
The OpenAI GPT-4.o LLM is replaced by the Claude Sonnet LLM hosted on AWS Bedrock
The perimeter services originally hosted in CloudFlare are moved to AWS-equivalents.
Third-party monitoring services are removed or replaced by AWS-internal monitoring services.
The end-user authentication models allowed by the service are configured to whatever is allowed and appropriate in the customer's environment.
Other than these changes, the AWS private instance version of the Thunk.AI platform is essentially identical to the public multi-tenant version. While the public version is redeployed with the latest updates daily, the private instance may deploy the latest updates less frequently.