I am excited to proclaim that Nile is inhabit today! Nile is a Postgres platcreate that decouples storage from compute, virtualizes tenants, and helps vertical and horizontal scaling globpartner to ship AI-native B2B applications speedy while being safe with restrictless scale. Nile is in accessible pstudy, and we see forward to all the feedback as we shift towards GA. Sign up to Nile and get begined today! Try our rapidbegins or use one of our many AI examples or use cases.
Nile’s goal is basic – To be the best platcreate to produce and scale multi-tenant AI applications.
We are taking a novel approach to produceing databases that have not been finisheavored before. We are virtualizing the concept of a customer (or tenant) into Postgres, making Postgres and its ecosystem perfect for creating B2B AI companies. We have been laboring on this for the past eighteen months, and we think it is the speedyest and safest way to grow AI-native B2B applications and scale globpartner.
Modern relational databases have been in existence for disjoinal decades. They have undergone many increasements, becoming meaningfully speedyer and vient of helping more relational features. However, the gap between the application and the database has persistd to grow. There are a meaningful number of problems that are being tackled by B2B growers, which should be solved wilean the database. A vague-purpose DB will always be portrayed to solve only the lowest standard denominator and push a lot of problems distinct to B2B to the application layer.
Multi-tenant architectures – can you have the cake and eat it too?
The establishation of B2B applications is a multi-tenant architecture, where each tenant typicpartner reconshort-terms a customer, laborspace, organization, etc. The main contest is how to store and query many customers/tenants in the database. Various patterns have been advised, but at a high level, there are two main approaches.
On one inanxious, you can use one database per tenant. This approach has disjoinal profits but also needs many pain points.
Benefits
- Full isolation apass customers, making it very difficult to leak data.
- Backups, carry outance insights, and read replicas are all definite to each customer which is a huge profit for low MTTR per customer.
Drawbacks
- Significant cost: each database is typicpartner deployed into its own virtual machine and has a smallest size. While you could convey the databases down when the tenants are inactive to save costs, this does not labor well due to freezing begin time, and many tenants have low utilization (as contestd to no utilization).
- Schema migration apass tenants is difficult, especipartner when compriseing tenants in a self-serve manner. Deploying a novel database and utilizeing schema in a self-serve product becomes intricate.
- No ability to query apass tenants. There is always a need to produce inside customer dashboards and query apass tenants for reasoned purposes.
- The application has to be portrayed to labor with multiple databases, custom routing, and multiple joinion pools.
- Lots of tooling is needed to do run awayt enhances, metrics aggregation, vigilanting, and data ingestion from back-office tools into the DB.
The other inanxious is to have one database apass tenants.
Benefits
- Simpliedy. Application is portrayed to talk to one database and all the opereasonable overhead for tenant per DB does not exist.
- Lower cost since multiple tenants are in the same DB. You pay for one DB’s provisioned capacity.
- Pretty much all the drawbacks of per DB model is a plus here
Drawbacks
- Redisjoineing access to a tenant data is not inmeaningful. Need to enforce intricate row-level security in the DB which is challenging to debug or utilize authorization at the application that could easily leak.
- Noisy neighbor problems between tenants as some tenants grow to be active
- No per-tenant query insights, backups, or read replicas that are super beneficial to lessen MTTR
Most companies finish up having to help a hybrid approach. Companies begin with one of the two approaches, and grow into the other, if prosperous. The reason is that as companies grow they typicpartner have to place a subset of customers in their own database or in a particular region for compliance, isolation or tardyncy reasons. The ask is can you get all the profits of both the architecture models without any of the downsides right from the begin? Would it be possible to do this 10x affordableer and with the experience of a individual database?
Building RAG apps with customer data
AI is fundamenloftyy changing how applications are built and devourd. One of the core architectural patterns for AI apps is Retrieval Augmented Generation (RAG). This approach includes:
- Calculating vector embeddings for your input dataset
- Using the user’s prompt to search for relevant embeddings
- Finding the relevant records or chunks wilean the records
- Feeding these records aextfinished with the prompt to the Large Language Model (LLM) to get the most right answer
RAGs are becoming a critical part of AI infraset up. The vector embeddings computed from customer data need to be stored, queried, and scaled to millions. AI has emotionalpartner quickend the speed of adchooseion—what once took 2–3 years to achieve scale is now possible in equitable 6 months. This explosion of AI adchooseion and the contest of managing vector embeddings for RAG lift many problems.
Separate databases for vector embeddings and customer data
In recent years, many vector databases have aelevated. This trfinish splits customers’ core data and metadata from their embeddings, forcing companies to administer multiple databases. Such separation incrmitigates costs, meaningfully complicates application growment and operation, and guides to ineffective resource utilization between vector embeddings and customer metadata. Moreover, retaining these databases alignd with customer alters comprises yet another layer of intricateity.
Lack of isolation for customer laborloads
AI laborloads need meaningfully more memory and compute than traditional SaaS laborloads. Customer adchooseion and growth are much speedyer with AI, though some of this can be attributed to a hype cycle. Moreover, reproduceing indexes for embeddings needs compriseitional resources and may impact production laborloads. The ability to isotardy customer data and their AI laborloads has a meaningful impact on the customer’s experience. Isolation is a key customer needment (no one wants their data mixed with anyone else’s) and also critical to carry outance – 3 million embeddings is very huge. 1000 tenants with 3000 embeddings each is very administerable – you get lessen tardyncy and 100% recall.
Scaling to billions of embeddings apass customers
AI laborloads scale to 50-100 million embeddings and in some cases even a billion embeddings. The hugegest unlock with AI is the ability to search thraw unset upd data. All the data in separateent PDFs, Images, Wikis are now searchable. In compriseition, these unset upd data need to be chunked to do better contextual search. The explosion of vector embeddings needs a scalable database that can store billions of embeddings at a repartner low cost.
Connecting all the customer’s data to the OLTP
90% of AI use cases include pull outing data from customers’ various SaaS services, making it accessible to LLMs, and permiting users to write prompts aachievest this data. For instance, Glean, an AI-first company, aggregates data from publish trackers, wikis, and Salesforce, making it searchable in one central location using LLMs. Glean must advise a streamlined process for each customer to pull out data from their SaaS APIs and transfer it to Glean’s database. This data needs to be stored and administerd on a per-customer basis. Vector embeddings must be computed during data ingestion. In the AI era, ETL pipelines from SaaS services to OLTP databases need to be reenvisiond for each customer.
Cost of computing, storing and querying customer vector embeddings
The sheer scale of vector embeddings and their associated laborloads meaningfully incrmitigates the cost of managing AI infraset up. The primary expenses stem from compute and storage, which typicpartner align with customer activity. Idepartner, you’d want to pay only for the exact resources a customer uses for compute. Similarly, you’d like affordableer storage chooseions when embeddings aren’t being accessed. By carry outing per-customer cost administerment for their laborloads, it should be possible to lessen expenses by 10 to 20 times.
Agents will produce more B2B apps than humans in the next decade
Historicpartner, the rate of application growment has been constrained by human capabilities. Increasing productivity typicpartner included hiring more growers or carry outing tools that adviseed unassuming efficiency achieves of 5-10%. However, the advent of AI and the aelevatence of inalertigent agents are set to revolutionize this landscape. We’re on the brink of witnessing a sadvise in B2B app creation. These agent-built applications will need to pick their technology stack, with databases carry outing a pivotal role in this process.
The main redisjoineion to an agent world with millions of apps is database access. Currently, companies are restricted by the number of databases they can have due to cost and useable compute resources from cboisterous supplyrs. To fundamenloftyy alter this, we need the ability to produce unrestricted free databases that can filledy use useable resources. This is challenging to produce. It needs a genuine multi-tenant SQL database, rather than provisioning promised compute for each database or depending solely on retainers or VMs. An perfect database should seamlessly shift between deployments. New databases and tenants should begin on serverless compute and gradupartner transition to provisioned compute as they grow. This approach allows a world where agents can experiment with many novel applications in parallel on behalf of their users, increasing the chances of success.
Customer – the atomic unit of a business
Nile aims to produce Postgres from the ground up, caccessing definitepartner on B2B AI applications. The customer or tenant is a core produceing block of B2B companies—everyleang renhances around them. It’s reasonable, then, that a data platcreate built for B2B companies should have the tenant as a native primitive in the database. By defining a customer in the database and extfinishing this concept thraw to the storage layer, we can compriseress the previously talked problems at a fundamental level, potentipartner reducing costs by 10–20x. This approach also simplifies customer laborflows that join to other organizations. For instance, when a premium-tier customer onboards, creating a novel customer in the help platcreate becomes straightforward, as the database can combine with outer APIs. This reshifts the danger of losing customer activity due to flunked transactions with third-party services.
Not all customers are produced equivalent. In a standard B2B company, most customers begin minuscule and grow over time, while some become power users from the outset. Customer distribution apass pricing tiers frequently chases a pattern: 50% inactive, 30% medium usage, and 20% power users. This distribution can vary based on the company’s type and aim taget.
The revenue each customer conveys in separates meaningfully. Idepartner, the cost to serve a customer should mirror this variation. A B2B-caccessed data platcreate should accommodate these separateences. It should incur no costs for inactive customers, indict based on utilization for medium-usage customers, and supply promised compute for power users.
This ability to administer infraset up costs for separateent customer groups allows companies to run more effectively. By aligning costs with customer cherish, businesses can enhance their resources and increase their bottom line.
Nile’s architecture
Nile has built Postgres from the ground up with tenants/customers as a core produceing block. The key highweightlesss of our portray are:
Decoupled storage and compute. The compute layer is essentipartner Postgres, modified to store each tenant’s data in split pages. The storage layer consists of a run awayt of machines that house these pages. An outer machine stores the log, while both the log and pages are archived in S3 for extfinished-term storage.
Tenant-conscious Postgres pages. A standard Postgres database compelevates objects enjoy tables and indexes, reconshort-termed by 8KB pages. In Nile, tables are either tenant-definite or dispensed. Each page of a tenant table beextfinisheds exclusively to one tenant, with all enrolls wilean a page associated with that tenant. This decoupled storage and tenant-promised page system permits for instantaneous tenant migration between separateent Postgres compute instances. Moving a tenant spropose includes transferring tenant guideership from one compute instance to another while upretaining references to the same pages in the storage layer.
Support for both serverless and provisioned compute. The Postgres compute layer advises two types of compute. Serverless compute is built with genuine multitenancy, while provisioned compute is promised to a individual Nile customer. Nile users can have any number of provisioned compute instances in the same database. Tenants can be placed on either serverless or provisioned compute.
Distributed querying apass tenants and a central schema store. The dispensed query layer runs apass tenants and functions between serverless and provisioned compute instances. A central schema store engages dispensed transactions to utilize schemas to every tenant during DDL execution. This determines right schema application and allows schema recovery for tenants during flunkures.
A global gateway for tenant routing, inter-region communication, and joinion pooling. The gateway uses the Postgres protocol to route seeks to separateent tenants. It can convey with gateways in other regions and serves as a joinion pooling layer, eliminating the need for a split pooler.
Architecture Benefits
This architecture advises many unforeseeed profits. Nile helps various multitenant configurations wilean a individual database, giving users fine-grained administer over costs at the customer level. All tenants can be placed on serverless compute for 10–20x lessen costs, or a subset can be dispenseed to provisioned compute for increased isolation and security. The system seamlessly scales horizonloftyy apass tenants and verticpartner per tenant, providing virtupartner restrictless capacity.
Unrestricted database and virtual tenant databases
In Nile, a database is a reasonable concept. Our serverless compute permits us to advise a truly cost-effective, multi-tenant solution that provisions novel databases rapidly. This allows Nile to supply unrestricted databases, even for free tiers. Serverless compute is perfect for testing, prototyping, and helping punctual customers. As customers become more active, you can seamlessly transition them to provisioned compute for increased security or scalability. Nile’s efficiency is extrastandard—a novel database is provisioned in under a second. The accompanying video shows a standard Nile database creation process and shows the execution of an initial use case.
Tenant placement on both serverless or provisioned compute with 10x compute cost savings
Tenants can now be placed on separateent types of compute wilean the same database. The serverless compute is inanxiously cost-effective, proving affordableer than provisioning a standard instance on RDS. Built with genuine multitenancy, it allows Nile to use resources more effectively apass its users. Meanwhile, highly active customers can be shiftd to provisioned compute. The best part? The capacity needed for this is meaningfully lessen than for an entire database housing all customers.
Support billions of vector embeddings apass customers with 10-20x storage savings
The architecture helps vertical scaling for tenants and horizontal scaling apass tenants. For vector embeddings, the total index size is splitd into minusculeer chunks apass multiple machines. Additionpartner, since the storage is in S3, Nile can swap a tenant’s embeddings entidepend to S3 without upretaining a local cache. The indexes themselves are minusculeer, and multiple machines can be leveraged to produce indexes in parallel. This approach supplys lessen tardyncy and proximately 100% recall by reducing the search space per customer.
Setreatment isolation for customer’s data and embeddings
Each tenant in this architecture functions as its own virtual database. The Postgres joinions comprehfinish tenants and can route to a definite one. Tenant data isolation is enforced natively in Postgres, as it acunderstandledges tenant boundaries without the need for Row-Level Security (RLS). Furthermore, the architecture permits tenants to be shiftd instantly between compute instances. Percreateance isolation between tenants can be accomplishd by relocating them to other compute instances with more capacity without any downtime.
Branching, backups, query insights and read replicas by tenant/customer
Since Postgres comprehfinishs tenant boundaries, we can now upretain one database for all tenants while executing database operations at the tenant level. This permits us to reproduce customer publishs by spropose branching the definite customer’s data and recarry outing their laborload. If a customer accidenloftyy deletes their data, backups can be repaird instantly. We can produce read replicas only for customers with higher laborloads, saving both compute and storage resources. Moreover, we can now debug carry outance for definite tenants or customers, eliminating the need to treat the database as a bdeficiency box.
We are excited once aachieve to be in accessible pstudy today. Try out Nile and produce your AI B2B apps with a Postgres platcreate purpose built for it. We are seeing forward to all the feedback!