How to Build a SaaS Application Block by Block
When constructing a SaaS application, it’s easy to begin in the wrong place — namely, with architecture. Focusing first on software or architecture seems appealing because everyone is doing it, but trust me, you don’t actually want to start there.
Start with a monolith
Service-oriented architecture is ridiculously complex. The moment you have two components that need to talk to each other, one might stop talking altogether. And of course, the more elements you add, the more complicated things get.
A better approach is to start with a monolith and evolve onwards. A monolith is an organism-like distributed set of services with predictable behavior. That predictability makes it an ideal foundation for SaaS architecture.
Important building blocks
With your monolith in place, the evolution to architecture can be eased with knowledge of common (and mostly language-agnostic) building blocks of SaaS applications. From a structural standpoint, here are the ones that are most important in my mind:
Multi-tenancy
Tenant isolation
Audit logs
Multi-tenancy
The most critical element of SaaS is that it’s multi-tenant. In other words, there is a single installation of software, and many customers (or tenants) operate within that installation. For context, the alternative is single-tenant, a self-hosted version of software that you can install or manage for others.
Tenant isolation
As you can imagine, it’s quite dangerous when an application’s system fails to prevent you (or someone else) from doing things that shouldn’t be done. Tenant isolation — the idea that you can store information that identifies the current tenant — is an essential building block in SaaS architecture because it provides security.
Imagine that an HTTP request comes in. When we receive the request, we can set up a context, which allows the application to figure out its state at any point in time and then decide on its security policies. With all security policies and parameters attached, the context can be stashed into a thread-local so that it’s always available.
As the basis of tenant isolation, the thread-local context data stores information that identifies the current request or current tenant. In other words, tenant isolation provides security by allowing code to be written in a way that avoids accidental disclosure of information from other tenants.
Tenant isolation should look something like this:
from flask import g, request
def get_tenant_from_request():
auth = validate_auth(request.headers.get('Authorization'))
return Tenant.query.get(auth.tenant_id)
def get_current_tenant():
rv = getattr(g, 'current_tenant', None)
if rv is = None:
rv = get_tenant_from_request()
g.current_tenant = rv
return rv
In Flask, the g
object is scoped to the current application state (and HTTP request). At any point, the code can figure out which tenant it executes by invoking get_current_tenant
. If no tenant has been bound yet, it’s auto-discovered from the authorization header in this example.
Tenant isolation is also necessary because the most critical security problems in SaaS begin with a failure to scope the operations to the current tenant.
Scoping to the current tenant
In many multi-tenant applications, data from different tenants live side by side in the same database. This means that a user might accidentally be given the opportunity to change resources from other tenants.
Let’s say someone writes a projects function that batch updates. When the projects function requires a different state, project IDs are considered. If someone were to pass arbitrary project IDs into the context, like the project IDs of the user that is supposed to operate in it, the app’s security might be overridden.
def batch_update_projects(ids, changes):
projects = Project.query.filter(
Project.id.in_(ids) &
Project.status != ProjectStatus.INVISIBLE
)
for project in projects:
update_projects(project, changes)
The hypothetical developer who wrote the above code didn’t scope operations down to the tenant, leaving the tenant’s data vulnerable.
With auto-tenant scoping, the base layer of the database query system can contact the current tenant any time to see if projects can automatically be scoped down to the current tenant. Whenever the project is queried, the query will already be restricted to the current tenant, ensuring the security of the tenant’s data.
class TenantQuery(db.Query):
current_tenant_constrained = True
def tenant_unconstrained_unsafe(self):
rv = self._clone()
rv.current_tenant_constrained = False
return rv
@db.event.listens_for(TenantQuery, 'before_compile', retval=True)
def ensure_tenant_constrained(query):
for desc in query.column_descriptions:
if hasattr(desc['type'], 'tenant') and \
query.current_tenant_constrained:
query = query.filter_by(tenant=get_current_tenant())
return query
Audit logs
Auditing is especially critical when a SaaS business has customers who encounter issues, such as deleted data. These customers need to figure out why data was removed, and audit logs give developers the opportunity to uncover what dangerous operations were executed on the customer’s tenant.
A general audit log function, which can be called at any time with an action, is useful in this situation. The log function will take whatever contextual information is available and attach it to the log record via metadata. The idea is that someone who writes API can call one function, and the function itself will figure out as much information as possible and store it in the audit log. Then the modified users/tenants can be viewed later.
def log(action, message=None):
data = {
'action': action
'timestamp': datetime.utcnow()
}
if message is not None:
data['message'] = message
if request:
data['ip'] = request.remote_addr
user = get_current_user()
if user is not None:
data['user'] = User
db.session.add(LogMessage(**data))
Obviously, there are more SaaS building blocks than the structural patterns I’ve mentioned here. From an architectural perspective, there are many vital elements explicitly not mentioned in this post. For instance, the vast majority of SaaS applications will need a message queue, a caching layer, and much more. But that’s another post for another time.
For now, you can watch my talk from the 2017 WeAreDevelopers Conference for more insight into pragmatic SaaS architecture.