Architecture

Distributed SaaS Monolith 101

Intro

The first time I implemented a REST service was in order to upgrade an old proprietary communication protocol between our servers. At that time we called our services “servers”, as if they were an enclosed computational unit, and only that unit could have done the thing it was designed to do. Mostly single-threaded, active-standby mode (non-scalable that is), and of course, monolithic (and I’m not taking sides here in the “Monolith-first” vs “Microservice-first“ discussion).

Man… Those were the good ol’ days, when I didn’t care about most of what I care about today (when it comes to coding) and where logs were saved to disk and you had to ssh and grep scavenge your way through many auto rotating log files.

Nowadays, when starting a new product, you’ll be focusing your engineering wizardry on your core business offering. Step by step, you are required to make your “project” into a “product”, something that others can use, A SERVICE. Only now, you are a few customers in, and you need to show results.
You start by adding a single REST microservice. Then another, and another one after that. Until your wizardry has become a kid’s birthday magician and you are left with the most unmaintainable, impossible to scale, intricate and coupled distributed-monolith possible.

And all that Jazz …or SaaS

You probably think I’m exaggerating — Well actually, I am. To a point.

To be honest, when you’re thinking about building your core product, the need to add all the fluff that goes around it can be a really big pain.

“I’m building the next big thing here and want to focus on that. Why on earth am I wasting resources on security, reporting and authentication stuff? This sucks!”

And that’s exactly the point!

It does suck. Because most of you are not building a SaaS platform. Your product is amazing but SaaS is simply not your main business!

And this is often the case. Many of the products offered as a service are truly amazing and life changing, but SaaS is an ecosystem of its own and should be treated as such, with a proper product plan, requirements and dev resources.

I dare to say that when SaaS is built haphazardly “around” your core product, it becomes like an ugly and murky glass to cover your beautiful portrait. And let’s be honest here, if your picture is a Mona Lisa, you’d want people to see it. But remember that a bad “SaaS experience” can overshadow even the most brilliant of ideas. 

The (technical) SaaS business

So let’s say you have an idea.

Your best idea yet!

And now you’ve reached a point where your POC is ready to be offered as a service to the world.

What happens next?

First, you start by adding a service to manage users and authentication. That service will proxy an online service providing a secure access solution. In order to engage with new users, you will need to send out emails to welcome them when they register. At that point, you recognize that you will be sending out emails more often than you thought, so you decide to develop a service that is responsible for it, proxying an external email delivery service. To complete the feature, you add an HTTP request to the proxy service, in order to send an email to your users once their registration is complete.

A theoretical simple flow would be:

  1. Search if user exists in DB
  2. Insert a new user profile data (personal info) to the DB in pending state
  3. Send a request to the secure access service in order to submit all authentication data.
  4. Local email proxy service (or even secure access service) sends an activation email.
  5. Client activates his user.
  6. Change user from pending state to active state in DB

Easy peasy…

The (technical) SaaS business

A few months and 10/100/1000s of users later — Customer support tickets starts piling up saying that they can’t register. You start digging through your logs and find out that your production env gets throttled with 429 HTTP status code (Too Many requests) by the external secure access service. 

You find yourself at a crossroads. On the one hand you can implement your own authentication service to handle the load which will take a long time to develop — just think about secure user handling, roles and permissions, profile management, JWT, JWK, multi-factor authentication, OpenID support, password reset, user activation — or you can spend (much) more money on a better subscription plan with the secure access service you are already using… Damn…

Let’s try something else:

One of the product guys wants to allow you to form groups within the product. You add another service to support such a feature —  the Team Management service. Now you can invite users to join your group and collaborate on a shared project.

To make the team aware of a new member joining, the product guy wants a notification to pop-up for all existing members. Yet again, you recognize that a new service is required since notifications will be used more than once. So you build a notifications service, one that can handle a high volume of requests.

After a few months, the product guy comes up with a new requirement: Notify the user whenever someone edits a shared project.

You use the notifications service for another spin off, and send the required notification on every edit.

Commit. Deploy. Done. 

A couple of hours later —

“We’re getting too many notifications!”

“This feature is unusable!”

The product had changed but the service was not built to handle it. The notifications service was built to only handle one-time-notifications: “A new user just joined the team!”, “User logged in”… Now when you have many events of the same type, you need to filter similar notifications.

OK OK! Let’s try again. One last time (To spice things up):

For compliance and security reasons, an audit logs service was developed to keep track of all the activities of the users.

The “Facepalm Architecture”
The “Facepalm Architecture”

To comply with different privacy and data protection laws, you are required to support user data deletion. Using only the services we described, let’s try to breakdown a possible flow:

  1. Send a delete request to the users service
  2. Set a user in “delete-pending” status
  3. Send an audit
  4. Block the user in the authentication service (doesn’t matter if you pay for it or developed a new one)
  5. Send an audit
  6. Remove the users from the teams they are in
  7. Send an audit (?)
  8. Send a notification to the team that the user left
  9. Send an email to the user that you’re sad to see him go
  10. Delete the user

And just when you think you’re done with development, you remember that you somehow need to test and deploy it.

Building SaaS properly is an area of expertise and it’s complex. That’s why I call this the “SaaS Facepalm Architecture” because that’s ultimately what you will do when trying to build it. Without a proper design, it will slow you down. Scaling your business is dependent on scaling your SaaS product and the two will eventually affect one another.

“So what is behind the technical business of SaaS and why is everything so highly interconnected?”

When building a good SaaS product, you need to allow your users to interact and engage in a secure way with your platform while at the same time ensuring that you get paid for the services you are offering. These requirements are all connected and overlap with one another. In order for you to avoid building the “Facepalm Architecture” you will need to create proper boundaries between these services.

Despite the fact that REST services are a great way to quickly begin your SaaS journey, they tend to introduce an “HTTP grid” that blurs the boundaries between one service and another by synchronously weaving features and tangling your code with numerous HTTP calls.
While it may seem like the solution is rather obvious, try to keep in mind how you even started your SaaS product, what your initial goals were when you started out building your business offering and how you added in those “SaaS things” along the way, probably not at the same level of finesse.