Webhooks have now become an industry standard. Whether you like it or not, there are barely any enterprise facing products that don’t use webhook notification functionality as an integral part of their use flow. But why do you need webhooks integration in our products? And how do you implement webhooks to create fundamentally sound and enterprise-ready SaaS applications?
In this post, we will try to cover the main aspects you have to keep in mind when implementing webhooks, before adding them to your SaaS offering.
Before we dive into the technical stuff and learn how to implement webhooks, it’s important to understand what they are all about. The idea of webhooks is to expose events from any of the software products you engage with, in order to allow the customers of this product to act upon them, carry out automations, and perform other performance-enhancing or productivity-related actions.
How about some examples? Firstly, webhooks can be found in applications when giving users important notifications, like in the following Github example:
The popular JIRA application also uses webhooks extensively:
So you get the idea. The way it works is pretty simple. When an event is triggered on a product, it handles it, and then lets all the Webhook subscribers know that this event has occurred by sending them an HTTP call (aka web-hook).
It should be somewhat like this:
Securing webhook calls is crucial for ensuring proper implementation. As straightforward as these measures may sound, they are often overlooked while setting up webhooks. We need to make sure that the receiving side of the webhook can validate the original request and protect it by using one, or all, of the following security measures. Keep in mind that some may not be best for your use case.
One of the common methods to allow or deny requests and to “authenticate” the sender is by using IP whitelisting. While this method can work in some cases, in today’s dynamic cloud environments (not to mention serverless architecture), it’s sometimes hard to maintain and can easily undo the implementation. You may want to pass on this if you lack the resources and time that is required.
Pre shared key
This method enables the webhook sender to add a header to each of the requests. The value of the header should contain a pre-shared key that was shared and configured between the two parties (the webhook sender and receiver). The receivers’ responsibility is to validate the header and to make sure it actually contains the value that was configured by the parties.
Replay attack prevention
The use of pre-shared keys exposes the webhook receiver to Man-in-the-Middle (MITM) attacks (especially replay attacks).
This is how it looks (in this case, password sniffing):
The attacker/hacker can hijack the request and reuse it over and over again (for example to wire funds as part of the automation). In order to protect the receiving end from pulling off replay attacks, the webhook sender is expected to send the time when the request was sent and the “ValidUntil” header, which is usually limited to 10-20 seconds after the request origination timestamp.
How to Implement Webhooks?
The well-known JSON Web Token (JWT) mechanism is an all-in-one solution to patch up the security loopholes, without the complexities involved with using the aforementioned approaches. JWT helps combine both the shared secret as a header and the “ValidUntil” header. Using this approach, we’ll assign a value using symmetric H256 JWT, and set it to expire after 20 seconds.
That is the secret to the validation and replay attack mechanism. Cool, isn’t it? Now let’s move on and cover other bases before summing up.
Scale & Fault tolerance
Keep in mind that as your SaaS product and customer base grows, you will need to handle more and more webhook calls. This means that we need to handle the scaling of millions of webhook calls per minute. We need to keep in mind that the webhook functionality is mandatory for customers to pick up automation flows. and in most cases timing and reliability is crucial.
You DON’T want to find yourself with these types of notifications on your status page:
Scale is important but not enough. We need to make sure that in the case of network disconnections or drops, we are able to retry sendings — first automatically and then manually, in order to make sure that the customer’s automation is not damaged. One of the popular methods we tend to use with these kinds of challenges is a streaming platform like Apache Kafka.
So full scale and fault tolerance?
This allows greater flexibility, and follows the Separation of Concerns (SoC) principles, which are being embraced globally today. More importantly, it allows you to scale and partition in order to meet greater demand and load on the webhooks handling as well as the handling of the failed items on separate partitions and log indexes. The flow using Apache Kafka will look somewhat like this:
Why should we choose Apache Kafka for this use case? The added values are:
- The horizontal scale capabilities – allows us to handle scale around our cluster and add more handlers as our scale and demand grows. Today’s dynamic user behavior and fluctuating conditions make this crucial.
- Partitioning – the partitioning capability of Kafka allows us to “isolate” problematic recipients while not blocking the rest of the recipients.
- Replay messaging – This great feature allows us to replay problematic and failed messages while moving the offset of a specific topic. This is great since it allows us to “go back in time” in the case of errors.
Webhooks are the new de-facto standard for communicating between systems and platforms. Every webhook implementation should take the main items we discussed in this post into consideration to avoid security and scaling pitfalls. With the SaaS market getting uber-competitive and customers demanding the most dynamic features, you simply must be setting up webhooks in the best way possible.
When used correctly, webhooks can be true growth enablers and customer satisfaction enhancers, while also saving you time and resources. Get started now!