Why do we use Authentication?
To protect our users of course!
All yolks aside, an easy example of Authentication could be email+password authentication.
- Our users want to access our product.
- We prompt them with an email+password form.
- We generate a secure session for them to work with the product.
This is all lovely but the rise of microservices, as well as CI/CD, introduced new challenges.
We need to be able to authenticate without user presence. This is where API tokens come into play.
There are few use cases that come to mind when discussing API tokens:
- 3rd party services — every 3rd party that wants to communicate with our service needs to be authenticated
- CI/CD flows — Another 3rd party use case is when part of the services which run as part of the flows are externals (such as code scanners, deployment tools, etc)
So how do we generate sessions for non-humans and what do we need to pay attention to?
This guide will provide you with most of the “how-tos” in terms of generating and securing machine-to-machine sessions using API tokens.
The OAuth token flow
So we understand that in order to authenticate using API tokens we need to generate them. But which kind of token should be generated? Which flow of authentication do we want to support here?
The OAuth flow is the most common one when it comes to authentication.
The idea is simple.
The unauthorized client gets a ClientID and a Client Secret 🤫
This pair is sent through the Authorization Server 🚫
The client, yet to be authorized, receives an Access Token 🔑
The client uses that access token to access the resource 🙏
Upon validation of the token, the client becomes the resource owner 😎
The single token flow
The single token flow basically means that instead of generating client id and client secret, we will be authenticating using a single token.
The idea is to simplify the authentication flow where the client is using a token that doesn’t need to be swapped with a valid JWT through the authorization server.
There are several approaches for this method:
Grant the client a JWT token without expiration during the generation phase, which will then be used as the access token.
Grant the client a token in the generation step, which will then be validated against the central token store.
Token types (account / personal)
On B2B SaaS applications, we would find the “tenant” term as the center of the application. Every flow our application is handling, as well as most of the data in the application, will always be in a context of a “tenant”
In this context, when the user logs in to our application, he logs in to a specific tenant, sees data related to that tenant, and manages the configuration of this exact tenant.
Sounds familiar right?
Usually in cases of the tenant-related token, as there is no user context, we would like to be able to have roles/permissions associated with that token as well.
We can a great example on SendGrid when generating a new token:
We generate an API token, which has specific permissions and is on the SendGrid account level (not related to any user)
That means that as long as we are maintaining the SendGrid account, we can use the API token (regardless of the identity of the users who are members of the account)
However, in some of the cases, we would like to allow our users to generate personal tokens in order to perform machine-to-machine operations on behalf of a user and not a tenant.
We can see an example for it on GitHub:
The token is created on the users’ personal account including specific scopes (permissions).
Important to highlight that with personal tokens, once the user is deleted the token should be deleted as well.
The token repository
So now that we understand the tenant / personal tokens difference and have a high-level view of the authentication flow, we can start and taking a look at the generation of the token.
How should we generate the initial token? Where should we save it? How should we authenticate it?
Generating the initial token (OAuth flow)
Using the OAuth flow, we need to generate clientId and secret. These will be saved to the DB so they can be used on the authentication flow.
Few things to pay close attention to when we design this solution:
- We would like to generate the client id and secret as 128 UUID (version 4 is preferred to reduce the probability of duplicates — for more info, you can read on Birthday Paradox)
- Client secret = password(!!!). It has to be saved salted and hashed on the database so it cannot be reversed in case of DB hijack. That means that using the generation process, we will show the secret ONLY ONCE to the user and we will not be able to show it again
- Token pair association — When storing the token on the DB, we must retain (as in any multi-tenancy / multi-user product) the relation of this token to this tenant/user so we make sure that when generating the JWT for it we sign it for the relevant customer
A lot to think about…
So let’s take a look at how a tenant API token looks like from an entity point of view:
On top of the clientId and secret, we are storing the tenant relation, as well as the scope. In some of the cases, we would like to store the ID of the user who created this token.
So let’s take a look at how a user API token looks like from an entity point of view:
Not much has changed, but now we are saving the user ID instead of the tenant ID (which means we don’t have to store the createdBy anymore)
So we know how to create an API token. Good!
Do we allow to change? modify?
By nature, API tokens are immutable objects. In most of the cases, we don’t want to allow modifications on the tokens in terms major changes (updating the display name of the token can work). We won’t let the creator of the token change it’s scopes / clientId / secret etc.
Why? Security first approch…
We want to make sure that once a token is created it is safe to provide it to a 3rd party application without the risk of scopes changing (by mistake OR not) exposing our service to elevated permissions for this 3rd party application…
OK… So we know how to create tokens and save them.
We know how to authenticate it and generate JWT from it.
But how do we handle different permissions and scopes?
Authorization with API tokens
We know that authentication is not enough. We would like to maintain a list of permissions (aka scopes) on the signed JWT so we can work with this list on each of the microservices we would call in order to perform authenticated call
General flow is simple:
The resource owners’ responsibility in this case is to validate the authentication (using the authentication service public key) and validate the scopes required for accessing the resource (decoded from the JWT token)
Looks simple right?
Well, it is. Even when looking at it from code perspective:
The middleware above is validating the authentication header, extracts the scopes and matches them to the requested scopes by the caller.
So we are getting close!
We have a full flow working end to end from the generation, through the authentication and even through validating the authentication and authorization!
By how do we revoke access token?
Let’s say we created a token with default JWT expiration of 7 days.
We would like to revoke it after a day… Looking at the flow above, once the 3rd party application got hold of JWT, it lives for that 7 days and now the resource service has no way of knowing this JWT is not valid anymore.
Few potential solutions here:
Shorter JWT tokens + longer refresh tokens
Using shorter expiration tokens on the JWT + longer refresh tokens will allow to shorten the exposure window from the revocation of the token. The 3rd party application in this case is required to constantly refresh the token (using the refresh token) in order to retrieve a new JWT
Centeral token repository
Implement Central token repository which holds the deny list (aka as CRL). In this use case for each revocation of token, we would like to update the deny list that this token is revoked. Now we are passing another responsibility to the resource service to verify the validity of the token.
2 possible approaches here:
The recommended flow is to hold the CRL on the API-GW and update the list on each revocation. That means that each request to the resource MUST pass through the API gateway and each revocation should update the CRL (which is used by the API GW):
Central token repository:
Using the central token repository approach, the CRL is saved on the authentication service (on the same service which holds the actual tokens).
This practice makes this service as Single Point Of Failure for the entire application. Meaning that each request must pass through this service and each service on our cluster MUST know how to communicate with this service (takes back to the world of monolith application…)
While this may not look so complex with single resource service, with multiple services it looks like this:
Do you happen to know an architect that would sign off this architecture? Risking tightly coupling services to end up like this:
Tracing API tokens activities
The last step we want to make that we have on our API tokens checklist is traceability
Keep in mind that API token authentication eventually is another method of authentication in our system. That means that as we are auditing every user authentication (email, password), we MUST audit every API token activity.
API tokens can be destructive just the same as administrative actions performed by users, so bear in mind that there are few items on the traceability checklist (I call it the 5-w checklist):
- The who — Which user created this API token? Listing the user which created the API token can help us identify issues with identifying destructive operations by suspicious users on the system.
- The when — When this API token was created? Listing the timestamp can help to track potential issues with timeslots where we suspect changes were made in the system.
- The what — What scopes this API token was created with? Listing the scopes helps with identifying potential issues with elevated permissions granted to this API token.
- The where — Where that API token was accessed from? Which IP address? This is extremely useful for detecting anomalies and providing deny list for specific countries/networks.
- The how— How was that API token used? When it was authenticated last? How was that used in the system? Which APIs were accessed using that API token?
The complexity of authentication and identity management in the products build are not the same.
We are using way more microservices now; therefore, we need to make sure that the line of communication between our microservices is authenticated and secured.
More and more 3rd party services are being used, so we need to make sure that when we are consuming these services, we are doing so using the correct authentication and permissions.
This requires us to use machine-to-machine authentication methods which will allow us, as developers, to consume these services securely.
In this post, we talked about scoping, auth flows, revocations and traceability. We walked through the items necessary to ensure a successful design and implementation of the machine to machine authentication and authorization mechanisms,
and it was lit. 🔥
We are here! Feel free to drop us a line.