API Token Generation Full Guide

Why do we use authentication? To protect our users of course!

All yolks aside, an easy example of Authentication could be email+password authentication.

Our users want to access our product.
We prompt them with an email+password form.
We generate a secure session for them to work with the product.

This is all lovely but the rise of microservices, as well as CI/CD, introduced new challenges.

We need to be able to authenticate without user presence. This is where API tokens come into play.

There are few use cases that come to mind when discussing API tokens:

3rd party services — every 3rd party that wants to communicate with our service needs to be authenticated
CI/CD flows — Another 3rd party use case is when part of the services which run as part of the flows are externals (such as code scanners, deployment tools, etc)

So how do we generate sessions for non-humans and what do we need to pay attention to?

This guide will provide you with most of the “how-tos” in terms of generating and securing machine-to-machine sessions using API tokens.

The OAuth token flow

So we understand that in order to authenticate using API tokens we need to generate them. But which kind of token should be generated? Which flow of authentication do we want to support here?

The OAuth flow is the most common one when it comes to authentication.

The idea is simple.

The unauthorized client gets a ClientID and a Client Secret ????
This pair is sent through the Authorization Server ????
The client, yet to be authorized, receives an Access Token ????
The client uses that access token to access the resource ????
Upon validation of the token, the client becomes the resource owner ????

The single token flow

The single token flow basically means that instead of generating client id and client secret, we will be authenticating using a single token.

The idea is to simplify the authentication flow where the client is using a token that doesn’t need to be swapped with a valid JWT through the authorization server.

There are several approaches for this method:

Grant the client a JWT token without expiration during the generation phase, which will then be used as the access token.

Grant the client a token in the generation step, which will then be validated against the central token store.

Token types (account / personal)

On B2B SaaS applications, we would find the “tenant” term as the center of the application. Every flow our application is handling, as well as most of the data in the application, will always be in a context of a “tenant”

In this context, when the user logs in to our application, he logs in to a specific tenant, sees data related to that tenant, and manages the configuration of this exact tenant.

Sounds familiar right?

Usually in cases of the tenant-related token, as there is no user context, we would like to be able to have roles/permissions associated with that token as well.

We can a great example on SendGrid when generating a new token:

We generate an API token, which has specific permissions and is on the SendGrid account level (not related to any user)

That means that as long as we are maintaining the SendGrid account, we can use the API token (regardless of the identity of the users who are members of the account)

However, in some of the cases, we would like to allow our users to generate personal tokens in order to perform machine-to-machine operations on behalf of a user and not a tenant.

We can see an example for it on GitHub:

Or npm:

The token is created on the users’ personal account including specific scopes (permissions).

Important to highlight that with personal tokens, once the user is deleted the token should be deleted as well.

The token repository

So now that we understand the tenant / personal tokens difference and have a high-level view of the authentication flow, we can start and taking a look at the generation of the token.

How should we generate the initial token? Where should we save it? How should we authenticate it?

Generating the initial token (OAuth flow)

Using the OAuth flow, we need to generate clientId and secret. These will be saved to the DB so they can be used on the authentication flow.

Few things to pay close attention to when we design this solution:

We would like to generate the client id and secret as 128 UUID (version 4 is preferred to reduce the probability of duplicates — for more info, you can read on Birthday Paradox)
Client secret = password(!!!). It has to be saved salted and hashed on the database so it cannot be reversed in case of DB hijack. That means that using the generation process, we will show the secret ONLY ONCE to the user and we will not be able to show it again
Token pair association — When storing the token on the DB, we must retain (as in any multi-tenancy / multi-user product) the relation of this token to this tenant/user so we make sure that when generating the JWT for it we sign it for the relevant customer

A lot to think about…

So let’s take a look at how a tenant API token looks like from an entity point of view:

On top of the clientId and secret, we are storing the tenant relation, as well as the scope. In some of the cases, we would like to store the ID of the user who created this token.

So let’s take a look at how a user API token looks like from an entity point of view:

Not much has changed, but now we are saving the user ID instead of the tenant ID (which means we don’t have to store the createdBy anymore)

Cool…

So we know how to create an API token. Good!

Do we allow to change? modify?
Well… No…

By nature, API tokens are immutable objects. In most of the cases, we don’t want to allow modifications on the tokens in terms major changes (updating the display name of the token can work). We won’t let the creator of the token change it’s scopes / clientId / secret etc.

Why? Security first approch…

We want to make sure that once a token is created it is safe to provide it to a 3rd party application without the risk of scopes changing (by mistake OR not) exposing our service to elevated permissions for this 3rd party application…

OK… So we know how to create tokens and save them.
We know how to authenticate it and generate JWT from it.
But how do we handle different permissions and scopes?

Authorization with API tokens

We know that authentication is not enough. We would like to maintain a list of permissions (aka scopes) on the signed JWT so we can work with this list on each of the microservices we would call in order to perform authenticated call

General flow is simple:

The resource owners’ responsibility in this case is to validate the authentication (using the authentication service public key) and validate the scopes required for accessing the resource (decoded from the JWT token)

Looks simple right?

Well, it is. Even when looking at it from code perspective:

The middleware above is validating the authentication header, extracts the scopes and matches them to the requested scopes by the caller.

The revocation

So we are getting close!

We have a full flow working end to end from the generation, through the authentication and even through validating the authentication and authorization!

By how do we revoke access token?

Let’s say we created a token with default JWT expiration of 7 days.
We would like to revoke it after a day… Looking at the flow above, once the 3rd party application got hold of JWT, it lives for that 7 days and now the resource service has no way of knowing this JWT is not valid anymore.

Few potential solutions here:

Shorter JWT tokens + longer refresh tokens

Using shorter expiration tokens on the JWT + longer refresh tokens will allow to shorten the exposure window from the revocation of the token. The 3rd party application in this case is required to constantly refresh the token (using the refresh token) in order to retrieve a new JWT

Centeral token repository

Implement Central token repository which holds the deny list (aka as CRL). In this use case for each revocation of token, we would like to update the deny list that this token is revoked. Now we are passing another responsibility to the resource service to verify the validity of the token.

2 possible approaches here:

API-GW first:
The recommended flow is to hold the CRL on the API-GW and update the list on each revocation. That means that each request to the resource MUST pass through the API gateway and each revocation should update the CRL (which is used by the API GW):

Central token repository:
Using the central token repository approach, the CRL is saved on the authentication service (on the same service which holds the actual tokens).
This practice makes this service as Single Point Of Failure for the entire application. Meaning that each request must pass through this service and each service on our cluster MUST know how to communicate with this service (takes back to the world of monolith application…)

While this may not look so complex with single resource service, with multiple services it looks like this:

Do you happen to know an architect that would sign off this architecture? Risking tightly coupling services to end up like this:

Tracing API tokens activities

The last step we want to make that we have on our API tokens checklist is traceability

Keep in mind that API token authentication eventually is another method of authentication in our system. That means that as we are auditing every user authentication (email, password), we MUST audit every API token activity.

API tokens can be destructive just the same as administrative actions performed by users, so bear in mind that there are few items on the traceability checklist (I call it the 5-w checklist):

The who — Which user created this API token? Listing the user which created the API token can help us identify issues with identifying destructive operations by suspicious users on the system.
The when — When this API token was created? Listing the timestamp can help to track potential issues with timeslots where we suspect changes were made in the system.
The what — What scopes this API token was created with? Listing the scopes helps with identifying potential issues with elevated permissions granted to this API token.
The where — Where that API token was accessed from? Which IP address? This is extremely useful for detecting anomalies and providing deny list for specific countries/networks.
The how— How was that API token used? When it was authenticated last? How was that used in the system? Which APIs were accessed using that API token?

Summary

The complexity of authentication and identity management in the products build are not the same.

We are using way more microservices now; therefore, we need to make sure that the line of communication between our microservices is authenticated and secured.

More and more 3rd party services are being used, so we need to make sure that when we are consuming these services, we are doing so using the correct authentication and permissions.

This requires us to use machine-to-machine authentication methods which will allow us, as developers, to consume these services securely.

In this post, we talked about scoping, auth flows, revocations and traceability. We walked through the items necessary to ensure a successful design and implementation of the machine to machine authentication and authorization mechanisms,

and it was lit. ????

Got questions?
We are here! Feel free to drop us a line.

The Complete Guide to SaaS Multi-Tenant Architecture

Read the guide

Cookie	Duration	Description
_vis_opt_s	3 months 8 days	Visual Website Optimizer sets this cookie to detect if there are new to or returning to a particular test.
_vis_opt_test_cookie	session	Visual Website Optimizer creates this cookie to determine whether or not cookies are enabled on the user's browser.
li_gc	6 months	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
messagesUtk	6 months	HubSpot sets this cookie to recognize visitors who chat via the chatflows tool.

Cookie	Duration	Description
_gcl_au	3 months	Google Tag Manager sets the cookie to experiment advertisement efficiency of websites using their services.
_hjSession_*	1 hour	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjSessionUser_*	1 year	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hp2_ses_props.*	1 hour	Heap sets this cookie to store the timestamp and cookie domain or path.
_omappvp	1 year 1 month 4 days	The _omappvp cookie is set to distinguish new and returning users and is used in conjunction with _omappvs cookie.
_omappvs	20 minutes	The _omappvs cookie, used in conjunction with the _omappvp cookies, is used to determine if the visitor has visited the website before, or if it is a new visitor.
_vwo_uuid_v2	1 year	This cookie is set by Visual Website Optimiser and calculates unique traffic on a website.
cb_anonymous_id	1 year	Clearbit sets this cookie to track page views and traits for Clearbit.
cb_group_id	1 year	Clearbit sets this cookie to track page views and traits for Clearbit.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
cb_user_id	1 year	Clearbit sets this cookie to collect data on visitors. This information is used to assign visitors into segments, making website advertising more relevant.

Cookie	Duration	Description
__Host-session	14 days	No description available.
__tld__	session	Description is currently not available.
_cfuvid	session	Description is currently not available.
_crowdcontrol_session_key	session	Description is currently not available.
_g2_session_id	session	Description is currently not available.
_hp2_hld346349427843107.1065080579	5 minutes	Description is currently not available.
_hp2_hld6722177740337317.1065080579	5 minutes	Description is currently not available.
_hp2_hld8090462093010520.1065080579	5 minutes	Description is currently not available.
cbtest	1 year	Description is currently not available.
debug	never	No description available.
events_distinct_id	session	Description is currently not available.
h1_device_id	1 year	Description is currently not available.
pfjscookies	1 year	Description is currently not available.

The Full Guide to API Token Generation