Bloom Filters : The Unsung Hero of System Design

By Pradyumna Chippigiri

January 29, 2026

Let’s start from Ground Up..

Say we’re building an app, and as a part of the user registration process we ask users to choose a username. If the username is already taken, they need to try another one. (Just like your email, instagram usernames etc..)


So as engineers, how should we approach building a feature like this ?

The most straightforward approach that we can think of is that :

This works perfectly… at small scale.


Now as the app grows, suddenly this endpoint becomes one of our hottest paths because this traditional approach does a full db scan which takes O(n) which is good if the number of rows in our table is limited to some extent, but at millions and billions of records, the same query will take forever to return the result. (not only because we have multiple rows to check, but also that during registration people will retry multiple usernames, this in-turn increases the QPS, which can still hammer our DB, and increase the response latency,)


And introducing a cache doesnt make sense here as we are trying to see if a username exists or not. (cache stores the frequently accessed data, which is irrelevant to us).


So at this scale now we have to protect the db from being overwhelmed,


So our next approach would be to store the existing usernames in a set, and each time someone requests for a username check, instead of hitting the db, we will first check if the username is present in the set or no.

A Set lookup is super fast as it gives a O(1) look up time, but when our app grows to Instagram scale, storing billions of strings in a traditional in-memory Set will destroy our RAM, and also is super expensive. (1 billion usernames, each with average of 10 characters, roughly above 10GB + just for names 😂)

Here is where Bloom Filters come in to the rescue, and solves the above problem optimally.

What is Bloom Filter ?

Bloom filter is a probabilistic data structure that returns whether

So this helps in solving questions like this efficiently : “Does the element exist in the set (data structure) or not?” For example see the real-world usecases below.

Bloom Filters Use cases

It has numerous use-cases… Here’s another discussion from Linkedin from a Amazon Employee as to how they used Bloom filters…

Now let’s understand how bloom filters work..

How does Bloom filter Work?

Let’s understand by taking an example :


Imagine we have an array of 128 buckets where each bucket consists of a single bit that is set to zero. Since it’s a bit, it can only have two values: either 0 or 1. Initially, all 128 bits are set to 0.

Let’s consider the previous example itself, the example of finding a username.. (for better understanding let’s say we already have 3 users “pushu”, “giri” and “vikram”, and we have marked their hash values to 1.)

Insert Operation

When the user is typing a username, we might do a GET availability check in the bloom filter (refer to the query operation section bellow).

But the insert happens when the bloom filter query operation (GET) returns a 0 (100% certain that the user is not present) and only when the user finally clicks Sign Up and we create the account via a POST request.


Only after the DB insert succeeds, we update the Bloom filter like below.

For insert operation we need to perform the following operations :

Query/Read Operation

For query operation (meaning to see if a username exists or not )we perform the following steps :

Some Important Points to Note

Obvious Questions that Pop-up

For query(element) operation, you might have to check in all instances of Bloom Filters and then take a decision accordingly.




We have finally come to an end of this long deep dive. Also, I have implemented Bloom Filters in Python from First Princples that is starting from brute force to production grade optimal approach.


I will be talking more about the code step by step in my coming articles..


If you liked this article, feel free to like, subscribe, and drop a comment to spark a discussion, so we can all learn from each other. And if you think it’ll help someone, share it with your friends or on social media.


If you found this useful and want to support my work, you can also buy me a coffee ☕ - buymeacoffee.com/cpradyumnao