Skip to main content

Architecture

Why Falcoscan ships on three services instead of five

6 min readShem Nyachieo

When I sketched the first Falcoscan architecture, I started with the playbook every "production RAG stack" tutorial recommends: a scraper, a queue, an enricher, Pinecone for the vector store, and Postgres for everything else. Five services. Clean. Conventional. Easy to justify on a whiteboard.

Then I measured my actual load — and a leaner three-service stack came out ahead on speed, cost, and reliability. This is the post I wish I had read a month earlier.

3

services in production

Supabase, Vercel, and one scraper.

6,700+

AI tools indexed

Across 29+ markets, refreshed weekly.

300+

active users

Validated the architecture under real traffic.

What the queue was for in the original plan

The classic argument for a queue is, "the scraper produces work faster than the enricher can consume it, so you need a buffer." That is genuinely true at scale. The question I asked myself: what scale am I actually running at?

When I instrumented the system, the answer was specific:

So the scraper produced about 4,000 rows in a 5-minute window during one daily crawl, then went idle for 23 hours. The enricher chewed through 4,000 rows in roughly 12 minutes. The queue was buffering a 12-minute spike, then sitting idle for the rest of the day.

That is a beautiful piece of infrastructure — for a workload 20× my actual one. The right move for now was to consolidate it onto Postgres.

The Postgres "queue" that replaced it

The replacement is a single Postgres table called enrichment_jobs with a status column. The scraper inserts rows with status = 'pending'. The enricher reads pending, marks running, processes, sets done. A Postgres advisory lock keeps two workers from grabbing the same row.

A dedicated queue gives you durability, retries with exponential backoff, dead-letter routing, and strict ordering guarantees. The Postgres pattern gives me durability for free, and the more advanced bits become tiny additions when the workload demands them. So far the workload has not, which is exactly how I want it.

Why Postgres beat Pinecone for my data shape

The classic argument for Pinecone is, "vector search is a specialized workload." That was a stronger argument three years ago. Today, pgvector inside Supabase handles my scale comfortably, and it wins on two things that matter more than raw query speed.

The first win is latency. The indexed cosine search on a 1,536-dimensional embedding returns in 8-22ms inside Postgres. Pinecone's published p50 is 40-100ms before any network hop. After the hop from my edge function, my pre-cutover measurements landed at about 120ms.

The second and bigger win is the JOIN. Every Falcoscan result needs vector similarity and a filter on published_at > now() - 7 days and a filter on market in (...) and a JOIN to tools for the human-readable card.

Pinecone + Postgres

Two stores, two queries

  • Vector query to Pinecone
  • Separate Postgres query for metadata
  • In-memory merge in the edge function
  • Two systems to keep in sync

Supabase pgvector

One store, one query

  • Single 14-line SQL statement
  • Vector similarity + filters + JOIN in one round trip
  • Zero drift risk between vector and metadata
  • Same backup, same access control, same dashboard

The query I run for the front page is 14 lines and JOINs three tables. Cleaner. Faster. One source of truth.

What I traded away on purpose

Picking a leaner stack is not free. I made three deliberate trade-offs:

  1. Horizontal vector scale. If Falcoscan grows to 10 million tools, pgvector starts to feel the strain. The migration path is clear: add a read replica first, move vectors to Pinecone (or pgvector with HNSW + partitioning) when the data forces the decision. I have explicitly chosen to defer that choice until I have the data to make it well.

  2. Background-job ergonomics. Inngest or BullMQ would give me a dashboard, retry UI, and built-in metrics. My enrichment_jobs table gives me none of those. I built a 40-line Supabase view that surfaces stuck rows, and that has been enough to operate cleanly.

  3. The line on the architecture diagram that says "Pinecone." Some people read that line as a credibility signal. The trade is worth it — the engineers and investors who actually understand the choice nod when I walk them through it.

The PM lens

The deeper lesson is not about Postgres versus Pinecone. It is about the difference between an architecture that buys you optionality and one that buys you complexity.

Most v1 architectures end up with fewer services than they started with, not more. The teams that ship are the ones who measure first and add services in the order their data demands. Falcoscan runs on 3 services right now — Supabase, Vercel, and one scraper. When my data tells me to add a fourth, I will. Until then, every box on the diagram has to earn its place.

That mindset has been the single most useful one I have brought from product into engineering: figure out what the user actually needs, ship the smallest thing that delivers it well, and let real usage tell you what to build next.


I am building Falcoscan in public and writing about the decisions as they happen. If you are hiring for Senior, Staff, or Founding PM roles and this kind of thinking is what you are looking for, I would love to talk — shemnyachieo@live.com.