Engineering

Building a real-time matchmaking scoring engine

How we moved from batch scoring to an incremental, event-driven pipeline that updates 50,000+ pair scores in under a second.

At mytradeshow.ai we process thousands of matchmaking pairs per event — scoring, ranking, and scheduling meetings in near real-time while organizers and participants watch the grid update live. This post walks through how we rebuilt the matchmaking engine from a batch job into a streaming pipeline.

The problem with batch scoring

Our original scoring pipeline ran as a cron job every five minutes. For small events this was fine, but at scale (2,000+ participants, 50,000+ potential pairs) the feedback loop was painfully slow. Organizers would tweak weights, wait five minutes, and wonder if anything happened.

Worse, the batch approach created a thundering-herd pattern: every five minutes the database would spike as the entire pair matrix was recomputed. During peak hours before an event, this competed with the booking and planning APIs for connection pool slots.

Architecture overview

The new engine is built on three layers:

  1. Event bus — Profile updates, weight changes, and manual overrides publish domain events. We use a lightweight in-process event emitter in dev and a durable queue (Vercel Queues) in production.

  2. Incremental scorer — Instead of rescoring the full matrix, we identify the affected pairs from the event payload and only recompute those. A profile change for participant A touches at most N pairs (where N is the other-side count), not N×M.

  3. Score cache — Computed scores live in a per-region cache with tag-based invalidation. The grid API reads from cache first; on miss it falls through to the scorer.

Incremental scoring in practice

The scorer is a pure function: given a pair of profiles and a weight configuration, it returns a deterministic score between 0 and 100. We decompose this into sub-scores:

type SubScore = {
  dimension: string    // e.g. "industry_fit", "stage_match", "geo_proximity"
  weight: number       // 0–1, from campaign config
  raw: number          // 0–100, dimension-specific
  weighted: number     // raw * weight
}

function computePairScore(
  profileA: ParticipantProfile,
  profileB: ParticipantProfile,
  weights: WeightConfig
): { total: number; breakdown: SubScore[] } {
  const dimensions = [
    industryFit(profileA, profileB, weights),
    stageMatch(profileA, profileB, weights),
    geoProximity(profileA, profileB, weights),
    tagOverlap(profileA, profileB, weights),
    manualBoost(profileA, profileB, weights),
  ]
  const total = dimensions.reduce((sum, d) => sum + d.weighted, 0)
  return { total: Math.round(total), breakdown: dimensions }
}

Because each dimension function is stateless, we can run them in parallel for a batch of affected pairs using Promise.all without worrying about shared mutable state.

Cache invalidation strategy

We tag cache entries with two keys: the participant ID and the campaign ID. When a profile updates, we purge by participant tag — this invalidates every pair that includes that participant. When weights change, we purge by campaign tag — this invalidates every pair in the campaign.

This is a good trade-off: weight changes are rare (organizers tweak them a few times, not continuously) so a full-campaign purge is acceptable. Profile changes are frequent but only affect one slice of the matrix.

Results

After deploying the incremental engine:

  • Score latency dropped from ~5 minutes (batch interval) to under 800ms for the affected pairs.

  • Database load during peak pre-event hours fell by 72%, because we no longer recompute the entire matrix on every cycle.

  • Grid responsiveness improved dramatically — organizers see score changes reflected in the UI within a second of saving a weight adjustment.

Lessons learned

The biggest lesson: don't batch what you can stream. Batch jobs are easy to reason about but they impose a latency floor that compounds as your data grows. By identifying the minimal set of affected entities from each domain event, we kept compute proportional to the change — not the dataset.

Another lesson: cache tags are underrated. Before this project we were doing key-based invalidation with constructed cache keys. Tag-based invalidation (purge everything tagged "participant:abc123") is simpler, less error-prone, and works well with the per-region cache model on Vercel.

The fastest query is the one you never make. The second fastest is the one that hits a warm cache. Everything else is engineering trade-offs.

What comes next

We're experimenting with predictive pre-scoring: when a new participant registers, we speculatively compute scores against the top 50 participants on the other side, so the first grid load after onboarding is already warm. Early results are promising — cache hit rates for first-load jumped from 12% to 78%.

We'll write more about the predictive scoring approach and our grid WebSocket layer in future posts. If you're interested in event matchmaking infrastructure, we're hiring.

Tags

  • architecture
  • matchmaking
  • performance
  • caching
  • real-time