HeadlinesBriefing favicon HeadlinesBriefing.com

Building a Scalable Multimodal Recommender on Amazon EKS

Towards Data Science •
×

A developer detailed a multistage, multimodal recommender running on Amazon EKS. The pipeline joins data ingestion, model training, Bloom‑filter exclusion, feature caching and real‑time ranking. Four stages—Two‑Tower candidate generation, Bloom filter masking, DLRM ranking and final rerank—mix tabular collaborative data with CLIP image and Sentence‑BERT text embeddings. This layout scales to catalogs of millions of items without full‑catalog scoring.

The architecture leans on Kubeflow to orchestrate two pipelines: an initial job that copies raw parquet logs, runs NVTabular feature workflows, trains the retrieval and ranking models, builds a FAISS ANN index and deploys them to NVIDIA Triton. A daily fine‑tuning pipeline updates the query tower and ranker without rebuilding item embeddings, keeping latency low.

Feature stores play a pivotal role: Feast supplies offline user and item tables, while Amazon ElastiCache for Valkey hosts an online store that maintains per‑user Bloom filters and popularity counters. In‑memory caching of item features shaved milliseconds off lookup time, enabling the system to serve cold‑start and context‑aware recommendations to both logged‑in and anonymous shoppers instantly.

The author released the full Kubeflow YAML and Triton ensemble files, inviting practitioners to replicate the stack and adapt its caching and Bloom‑filter tricks to their own domains.