What are Embeddings & Vector Search?
An embedding is a numeric vector that represents the meaning of text (or images, audio, code) so that semantically similar items sit close together in vector space. Vector search finds the nearest embeddings to a query, enabling search by meaning rather than keywords. Embeddings are the backbone of retrieval-augmented generation, semantic search, clustering and recommendation.
Definition
An embedding is a dense numeric vector that encodes the meaning of a piece of data, positioned so that similar items are close in vector space; vector search retrieves the nearest embeddings to a query.
Key takeaways
- Embeddings turn meaning into vectors; similar things sit close together.
- Vector search retrieves by semantic similarity, not exact keywords.
- They power RAG, semantic search, clustering and recommendation.
- Hybrid search (vector + keyword) usually beats either alone.
- Chunking and the embedding model choice drive retrieval quality.
Context
Computers compare numbers, not meaning. Embeddings bridge that gap: an embedding model maps text to a vector such that 'cancel my subscription' and 'how do I unsubscribe' land near each other, even with no shared words.
This is what makes semantic retrieval possible. Instead of matching keywords, a system embeds the query and finds the closest stored vectors — the foundation of how RAG and modern search retrieve relevant content.
Architecture
Indexing: content is split into chunks, each passed through an embedding model to produce a vector, then stored in a vector index. Querying: the query is embedded and the index returns the nearest vectors by a similarity metric (e.g. cosine).
Production systems add a re-ranker to refine the top results, combine vector search with keyword search (hybrid), and filter by metadata and permissions. Quality depends heavily on chunking and the embedding model.
Components
Benefits
- Search by meaning, robust to wording.
- Cross-lingual and multimodal matching.
- The backbone of RAG and semantic search.
- Cheap to query at scale once indexed.
Risks
- Poor chunking degrades every downstream result.
- Embedding model mismatch hurts relevance.
- Vectors can leak sensitive info; secure the store.
- Pure vector search can miss exact-match needs (use hybrid).
Tools & technologies
Examples
- Semantic search over a help center that matches intent, not keywords.
- Retrieving relevant passages to ground a RAG answer.
- Clustering support tickets by topic using their embeddings.
FAQs
- How are embeddings different from keywords?
- Keyword search matches exact words; embeddings match meaning, so paraphrases and synonyms still retrieve the right content.
- What is vector search?
- Finding the stored embeddings nearest to a query embedding by a similarity metric like cosine distance — search by semantic closeness.
- Why combine vector and keyword search?
- Vectors excel at meaning but can miss exact terms (codes, names). Hybrid search blends both for the best recall and precision.
- Do embeddings power RAG?
- Yes. RAG embeds documents and queries, retrieves the nearest chunks by vector search, and grounds the model's answer in them.