Arcana: Embeddable RAG for Elixir/Phoenix

Everything I've learned building RAG systems, packaged into one Elixir library. Hybrid search, agentic pipelines, GraphRAG, all running on your existing Ecto Repo and pgvector.

Arcana: Embeddable RAG for Elixir/Phoenix
Photo by Gustavo Sánchez / Unsplash

I've built RAG systems more than once now. Every time I end up reimplementing the same decisions: how to chunk documents, which embedding model to use, semantic vs full-text vs hybrid search, when to rerank, how to handle multi-part questions, whether you need a knowledge graph. The research is scattered across papers, blog posts, and Twitter threads, and teams keep rediscovering the same lessons.

Arcana is everything I learned packaged into one Elixir library. It's opinionated about defaults but swappable where it matters. You add it to your Phoenix app and get a solid RAG system out of the box, without needing to become a retrieval expert.

Three functions to start

Arcana's core API is three functions: ingest, search, and ask.

# Ingest a document
{:ok, document} = Arcana.ingest("Your document content here", repo: MyApp.Repo)

# Search for relevant chunks
{:ok, results} = Arcana.search("your query", repo: MyApp.Repo)

# Ask a question and get an answer
{:ok, answer} = Arcana.ask("What is Elixir?",
  repo: MyApp.Repo,
  llm: "anthropic:claude-sonnet-4-20250514"
)

It uses your existing Ecto Repo. No separate vector database, no external service to manage. Just pgvector running in the Postgres you already have.

What's baked in

Behind those three functions, Arcana makes a lot of these decisions so you don't have to.

Chunking uses overlapping token-based windows (450 tokens, 50 overlap) that preserve context boundaries, though you can swap in semantic chunking. Embeddings run locally via Bumblebee by default, so no API keys needed, with E5 prefix handling and OpenAI as an option. Search combines semantic similarity with full-text via Reciprocal Rank Fusion, which in practice outperforms either method alone. Then there's an agentic pipeline for complex questions (query expansion, decomposition, multi-hop reasoning, reranking) and GraphRAG for when entity relationships matter.

Every component is backed by a behaviour, so you can swap anything when the defaults don't fit.

Hybrid search with Reciprocal Rank Fusion

Pure vector search works well for semantic queries, but sometimes you need exact keyword matching too. Arcana's hybrid mode combines both:

{:ok, results} = Arcana.search("query",
  repo: MyApp.Repo,
  mode: :hybrid,
  semantic_weight: 0.7,
  fulltext_weight: 0.3
)

RRF merges ranked results from both search methods without needing to normalize scores across different ranking systems. It just works.

Agentic RAG pipeline

Basic RAG works for simple questions, but real-world queries are messier. "What changed between Phoenix 1.7 and 1.8 regarding LiveView authentication?" needs more than a single vector search. Arcana's Agent pipeline handles this:

alias Arcana.Agent

ctx =
  Agent.new("Compare Elixir and Erlang features", repo: MyApp.Repo, llm: llm)
  |> Agent.gate()       # Skip retrieval if answerable from knowledge
  |> Agent.expand()     # Add synonyms and related terms
  |> Agent.decompose()  # Split multi-part questions
  |> Agent.search()     # Execute vector search for each sub-question
  |> Agent.reason()     # Evaluate results, search again if needed
  |> Agent.rerank()     # Score and filter chunks by relevance
  |> Agent.answer()     # Generate final answer

ctx.answer

Each step is optional. You can build a simple search-and-answer pipeline or go full agentic. The gate step decides if retrieval is even needed, expansion enriches queries with synonyms, decomposition splits multi-part questions into focused sub-queries, reasoning evaluates intermediate results and triggers follow-up searches, and reranking scores each chunk on a 0-10 scale to filter out noise.

GraphRAG

For document collections where entities and their relationships matter, Arcana supports GraphRAG. During ingestion, it extracts named entities and their relationships, then clusters them using the Leiden community detection algorithm (via Leidenfold).

# Ingest with graph building
{:ok, document} = Arcana.ingest(content, repo: MyApp.Repo, graph: true)

# Search combines vector + graph traversal with RRF
{:ok, results} = Arcana.search("Who leads OpenAI?", repo: MyApp.Repo, graph: true)

At search time, vector results and graph traversal results are fused with RRF. Works better for relationship-heavy queries where pure vector search might miss relevant context.

Swap anything

Every component in Arcana is backed by a behaviour: embedders, chunkers, LLMs, vector stores, every step in the Agent pipeline. When you need a custom cross-encoder reranker or a domain-specific chunker, implement the behaviour and pass it in.

Installation

If you're using Igniter:

mix igniter.install arcana
mix ecto.migrate

The installer creates migrations, configures your repo, and optionally sets up a LiveView dashboard for managing documents and testing searches.

Get started

Check out the GitHub repo and HexDocs for the full guides on agentic RAG, GraphRAG, search algorithms, and custom components.

I'll be giving a talk about Arcana at ElixirConf EU, going deeper into the architecture and the research behind each component.

If you want to see it in action with a real corpus, check out arcana-adept, a demo Phoenix app with a Doctor Who knowledge base ready to embed and query.


Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to George Guimarães..

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.