Graphify's query mechanism, evolving from keyword-based BFS to embedding-based semantic search.
Raw Developer Origin & Technical Request
GitHub Issue
Apr 6, 2026
## Problem
Current `/graphify query` is BFS keyword matching - same as grep with graph traversal. Searching "find what handles authentication" only works if the word "auth" appears in node labels.
## Goal
Replace keyword BFS with embedding-based semantic search so queries find concepts by meaning, not exact string match.
## Plan
**Embedding backend (local by default):**
- `sentence-transformers` with `all-MiniLM-L6-v2` (80MB, no API key, works offline)
- Optional: OpenAI embeddings API, nomic-embed via ollama
**What changes:**
- On graph build, embed every node label + source context, store vectors in `graph.json`
- `/graphify query` computes query embedding, ranks nodes by cosine similarity, then does BFS from top-k hits
- `semantically_similar_to` edge detection can use embeddings instead of LLM (faster, cheaper)
- Node similarity surfaced in graph visualization
**New optional dependency:**
```
pip install graphifyy[embeddings]
```
## Why this matters
This is the difference between a search tool and an understanding tool. "Find what connects the optimizer to the attention mechanism" should work even if those exact words don't appear together anywhere in the codebase.
Developer Debate & Comments
No active discussions extracted for this entry yet.
Adjacent Repository Pain Points
Other highly discussed features and pain points extracted from safishamsi/graphify.
Engagement Signals
Cross-Market Term Frequency
Quantifies the cross-market adoption of foundational terms like LLM and cosine similarity by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.
SaaS Metrics