Insight for: v0.4.0: local embeddings via quantized Gemma 4 (no API cost)

Graphify's semantic similarity feature, specifically adding local embeddings via quantized models (Gemma 4).

Analyzed: Apr 8, 2026

This issue proposes a significant enhancement to Graphify's semantic similarity capabilities by introducing local embeddings using quantized models like Gemma 4. The motivation is clear: reduce API costs and overcome the subjectivity and sampling limitations of Claude's judgment for semantic linking. By enabling offline, exhaustive cross-file concept linking, Graphify addresses a critical developer pain point related to cost and completeness. This hybrid approach, combining LLM-driven 'interesting' links with local 'exhaustive' links, offers a compelling value proposition. For B2B SaaS, providing cost-effective, privacy-preserving (offline) alternatives for core AI functionalities is a strong differentiator. It expands Graphify's appeal to organizations with strict data governance requirements or those seeking to optimize operational expenses associated with external API calls, enhancing its market competitiveness and adoption potential.

local embedding pass quantized model Gemma 4 Q4/Q8 `llama.cpp` `ollama` `semantically_similar_to` edges API cost semantic similarity edges Claude's judgment API tokens AST and semantic passes cosine-similarity edges `INFERRED` cross-file concept linking offline SHA256 file cache node labels docstrings pairwise cosine similarity `llama-cpp-python`

GitHub Issue

Parent Entity

v0.4.0: local embeddings via quantized Gemma 4 (no API cost)

State: Open