← Back to AI Insights
Gemini Executive Synthesis

Graphify's semantic similarity feature, specifically adding local embeddings via quantized models (Gemma 4).

Technical Positioning
An AI coding assistant skill that turns code/docs into a queryable knowledge graph.
SaaS Insight & Market Implications
This issue proposes a significant enhancement to Graphify's semantic similarity capabilities by introducing local embeddings using quantized models like Gemma 4. The motivation is clear: reduce API costs and overcome the subjectivity and sampling limitations of Claude's judgment for semantic linking. By enabling offline, exhaustive cross-file concept linking, Graphify addresses a critical developer pain point related to cost and completeness. This hybrid approach, combining LLM-driven 'interesting' links with local 'exhaustive' links, offers a compelling value proposition. For B2B SaaS, providing cost-effective, privacy-preserving (offline) alternatives for core AI functionalities is a strong differentiator. It expands Graphify's appeal to organizations with strict data governance requirements or those seeking to optimize operational expenses associated with external API calls, enhancing its market competitiveness and adoption potential.
Proprietary Technical Taxonomy
local embedding pass quantized model Gemma 4 Q4/Q8 `llama.cpp` `ollama` `semantically_similar_to` edges API cost

Raw Developer Origin & Technical Request

Source Icon GitHub Issue Apr 6, 2026
Repo: safishamsi/graphify
v0.4.0: local embeddings via quantized Gemma 4 (no API cost)

## Summary

Add an optional local embedding pass using a quantized model — leading candidate is **Gemma 4** (Q4/Q8 via `llama.cpp` or `ollama`) — to generate `semantically_similar_to` edges across all nodes without any API calls.

## Motivation

Currently, semantic similarity edges come from Claude's judgment during extraction — one pass per file, subjective, and costs API tokens. A local embedding pass would:

- Generate embeddings for every node (label + docstring) after the AST and semantic passes
- Add cosine-similarity edges above a configurable threshold, marked `INFERRED`
- Make cross-file concept linking exhaustive rather than sampled
- Work fully offline, cached per-node alongside the existing SHA256 file cache
- Cost zero API tokens after the initial model download

The two approaches complement rather than replace each other — Claude finds the *interesting* cross-cutting edges, local embeddings find the *exhaustive* ones. Both end up in the same graph.

## Design

**Model**: Gemma 4 Q4 or Q8 via `llama.cpp` or `ollama`. Produces strong semantic embeddings for code + text at ~2-4GB RAM, no GPU required.

**Pipeline position**: after Part C (build + cluster), before export. Reads all node labels + docstrings, generates embeddings in batch, computes pairwise cosine similarity, adds edges above threshold.

**Threshold**: configurable, default ~0.82. Exposed as `--embed-threshold 0.82`.

**Backend**: support both `llama-cpp-python` and `ollama` client, auto-detect which...

Developer Debate & Comments

No active discussions extracted for this entry yet.

Adjacent Repository Pain Points

Other highly discussed features and pain points extracted from safishamsi/graphify.

Extracted Positioning
Graphify's query mechanism, evolving from keyword-based BFS to embedding-based semantic search.
An AI coding assistant skill that turns code/docs into a queryable knowledge graph.
Extracted Positioning
Graphify's worked examples and their completeness, specifically the `graph.html` output.
An AI coding assistant skill that turns code/docs into a queryable knowledge graph.
Extracted Positioning
Graphify's user onboarding and visualization of its output.
An AI coding assistant skill that turns code/docs into a queryable knowledge graph.
Extracted Positioning
Security vulnerabilities in Graphify's `_fetch_tweet` function (SSRF) and Neo4j Cypher export (injection).
An AI coding assistant skill that turns code/docs into a queryable knowledge graph.
Extracted Positioning
Graphify's language support expansion to include COBOL.
An AI coding assistant that turns code into a queryable knowledge graph.

Engagement Signals

0
Replies
open
Issue Status

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like offline and Gemma 4 by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.