Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters

Name: Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters
Rating: 4.5 (11 reviews)

Presents quantitative findings on AI model stylistic characteristics, cost-efficiency comparisons, and prompt-induced convergence/divergence, using a 32-dimension stylometric fingerprint.

Traction Score

Discussions

Apr 8, 2026

Launch Date

View Origin Link

Product Positioning & Context

AI Executive Synthesis

Presents quantitative findings on AI model stylistic characteristics, cost-efficiency comparisons, and prompt-induced convergence/divergence, using a 32-dimension stylometric fingerprint.

This analysis provides quantitative insights into AI model stylistic differentiation and convergence. Identifying 'clone clusters' with high cosine similarity highlights potential commoditization or lack of unique voice among certain models. The finding that Gemini 2.5 Flash Lite writes 78% like Claude 3 Opus at 185x less cost presents a significant cost-optimization opportunity for businesses prioritizing stylistic similarity over other model attributes. Meta's 'strongest provider house style' indicates brand-specific stylistic consistency, which could be a differentiator. The impact of specific prompts on writing convergence ('satirical fake news') and divergence ('count letters') offers valuable data for prompt engineering and model evaluation. This research informs strategic model selection, cost management, and understanding the inherent stylistic biases and capabilities of various LLMs, critical for applications requiring specific tone or avoiding detection.

We have a dataset of 3,095 standardized AI responses across 43 prompts. From each response, we extract a 32-dimension stylometric fingerprint (lexical richness, sentence structure, punctuation habits, formatting patterns, discourse markers).Some findings:- 9 clone clusters (>90% cosine similarity on z-normalized feature vectors)
- Mistral Large 2 and Large 3 2512 score 84.8% on a composite metric combining 5 independent signals
- Gemini 2.5 Flash Lite writes 78% like Claude 3 Opus. Costs 185x less
- Meta has the strongest provider "house style" (37.5x distinctiveness ratio)
- "Satirical fake news" is the prompt that causes the most writing convergence across all models
- "Count letters" causes the most divergenceThe composite clone score combines: prompt-controlled head-to-head similarity, per-feature Pearson correlation across challenges, response length correlation, cross-prompt consistency, and aggregate cosine similarity.Tech: stylometric extraction in Node.js, z-score normalization, cosine similarity for aggregate, Pearson correlation for per-feature tracking. Analysis script is ~1400 lines.

Related Ecosystem & Alternatives

Discover adjacent products, open-source repositories, and developer tools sharing similar technical architecture.

Deep-Dive FAQs

What is We fingerprinted 178 AI models' writing styles and similarity clusters?

We fingerprinted 178 AI models' writing styles and similarity clusters is analyzed by our AI as: Presents quantitative findings on AI model stylistic characteristics, cost-efficiency comparisons, and prompt-induced convergence/divergence, using a 32-dimension stylometric fingerprint.. It focuses on This analysis provides quantitative insights into AI model stylistic differentiation and convergence. Identifying 'clone clusters' with high cosine...

Where did We fingerprinted 178 AI models' writing styles and similarity clusters originate?

Data for We fingerprinted 178 AI models' writing styles and similarity clusters was aggregated directly from the Hacker News community ecosystem, representing raw developer and early-adopter sentiment.

When was We fingerprinted 178 AI models' writing styles and similarity clusters publicly launched?

The initial public indexing or launch date for We fingerprinted 178 AI models' writing styles and similarity clusters within our tracked developer communities was recorded on April 8, 2026.

How popular is We fingerprinted 178 AI models' writing styles and similarity clusters?

We fingerprinted 178 AI models' writing styles and similarity clusters has achieved measurable traction, logging over 52 traction score and facilitating 11 recorded discussions or engagements.

Which technical categories define We fingerprinted 178 AI models' writing styles and similarity clusters?

Based on metadata extraction, We fingerprinted 178 AI models' writing styles and similarity clusters is categorized under topics such as: stylometric fingerprint, lexical richness, sentence structure, punctuation habits.

What are some commercial alternatives to We fingerprinted 178 AI models' writing styles and similarity clusters?

Our semantic intelligence engine identifies potential commercial alternatives in the SaaS space, such as PayCan, which offers overlapping value propositions.

How does the creator describe We fingerprinted 178 AI models' writing styles and similarity clusters?

The original author or development team describes the product as follows: "We have a dataset of 3,095 standardized AI responses across 43 prompts. From each response, we extract a 32-dimension stylometric fingerprint (lexical richness, sentence structure, punctuation habi..."

Community Voice & Feedback

docheinestages • Apr 8, 2026

The muted colors on a dark background makes everything hard to read.

apercu • Apr 8, 2026

Has anyone else used LLMs to fact check other LLMS?I hate to say it, but Gemini lies less frequently than paid models from OPenAI and Anthropic (Open AI is worst in my use cases).My guess is that Google has better training data (and uses less synthetic data which might be creating training feedback loops in other models), has more of a "be calibrated" model than a "be helpful" model, but it could just be that they leverage more RAG than leveraging weights more.But, I really shouldn't speculate the "why" as I'm out of my domain. Just curious if others use all the models they can and compare outputs as much as I do.

qaid • Apr 8, 2026

Ugh. subheadings were a major turn off.I expected it to be an analysis of AI-generated writing styles. Not full of them.;)

agomezc01 • Apr 8, 2026

[dead]

kurthr • Apr 8, 2026

It would be shocking to me if the large model trainers didn't have tools like this to analyze their outputs, but this is interesting work!You can see who likely (post)trained/distilled their models or borrowed parameters from each other. I do wonder if the 32 dimensions were chosen/named from principal components or pre-selected and designed, but the tool seems like an effective discriminator in any case.Were the prompts similarly selected for orthogonality? I've wondered how the different LLMs would respond from iterative zero-shot prompt_n generation by summary from a previous response_n to generate zero-shot response_n+1. Would it statistically converge to a more distinguishable prompt for that LLM?

rpdaiml • Apr 8, 2026

[dead]

redox99 • Apr 8, 2026

Besides claiming opus and gemini flash share 99% of style being suspicious, the point that you are wasting money on the expensive model is non sensical. You pay primarily for the intelligence, not the writing style.Is this article AI slop?

leonidasv • Apr 8, 2026

I've always wondered if the "typical" AI writing style is just an unavoidable RL artifact or a deliberate fingerprint to prevent model collapse as low-effort AI-generated text floods the training data pool (the web).

jefftk • Apr 8, 2026

> "Models with >75% writing similarity but massive price gaps. The cheap model writes the same way. You are paying for the brand.* > ...** > Gemini 2.5 Flash Lite Preview 06-17 and Claude 3 Opus: 78.2%*As someone who has tried to use many of these models for writing assistance, you're very wrong here. It really matters whether the model can get what I'm trying to communicate well enough to be helpful, or else I'll just write it myself. If you actually play with them a bit it's very clear these models are not substitutes. This goes for many on your list!

Discovery Source

Hacker News

Aggregated via automated community intelligence tracking.

Tech Stack Dependencies

No direct open-source NPM package mentions detected in the product documentation.

Media Tractions & Mentions

No mainstream media stories specifically mentioning this product name have been intercepted yet.

Deep Research & Science

No direct peer-reviewed scientific literature matched with this product's architecture.