Gemini Executive Synthesis

CriteriaBot – A Universal Customizable Classifier API and service for nuanced, subjective content classification.

Technical Positioning

A solution for subjective classification problems beyond typical ML use-cases, leveraging a consensus of small, open-weight LLMs, dynamically modified by user feedback, and augmented with external knowledge bases like Wikipedia and Wolfram.

SaaS Insight & Market Implications

CriteriaBot addresses a critical pain point in content moderation and data classification: handling nuanced, subjective criteria where traditional ML often fails. Its architecture, combining a pool of open-weight LLMs with a two-stage factorization machine for consensus and dynamic user feedback, represents a sophisticated approach to achieving high accuracy. Augmentation with Wikipedia and Wolfram enhances factual grounding. The findings regarding Gemma 4 26B's performance relative to Opus 4.8, and LFM2 24B's unique contribution, provide valuable insights into effective LLM ensemble strategies. This service offers a compelling solution for businesses requiring highly customizable and adaptable content classification, particularly where human-like judgment is paramount. The mention of legal obligations for image handling underscores the operational complexities inherent in deploying such services, highlighting a significant barrier to entry for competitors.

Proprietary Technical Taxonomy

Raw Developer Origin & Technical Request

Hacker News Jun 15, 2026

Show HN: CriteriaBot – A Universal Customizable Classifier

I needed a classifier for nuanced, subjective buckets that fell outside of typical ML use-cases (e.g., "is this a spoiler?", "is this factually correct?", "is this user being mean?"). I ended up really happy with the architecture I built to solve it, so I rolled it out as a standalone API and service called CriteriaBot.WHAT IT DOES:You give it content and plain-English criteria. It gives you a true/false verdict on whether the content meets those criteria.HOW IT WORKS:In addition to a traditional classifier, the classification request is routed through a pool of small, open-weight LLMs to achieve a consensus verdict.I built a pre-vote factorization machine that selects a sub-pool of LLMs optimized for signal strength based on the embedding of the subject/category. A second factorization machine then reads the votes and the embedding to arrive at a single verdict. That verdict is dynamically modified based on the user's history of agreement/disagreement with the models in semantically similar evaluations.The models are also hooked up to Wikipedia and Wolfram to support edge cases requiring current information or mathematical grounding.FINDINGS:* With the same harness and sample set, Gemma 4 26B's accuracy is only ~1 percentage point below Opus 4.8.* Pure oracle is theoretically very good - currently ~98% accuracy for the datasets. I'm using the second factorization machine as a combiner as it can theoretically push past oracle results, but it's an interesting fallback.* The single most useful LLM surprised me - LFM2 24B contributes the most to the consensus, despite being the worst individually (of the current pool of LLMs). It correlates the least with the other models (perhaps due to its unique architecture?) which makes it a useful signal for some of the problems.* The legal obligations of handling user-submitted images are... involved. I've disabled image support for non-me users while I sort that out (in case you were hoping to try out "Hotdog, Not Hotdog").* Rails singularizes "criteria" as "criterium" and I didn't realize that was incorrect until it was kind of a lot of work to fix.WHY I'M POSTING: I’d been dealing with burnout for a while, and getting this running has been incredibly rewarding. The majority of people in my personal life are non-technical so it's been hard to get reactions to it beyond "what is it?".Would be thrilled with whatever honest feedback you have.

View Raw Source

Developer Debate & Comments

No active discussions extracted for this entry yet.

Frequently Asked Questions

Market intelligence mapped to CriteriaBot – A Universal Customizable Classifier API and service for nuanced, subjective content classification..

How is CriteriaBot – A Universal Customizable Classifier API and service for nuanced, subjective content classification. positioned in the market?

Based on our AI analysis of the original developer request, its primary technical positioning is: A solution for subjective classification problems beyond typical ML use-cases, leveraging a consensus of small, open-weight LLMs, dynamically modified by user feedback, and augmented with external knowledge bases like Wikipedia and Wolfram.

What architecture is tied to CriteriaBot – A Universal Customizable Classifier API and service for nuanced, subjective content classification.?

Our proprietary extraction maps CriteriaBot – A Universal Customizable Classifier API and service for nuanced, subjective content classification. to adjacent architectural concepts including classifier, nuanced, subjective buckets, ML use-cases, standalone API and service.

Engagement Signals

Upvotes

Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like classifier and Wikipedia by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.