Gemini Executive Synthesis

Convexly – a decision-tracking and calibration platform.

Technical Positioning

A tool to improve decision-making by tracking probability estimates, resolving outcomes, and generating calibration curves to identify over/underconfidence, starting with a 2-minute quiz.

SaaS Insight & Market Implications

Convexly addresses a critical need for improved decision-making in both personal and professional contexts. By providing a structured method for tracking probability estimates and generating 'calibration curves,' it offers a data-driven approach to identify and correct biases like overconfidence. This directly impacts strategic decision-making in areas like 'hiring, firing, choosing equipment,' which are core business functions. The underlying mathematical models (Beta-PERT, Kelly criterion, signal detection theory) provide a robust analytical foundation. This product has strong B2B potential as an executive coaching tool, a component of project management, or a training platform for improving organizational forecasting and risk assessment.

Proprietary Technical Taxonomy

Raw Developer Origin & Technical Request

Hacker News Apr 6, 2026

Show HN: I built a 2-min quiz that shows you how bad you are at estimating

I've gotten to the point in my career where I now make strategic decisions often (hiring, firing, choosing what equipment to go with, etc.), as well as in my personal life where I need to strongly weigh my options for a big purchase or investment. I found a not-so-surprising parallel between the two as these decisions "resolved." Am I making good decisions or am I getting lucky?Did some research, read some books, and realized I should get in the habit of tracking my decision process. That quickly turned into the idea that formed Convexly.The landing page is a 10-question calibration quiz where you assign a confidence level to statements drawn from a rotating pool of 100 (working on making the pool larger) and you get a Brier score back instantly. No signup required, and you can share your scores right away.If you find it interesting, you can create a free account where you can track your decisions with probability estimates, resolve them over time, and get calibration curves that show if you are over/underconfident. From what I've seen so far, users are overconfident when they say they're between 70-90% sure about something.For the math: Beta-PERT distributions for the payoff modeling, Kelly criterion for the position sizing, signal detection theory for separating skill from randomness.On the coding side: FastAPI with NumPy/SciPy, frontend in Next.js and Supabase.So far this has been a solo project of mine. If you want to see all the features use code SHOWHN for 30 days of full access, no credit card required.Curious if anything about your score surprised you after taking the quiz.

View Raw Source

Developer Debate & Comments

slothsonaplane • Apr 7, 2026

Brier scoring works on questions with cheap, fast resolution; the strategic decisions you mention (hiring, equipment, big purchases) resolve over months or years, often ambiguously, and the counterfactual never resolves at all. Curious whether the calibration gains from the rapid-feedback quiz actually transfer to the slow-feedback domains the tool is designed to help with, or whether it ends up training a slightly different skill. A second thing: most of my strategic decisions weren't solo, and once one calibrated person sits in a room with two louder uncalibrated ones, the calibration math stops being load-bearing. Have you thought about a team variant?

EForEndeavour • Apr 6, 2026

Apologies if this is off-topic, but having spent more time than I'd like to admit having to create and edit webapps that emerged entirely out of Claude Code, Cursor, Codex, etc. with minimal to no direct code-writing by their human subscribers, this website has strong AI smells:- Inter font- all caps section headers- Lucide icons- em dashes, of course the em dashes- bubble status badges (of course with all-caps "IN PROGRESS" and "COMING SOON" that mean the same thing)- Uncited claims like "Most founders are overconfident in the 70-90% range" and "Most people score between 0.20 and 0.30"- No less than FOUR blog articles all published April 4None of these points is by any means a dealbreaker. And after all, I suppose a product should be judged on its merits and the value it delivers to its users, not on the tools used to create it. But together, the frontend bears the unmistakeable generative AI "smell" that telegraphs that the human(s) directing the tools building this app might be optimizing for speed over rigor and quality (further supported by the volunteer QA/QC happening in the comments), and may only be as good and reliable as the uncritically accepted outputs of a $20/month coding assistant.

gcanyon • Apr 6, 2026

Wait, so roughly is it rewarding being confident when correct, and penalizing being confident when wrong? Meaning that the highest score is only achievable if you answer fully confident true or false, and get all 10 correct?If so, isn't that conflating knowledge with over/under confidence?

addisonl • Apr 6, 2026

> Question: A fair die rolling a 6 twice in a row is more likely than rolling 1-2-3-4-5-6 in sequenceTwo 6s in a row is 1/36 chance (1/6)^21-2-3-4-5-6 is a 1/46656 chance (1/6)^6Website is claiming they are the same probability:> Same probability: 1/46,656 — Both outcomes have exactly the same probability: (1/6)^6 = 1/46,656. This illustrates the representativeness heuristic — random-looking sequences feel more probable than ordered ones.Website's "answer" is wrong: was the question supposed to be rolling a 6 six times in a row?

lorenzohess • Apr 6, 2026

Maybe I don't know enough about "calibration" in a technical sense, but it seems like this quiz cant really distinguish between factual knowledge and calibration skill?Is this type of quiz reproducible for individuals and across various cross-sections of the population?Are there studies on this? Is the quiz based on these studies?

testycool • Apr 6, 2026

I thought it was interesting, but don't appreciate having to give you my email to see full results.I unsubscribe from mails that aren't useful to me day-to-day because they're distracting.Other than that it seems like a cool idea. I'd recommend slightly bigger fonts. I often have this issue with Gemini. Brier Score: 0.216 (lower is better) Diagnosis: Overconfident

reltnek • Apr 6, 2026

I think this might be conflating confidence with accuracy. I tried leaving the slider the the middle (nominally the least confident position) and it gave a score of 0.25 and diagnosed it as 'overconfident'.

iamtedd • Apr 6, 2026

Why do I need to sign up to get the results? Why couldn't it just be on the page?

Evgeniuz • Apr 6, 2026

There’s a bias, I think. When I saw the title that is about how bad I’m at estimating, I’ve leaned towards counterintuitive answers. This got me quite a high score. I think test set should also include intuitive facts (or maybe I was just lucky).

Frequently Asked Questions

Market intelligence mapped to Convexly – a decision-tracking and calibration platform..

What is the technical positioning of Convexly – a decision-tracking and calibration platform.?

Based on our AI analysis of the original developer request, its primary technical positioning is: A tool to improve decision-making by tracking probability estimates, resolving outcomes, and generating calibration curves to identify over/underconfidence, starting with a 2-minute quiz.

How is the developer community reacting to Convexly – a decision-tracking and calibration platform.?

Yes, we have tracked 59 direct responses and active debates regarding this specific topic originating from Hacker News.

What architecture is tied to Convexly – a decision-tracking and calibration platform.?

Our proprietary extraction maps Convexly – a decision-tracking and calibration platform. to adjacent architectural concepts including calibration quiz, Brier score, track decisions, probability estimates.

Engagement Signals

Upvotes

Comments

Cross-Market Term Frequency

Quantifies the cross-market adoption of foundational terms like Next.js and Supabase by tracking occurrence frequency across active SaaS architectures and enterprise developer debates.