Accepted Answer

September 10, 2025 Score: 48 Rep: 220,789 Quality: Expert Completeness: 50%

When you have two algorithms,

one which is simple to implement, but probably slow, and
one which is probably much quicker, but more complex, and it needs more effort to implement

you always start with the first. Even when you know for sure only the second one will meet your performance requirements, you will still need to implement the first one for the purpose of comparing the results, for testing whether you got the complex implementation of the second one right.

If you are lucky, after you implemented the first (simple) one and tried it out, it may turn out it is already quick enough and you may save the effort for implementing the second one. If you are not so lucky, you need to invest this effort. Maybe it turns out the simple algo is not only useful for comparison purposes, but faster for small inputs, so you may combine both approaches. And if you are really unlucky, the second algo will not meet your performance requirements as well, and you have to look for third approach.

Anyway, given there is a certain complexity involved, you will only find this out by making experiments and measuring.

September 10, 2025 Score: 12 Rep: 46,905 Quality: Expert Completeness: 30%

... I could implement the brute force approach and monitor in production so I'm not prematurely optimizing...

Do this, except not in production. The brute force approach is likely faster to implement and test, but I wouldn't advise you do this in production straight away. You'll want a safe pre-production environment with realistic enough data to feel like the brute force solution won't cause production problems.

Once you have confidence in this solution, deploy to production and monitor.

The challenge is trying to predict how large the dataset will be, because that ends up being the limiting factor with brute-force approaches. Production, at the present moment, is your best comparison.

So, let' say you deploy this and start monitoring. When is there a problem? You need to define a threshold for performance and then proactively notify developers when runtime exceeds that threshold while the system still responds quick enough to avoid an emergency patch to production.

I would run some experiments to get a feel for how well the brute force approach scales; when does this become untenable? If all you have is production data, then consider simply copying and pasting data to make the dataset bigger. I like to make orders of magnitude jumps in size as a first rough test: increase the amount of data 10x for each jump.

So, test 1 involves 100 vector embeddings. Test two would be 1,000 vector embeddings, and then 10,000 vector embeddings, etc. until you hit an obviously terrible panic-inducing level of performance.

If you find the brute force approach handles 1,000 reasonably well then this indicates it will suffice. If 1,000 vector embeddings causes your computer to fall over, then consider dialing things back: how does 5x perform? How about 3x or 2x?

If the brute force approach really can't handle much more than production already has, then this could indicate you need to invest time upfront in a more complex but efficient solution.

September 11, 2025 Score: 4 Rep: 62,377 Quality: Medium Completeness: 20%

In practice you have to implement both algorithms and measure in order to determine the "break even"-point. In theory it could be calculated if you know the cost of each individual operation, but this gets incredibly complex. For example some operations become slower if the dataset exceed the size of a memory page. The reason big-O analysis is feasible is because we ignore the cost of individual operations and just look at the growth in number of operations.

The recommended approach is to always implement the simplest algorithm first and then only optimize if performance turns out to be a problem with realistic input. It might never be an issue.

The fancy solution is to support both algorithms, and select the optimal one based on the input size. But you should only do this if it gives a measurable real-world benefit, since it gives you twice as much code to debug and maintain.

September 11, 2025 Score: 4 Rep: 9,339 Quality: Low Completeness: 20%

Make a prototype, favoring simplicity and correctness over efficiency.
It must still be fast enough for the most basic constructed test though.

Test it until it works, then scale up the test until it breaks the true performance goals or gets safely beyond peak production workload.

If you have to replace it with something more efficient and complex, you now have a known good test-suite to find the inevitable bugs.

Either way, you might have to add a guard against DOS attacks if you cannot rely on an existing one.

September 11, 2025 Score: 2 Rep: 31,152 Quality: Low Completeness: 40%

I know that the most efficient way of doing this is an Approximate nearest neighbor search (ANN) but as far as I can tell all ANN algorithms have overhead

One thing that you will need to do to make sense of this is to distinguish between space and time costs. There's often a tradeoff between the amount of memory required for an algorithm and its performance in terms of time. You need to consider these costs separately in order to get anywhere. For example, you might find an approach with excellent performance but the memory required exceeds what you can have or can afford.

Another aspect of this to consider is there are different kinds of time complexity and which kind matters depends on your specific constraints and requirements.

September 11, 2025 Score: 2 Rep: 153 Quality: Low Completeness: 50%

Eternal calibration

This is a very lucky case where you can implement both and approximately always choose the best one. by simply running both in parallel in the first runs.

Because the data is immutable and the computation has no side effects, running both at same time will not interfere in each one, besides wasting CPU time while the fast one still runs. After the fast one returns, setting up a flag in the other will interrupt the long lasting other.

And by remembering what conditions make one or another fast, it's possible to choose one or another dynamically, so the performance will almost always be optimal.

Say, the remembering bits start empty, and will run both versions in parallel. After some runs, it's possible to determine what conditions make one "always" fast, and under what conditions they run almost the same. After this statistic in place, then the code starts running only one version per request, and always will be a version that:

Is faster; or.
Has the same performance as the other.

Let some small percent of requests run in parallel, even after the statistic is formed, to allow a dynamic calibration of said statistic.

September 26, 2025 Score: 1 Rep: 49,610 Quality: Low Completeness: 10%

In case you use identical data for both methods and can decide at the time of the call which method to use: Measure the execution time of both methods, depending on n. Then at the time of the call which method to use.

Say method 1 takes n nanoseconds, and method 2 takes 10,000 + sqrt(n) nanoseconds. (1) will be faster or close for n

September 17, 2025 Score: 0 Rep: 59,591 Quality: Medium Completeness: 70%

Is there a way to determine the overhead cost of algorithms,

Theoretically, yes. This is a deterministic process. However, part of the consideration here is the specific hardware you're using, and the profile might change when the hardware does.

For that reason, I would err on the side of not trying to theorize, and instead just run and benchmark it.

or is it best to implement both and do some performance benchmarks to determine which approach I should use?

Which brings us to this solution.

There's a few things to consider here. A sequence of questions would be:

At what point does option A start outpacing option B, and when does it not?
Realistically, are we expecting to most commonly be working in the range where one solution is decidedly superior?
For the cases (rare if not) where we end up in the other range, can we live with the inefficiency (rare if not) or would we rather invest the time to also build the other option and use that when it's more optimal?

Based on those questions, if you're happy with a one-approach solution, pick the one that benchmarks best for your expected use cases. If you'd prefer to implement both solutions so that you can always run the most optimal agorithm, develop both and toggle between them using e.g. the strategy pattern and a smart factory that knows which strategy to pick when.

September 12, 2025 Score: -1 Rep: 1,406 Quality: Low Completeness: 30%

Only thing that matters is when the simple algorithm becomes too slow

Predicting the performance of the complex algorithm can be difficult without implementing it. But you can easily test how fast the simple algorithm is. It doesn't really matter how much faster you could potentially make it, as long as the implementation meets the requirements.

Then just make a decision whether it's a tolerable wait for the user with the largest data set you could expect to be loaded. A common rule of thumb is that if a process takes more than 5 seconds, you should have a progress bar and a way to cancel.

In many types of algorithms, you can get approximate results even if the process is terminated part-way through. For example, it is useful to let the user know that 5 matching vector pairs have already been found when 15% of the potential pairs had been checked - if they didn't expect any matches, they don't need to wait till the end. Though it depends a lot on the application how much detail the user should be given.

How to choose between brute force and efficient solution that has overhead?

Question Details

Tags

Answers (9)

Eternal calibration

Analysis Metrics

Question Information

Actions

Related Questions

Export Question Data