Nishant

The AI Metric that might surprise you

Before I reveal the metric, let me give you some context.

Before I reveal the metric, let me give you some context.
‍

NonBioS is an AI agent that works as a Junior Software Developer. You give it a high-level task, and it does the work mostly autonomously with occasional guidance, then circles back with updates. As someone managing NonBioS, your work is mostly limited to giving the initial task and following up with "Yes," "OK," or "Confirmed" every couple of minutes as NonBioS navigates the task and keeps you in the loop.

‍
Recently, we launched the billing feature on NonBioS. NonBioS was free to use until this month, but our surging usage forced our hand. The billing feature had three high-level tasks: set up Stripe products, implement usage tracking, and update the UX for subscription nudges.
‍

ALL of these tasks were done by NonBioS. 100% of it. No IDEs were harmed in the process. No wrists were subjected to RSI.
But here's the thing: each of these tasks took NonBioS almost 30 minutes of working time. In that span, NonBioS wrote the code, made design changes, implemented refactors, fixed build errors, and made everything work end-to-end.
‍

The metric that might surprise you is this: Even though NonBioS took only 30 minutes per task, the actual time spent by the developer guiding it was almost 5 hours per task. We know this now because it's the first time we measured it accurately.
‍

Why is this crazy? It proves what we've been thinking for a long time: the bottleneck is now the human. NonBioS took only 30 minutes to complete each task, but the human guiding it spent almost 10 times as much time understanding the work and effectively steering it within our required architectural guardrails.
‍

What did the human do while NonBioS was working? Other tasks. Coffee breaks. Everything in between.
‍

Very soon, NonBioS will handle multiple tasks in parallel. But this problem won't disappear because you can only multitask so much. For the billing feature, there had to be a sequence to the tasks – it wasn't completely parallelizable.
‍

The other interesting metric: an expert developer could do the same task in about 10 hours, or maybe slightly less using an AI-enabled IDE. But they chose not to. The reason? Using NonBioS requires less cognitive effort.

Think of it like commuting to work. Walking takes a couple of hours - that's like writing code manually in a traditional IDE. Driving a regular car is much faster - similar to using an AI-enabled IDE like GitHub Copilot. But with NonBioS, it's like having a fully autonomous car that drives itself while you sit back. The trip still takes about the same time as driving yourself, but the car handles everything while you relax, plan your evening, or take calls. Your cognitive effort drops to almost zero.

OpenRouter is an incredible service powering a fair amount of NonBioS LLM Infrastructure. As our usage has grown, so have our rankings on OpenRouter.

‍