What does Agam Arora believe about reduce before evaluating. the 8-year operator discipline.?

Strip the marketing label, name the underlying mechanism, ask whether the mechanism is genuinely new. The substrate test has returned the same verdict across blockchain, no-code, AI, and anti-customization.

Reduce before evaluating. The 8-year operator discipline.

There is a moment in every hype cycle where the honest operator has to say something unpopular about a category they are actively working in. That moment is the substrate test. Strip the marketing label, name the underlying mechanism, ask one question: is this mechanism genuinely new, or is it a rebranded version of something already understood? If the mechanism is new, the excitement is earned and worth building against. If it is familiar wearing different terminology, refuse the rebrand in writing. Regardless of how much financial exposure you have to the category being evaluated.

Every technology hype cycle runs the same sequence: new label, elevated energy, market formation, substrate test, verdict. The operator's job is to run the substrate test before the market does. Reduce every hyped category to its substrate. If the substrate is genuinely new, the excitement is earned. If not, refuse the rebrand.

This discipline has run against four major waves since 2018. The verdicts differ. The method does not.

The 8-year through-line

2018: Blockchain. The market was pricing blockchain as a transformation of trust infrastructure. The substrate test returned: database innovation first. Not wrong in kind, wrong in scope. "No one wants to accept the biggest database innovation as just a database innovation." The category was real at the mechanism layer. The extrapolations were not. That distinction was documented publicly while actively selling a blockchain product. Inside-the-category honesty is the only version of this discipline that counts. (Saying the category is overhyped when you have nothing at stake is not discipline. It is positioning.)

2020: No-code. The substrate test returned: abstraction-layer tooling. Genuine for specific use cases. Not a category that eliminated engineering. The "super customizable" product pitch. The one that hid a six-month implementation tax behind the word "flexible". Failed the substrate test at the product level. That specific variant was deprecated from PRD scope before it was sold into enterprise accounts.

2023: GenAI. The substrate test returned: a platform that radically reduces the cost of cognition. That is a genuinely new mechanism. The substrate passes. The excitement is earned. The claim that GenAI would eliminate PM roles, replace engineering teams, or produce production-grade output without context infrastructure: failed the substrate test. Execution is cheap. Taste is not.

2025: Anti-customization. "Highly configurable" as a product attribute. The substrate test returned: a six-month implementation tax, an implementation partner engagement, and hundreds of training documents before day-zero go-live. Refused at the spec layer. Documented in writing. The category named itself as flexibility and customer-centricity. The mechanism was deployment complexity transferred to the customer.

The substrate test as a single-move procedure

Strip the marketing label. Name the underlying mechanism. Ask whether that mechanism is new.

The reduction is complete when it fits in one declarative sentence. "A database innovation first." "A platform that radically reduces the cost of cognition." "LLMs with tools and memory creating a new serving layer." If the reduction requires a paragraph, the substrate has not been found yet. Keep reducing.

The test is neutral. It does not presuppose skepticism. GenAI passes. The test forced evaluation on the correct variable: is the mechanism new. It returned yes. The agent-first thesis follows directly: if LLMs with tools and memory create a genuinely new serving layer, the architectural implication is that the agent surface precedes the human UI in enterprise system design. That is a substrate claim, not a label claim. It survives the test because the mechanism is real.

The test also enforces a conditional structure for capability claims. The condition under which a category claim gets evaluated as credible is observable and specified in advance, not constructed after the fact. Specifying the condition before the evidence arrives prevents post-hoc rationalization. The operator who cannot state the condition under which they would believe a capability claim is running on sentiment, not analysis.

Operator discipline at B2B scale

The discipline has specific consequences at B2B scale. Enterprise procurement cycles are 6 to 18 months. A technology bet that fails the substrate test mid-cycle does not produce a graceful exit: it produces a sunk-cost negotiation, a contract dispute, or a failed implementation that consumes engineering resources for months after the category verdict is obvious.

Evaluating the substrate before the pilot saves that cost. The 80% of enterprise AI experiments that do not reach production failed at the non-functional layer, the governance layer, and the data-pipeline layer. Not at the model layer. The substrate test applied at the architecture phase, not the post-demo phase, is how the 20% that ships gets separated from the 80% that does not.

The question this method does not answer

What I find harder is the false negative. The substrate test is good at filtering rebrands. It is less reliable at catching genuinely new mechanisms that are dressed conservatively. The substrate claim that undersells. The blockchain case was the opposite problem: oversell. I have been wrong in the other direction too, in categories I was too quick to reduce to "abstraction layer" before the mechanism had fully emerged. The method is sound. The practitioner running it is not infallible. Running the test in writing, with a stated falsification condition, is the only hedge against the practitioner's own priors distorting the verdict.