Evaluator in Microsoft commercial marketplace

Evaluator is now discoverable in the Microsoft commercial marketplace

Evaluator is our cognitive AGI decision engine that sits on top of large language models and turns raw model output into deterministic, policy-aligned decisions. With this milestone, Evaluator is now discoverable in the Microsoft commercial marketplace ecosystem, alongside Azure services that our customers already trust.

Evaluator runs as a managed, low-latency SaaS API on Azure, fronted by Azure Front Door at https://api.smsquared.ai, with policy-based steering, auditability, and controls designed for production use.

Production performance snapshot
In recent production smoke tests through Azure Front Door, Evaluator’s /v1/predict endpoint delivered consistently low latency at our default configuration.
p50 latency: ~235 ms
p95 latency: < 260 ms
Error rate: 0%

What “discoverable in the marketplace” actually means

Being discoverable in the Microsoft commercial marketplace is more than a listing badge. It means Evaluator has gone through Microsoft’s validation pipeline for offer structure, purchase flow, and basic compliance, and it is now aligned with the same commercial rails customers use to buy and manage Azure-native services.

Practically, that means customers can:

  • Find Evaluator directly through Microsoft marketplace channels.
  • Route Evaluator usage and billing through their existing Microsoft relationship.
  • Evaluate our service alongside other Azure-aligned building blocks.
Azure-native routing Commercial readiness Enterprise alignment

Evaluator: cognitive AGI for deterministic decisions

Modern AI systems are powerful, but raw model output is not a decision. Evaluator is built for teams that need a clear, explainable answer — not just a token stream.

Evaluator ingests unstructured model outputs (text, scores, vectors) and applies a policy graph that encodes business rules, thresholds, and safety constraints. The result is a final decision that is:

  • Deterministic — the same input and policy produce the same outcome.
  • Explainable — decisions are driven by explicit policy, not hidden weights.
  • Auditable — every decision can be traced back to policy and inputs.

What this means for customers right now

With Evaluator discoverable in the marketplace and fronted by Azure Front Door at api.smsquared.ai, customers get:

  • Low-latency decisioning on top of their existing LLM stack, with median responses around ~235 ms and p95 under 260 ms in our latest smoke tests.
  • Clear deployment posture — Evaluator runs as a managed SaaS API on Azure, with a design that fits existing cloud-native and enterprise architectures.
  • Governance and controls — policy-based steering, configurable vectors, and a clear contract for tiers and usage.

Near-term roadmap

This marketplace milestone is just a step in the product journey. Over the next phase, we’re focused on three visible tracks:

1. Public Evaluator demo

We are preparing a public demo experience so teams can explore Evaluator’s decision engine without a full integration. The demo will showcase how policy-based steering, alpha/gate controls, and vectors work together to shape decisions in real-time.

2. Deeper Evaluator policy and audit features

We’re expanding policy packs, audit views, and introspection tooling so customers can:

  • Inspect how policy changes impact decisions before they go live.
  • Trace full decision paths for compliance and post-incident review.
  • Roll out changes safely with stronger guardrails and change discipline.

3. Continued focus on performance and reliability

Evaluator is built by an engineering-first team that treats latency, correctness, and observability as core product features. We’ll continue to:

  • Track p50 and p95 latency as first-class SLOs.
  • Keep error rates effectively at zero under normal traffic patterns.
  • Refine our decision contract and test harness as we add new capabilities.
Deterministic decisioning Explainability Auditability Latency as a feature

Keeping the engineering bar high

From the beginning, Evaluator has been engineered as a core infrastructure service, not a demo. That shows up in the details:

  • Strict contracts for decision tiers and vector behavior.
  • Smoke and certifier scripts that exercise the same paths customers use.
  • Operational hygiene around incident response, change management, and hardening.

We believe AI systems need this kind of discipline if they’re going to sit in real user-facing flows. Being listed and discoverable in the Microsoft commercial marketplace is a signpost that Evaluator is ready for that responsibility — and we’re just getting started.