Spec Review Assistantv1.4 · in production 56 days
Executive Dashboard
How this AI workflow is performing, and how much it has improved since launch.
Review accuracy
▲1.2 pts94.1%
12.9 pts since launch
Cost / review
▼2.2%$0.0089
$0.0103 at launch
p95 latency
▼70.0ms1,840ms
per spec clause
Reviewer hours saved / wk
▲9.4%142 hrs
≈ $14.2k labor / wk
Improvement over time
Review accuracy by week. Markers show versions promoted into production.
+12.9 pts
v1.0Apr 22v1.1Apr 29v1.2May 13v1.3May 27v1.4Jun 10
Where the AI still struggles
Accuracy by spec division, the loop targets the weakest slices first.
Recent activity
M. Okafor version promoted
Promoted v1.4 to production after eval pass (+2.3 pts).
M. Okafor review corrected
Corrected fire-rating verdict; added to eval set.
system failure detected
LLM-judge flagged wrong_standard (groundedness 0.41).
D. Vance data reindex
Re-indexed standards library; dropped superseded editions.
M. Okafor version promoted
Promoted citation guardrail.
a.ryan@northforge.com access login
Entra ID SSO sign-in from 10.4.x (East US 2).
Trend and historical metrics on this dashboard are illustrative sample data. Live actions, spec-review runs and the eval benchmark, are genuinely measured when a model key is connected (see Improvements → Test against benchmark).