← Blog

Why Accurate Medical Coding Matters and How Ember AI Delivers

Lynn Hsing ·

Why Accurate Medical Coding Matters and How Ember AI Delivers

1. What is medical coding, anyway?

Every time a clinician treats a patient, someone has to translate that visit into a short set of numeric “procedure” and “diagnosis” codes so the practice can get paid. If those codes are wrong, insurers reject the claim or under-pay it. Hospitals and clinics typically lose 3–5 % of their revenue to simple coding mistakes — money that could be going toward better staffing, new equipment, or lower patient bills.

2. How good is Ember AI?

Think of Ember AI as a super-focused autopilot for billing:

Metric (macro-average across all specialties)Ember AICertified CodersBest LLM baseline*
Procedure-code accuracy (CPT/HCPCS)93.0 %83.6 %74.3 %
Diagnosis-code accuracy (ICD-10-CM)81.3 %72.3 %61.7 %

*Highest score among latest OpenAI, Claude, and Gemini models.

By comparison, certified human coders average about 84% and 73%, and well-known general-purpose AIs still struggle, often scoring below 60 % on the same tasks. A New England Journal of Medicine AI study summed it up bluntly: “Large language models are poor medical coders.” 

3. Why should you trust those numbers?

  • Big, fair sample. We re-code 10,000+ real patient charts in each specialty, chosen at random so there’s no cherry-picking.

  • Outside experts. Independent auditors with at least five years’ experience do the scoring — and they can’t see Ember’s answers while they work.

  • Double-checks for tricky cases. If two auditors disagree, a third steps in so our “ground truth” isn’t just one person’s opinion.

  • Audit scores for the auditors. Their agreement rate stays above 96 %, which is well beyond typical industry targets.

  • Fresh data, all the time. We add new charts every quarter so the benchmark keeps up with rule changes.

4. The takeaway

  • Coding errors quietly drain billions from healthcare every year.

  • Generic AI can help, but on its own it’s not accurate enough for money-on-the-line billing.

  • Ember already outperforms seasoned human coders and keeps learning every week.

If you manage a clinic, hospital, or billing service and want to see how Ember AI performs on your charts, let us know — we’ll run a no-cost, blind test and hand you the full report.