Validation flags reference — all flags the pipeline can return×
validationFlags[] = hard issues · validationWarnings[] = soft flags.
FLAG surface to nurse
WARN soft / advisory
INFO informational only
| Flag / Warning | Applies to | Type | What it means |
|---|---|---|---|
EXPIRED | All | FLAG | Document is past its expiration date. |
POSSIBLE_ALTERATION | All | WARN | AI detected signs of tampering: font inconsistencies, pixel artifacts, white-out, copy-paste edges, or photo manipulation. Check alterationDetails for specifics. |
UNDER_18 | Gov ID | FLAG | Extracted DOB places person under 18. |
IMPOSSIBLE_DOB | Gov ID | FLAG | DOB is in the future or implies age >120. Likely an OCR misread. |
AGE_OVER_100 / AGE_UNDER_16 | Gov ID | WARN | DOB produces an unusual age — probable OCR digit error. Manual review. |
EXPIRING_SOON | License | WARN | License expires within 30 days. |
UNRECOGNIZED_DOCUMENT | License | FLAG | AI couldn't classify the document type. May be a wrong upload or blurry image. |
POSITIVE_RESULT | TB | FLAG | TB test result is positive. A chest X-ray or clinical follow-up is required. |
READ_WINDOW_VIOLATION | TB skin | FLAG | PPD was read outside the required 48–72 hour window. Test is invalid. |
2_STEP_INTERVAL_TOO_SHORT | TB skin | FLAG | Step 2 placed fewer than 7 days after step 1. CDC protocol requires ≥7 days between steps. |
2_STEP_INCOMPLETE_MISSING_STEP_2 | TB skin | FLAG | 2-step form with only step 1 data present. Nurse needs to upload the complete form. |
NO_PROVIDER_NAME_OR_SIGNATURE | TB | WARN | No nurse, reader, or physician on the document. Lab-issued QuantiFERON/T-SPOT reports with a facility name do NOT trigger this. |
NO_ACTIVE_TB_STATEMENT_NOT_FOUND | TB X-ray | WARN | No TB-negative phrase found in the radiology report. Standard phrases ("no acute infiltrate", "lungs are clear", etc.) are auto-detected. |
XRAY_INDICATION_NOT_TB_RELATED | TB X-ray | WARN | X-ray indication is unrelated to TB screening (e.g., "SOB, wheezing"). X-ray may be clinically valid but wasn't ordered for TB. |
PHYSICAL_MISSING_DATE | Physical | FLAG | No examination date extracted. Required to verify the physical is current. |
DATE_LOW_CONFIDENCE | Physical | WARN | Exam date extracted with low confidence — often a handwritten year where "5" and "3" look similar. Manual review recommended. |
typeMatch: false | All | WARN | AI detected a different document type than what was selected. Check typeMatchDetails. Selection is a hint only, never a hard block. |
Pipeline summary — what the extraction stamps on every document×
Each capability runs automatically. Severity: BLOCKS = backend must reject WARNS = soft flag for backend to act on INFO = returned, no action required
| Capability | Applies to | Severity |
|---|---|---|
| Doc type classification — pipeline re-identifies doc type independently; nurse's selection is a hint only, never a hard gate | All | INFO |
| Expiry / validity check — hard block on expired docs; warn on expiring within 30 days (licenses); type-specific windows: TB PPD/blood 1yr, X-ray 5yr, CPR 2yr | Gov ID · License · TB · CPR · Vaccines | BLOCKS WARNS |
| Alteration & fraud detection — font anomalies, pixel artifacts, white-out; Google ID Proofing signals run separately on gov IDs | All | WARNS |
| Multiple docs in one upload — detected when nurse photographs two cards side by side | All | WARNS |
| Identity / name matching — exact, fuzzy (1–2 char diff / OCR noise), partial (first only or last only), no match | All | WARNS INFO |
| Dual-model confidence scoring — Claude + Gemini 3.5 Flash both read the doc; agreement per field raises score, disagreement lowers it | All | INFO |
| OCR cross-check — raw Cloud Vision scan verified against model output per field; match boosts confidence, mismatch reduces it | All (non-PDF) | INFO |
| Per-field confidence scores (0–100) — returned on every field; backend decides auto-approve vs. flag threshold | All | INFO |
| Age validation — impossible DOB (future or age >120) hard blocked; under-18 and over-100 flagged as probable OCR errors | Gov ID | BLOCKS WARNS |
| 2-step PPD protocol compliance — step 2 < 7 days after step 1 blocked (CDC violation); step 1 only blocked (incomplete); 1-step form auto-upgraded if step 2 date present; read window 48–72h validated | TB skin | BLOCKS WARNS |
| Lab report enforcement — QuantiFERON / T-SPOT must come from an actual lab report; physical exam or immunization records rejected; expected lab values (antigen, mitogen, nil) must be present | TB blood | BLOCKS WARNS |
| Chest X-ray clearance statement — radiology report must explicitly state "no active TB"; absence flagged | TB X-ray | WARNS |
| State-specific normalization & exceptions — license type codes normalized (e.g., OH "State Tested Nurse Aide" → CNA); IL/NE CNA no exp date or number expected; AL CNA 2yr expiry auto-computed from issue date | License | INFO |
| PA CNA enrollment phrase check — required phrases confirming PA Nurse Aide Registry enrollment must appear in the document | License (CNA-PA) | WARNS |
| Vaccine immunity determination — returns immune / not immune / exempt / unknown per vaccine; MMR requires all 3 components (Measles, Mumps, Rubella); MMRV satisfies both MMR + Varicella; declinations always rejected | Vaccines | BLOCKS INFO |
| Missing date auto-computation — CPR: 2yr expiry from issue date when not printed; 1-step PPD auto-upgraded to 2-step when step 2 date is present but form says 1-step | CPR · TB skin | INFO |
What the extraction pipeline handles today×
Every item below is logic that runs automatically on every upload — no human review needed unless flagged. BLOCKS = hard stop, document cannot be accepted. WARNS = soft flag passed to backend. INFO = computed/detected, no action needed.
All document types
| BLOCKS | Wrong document type uploaded (e.g. nurse selected "RN License" but uploaded a passport) — detected and flagged |
| WARNS | Signs of tampering or image manipulation detected on the document |
| WARNS | Multiple documents detected in a single upload (e.g. photo of two cards side by side) |
| INFO | Confidence score returned per extracted field (0–100) — backend can use to set review thresholds |
| INFO | Two independent models (Claude + Gemini) extract every document — fields where they agree get a confidence boost; conflicts are flagged for review |
| INFO | OCR text scan independently cross-checks every extracted field — mismatches reduce confidence score |
Government ID
| BLOCKS | Impossible date of birth detected (age over 120, or future date) |
| WARNS | ID is expired |
| WARNS | Nurse appears to be under 18 |
| WARNS | Nurse appears to be over 100 (possible OCR error on DOB) |
| WARNS | Google Document AI fraud signals detected (image manipulation, suspicious marks) |
| INFO | Nurse's pre-selected ID type (e.g. "Driver's License") treated as hint only — the actual document type detected by the pipeline is the source of truth, never blocks |
| INFO | Nurse's age at time of extraction calculated from DOB and returned |
Nursing License / Certification
| BLOCKS | License is expired |
| WARNS | License expires within 30 days |
| WARNS | PA CNA: uploaded a Notice of Enrollment but one or more of the 3 required phrases is missing from the document text |
| INFO | State-specific license type names normalized to standard codes (e.g. Ohio "State Tested Nurse Aide" → CNA; original label also saved) |
| INFO | Alabama CNA: 2-year expiration auto-calculated from issue date (AL does not print expiration on license) |
| INFO | IL, NE, NB CNA: no expiration date required — absence of expiration is not flagged as missing |
| INFO | AL, IL, NE, NB CNA: no license number required — absence of license number is not flagged as missing |
Name verification (cross-document)
| INFO | Exact match: name on document matches account name exactly |
| INFO | Fuzzy match: small spelling differences (accent marks, OCR slips, 1-2 character errors) still count as a match |
| WARNS | Partial match: first name matches but last name doesn't (or vice versa) |
| WARNS | No match: neither first nor last name matches account — possible name change or wrong document |
TB Test
| BLOCKS | 2-step PPD: Step 2 was administered fewer than 7 days after Step 1 (protocol violation — results invalid) |
| BLOCKS | 2-step PPD uploaded with Step 1 present but Step 2 missing — incomplete submission |
| BLOCKS | QuantiFERON or T-SPOT uploaded but document is a physical form or immunization record, not a lab report |
| BLOCKS | TB test is expired (PPD/blood test: 1 year; Chest X-Ray: 5 years) |
| WARNS | PPD was not read within the required 48–72 hour window after placement |
| WARNS | Positive TB result detected |
| WARNS | No provider name or signature found on the document |
| WARNS | Chest X-Ray: document does not contain a statement confirming no active TB |
| WARNS | QuantiFERON/T-SPOT: lab values not found on the document |
| WARNS | 2-step PPD: more than 30 days between Step 1 and Step 2 (possible date transcription error) |
| INFO | 1-step PPD with a Step 2 date present is automatically re-classified as 2-step |
CPR / BLS
| BLOCKS | Document is a First Aid card only — does not include CPR or BLS |
| BLOCKS | CPR is expired (2-year validity from issue date if no explicit expiration printed) |
| BLOCKS | Neither an expiration date nor an issue date can be read — manual review required |
| WARNS | Document does not contain "CPR" or "BLS" keywords (unexpected format) |
| INFO | 2-year expiration auto-calculated from issue date when no expiration date is printed on the card |
Vaccines — MMR, Varicella, Tdap
| BLOCKS | Document contains no usable vaccine data — no doses and no titer results |
| BLOCKS | MMR titer: one or more of Measles, Mumps, Rubella components is missing from the lab report — cannot confirm full immunity |
| BLOCKS | MMR titer: one or more components is explicitly non-immune (negative result) |
| BLOCKS | Expected vaccine (e.g. MMR) not found anywhere in the document |
| BLOCKS | Declination form uploaded — these are never accepted as proof of immunity |
| WARNS | MMR titer: one or more components is equivocal or indeterminate — manual lab review required |
| WARNS | Medical exemption: signed but not by a licensed clinician (MD, NP, PA, etc.) |
| WARNS | Exemption form does not list MMR (or "all vaccines") — may be for a different vaccine |
| WARNS | Physical exam form uploaded but MMR status is ambiguous or marked "Due" — cannot confirm immunity |
| INFO | Document classified as: vaccination record / titer report / exemption / declination / physical form |
| INFO | MMRV combo vaccine satisfies both MMR and Varicella requirements from a single document |
| INFO | When multiple documents submitted for MMR (e.g. separate measles, mumps, rubella titers), results are aggregated — most recent and best result per component is used |
| INFO | Final immunity result returned: immune / not immune / exempt / unknown — ready for backend to act on |
Live test results — SHFT-4882 Government ID (2026-05-26)×
Document:
GovID_Ohio-Drivers-License_Crystal-Mae_Stover_Exp-2029-07-01.jpg sent through POST /api/extract-compare| What was being verified | Result |
|---|---|
| The nurse's pre-selected ID type is treated as a hint only — the pipeline identifies the document type independently | Correctly identified as Driver's License ✓ |
| Date of birth validated — impossible ages flagged | Age = 37, no impossible-age flag ✓ |
| Tampering/alteration detection runs on every upload | No alterations detected (no false positive) ✓ |
| Single document uploaded — multi-document flag should not fire | Multiple documents = false ✓ |
| Clean document — no validation error or warning flags returned | No flags returned ✓ |
Live test results — SHFT-4950 License/Certification (2026-05-26)×
4 real nurse license documents sent through
POST /api/extract-compare| Document | What was being verified | Result |
|---|---|---|
| Illinois RN License NursingLicense_Illinois-Registered-Professional-Nurse-RN-License_Alyissia_Sims_Exp-2026-05-31.png | All core fields extracted: license type, number, expiration, name, state | RN, number extracted, exp 05/31/2026, state IL ✓ |
| "REGISTERED PROFESSIONAL NURSE" normalized to RN while keeping original label | "REGISTERED PROFESSIONAL NURSE" → RN ✓ | |
| Expiring within 30 days — warning flag should fire | EXPIRING_SOON warning returned ✓ | |
| Illinois CNA Registry Printout NursingLicense_Illinois-Health-Care-Worker-Registry-CNA-Verification_Tamika-Nicole_Wilson_Active-2026-03-12.pdf | CNA fields extracted from state registry printout | CNA, number, exp 03/12/2026, issue date 05/11/2009, state IL ✓ |
| PDF input handled correctly | PDF processed without error ✓ | |
| Ohio STNA Card NursingLicense_Ohio-State-Tested-Nurse-Aide-CNA-Card_Misty_McKee_Issued-2025-08-20.jpg | Ohio "State Tested Nurse Aide" normalized to CNA | "State Tested Nurse Aide" → CNA ✓ |
| OH CNA cards have no expiration — only issue date returned | Issue date 08/20/2025, expiration empty (correct) ✓ | |
| Pennsylvania CNA Wallet Card NursingLicense_Pennsylvania-CNA-Nurse-Aide-Registry-Card_Keniesha_Porter_Exp-2027-05-20.jpg | PA CNA card fields extracted | CNA, number, exp 05/20/2027, state PA ✓ |
| PA notice validation should only fire on Notice of Enrollment, not a wallet card | PA notice validation = not triggered (correct) ✓ |
All 4 docs: correct type detected, no tampering flagged, no duplicate documents ✓
Live test results — SHFT-4952 TB Test (2026-05-26)×
Document:
TB_Chest-Xray-Radiology-Report_REDACTED_2025-04-28.png sent through POST /api/extract-compare| What was being verified | Result |
|---|---|
| Document correctly classified as a Chest X-Ray | Type = Chest X-Ray ✓ |
| Chest X-Ray expiration auto-calculated as X-Ray date + 5 years | Expiration = 04/28/2030 (from X-Ray date 04/28/2025) ✓ |
| Document contains "no active TB" clearance statement | "No acute pulmonary findings." detected ✓ |
| No provider signature found — missing-signature warning should fire | NO_PROVIDER_NAME_OR_SIGNATURE warning returned ✓ |
| Single document uploaded — multi-document flag should not fire | Multiple documents = false ✓ |
×
SHFT-4882 — AI Extraction Pipeline: Government ID (P1_upload_id)
Open in Jira → | Field Registry (P2–P11) →
What this demo covers:
Each criterion links to a clickable demo sample above.
Open in Jira → | Field Registry (P2–P11) →
What this demo covers:
Each criterion links to a clickable demo sample above.
Live API verification — 2026-05-26
Doc:
Doc:
GovID_Ohio-Drivers-License_REDACTED_Exp-2029-07-01.jpg → POST /api/extract-compare
| Criterion | Field returned | Value |
|---|---|---|
| P0B is hint only — AI classification source of truth | claude.typeMatch | true (DRIVERS_LICENSE) |
| P5 DOB validated for impossible dates | claude.ageAtExtraction / impossibleAgeFlag | 37 / null |
| Alteration detection runs on every extraction | claude.hasVisibleAlterations | false (no false positive) |
| Multi-document detection | claude.multipleDocumentsDetected | false |
| Backend flag arrays (clean doc → null) | documentai.validationFlags / validationWarnings | null / null |
| Acceptance Criterion | Status | Test Scenario |
|---|---|---|
| Accepts US passport, driver's license, learner's permit, temporary license, resident card | ✓ | Passport (REDACTED), Resident Card (REDACTED), State ID (REDACTED), DL (REDACTED, REDACTED) |
| Rejects non-accepted document types with error message | ◐ | Dan: Pipeline detects wrong doc type and flags TYPE MISMATCH. Dmitry: Backend reads the flag and returns the rejection message to nurse: "We couldn't identify this document. Please upload a valid U.S. passport, driver's license, learner's permit, temporary license, or resident card." |
| Detects expired documents and returns error | ◐ | Dan: Pipeline extracts expiration date and flags EXPIRED. Dmitry: Backend reads the flag and returns rejection: "This document is expired. Please upload a current, unexpired document." |
| Extracts all fields P2–P11 (first name, last name, suffix, DOB, expiration, issue date, document #, address, sex, middle name) | ✓ | REDACTED IL State ID → all 13 fields at 100% agreement |
| P0B_id_type (nurse's pre-selected doc type) is a hint only — AI classification is source of truth, mismatch never blocks | ✓ | Passport uploaded as "Driver's License" → AI detects passport, no block |
| P5 (date of birth) validated for impossible dates; under-18 triggers blocking flag | ✓ | DOB validated on every extraction. ageAtExtraction computed (REDACTED IL DL → 29). Flags: AGE_OVER_100, AGE_UNDER_16 for manual review. Under-18 → blocking flag. |
| P11 (middle name) extracted where present | ✓ | REDACTED MO DL → "REDACTED" extracted as middle name |
| Passport date formats (DD MMM YYYY) normalized to MM/DD/YYYY | ✓ | REDACTED passport → dates normalized in output |
| Handles both portrait and landscape orientations | ✓ | REDACTED GA DL (portrait), REDACTED Resident Card (landscape) |
| Confidence scores logged per field for threshold tuning (A6 = confidence config) | ✓ | Every extraction shows per-field scores (visible in results table) |
| Alteration detection: font inconsistencies, pixel artifacts, copy-paste edges, white-out, photo tampering | ✓ | Check runs on every extraction (hasVisibleAlterations field). Current test docs are real submissions and don't trigger it — need purpose-built tampered samples to validate the detection path |
| Readability check fails gracefully with message prompting clearer upload | ✓ | Low-quality images return graceful error with re-upload prompt |
| OCR cross-validation as independent third check | ✓ | REDACTED physical → OCR confirms/denies each field value independently |
| P2/P3 (first name / last name) cross-checked against A1/A2 (account first name / last name) for name mismatch detection | ◐ | Dan: Extraction pipeline already returns P2_first_name & P3_last_name from the document. Dmitry: Backend needs to compare extracted P2/P3 against the nurse's account fields A1_first_name/A2_last_name, handle nickname/suffix/hyphenation fuzzy matching, and trigger the mismatch flag + resolution options (update account name or upload proof of name change via FL-09). Also needs to check ALT-1/ALT-2 (alternate names on file) before flagging. |
| Name mismatch resolution options (nurse can update account name to match ID, or upload proof of name change) | — | Backend/UI flow — Dmitry scope |
| Document status flow (In Review → Verified → Needs Attention) | — | Backend state machine — Dmitry scope |
| Re-upload replaces previous, re-extracts all fields | — | Backend persistence — Dmitry scope |
| Duplicate detection (same doc uploaded by different accounts) | — | Backend dedup logic — Dmitry scope |
✓ Done in this demo
◐ Extraction ready, backend integration pending
— Backend/Dmitry scope
Beyond ticket scope (bonus):
• Dual-model comparison (Claude + Gemini) for higher confidence
• Google Cloud Vision OCR as independent cross-check
• Per-field OCR corroboration with confidence adjustment
• Image compression & optimization (sharp: auto-resize, EXIF rotation)
• SHA-256 caching with 24h TTL
• Rate limiting (separate AI + OCR tracking)
• Dual-model comparison (Claude + Gemini) for higher confidence
• Google Cloud Vision OCR as independent cross-check
• Per-field OCR corroboration with confidence adjustment
• Image compression & optimization (sharp: auto-resize, EXIF rotation)
• SHA-256 caching with 24h TTL
• Rate limiting (separate AI + OCR tracking)
×
SHFT-4950 — AI Extraction Pipeline: License/Certification (L1_license_upload)
Open in Jira → | Field Registry (L2–L8) →
What this demo covers:
Each criterion links to a clickable demo sample above.
Open in Jira → | Field Registry (L2–L8) →
What this demo covers:
Each criterion links to a clickable demo sample above.
Live API verification — 2026-05-26 (4 docs ×
POST /api/extract-compare)
| Doc | Criterion proven | Evidence |
|---|---|---|
IL RN (REDACTED)license-rn | L2/L3/L4/L5/L6 all extracted | licenseType=RN, L3=REDACTED, L4=05/31/2026, L5=REDACTED T REDACTED, L6=IL |
| License type normalization | "REGISTERED PROFESSIONAL NURSE" → RN (rawLicenseTypeLabel preserved) | |
| Expiring-soon warning fires (<30 days) | documentai.validationWarnings: ["EXPIRING_SOON"] | |
IL CNA (REDACTED)license-cna | CNA registry verification extraction | licenseType=CNA, L3=REDACTED, L4=03/12/2026, L8=05/11/2009, L6=IL |
| PDF input handled | .pdf processed without error | |
OH STNA (REDACTED)license-cna | State-specific abbreviation normalized (STNA → CNA) | rawLicenseTypeLabel="State Tested Nurse Aide" → licenseType=CNA |
| L8 issue date extraction (OH CNA has no expiration) | issueDate=08/20/2025, expirationDate=null (correct — OH CNA card) | |
PA CNA (REDACTED)license-cna | PA CNA registry card extraction | licenseType=CNA, L3=REDACTED, L4=05/20/2027, L6=PA |
| paCnaNoticeValidation only fires on Notice of Enrollment | paCnaNoticeValidation=null (correct — this is a card, not Notice) | |
| All 4: typeMatch=true, hasVisibleAlterations=false, multipleDocumentsDetected=false ✓ | ||
| Acceptance Criterion | Status | Test Scenario |
|---|---|---|
| Accepts: state nursing board certificates, wallet cards, online verification printouts, screenshots of state board portals | ✓ | IL RN License (REDACTED), PA CNA Wallet Card (REDACTED), IL Registry Search Printout (REDACTED), OH CNA Card (REDACTED) |
| Rejects non-accepted doc types with error message | ◐ | Dan: Pipeline detects UNKNOWN documentType and flags TYPE MISMATCH. Also detects license subtype mismatch (e.g., selected LPN but uploaded CNA). Dmitry: Backend sends rejection message to nurse. |
| Extracts L2 (license type), L3 (license number), L4 (expiration date), L5 (full name), L6 (state), L8 (issue date) | ✓ | All license demo samples → fields extracted with per-field confidence |
| State-specific license type abbreviations (GNA, STNA, TMA, CMA, CMT, QMA, LNA) normalized to internal codes | ✓ | OH "State Tested Nurse Aide" (REDACTED) → normalized to CNA. rawLicenseTypeLabel preserves original. |
| License name formats vary by state — AI handles all common formats | ✓ | Prompt handles "Last, First M.", "First Middle Last", "LAST, FIRST MIDDLE" etc. |
| Passport-style and non-standard date formats normalized to MM/DD/YYYY | ✓ | All dates normalized in extraction prompt |
| Issue date (L8) identified regardless of labeling ("date issued", "effective date", "date of certification") | ✓ | Prompt lists all common label variants; MO CNA (REDACTED) uses "date of completion" |
| Detects expired licenses (L4 in past) + warning message | ◐ | Dan: Pipeline flags EXPIRED with warning text. Dmitry: Backend returns warning to nurse. |
| License expiring within 30 days returns warning | ◐ | Dan: Pipeline flags EXPIRING_SOON with date. Dmitry: Backend returns warning to nurse. |
| Handles portrait and landscape orientations | ✓ | Wallet cards (landscape) vs certificates (portrait) both handled |
| Confidence scores logged per field for threshold tuning | ✓ | Every extraction shows confidencePerField in results |
| Alteration detection (font inconsistencies, pixel artifacts, white-out, tampering) | ✓ | hasVisibleAlterations checked on every extraction |
| Readability check fails gracefully with re-upload prompt | ✓ | Low-quality images return graceful error |
| BACKEND SCOPE (Dmitry) | ||
| License type matching (L2 vs A7) — mismatch prompt + two options | — | Backend compares extracted L2 against A7_license from account creation |
| State matching (L6 vs A5) — CNA must match, LPN/RN flexible | — | Backend compares L6 against A5_license_state |
| Name matching (L5 vs A1/A2) + ALT-N lookup + FL-09 flag | — | Backend cross-checks names, only proof-of-name-change path (no A1/A2 update from license) |
| CNA exceptions: AL/IL/NB no license number; IL/NB no expiration; AL calculated expiration (issue+24mo) | ✓ | IL CNA (REDACTED) → stateException: IL_NO_EXPIRATION_REQUIRED. AL: calculatedExpiration = issueDate+2yr. Pipeline skips extraction for these fields per state rules. |
| PA CNA: validate Notice of Enrollment (3 text checks) | ✓ | Pipeline extracts documentText and checks 3 required phrases (Commonwealth/Dept of Health, nurse aide training completion, Nurse Aide Registry enrollment). Returns paCnaNoticeValidation with per-phrase results. |
| Document status flow (In Review → Verified → Needs Attention) | — | Backend state machine |
| Re-upload replaces previous, re-extracts all fields | — | Backend persistence |
| Duplicate detection (CNA numbers unique across accounts) | — | Backend dedup logic |
| Nursys mapping (LPN→PN, RN→RN) + downstream verification | — | Separate ticket — consumes L2, L3, L6 from this pipeline |
✓ Done in this demo
◐ Extraction ready, backend pending
— Backend/Dmitry scope
×
SHFT-4952 — AI Extraction Pipeline: TB Test (TB4_tb_upload)
Open in Jira → | Field Registry →
What this demo covers:
Validated with automated test suite against prod API (2026-05-20).
Open in Jira → | Field Registry →
What this demo covers:
Validated with automated test suite against prod API (2026-05-20).
Live API verification — 2026-05-26
Doc:
Doc:
TB_Chest-Xray-Radiology-Report_REDACTED_REDACTED_2025-04-28.png → POST /api/extract-compare
| Criterion | Field returned | Value |
|---|---|---|
| Chest X-ray classification | claude.typeMatch / testType | true / CHEST_XRAY |
| Chest X-ray expiration = performed date + 5 years | claude.calculatedExpiration | 04/28/2030 (from xrayDate 04/28/2025) |
| Detect "no active TB" / equivalent clearance phrase | claude.noActiveTbStatement | "No acute pulmonary findings." |
| Validate doctor's name/signature presence | documentai.validationWarnings | ["NO_PROVIDER_NAME_OR_SIGNATURE"] ✓ fires correctly |
| Multi-document detection | claude.multipleDocumentsDetected | false |
| Acceptance Criterion | Status | Test Scenario / Evidence |
|---|---|---|
| Accepts 4 TB doc types: 1-step skin, 2-step skin, blood test (QuantiFERON/T-Spot/IGRA), chest X-ray | ✓ | REDACTED (1-step), REDACTED (2-step), REDACTED/REDACTED/Scott (QuantiFERON), REDACTED/Turner (X-ray) — all classified correctly |
| Type mismatch: rejects docs that don't match nurse's TB3 selection | ✓ | Test: skin_test selected + X-ray uploaded → typeMatch:false, "Selected PPD_SKIN_TEST but appears to be CHEST_XRAY" |
| 1-step selected but 2-step detected → auto-upgrade silently, return TEST_TYPE=2-STEP | ✓ | Auto-upgrade logic fires when step2DatePlaced detected on a 1-step classification. Warning: AUTO_UPGRADED_TO_2_STEP |
| 2-step selected but only one set of dates → incomplete upload flag | ✓ | Flag: 2_STEP_INCOMPLETE_MISSING_STEP_2 when step2DatePlaced is absent |
| 1-step skin: extract placed+read dates, validate 48-72hr read window | ✓ | REDACTED: placed 05/19, read 05/21 → readWindowHours:48, readWindowFlag:WITHIN_RANGE. Violation → READ_WINDOW_VIOLATION flag |
| 2-step skin: Step 1 and Step 2 placed dates must be >1 week apart | ✓ | REDACTED: step1=02/16, step2=03/02 → stepsIntervalDays:14, stepsIntervalFlag:WITHIN_RANGE. <7 days → 2_STEP_INTERVAL_TOO_SHORT |
| Skin/blood test: calculate expiration = placed/result date + 1 year | ✓ | REDACTED: read 05/21/2025 → calculatedExpiration:05/21/2026. REDACTED: result 11/06/2025 → calculatedExpiration:11/06/2026 |
| Chest X-ray: calculate expiration = performed date + 5 years | ✓ | REDACTED: xrayDate 04/28/2025 → calculatedExpiration:04/28/2030 |
| Expired documents flagged (calculated expiration in past) | ◐ | Dan: Pipeline flags EXPIRED. Dmitry: Backend returns error message to nurse. |
| Positive result returns manual review flag | ✓ | overallResult:POSITIVE → flag:POSITIVE_RESULT. Routing to FL-07/Paused is Dmitry's scope. |
| Validate presence of doctor's name/signature/initials in "given by" field | ✓ | hasPhysicianSignature + physicianName extracted. Missing both → warning: NO_PHYSICIAN_NAME_OR_SIGNATURE |
| Blood test: detect if document has actual laboratory values | ✓ | REDACTED QuantiFERON: hasLabValues:true (IU/mL values present). Missing → warning: NO_LAB_VALUES_DETECTED |
| Blood test: reject Physical form or Immunization report (no lab values) | ✓ | isPhysicalOrImmunizationForm:true → flag: PHYSICAL_OR_IMMUNIZATION_FORM_NOT_LAB_REPORT. Rejection message is Dmitry's scope. |
| Chest X-ray: detect "no active TB" or equivalent clearance phrase | ✓ | REDACTED: noActiveTbStatement:"No acute pulmonary findings." Missing → warning: NO_ACTIVE_TB_STATEMENT_NOT_FOUND |
| Alteration detection (font inconsistencies, pixel artifacts, tampering) | ✓ | hasVisibleAlterations checked on every extraction |
| Returns tags: test_type, steps_interval_days, expiration_date | ✓ | All returned: testType, stepsIntervalDays (2-step only), calculatedExpiration |
| Confidence scores logged per field | ✓ | overallConfidence + per-field scores in results |
| Handles multi-page uploads (2-step across two pages) | ✓ | PDF uploads supported, all pages processed |
| BACKEND SCOPE (Dmitry) | ||
| Error messages to nurse (expired, read window violation, wrong doc type) | — | Backend reads flags and returns user-facing strings |
| Name matching (applicant name vs A1/A2) | — | Backend cross-checks extracted patientName against account |
| TB1 symptom screening interaction (positive result + symptoms → manual review) | — | Backend reads POSITIVE_RESULT flag + TB1 answers |
| Document status flow (In Review → Verified → Needs Attention) | — | Backend state machine |
| Facility enforcement (expiration_date + steps_interval_days for shift booking) | — | Backend/portal consumes tags from pipeline |
| TB3 update on auto-upgrade (1-step → 2-step) | — | Backend reads AUTO_UPGRADED_TO_2_STEP warning and updates TB3 |
✓ Done in this demo
◐ Extraction ready, backend pending
— Backend/Dmitry scope
Test docs used:
• TB_PPD-Skin-Test-Results_REDACTED_REDACTED_2025-05-21.jpeg (1-step)
• TB_Employee-Screening-Form_REDACTED_REDACTED_2026-03-02.jpg (2-step)
• TB_QuantiFERON-Gold-Plus-Blood-Test_REDACTED_2025-11-06.jpeg (blood)
• TB_Chest-Xray-Radiology-Report_REDACTED_REDACTED_2025-04-28.png (X-ray)
• TB_PPD-Skin-Test-Results_REDACTED_REDACTED_2025-05-21.jpeg (1-step)
• TB_Employee-Screening-Form_REDACTED_REDACTED_2026-03-02.jpg (2-step)
• TB_QuantiFERON-Gold-Plus-Blood-Test_REDACTED_2025-11-06.jpeg (blood)
• TB_Chest-Xray-Radiology-Report_REDACTED_REDACTED_2025-04-28.png (X-ray)
×
SHFT-5080 — AI Extraction Pipeline: MMR Immunity Proof
Open in Jira →
What this demo covers:
Six MMR proof types extracted to a shared contract (mmrDocType + mmrImmune) so Dmitry's backend can roll up immunity across multiple documents per nurse.
Open in Jira →
What this demo covers:
Six MMR proof types extracted to a shared contract (mmrDocType + mmrImmune) so Dmitry's backend can roll up immunity across multiple documents per nurse.
| Acceptance Criterion | Status | Test Scenario / Evidence |
|---|---|---|
| Vaccine record: extract per-vaccine doses, lot, manufacturer, dates | ✓ | VaccineRecordSchema returns vaccines[] with category (MMR/MMRV/MEASLES/MUMPS/RUBELLA/...), doses[], titerResult, immunityStatus |
| Lab titer report: extract POSITIVE/NEGATIVE/EQUIVOCAL per component | ✓ | titerResult + titerDate per vaccine entry; combined MMR titer covers all three components when positive |
| Physical form with MMR section as proof | ✓ | uploadType mmr-physical_form → PhysicalFormMmrSchema, mmrStatus (ADMINISTERED/IMMUNE_BY_TITER/UP_TO_DATE/DUE/DECLINED/EXEMPT) maps to mmrImmune |
| Medical exemption: clinician signature required | ✓ | Flags: MEDICAL_EXEMPTION_MISSING_PHYSICIAN_SIGNATURE, MEDICAL_EXEMPTION_NON_CLINICIAN_SIGNER (token-based MD/DO/NP/PA/APRN/DNP/CNM/PhD check — Pastor ≠ PA) |
| Religious exemption: nurse signature required | ✓ | Flag: RELIGIOUS_EXEMPTION_MISSING_NURSE_SIGNATURE when hasPatientSignature is false |
| Declination form: rejected as exemption | ✓ | DeclinationSchema → mmrDocType:"declination", mmrImmune:"unknown", flag:MMR_DECLINATION_REJECTED. Declinations never satisfy MMR. |
| Cross-document aggregation across multiple uploads | ✓ | POST /api/mmr/aggregate consumes prior extractions, returns mmrImmune + per-component evidence (measles/mumps/rubella) + missingComponents[] |
| Incomplete titer: missing one or more components flagged | ✓ | Flag: MMR_TITER_INCOMPLETE. Warning lists missing components by name. |
| Equivocal/indeterminate titer: manual review | ✓ | Flag: EQUIVOCAL_TITER_MANUAL_REVIEW, mmrImmune:"unknown". Distinguished from outright NEGATIVE. |
| Newer titer overrides older one per component | ✓ | Aggregator prefers later titerDate; falls back to status rank when dates missing |
| Applicant name cross-check (A1 first / A2 last) | ✓ | POST /api/name/verify returns EXACT/FUZZY/PARTIAL/NO_MATCH/INSUFFICIENT_DATA with similarity 0–100. Handles accents, hyphens, OCR slips (LeAnn/LeeAnn). |
| Alteration detection on every MMR doc | ✓ | hasVisibleAlterations + alterationDetails on all 6 schemas |
| Confidence scores per field | ✓ | confidencePerField + overallConfidence on every schema |
| BACKEND SCOPE (Dmitry) | ||
| Persist mmrDocType + mmrImmune per upload, call aggregator | — | Backend stores extraction, replaces prior of same type, calls /api/mmr/aggregate when status needs recomputing |
| User-facing error messages from flags | — | Backend reads warnings[] + flags[] and surfaces to nurse |
| Manual-review routing for exemptions and equivocal titers | — | Backend reads MMR_EXEMPTION_ON_FILE, EQUIVOCAL_TITER_MANUAL_REVIEW |
✓ Done in this demo
— Backend/Dmitry scope
Upload types:
Endpoints:
mmr-vaccine_record, mmr-titer, mmr-physical_form, mmr-medical_exemption, mmr-religious_exemption, mmr-declinationEndpoints:
POST /api/extract, POST /api/mmr/aggregate, POST /api/name/verify
×
Claude Sonnet 4.6 — built by Anthropic
Currently one of the most capable multimodal models for vision-based structured data extraction from documents.
Why we chose it for this pipeline:
Claude is the primary extraction engine. It reads the uploaded document image, identifies all relevant fields (name, DOB, license number, expiration, etc.), and returns structured JSON with per-field confidence scores. It is the most accurate model we tested for this use case.
Strengths:
• Highest accuracy for field extraction across all document types (gov ID, TB tests, physicals, nursing licenses)
• Best at detecting document alterations — catches font mismatches, pixel-level edits, inconsistent backgrounds, and photoshopped text
• Produces nuanced, realistic confidence scores (typically 88–98 range) rather than defaulting to 100
• Superior handwriting recognition — reads handwritten dates, signatures, and doctor notes more reliably
• Strong structured output — consistently returns valid JSON matching our Zod schemas
• Built-in safety guardrails — won't fabricate data it can't read; returns null with low confidence instead
Trade-offs:
• Slower than Gemini (~5–7 seconds per document vs ~3–5s)
• ~10x more expensive per document (~$0.01–0.03 vs ~$0.001–0.005)
• Occasionally over-cautious — may return lower confidence on legible fields
How it works in the pipeline:
1. Image is compressed & optimized (auto-resize, EXIF rotation via sharp)
2. Base64-encoded image + extraction prompt sent to Claude's vision API
3. Claude returns structured JSON with fields + confidence scores
4. Results validated against Zod schema + post-extraction rules (expiry, age, type match)
5. OCR cross-check adjusts confidence: +5 if OCR confirms, -15 if OCR disagrees
Currently one of the most capable multimodal models for vision-based structured data extraction from documents.
Why we chose it for this pipeline:
Claude is the primary extraction engine. It reads the uploaded document image, identifies all relevant fields (name, DOB, license number, expiration, etc.), and returns structured JSON with per-field confidence scores. It is the most accurate model we tested for this use case.
Strengths:
• Highest accuracy for field extraction across all document types (gov ID, TB tests, physicals, nursing licenses)
• Best at detecting document alterations — catches font mismatches, pixel-level edits, inconsistent backgrounds, and photoshopped text
• Produces nuanced, realistic confidence scores (typically 88–98 range) rather than defaulting to 100
• Superior handwriting recognition — reads handwritten dates, signatures, and doctor notes more reliably
• Strong structured output — consistently returns valid JSON matching our Zod schemas
• Built-in safety guardrails — won't fabricate data it can't read; returns null with low confidence instead
Trade-offs:
• Slower than Gemini (~5–7 seconds per document vs ~3–5s)
• ~10x more expensive per document (~$0.01–0.03 vs ~$0.001–0.005)
• Occasionally over-cautious — may return lower confidence on legible fields
How it works in the pipeline:
1. Image is compressed & optimized (auto-resize, EXIF rotation via sharp)
2. Base64-encoded image + extraction prompt sent to Claude's vision API
3. Claude returns structured JSON with fields + confidence scores
4. Results validated against Zod schema + post-extraction rules (expiry, age, type match)
5. OCR cross-check adjusts confidence: +5 if OCR confirms, -15 if OCR disagrees
Pricing breakdown:
• Input: $3.00 per 1M tokens (the image + prompt you send)
• Output: $15.00 per 1M tokens (the JSON response it generates)
Typical single extraction:
~1,500 input tokens × $3/1M = $0.0045
~400 output tokens × $15/1M = $0.006
Total: ~$0.01 per document
At scale: 10,000 docs/month ≈ $100–300/month
• Input: $3.00 per 1M tokens (the image + prompt you send)
• Output: $15.00 per 1M tokens (the JSON response it generates)
Typical single extraction:
~1,500 input tokens × $3/1M = $0.0045
~400 output tokens × $15/1M = $0.006
Total: ~$0.01 per document
At scale: 10,000 docs/month ≈ $100–300/month
×
Gemini 3.5 Flash — built by Google
A fast, cost-efficient multimodal model optimized for high-throughput tasks where speed and cost matter more than peak accuracy.
Why we chose it for this pipeline:
Gemini serves as the second opinion. When two independent models agree on a field value, our confidence in that extraction is very high. When they disagree, the field gets flagged for human review. This dual-model approach catches errors that any single model would miss.
Strengths:
• Very fast — typically 3–5 seconds per document
• ~10x cheaper than Claude per extraction
• Good accuracy on clearly printed text and standard document layouts
• Generous free tier (makes testing and development essentially free)
• High throughput — can process many documents quickly in batch scenarios
Trade-offs:
• Tends to give overconfident scores (95–100 for nearly everything, even ambiguous fields)
• Less reliable on handwritten forms, cursive, and poor-quality scans
• Misses some alteration cues that Claude catches (subtle font changes, compression artifacts)
• Occasionally misreads handwritten dates (e.g., "2025" as "2005")
How it works in the pipeline:
1. Same compressed image sent to Gemini's vision API in parallel with Claude
2. Gemini returns structured JSON matching the same Zod schema
3. Results compared field-by-field against Claude's output
4. Agreement/disagreement highlighted in the comparison view
5. OCR cross-check applied independently to Gemini's results too
A fast, cost-efficient multimodal model optimized for high-throughput tasks where speed and cost matter more than peak accuracy.
Why we chose it for this pipeline:
Gemini serves as the second opinion. When two independent models agree on a field value, our confidence in that extraction is very high. When they disagree, the field gets flagged for human review. This dual-model approach catches errors that any single model would miss.
Strengths:
• Very fast — typically 3–5 seconds per document
• ~10x cheaper than Claude per extraction
• Good accuracy on clearly printed text and standard document layouts
• Generous free tier (makes testing and development essentially free)
• High throughput — can process many documents quickly in batch scenarios
Trade-offs:
• Tends to give overconfident scores (95–100 for nearly everything, even ambiguous fields)
• Less reliable on handwritten forms, cursive, and poor-quality scans
• Misses some alteration cues that Claude catches (subtle font changes, compression artifacts)
• Occasionally misreads handwritten dates (e.g., "2025" as "2005")
How it works in the pipeline:
1. Same compressed image sent to Gemini's vision API in parallel with Claude
2. Gemini returns structured JSON matching the same Zod schema
3. Results compared field-by-field against Claude's output
4. Agreement/disagreement highlighted in the comparison view
5. OCR cross-check applied independently to Gemini's results too
Pricing breakdown:
• Input: $0.30 per 1M tokens
• Output: $2.50 per 1M tokens
Typical single extraction:
~1,500 input tokens × $0.30/1M = $0.00045
~400 output tokens × $2.50/1M = $0.001
Total: ~$0.001–0.005 per document
At scale: 10,000 docs/month ≈ $10–50/month
• Input: $0.30 per 1M tokens
• Output: $2.50 per 1M tokens
Typical single extraction:
~1,500 input tokens × $0.30/1M = $0.00045
~400 output tokens × $2.50/1M = $0.001
Total: ~$0.001–0.005 per document
At scale: 10,000 docs/month ≈ $10–50/month
×
Google Cloud Vision API — TEXT_DETECTION
Traditional OCR (Optical Character Recognition) — not an AI model. This is the same engine that powers Google Lens, Google Photos text search, and Google Drive's automatic PDF text extraction.
What is OCR?
OCR stands for Optical Character Recognition. It scans an image pixel-by-pixel to detect and extract raw text using pattern matching and character recognition. Unlike AI models, OCR doesn't "understand" the document — it simply finds every piece of text in the image and returns it as a plain string. It doesn't know what a "first name" or "expiration date" is; it just reads characters.
Why we use it in this pipeline:
OCR serves as an independent third cross-check alongside both AI models. If Claude extracts firstName = "SONYA" and OCR also found "SONYA" in the raw text, we have strong evidence that value is correct. If the AI extracted something OCR can't find anywhere in the document, that's a red flag — the AI may have hallucinated or misread.
How confidence adjustment works:
• AI extracts a field value → we search the OCR raw text for that value
• OCR ✓ Found in OCR text → confidence +5 points (confirmed by independent source)
• OCR ✗ Not found in OCR text → confidence -15 points (flagged for human review)
• This adjustment is applied per-field, independently for each AI model's results
Why run OCR if AI already reads the image?
AI models can "hallucinate" — confidently output text that isn't actually in the document. OCR is deterministic (same image always produces same text), so it acts as a ground-truth check. The combination of AI understanding + OCR verification is more reliable than either alone.
Performance:
• Latency: ~0.3–0.5 seconds (runs in parallel with AI, adds zero wait time)
• OCR fires simultaneously with Claude and Gemini — the total request time is determined by the slowest AI model, not the sum
Traditional OCR (Optical Character Recognition) — not an AI model. This is the same engine that powers Google Lens, Google Photos text search, and Google Drive's automatic PDF text extraction.
What is OCR?
OCR stands for Optical Character Recognition. It scans an image pixel-by-pixel to detect and extract raw text using pattern matching and character recognition. Unlike AI models, OCR doesn't "understand" the document — it simply finds every piece of text in the image and returns it as a plain string. It doesn't know what a "first name" or "expiration date" is; it just reads characters.
Why we use it in this pipeline:
OCR serves as an independent third cross-check alongside both AI models. If Claude extracts firstName = "SONYA" and OCR also found "SONYA" in the raw text, we have strong evidence that value is correct. If the AI extracted something OCR can't find anywhere in the document, that's a red flag — the AI may have hallucinated or misread.
How confidence adjustment works:
• AI extracts a field value → we search the OCR raw text for that value
• OCR ✓ Found in OCR text → confidence +5 points (confirmed by independent source)
• OCR ✗ Not found in OCR text → confidence -15 points (flagged for human review)
• This adjustment is applied per-field, independently for each AI model's results
Why run OCR if AI already reads the image?
AI models can "hallucinate" — confidently output text that isn't actually in the document. OCR is deterministic (same image always produces same text), so it acts as a ground-truth check. The combination of AI understanding + OCR verification is more reliable than either alone.
Performance:
• Latency: ~0.3–0.5 seconds (runs in parallel with AI, adds zero wait time)
• OCR fires simultaneously with Claude and Gemini — the total request time is determined by the slowest AI model, not the sum
Pricing:
• $1.50 per 1,000 images processed
• First 1,000 images/month are FREE (Google's free tier)
• Per document: ~$0.0015
At scale: 10,000 docs/month ≈ $13.50/month (after free tier)
• $1.50 per 1,000 images processed
• First 1,000 images/month are FREE (Google's free tier)
• Per document: ~$0.0015
At scale: 10,000 docs/month ≈ $13.50/month (after free tier)
×
Google Document AI — Google Cloud's specialized document processing platform.
Pre-trained processors built for specific document types (IDs, forms, invoices) — not a general-purpose LLM. Returns structured key-value pairs with bounding boxes and confidence scores.
Why we chose it for this pipeline:
Document AI is the third independent extractor alongside Claude and Gemini. Because it's purpose-built for documents (not generative), it tends to be deterministic, fast, and excellent at machine-readable layouts. When all three (Claude + Gemini + Doc AI) agree on a field, our confidence in that value is extremely high.
Strengths:
• Purpose-built per document category — separate model for IDs vs forms vs general text
• Returns spatial layout (bounding boxes), so we can see exactly where each value was read from
• Deterministic — same image always returns same output (unlike LLMs)
• Identity Document Proofing processor detects tampering signals (digital alteration scores, evidence inconclusive, etc.) that LLMs sometimes miss
• Strong on structured/printed text — driver's licenses, official forms, lab reports
Trade-offs:
• Slowest of the three (~10–14s vs Claude ~6s, Gemini ~4s)
• Weaker on free-form handwriting and unusual layouts than Claude
• Each document category needs its own processor (more setup than a single LLM call)
• Field labels come back raw — we map them to our schema in code (e.g., "Family Name" → lastName)
Processors we use:
How it works in the pipeline:
1. Same compressed image dispatched to Document AI in parallel with Claude + Gemini
2. Processor picked by upload category (see table above)
3. Raw entities returned with bounding boxes & per-field confidence
4. Field labels mapped to our schema (e.g., DocAI "Date Of Birth" → our dateOfBirth)
5. Results compared against Claude/Gemini for agreement scoring; populates
Pre-trained processors built for specific document types (IDs, forms, invoices) — not a general-purpose LLM. Returns structured key-value pairs with bounding boxes and confidence scores.
Why we chose it for this pipeline:
Document AI is the third independent extractor alongside Claude and Gemini. Because it's purpose-built for documents (not generative), it tends to be deterministic, fast, and excellent at machine-readable layouts. When all three (Claude + Gemini + Doc AI) agree on a field, our confidence in that value is extremely high.
Strengths:
• Purpose-built per document category — separate model for IDs vs forms vs general text
• Returns spatial layout (bounding boxes), so we can see exactly where each value was read from
• Deterministic — same image always returns same output (unlike LLMs)
• Identity Document Proofing processor detects tampering signals (digital alteration scores, evidence inconclusive, etc.) that LLMs sometimes miss
• Strong on structured/printed text — driver's licenses, official forms, lab reports
Trade-offs:
• Slowest of the three (~10–14s vs Claude ~6s, Gemini ~4s)
• Weaker on free-form handwriting and unusual layouts than Claude
• Each document category needs its own processor (more setup than a single LLM call)
• Field labels come back raw — we map them to our schema in code (e.g., "Family Name" → lastName)
Processors we use:
| Category | Processor | Purpose |
|---|---|---|
government_id | US Driver License Parser | Pre-trained on US DL/state ID layouts; extracts P2–P11 fields directly |
license | Form Parser | Generic form K/V extraction for nursing license certificates & cards |
tb_test / physical | OCR Processor | Better at handwriting (TB skin test dates, physician notes) |
| All gov IDs | Identity Document Proofing | Runs in parallel — returns tampering/alteration scores feeding POSSIBLE_ALTERATION flag |
How it works in the pipeline:
1. Same compressed image dispatched to Document AI in parallel with Claude + Gemini
2. Processor picked by upload category (see table above)
3. Raw entities returned with bounding boxes & per-field confidence
4. Field labels mapped to our schema (e.g., DocAI "Date Of Birth" → our dateOfBirth)
5. Results compared against Claude/Gemini for agreement scoring; populates
documentai.validationFlags / validationWarnings
Pricing breakdown:
• Form Parser / OCR / DL Parser: $0.030 per page (first 1,000 pages/month FREE)
• Identity Document Proofing: $0.10 per request
Typical single extraction (gov ID):
1 DL Parser call + 1 ID Proofing call = $0.03 + $0.10 = ~$0.13 per gov ID
Non-ID docs (license/TB/physical): ~$0.03 per doc
At scale: 10,000 docs/month — mix-dependent, ~$300–$1,300/month
Note: Doc AI is the most expensive of the three providers — used only because the independent third-opinion catches errors the LLMs miss.
• Form Parser / OCR / DL Parser: $0.030 per page (first 1,000 pages/month FREE)
• Identity Document Proofing: $0.10 per request
Typical single extraction (gov ID):
1 DL Parser call + 1 ID Proofing call = $0.03 + $0.10 = ~$0.13 per gov ID
Non-ID docs (license/TB/physical): ~$0.03 per doc
At scale: 10,000 docs/month — mix-dependent, ~$300–$1,300/month
Note: Doc AI is the most expensive of the three providers — used only because the independent third-opinion catches errors the LLMs miss.
×
What are tokens?
Tokens are the unit AI models use to measure text. Think of them as "word pieces." One token is roughly 4 characters or about ¾ of an English word. The word "extraction" is 2 tokens. A full sentence is typically 15–25 tokens.
Why do tokens matter?
AI model pricing is based entirely on tokens — both the tokens you send (input) and the tokens the model generates back (output). Understanding tokens helps you estimate costs and optimize usage.
Input tokens — what you send TO the model:
• The document image (~1,000–2,000 tokens depending on resolution)
• The extraction prompt/instructions (~200 tokens)
• The schema definition telling the model what fields to extract (~100 tokens)
• Total per request: ~1,300–2,300 input tokens
Output tokens — what the model sends BACK:
• The structured JSON with all extracted fields and confidence scores
• Typically ~200–500 tokens depending on document type
• Output tokens cost 3–5x more than input because the model is doing the computational work of "reading" and reasoning
Why does output cost more?
Input is just receiving data. Output requires the model to analyze the image, identify fields, read text (including handwriting), assess confidence, check for alterations, and generate structured JSON — this computation is what you're paying the premium for.
Tokens are the unit AI models use to measure text. Think of them as "word pieces." One token is roughly 4 characters or about ¾ of an English word. The word "extraction" is 2 tokens. A full sentence is typically 15–25 tokens.
Why do tokens matter?
AI model pricing is based entirely on tokens — both the tokens you send (input) and the tokens the model generates back (output). Understanding tokens helps you estimate costs and optimize usage.
Input tokens — what you send TO the model:
• The document image (~1,000–2,000 tokens depending on resolution)
• The extraction prompt/instructions (~200 tokens)
• The schema definition telling the model what fields to extract (~100 tokens)
• Total per request: ~1,300–2,300 input tokens
Output tokens — what the model sends BACK:
• The structured JSON with all extracted fields and confidence scores
• Typically ~200–500 tokens depending on document type
• Output tokens cost 3–5x more than input because the model is doing the computational work of "reading" and reasoning
Why does output cost more?
Input is just receiving data. Output requires the model to analyze the image, identify fields, read text (including handwriting), assess confidence, check for alterations, and generate structured JSON — this computation is what you're paying the premium for.
Worked example — 1 driver's license:
Claude Sonnet 4.6:
Input: ~1,500 tokens × $3.00/1M = $0.0045
Output: ~400 tokens × $15.00/1M = $0.006
Total: ~$0.01
Gemini 3.5 Flash:
Input: ~1,500 tokens × $0.30/1M = $0.00045
Output: ~400 tokens × $2.50/1M = $0.001
Total: ~$0.0015
Compare mode (both + OCR): ~$0.015
Claude is ~10x pricier but more accurate. The compare mode runs both for maximum confidence.
Claude Sonnet 4.6:
Input: ~1,500 tokens × $3.00/1M = $0.0045
Output: ~400 tokens × $15.00/1M = $0.006
Total: ~$0.01
Gemini 3.5 Flash:
Input: ~1,500 tokens × $0.30/1M = $0.00045
Output: ~400 tokens × $2.50/1M = $0.001
Total: ~$0.0015
Compare mode (both + OCR): ~$0.015
Claude is ~10x pricier but more accurate. The compare mode runs both for maximum confidence.
Running extraction across all models in parallel...
Re-extract the same file (results may vary between runs)