Blog
How to Convert a Certificate of Analysis (CoA) to Excel — and Verify It's Right
2026-06-16
TL;DR: To convert a Certificate of Analysis (CoA) to Excel, you can re-key it by hand, run OCR/table extraction, use a generic document-AI tool, or use a purpose-built CoA extractor. The faster question is whether the extracted data is correct — right analytes, right units, right limits, and a pass/fail that actually matches the numbers. This guide walks through each method and how to verify the result before it lands in your spreadsheet, LIMS, or ERP.
What is a Certificate of Analysis (CoA)? A Certificate of Analysis (CoA, also written COA) is a lab-issued, batch-specific document that lists each test (analyte), the measured result, the unit, the specification or limit, and a pass/fail verdict for a given lot of material. It’s the evidence used to accept, reject, hold, or release a product.
Why convert a CoA to Excel at all?
Because a PDF is a dead end. The moment you need to trend results across lots, compare a supplier’s numbers to your internal spec, file an audit record, or push data into a LIMS or ERP, you need structured rows and columns — not a scanned image. In large food and beverage manufacturers, receiving and checking a supplier CoA can happen over a million times a year and take 10–15 minutes each; for labs issuing CoAs, manual re-keying can run 30–90 minutes per document. Every one of those minutes is also an opportunity for a transcription error.
The catch: getting the data out is only half the job. Getting it out correctly is what protects you.
How do I convert a CoA PDF to Excel? (4 methods compared)
There are four practical ways to turn a CoA into a spreadsheet. They differ enormously in speed, accuracy, and how well they handle scanned documents.
| Method | Speed | Handles scans? | Accuracy risk | Best for |
|---|---|---|---|---|
| Manual re-keying | Slow (30–90 min) | N/A | High — transcription errors | One-off, no budget |
| OCR / table-extraction scripts (pdfplumber, Tesseract) | Medium | Partial | Medium — breaks on layout changes | Developers, one lab format |
| Generic document-AI (extract-anything tools) | Fast | Yes | Medium — misreads units/limits | Mixed documents |
| Purpose-built CoA extractor with verification | Fast | Yes | Low — recomputes & cross-checks | QA/labs at volume |
Method 1 — Re-key it by hand
Open the PDF, type each analyte, result, unit, and limit into Excel. It works at one or two CoAs a week and needs no tools. It collapses the instant volume rises or formats vary, and it’s the single biggest source of transcription errors in incoming QA.
Method 2 — OCR and table extraction
For digital CoAs, libraries like pdfplumber extract tables reasonably well; PyMuPDF gives coordinate-level control for merged cells and wrapped text. Scanned CoAs need an OCR pass (e.g., Tesseract or a cloud OCR API) first. The weakness: these scripts are brittle. A supplier changes a template and the parser silently mis-maps a column — and now your “limit” column holds a method reference.
Method 3 — Generic document-AI
Upload-anything extraction tools will return JSON or Excel from almost any layout, including scans, in many languages. They’re genuinely fast. But they’re built to extract fields, not to understand CoA semantics: they can accept “<0.01” as the value 0.01, treat mg/L as if it were mg/kg, or capture the lab’s printed verdict without checking it against the numbers.
Method 4 — Purpose-built CoA extraction with verification
A CoA-specific tool reads every analyte, result, unit, and limit, then does what the others skip: it recomputes pass/fail from the numbers and cross-checks that against the verdict the lab printed, flagging mismatches, missing limits, ambiguous units, and low-confidence scans. This is the difference between “I have the data” and “I trust the data.”
How do I get data out of a scanned CoA?
Yes, scanned and photographed CoAs can be converted — but only through a method with an OCR layer (Methods 3 and 4 above, or Method 2 with an added OCR step). The risk with scans is higher: faint print, skewed pages, and stamp overlaps cause silent misreads. The safeguard is confidence scoring — any low-confidence field should be flagged for human review rather than accepted at face value.
The part everyone skips: verifying the extraction
Extraction tells you what the document says. Verification tells you whether what it says is right. Four checks matter most.
1. Did the lab’s verdict actually match the numbers?
A CoA prints “Pass” — but labs apply their own tolerance, and verdicts get copied across templates. A potency result of 480 mg against a 500 mg minimum is a fail, even if the row says “pass.” Recomputing the verdict from the result and the limit catches these.
2. Are the units what you think they are?
This is where false pass/fails are born. Use this cheat-sheet:
| You see | It means | Watch out for |
|---|---|---|
| mg/kg | milligrams per kilogram | Exactly equal to ppm (mass basis) |
| ppm | parts per million | 1 ppm = 1 mg/kg; 1% = 10,000 ppm |
| mg/L | milligrams per litre | ≈ ppm only in water (density ≈ 1) |
| % | percent | 1% = 10,000 ppm = 10,000 mg/kg |
| µg/g | micrograms per gram | Equal to ppm; 1 µg/g = 1 mg/kg |
If a result is reported in mg/L but your spec is in mg/kg, you cannot compare them without density — and a tool (or a person) that treats them as equal will pass material it should fail.
3. Do you understand the notation?
CoA shorthand changes the meaning of a result. The essentials:
| Notation | Meaning | Effect on verdict |
|---|---|---|
| NMT (≤) | Not more than | Result must be at or below the limit |
| NLT (≥) | Not less than | Result must be at or above the limit |
| ND | Not detected (below LOD) | Treated as non-detect, not zero |
| <LOQ | Below limit of quantitation | Detected but not reliably quantifiable |
| LOD / LOQ | Limit of detection / quantitation | LOD ≠ LOQ; LOD is lower |
| NT | Not tested | A gap, not a pass |
A common potency spec reads “NLT 90.0% and NMT 110.0% of label claim.” Misreading NMT as NLT silently inverts the test.
4. Are the limits even there — and are they yours?
“Typical” or “Conforms” with no number isn’t a true CoA result; it’s a data sheet. “Report only” heavy metals have no pass/fail basis. And a supplier’s default spec may be looser than your internal spec — verify against your limits, not just the vendor’s.
A worked example: microbial and heavy-metal rows
Two areas trip up extraction most often.
Microbial. Pathogens are reported as presence/absence, e.g., “Salmonella — absent in 25 g.” That’s a 2-class sampling plan (c = 0): any presence is an automatic fail. It is not a count, so a tool that expects a number can mishandle it.
Heavy metals. Limits differ by framework. Verified current figures:
| Element | USP <232> oral PDE (drugs) | USP <2232> (dietary supplements) | Prop 65 lead (reproductive MADL) |
|---|---|---|---|
| Lead (Pb) | 5 µg/day | 10 µg/day | 0.5 µg/day |
| Inorganic arsenic | 15 µg/day | 15 µg/day | — |
| Cadmium | 5 µg/day | 5 µg/day | — |
| Mercury | 30 µg/day (inorganic) | 15 µg/day (total) | — |
Note the lead limit is genuinely different between drugs (5 µg/day) and supplements (10 µg/day) by design — and California’s Prop 65 reproductive limit (0.5 µg/day, set by OEHHA under Title 27 CCR §25805) is far stricter, which is why USP-compliant products can still carry a Prop 65 warning. If your extraction grabs the wrong framework’s limit, your pass/fail is wrong.
How Coatables fits in
Coatables converts CoA PDFs — digital or scanned — into clean, verified Excel or JSON. It re-reads every analyte, result, unit, and limit; recomputes pass/fail from the numbers; and cross-checks that against the verdict the lab printed, flagging mismatches, missing limits, ambiguous units, and low-confidence scans. There’s no login, and it’s pay-per-use credit packs — so you can verify a single CoA or a stack of them without onboarding a platform. The point isn’t just to save the re-keying time; it’s to make sure a “pass” is really a pass before it reaches your spreadsheet, LIMS, or ERP.
Try it on a sample certificate →
Frequently asked questions
How do I convert a certificate of analysis to Excel for free? You can re-key it manually or use OCR/table-extraction scripts at no software cost, but both carry accuracy risk and don’t verify the result. Pay-per-use tools avoid subscriptions while adding verification.
Can I extract data from a scanned CoA? Yes, with a method that includes OCR. Scanned CoAs carry higher misread risk, so look for confidence scoring that flags uncertain fields for review.
Why does my CoA say “pass” when a result is over the limit? Because the verdict may reflect the lab’s own tolerance or a copied template, or the wrong limit/unit was applied. Recomputing pass/fail from the raw number and the correct limit reveals the conflict.
Is mg/kg the same as ppm? On a mass basis, yes — 1 mg/kg = 1 ppm. But mg/L only approximates ppm in water; comparing mg/L to a mg/kg spec without density is a common error.
What does NMT mean on a CoA? NMT means “not more than” — the result must be at or below the limit. NLT means “not less than.” Confusing the two inverts the pass/fail test.
Do I have to test every incoming lot, or can I accept the supplier’s CoA? It depends on your sector and risk. For dietary supplements, 21 CFR 111.75(a)(1)(i) requires you to “conduct at least one appropriate test or examination to verify the identity of any component that is a dietary ingredient” on every lot — a supplier CoA cannot satisfy that identity requirement. Other attributes can often rely on a verified supplier CoA once the supplier’s reliability is established.