PDF to Excel troubleshooting: scanned files, passwords, numbers stuck as text
Fixes for the common PDF-to-Excel problems — a scanned PDF with no text layer, a password-protected file, numbers or dates that stayed as text, and output that doesn't match the original layout.
Last updated
Common problems when converting a PDF to Excel at /app/pdf-to-excel, and how to fix each one. Everything below happens in your browser — the file is never uploaded, so none of these fixes involve sending it anywhere.
"This PDF has no text layer — it looks scanned"
Cause: the converter reads the PDF's embedded text. A scanned or photographed document is just images of text, so there's nothing to extract.
Fix:
- Click OCR & convert to Excel right in the error notice. The text is recognized on-device (Tesseract, running locally), page by page, with a progress bar — then the conversion continues from the recognized text automatically.
- The built-in pass recognizes English. If the document is in another language, open the full OCR tool, pick the language there, download the searchable PDF, and convert that instead.
- If you see "OCR didn't find any readable text on this PDF", the scan is likely too low-resolution, too faint, or in an unselected language — try the full OCR tool with the right language, or re-scan at a higher quality.
Note that OCR output is a best-effort reading of an image: expect to proofread numbers before relying on totals.
"This PDF is password-protected"
Cause: an encrypted PDF's text can't be read until it's decrypted.
Fix:
- Open Unlock and remove the password (you'll need to know it — this is the legitimate-access tool, not a cracker). Unlocking also runs entirely in your browser.
- Convert the unlocked copy.
Numbers, zip codes or IDs stayed as text
Cause: the converter is deliberately conservative about typing cells. Values that look like codes rather than quantities are kept as text so they're never silently corrupted:
- leading-zero values like "0123" or a 02134 zip code (converting would drop the zero),
- digit runs longer than 15 characters (phone, card and account numbers; Excel would lose precision),
- anything with stray non-numeric characters.
Fix: that behavior is usually what you want for IDs. If a column really is numeric, convert it in Excel: select the column, then use Data → Text to Columns (or multiply by 1) to coerce it.
What does become real numbers automatically: plain and thousands-grouped numbers (1,234.56), percents (12.3%), currency with a $, €, £ or ¥ symbol, and accounting negatives like (1,200).
Dates didn't become date cells
Cause: only common, unambiguous spellings are recognized, and a year is required: ISO (2026-06-05), US numeric with slashes or dashes (6/5/2026, 06-05-26), and month names (Jan 5, 2026 or 5 Jan 2026). Numeric dates are read in US month/day order, and dot-separated dates (1.2.2026) are skipped on purpose because they collide with version numbers.
Fix: dates in other formats stay as text — sort them as text, or reformat the column in Excel.
The spreadsheet doesn't look like the PDF
Cause: this tool recovers tabular data, not visual layout. Each PDF page with text becomes one worksheet ("Page 1", "Page 2", …). Bordered tables are rebuilt as proper grids with merged cells and a styled header row; everything else is placed into rows and columns by position. Prose- or image-heavy pages will read as loose rows of text, and images are not carried over at all.
Fix:
- If you need an editable document that keeps images, paragraphs and reading order, use PDF to Word instead.
- For tables drawn without ruled borders, columns are inferred from the horizontal gaps between text — usually right, but a ragged column can land in a neighbor. Tidy stray cells in Excel afterwards.
"Couldn't convert this PDF to Excel"
Cause: the file may be corrupted or in an unsupported format.
Fix:
- Confirm the file opens in another PDF viewer. If it doesn't, re-export or re-download it from the source.
- If it opens fine elsewhere but still fails here, report it via the link below — that's a case the team wants to see.
Conversion is slow on a long document
Cause: every step runs on your device — that's the privacy guarantee, but it means a long document, and especially the OCR fallback, takes real local compute. OCR works page by page.
Fix: leave the tab in the foreground and let it finish; the progress bar shows which page OCR is on, and Cancel stops it cleanly. If you only need a few pages, split them out first with Pages and convert the extract.
Still stuck?
If none of these match what you're seeing, contact support and describe what happened — since your file never leaves your browser, support can't see it, so include the file's size, page count, and whether the OCR fallback was involved.
Related
Still stuck? Contact support →