Skip to content
Help center

PDF to text troubleshooting: empty output, passwords, and garbled text

Fixes for the common PDF to text problems: password-protected files, scanned PDFs with no text layer, garbled or out-of-order output, clipboard copy failures, truncated previews, and slow large files.

Last updated

PDF to text runs entirely in your browser — your PDF is never uploaded — so when something goes wrong it almost always comes down to the file itself. Here are the common problems and how to fix them.

"This PDF is password-protected, so its text can't be read"

Cause: The PDF is encrypted with an open password, so the extractor can't read its pages at all. The error links straight to the fix as "Remove the password first".

  1. Open Unlock.
  2. Enter the PDF's password to produce an unlocked copy. This also happens locally — the password and the file both stay on your device.
  3. Bring the unlocked copy back to PDF to text and click Extract text again.

"No text layer found — this looks like a scanned PDF"

Cause: The extraction finished but found zero characters. The pages are images — a scan or photo of a document — so there is no embedded text to pull out. Reading words out of pixels needs OCR, which is a different job.

  1. Open Searchable (OCR) — the notice in the tool links there directly.
  2. Run OCR on the file. It recognizes the words on each page and adds an invisible text layer, all on your device.
  3. Take the searchable copy back to PDF to text and extract again. OCR accuracy depends on scan quality, so skim the output for misread words.

The text comes out garbled, out of order, or full of strange characters

Cause: The extractor follows the order in which text is stored inside the PDF, which is usually — but not always — the order you read it on screen. Multi-column layouts, text boxes, and generators that draw text out of sequence can shuffle lines. Separately, some PDFs embed fonts without proper character-mapping information, so what's stored simply isn't the readable text you see rendered.

  1. Try PDF to Word instead — its layout-aware rebuild often handles columns and tables better than a flat text dump.
  2. If specific characters come out as boxes or gibberish in every tool, the font lacks a usable character map. Running the file through Searchable (OCR) re-reads the page visually, which can recover text no extractor can.
  3. For mostly-good output with a few shuffled sections, downloading the .txt and reordering those sections by hand is often the fastest path.

"Couldn't copy — try the download instead."

Cause: Your browser refused the clipboard write. Browsers gate clipboard access behind permissions, and some (or strict privacy settings, or non-secure contexts) block it.

  1. Click Download .txt instead — it always works and contains the same full text.
  2. If you want the clipboard route, check the site permissions in your browser's address bar and allow clipboard access, then click Copy text again.
  3. Alternatively, select the text in the preview pane and copy it manually — though remember the preview only shows the first 5,000 characters.

The preview stops mid-document

Cause: Not a bug. The on-screen preview is deliberately capped at 5,000 characters and appends "… (preview truncated — download the .txt for the full text)" when your document is longer.

  1. Click Download .txt or Copy text — both always carry the complete extracted text, regardless of the preview cap. The character count above the buttons tells you the true total.

A very large PDF is slow or the tab feels stuck

Cause: Every page is processed on your own machine — that's the privacy trade. There's no server farm doing the work, so a many-hundred-page PDF takes real time, especially on a low-power device.

  1. Leave the tab in the foreground and wait — the button shows Reading… while extraction is running.
  2. Close other heavy tabs to free memory.
  3. If the file is enormous, consider splitting it first with Pages and extracting the parts you need.

"Couldn't read text from this PDF — it may be corrupted or in an unsupported format"

Cause: The file couldn't be parsed as a PDF at all — a truncated download, a corrupted file, or a file that isn't really a PDF despite the extension.

  1. Re-download or re-export the file from its original source and try again.
  2. Check that it opens in another PDF viewer. If it opens there, print it to PDF from that viewer to produce a clean rebuild, then extract from that copy.
  3. If it won't open anywhere, the file is damaged at the source — no extractor, local or cloud, can read it.

Still stuck?

If none of these match what you're seeing, contact support and describe what happens — since your PDF never leaves your browser, we can't see your document, so the error text on screen is the most useful thing to include.

Open the tool →

Related

Was this helpful?

Still stuck? Contact support →