Skip to content
Help center

How to scan a PDF for fakes and AI-generated content

Step-by-step guide to scanning any PDF for metadata anomalies, suspicious producer fingerprints, and AI-generation tells — free, entirely in your browser, with no upload.

Last updated

Scan checks any PDF for metadata anomalies, suspicious producer fingerprints, and statistical signs of AI-generated content. The analysis runs entirely in your browser — the file is never uploaded anywhere.

Before you start

  • Scan is free and needs no account — there is no plan gate on this tool.
  • The drop zone accepts files up to 100 MB. For anything larger, split out the pages you need with Pages first.
  • Password-protected PDFs can't be analyzed: the metadata Scan reads lives inside an encrypted stream. Remove the password first — Unlock can do this in your browser if you know the password — then re-scan.
  • Know what you're getting: these checks are heuristics, not cryptographic proof. A motivated forger can spoof every field Scan reads by re-saving the file through Adobe Acrobat. Scan catches lazy fakes — AI-generated lease agreements, photoshopped pay stubs, recycled bank statements — and gives you a structured report to make a human judgment call.

Steps

  1. Open Scan.
  2. Drag your PDF onto the Drop a PDF here zone, or click Choose PDF and pick a file. An Analyzing… status appears while the report is computed — usually a second or two.
  3. Read the verdict banner at the top of the report. It shows one of three states plus a Suspicion score from 0 to 100 — higher means more suspicious. The score is a weighted tally of the signals below, not a probability.
  4. Review the signal rows under the banner. Each row has a severity icon, a one-line explanation of why the signal matters, and the raw value (for example, the exact producer string) when there is one.
  5. Click Scan another to reset and check a different file.
Verdict bannerWhat it means
No automated red flagsNo warning- or high-level signals fired. The document was probably authored by the tool its metadata claims — this does not rule out tampering by a motivated forger.
Suspicious — worth a closer lookAt least one warning-level signal. Consider asking the source for the original file, the authoring tool, or signed provenance.
Likely fake or AI-generatedAt least one strong tell, such as a known AI-generator name in the producer string. Treat the contents as unverified.

What the signals check

  • Producer fingerprint — the PDF's /Producer string is classified in order of precedence. Known AI and LLM tool names (ChatPDF, ChatGPT, GPT, Claude, Gemini, Copilot, Perplexity) are a high-severity tell. Generic AI-vendor fragments (OpenAI, Anthropic, Mistral, Llama, DeepSeek, Groq, xAI, phrases like "AI-generated") also score high, even when wrapped around a familiar tool name. Known online re-processors (iLovePDF, Smallpdf, PDF24, Sejda, PDF2Go and similar) raise a warning, because the document has been re-rendered at least once and the original layout or signatures may have been altered. Well-known authoring tools (Microsoft, Adobe Acrobat, LiveCycle Designer, pdfTeX, Ghostscript, LibreOffice, macOS Quartz and others) count as a good sign. Anything unrecognized is shown as informational, not suspicious.
  • Missing metadata — a missing producer string is a warning, since real PDFs almost always carry one. If the metadata simply lives in the modern XMP stream instead of the legacy /Info dictionary — common for InDesign and recent Acrobat exports — Scan notes that as informational instead.
  • Creation and modification dates — a missing creation date is a warning, and a modification date more than a year after creation is a warning when the producer is unknown. Both downgrade to informational when the producer is a known authoring tool, because old templates re-saved years later are normal: the IRS W-9 template dates back to 1996 and is re-issued every year.
  • Document structure — page count, plus Creator, Author, and Subject metadata for cross-reference. Fillable AcroForm form fields are a good sign: AI text generators rarely produce them.
  • Embedded images — Scan extracts embedded JPEG images (the first 6 are analyzed) and measures color diversity and pixel-noise smoothness. Real photographs carry sensor grain; AI-generated images cluster at unnaturally low color counts and unnaturally smooth gradients. An image must trip both tells at once — and in multi-image documents, at least half of the analyzed images must agree — before the report raises it, which keeps false alarms low on photo-heavy real documents.

Scoring: each high-severity signal adds 40 points, each warning adds 18, each good sign subtracts 5, and the total is clamped to 0–100. Any high-severity signal makes the verdict Likely fake or AI-generated; otherwise one or more warnings make it Suspicious — worth a closer look.

Result

You get a structured, explainable report — and because nothing is stored and no bytes leave your browser, you can safely scan an applicant's pay stub or a counterparty's invoice before acting on it. Remember the limit: a clean verdict means no automated red flags, not proof of authenticity. For verifiable authenticity, ask the source to sign the document with Sign and check the signature yourself at Verify.

  • Verify a signed PDF — cryptographic certainty instead of heuristics, for AttachKit-signed documents.
  • Unlock a PDF — remove a password (when you know it) so Scan can read the metadata.
  • Pages — split an over-100 MB file before scanning.

Open the tool →

Related

Was this helpful?

Still stuck? Contact support →