PDF Harvester Skill Extract and ingest PDF documents into RAG with proper text extraction, table handling, and metadata. Overview PDFs are common for research papers, reports, manuals, and ebooks. This skill covers: - Text extraction with layout preservation - Table extraction and conversion to markdown - Academic paper patterns (abstract, sections, citations) - OCR for scanned documents - Multi-page chunking strategies Prerequisites Extraction Methods Method 1: pdfplumber (Recommended) Best for structured PDFs with tables. Method 2: PyMuPDF (fitz) Faster, better for large PDFs. Method 3: OCR…