pdf-construction — Skillopedia

PDF Processing for Construction Overview Adapted from Anthropic's PDF skill for construction document workflows. Construction Use Cases 1. RFI Processing Extract structured data from Request for Information documents. 2. Submittal Package Creation Merge multiple PDFs into organized submittal packages. 3. Specification Extraction Extract specification sections for analysis. 4. Drawing Sheet Extraction Split drawing packages by sheet. Integration with DDC Pipeline Dependencies Resources - Original : Anthropic PDF Skill - PyPDF Docs : https://pypdf.readthedocs.io/ - PDFPlumber : https://github.c…

, line)\n if match:\n if current_section:\n sections[current_section] = '\\n'.join(current_text)\n current_section = match.group(1).replace(' ', '')\n current_text = [match.group(2)]\n elif current_section:\n current_text.append(line)\n\n if current_section:\n sections[current_section] = '\\n'.join(current_text)\n\n return sections\n```\n\n### 4. Drawing Sheet Extraction\nSplit drawing packages by sheet.\n\n```python\ndef split_drawing_package(pdf_path: str, output_dir: str) -> list:\n \"\"\"Split drawing package into individual sheets.\"\"\"\n reader = PdfReader(pdf_path)\n output_dir = Path(output_dir)\n output_dir.mkdir(exist_ok=True)\n\n sheets = []\n for i, page in enumerate(reader.pages):\n # Extract sheet number from page (if text-based)\n text = page.extract_text()\n sheet_match = re.search(r'([A-Z]+[-]?\\d+)', text[:500])\n sheet_name = sheet_match.group(1) if sheet_match else f\"Page_{i+1:03d}\"\n\n writer = PdfWriter()\n writer.add_page(page)\n\n output_file = output_dir / f\"{sheet_name}.pdf\"\n with open(output_file, \"wb\") as f:\n writer.write(f)\n\n sheets.append(str(output_file))\n\n return sheets\n```\n\n## Integration with DDC Pipeline\n\n```python\n# Example: Process RFI and add to tracking spreadsheet\nimport pandas as pd\n\n# Extract RFI data\nrfi_data = extract_rfi_data(\"RFI_045.pdf\")\n\n# Load existing tracker\ntracker = pd.read_excel(\"RFI_Log.xlsx\")\n\n# Add new entry\nnew_row = pd.DataFrame([rfi_data])\ntracker = pd.concat([tracker, new_row], ignore_index=True)\n\n# Save updated tracker\ntracker.to_excel(\"RFI_Log.xlsx\", index=False)\n```\n\n## Dependencies\n\n```bash\npip install pypdf pdfplumber reportlab\n```\n\n## Resources\n\n- **Original**: Anthropic PDF Skill\n- **PyPDF Docs**: https://pypdf.readthedocs.io/\n- **PDFPlumber**: https://github.com/jsvine/pdfplumber\n---","attachment_filenames":["claw.json","instructions.md"],"attachments":[{"filename":"claw.json","content":"{\n \"name\": \"pdf-construction\",\n \"version\": \"2.0.0\",\n \"description\": \"PDF processing for construction documents: RFIs, submittals, specifications, drawing packages. Extract data, merge packages, fill forms.\",\n \"author\": \"datadrivenconstruction\",\n \"license\": \"MIT\",\n \"permissions\": [\n \"filesystem\"\n ],\n \"entry\": \"instructions.md\",\n \"tags\": [\n \"construction\",\n \"data-processing\",\n \"document-management\",\n \"CAD\"\n ],\n \"models\": [\n \"claude-*\",\n \"gpt-*\"\n ],\n \"minOpenClawVersion\": \"0.8.0\"\n}","content_type":"application/json; charset=utf-8","language":"json","size":517,"content_sha256":"8854bfdd0dafacd6db74964b6d1859f0ac6f62395cfc8d1f117b0933a6106fbf"},{"filename":"instructions.md","content":"You are a construction industry assistant specializing in construction document management and reporting.\n\nPDF processing for construction documents: RFIs, submittals, specifications, drawing packages. Extract data, merge packages, fill forms.\n\nWhen the user asks to convert or extract data:\n1. Gather the required input data from the user\n2. Process the data using the methods described in SKILL.md\n3. Present results in a clear, structured format\n4. Offer follow-up analysis or export options\n\n## Input Format\n- The user provides project data, file paths, or parameters as described in SKILL.md\n- Accept data in common formats: CSV, Excel, JSON, or direct input\n\n## Output Format\n- Present results in structured tables when applicable\n- Include summary statistics and key findings\n- Offer export to Excel/CSV/JSON when relevant\n\n## Key Reference\n- See SKILL.md for detailed implementation code, classes, and methods\n- Follow the patterns and APIs defined in the skill documentation\n\n## Constraints\n- Only use data provided by the user or referenced in the skill\n- Validate inputs before processing\n- Report errors clearly with suggested fixes\n- Follow construction industry standards and best practices\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":1205,"content_sha256":"754f0ec2a21e9bc3313930dc57f7756d5b8205fa2be2cc2f1868ed5c3b136d13"}],"content_json":{"type":"doc","content":[{"type":"heading","attrs":{"level":1},"content":[{"text":"PDF Processing for Construction","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Overview","type":"text"}]},{"type":"paragraph","content":[{"text":"Adapted from Anthropic's PDF skill for construction document workflows.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Construction Use Cases","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"1. RFI Processing","type":"text"}]},{"type":"paragraph","content":[{"text":"Extract structured data from Request for Information documents.","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"from pypdf import PdfReader\nimport re\n\ndef extract_rfi_data(pdf_path: str) -> dict:\n \"\"\"Extract RFI fields from PDF.\"\"\"\n reader = PdfReader(pdf_path)\n text = \"\"\n for page in reader.pages:\n text += page.extract_text()\n\n # Parse common RFI fields\n rfi_data = {\n 'rfi_number': re.search(r'RFI\\s*#?\\s*(\\d+)', text),\n 'subject': re.search(r'Subject:?\\s*(.+?)(?:\\n|$)', text),\n 'from': re.search(r'From:?\\s*(.+?)(?:\\n|$)', text),\n 'to': re.search(r'To:?\\s*(.+?)(?:\\n|$)', text),\n 'date': re.search(r'Date:?\\s*(\\d{1,2}[/-]\\d{1,2}[/-]\\d{2,4})', text),\n 'spec_section': re.search(r'Spec(?:ification)?\\s*Section:?\\s*(.+?)(?:\\n|$)', text),\n 'drawing_ref': re.search(r'Drawing\\s*(?:Ref)?:?\\s*(.+?)(?:\\n|$)', text),\n }\n\n return {k: v.group(1) if v else None for k, v in rfi_data.items()}","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"2. Submittal Package Creation","type":"text"}]},{"type":"paragraph","content":[{"text":"Merge multiple PDFs into organized submittal packages.","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"from pypdf import PdfWriter, PdfReader\nfrom pathlib import Path\n\ndef create_submittal_package(\n cover_sheet: str,\n product_data: list,\n shop_drawings: list,\n output_path: str\n) -> str:\n \"\"\"Create organized submittal package.\"\"\"\n writer = PdfWriter()\n\n # Add cover sheet\n writer.append(cover_sheet)\n\n # Add bookmarked sections\n page_num = len(PdfReader(cover_sheet).pages)\n\n # Product Data section\n writer.add_outline_item(\"Product Data\", page_num)\n for pdf in product_data:\n writer.append(pdf)\n page_num += len(PdfReader(pdf).pages)\n\n # Shop Drawings section\n writer.add_outline_item(\"Shop Drawings\", page_num)\n for pdf in shop_drawings:\n writer.append(pdf)\n page_num += len(PdfReader(pdf).pages)\n\n with open(output_path, \"wb\") as output:\n writer.write(output)\n\n return output_path","type":"text"}]},{"type":"heading","attrs":{"level":3},"content":[{"text":"3. Specification Extraction","type":"text"}]},{"type":"paragraph","content":[{"text":"Extract specification sections for analysis.","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"python"},"content":[{"text":"import pdfplumber\n\ndef extract_spec_sections(pdf_path: str) -> dict:\n \"\"\"Extract specification sections by division.\"\"\"\n sections = {}\n\n with pdfplumber.open(pdf_path) as pdf:\n current_section = None\n current_text = []\n\n for page in pdf.pages:\n text = page.extract_text()\n\n # Match CSI MasterFormat sections\n for line in text.split('\\n'):\n # Match section headers like \"03 30 00 - Cast-in-Place Concrete\"\n match = re.match(r'^(\\d{2}\\s?\\d{2}\\s?\\d{2})\\s*[-–]\\s*(.+)

Important: agents should read /llm.txt, /llms.txt, or /.well-known/skills.json to discover the public Skillopedia API.