This guide explains how to convert PDF files to SVG format with embedded metadata and text extraction.
pdf2image
, pytesseract
, Pillow
, lxml
python examples/enhanced_pdf_svg_workflow.py input.pdf -o output.svg
Extract pages 1, 3, and 5-7:
python examples/enhanced_pdf_svg_workflow.py input.pdf --pages "1,3,5-7"
Export metadata to JSON:
python examples/enhanced_pdf_svg_workflow.py input.pdf --format json --output metadata.json
Export metadata to HTML:
python examples/enhanced_pdf_svg_workflow.py input.pdf --format html --output metadata.html
python examples/enhanced_pdf_svg_workflow.py encrypted.pdf --password "your-password"
Increase DPI for better OCR accuracy (default: 300):
python examples/enhanced_pdf_svg_workflow.py input.pdf --dpi 600
XQR supports multiple output formats:
--chunk-size
For more information, see the Troubleshooting Guide.