PDF Tools and Workflows for Archaeology Professionals
Archaeological work generates an extraordinary volume of documentation — site survey records, stratigraphic drawings, artifact inventories, photographic logs, field notebooks, laboratory analysis reports, and publication manuscripts. From the moment a trench is opened to the final publication of findings, systematic documentation is the scientific record of irreplaceable archaeological context that is destroyed in the act of excavation. That record increasingly lives as a collection of PDF files. Archaeologists, heritage managers, cultural resource management (CRM) professionals, museum curators, and field technicians all need to manage PDFs effectively. Site reports that compile hundreds of pages of context records, photographs, and artifact descriptions need to be organized and assembled systematically. Field photographs need to be converted to PDF and organized with descriptive metadata. Historical documents from archives need to be compiled with modern field observations. Gray literature reports need to be compressed for sharing with agencies while maintaining the archival quality needed for permanent repository submission. This guide covers the specific PDF needs of archaeological practice — from digitizing field records and compiling site reports to managing photographic archives and preparing materials for publication and repository submission.
Digitizing and Archiving Field Records
Field notebooks, drawn site plans, stratigraphic profiles, artifact sketches, and handwritten context record sheets are the primary records of an archaeological excavation. These documents capture information that cannot be recreated once the site has been excavated — they represent the only surviving record of the spatial relationships, stratigraphy, and material culture in their original context. Converting these physical records to properly organized PDF archives is a fundamental archival responsibility. Scan field notebooks and paper forms at a minimum of 300 DPI — higher resolution (400–600 DPI) is preferable for documents with fine pencil sketches, elevation measurements, or written details that may be partially illegible at lower resolutions. Scan in color even if most content is in pencil or black ink — color scanning captures any colored annotations, coded horizon markers, or soil description color references. Use a flatbed scanner for the highest quality; phone scanning apps are acceptable for supplementary documentation but not as the primary archival record. For individual context record sheets (one sheet per excavated context, which is standard in British-system single context recording), scan each sheet and create a PDF named with the context number: Context_0143.pdf, Context_0144.pdf. For field notebooks, scan all pages into a single PDF per notebook and name it with the site code and notebook number: SITE2026_Notebook_01.pdf. These naming conventions ensure the digital archive mirrors the physical archive structure. LazyPDF's image-to-PDF tool converts batches of scanned images to PDF. If your scanning workflow produces individual JPEG or TIFF images, convert them to PDF immediately after scanning and before organizing. This is more efficient than accumulating thousands of loose images and converting them later. Compress each completed PDF to reduce storage overhead while maintaining the detail needed for archival quality — LazyPDF's compress tool is effective for scan-heavy PDFs.
- 1Scan all field notebooks, context record sheets, and drawn plans at 300+ DPI in color.
- 2Review each scan immediately for legibility — rescan any illegible or incomplete pages.
- 3Convert scanned images to PDFs using LazyPDF's Image to PDF tool.
- 4Name each PDF file with the site code and document type following your archive naming convention.
- 5Compress completed PDFs to reduce storage size while maintaining archival quality.
- 6Back up to at least two separate locations immediately after completing each session's scanning.
Compiling Archaeological Site Reports
Site reports — from brief grey literature client reports to comprehensive monograph-length final publications — are the primary mechanism through which excavation results are communicated and preserved for future research. Modern site reports typically consist of many separately authored and formatted sections: excavation methodology, stratigraphic narrative, finds specialist reports (pottery, animal bone, human remains, environmental samples), scientific dating reports (radiocarbon, dendrochronology, luminescence), documentary historical research, and illustrated appendices. Assembling these sections from multiple specialists into a single coherent PDF report is a significant editorial and technical task. LazyPDF's merge tool can combine PDFs from different authors and different source applications into a single document. The key challenge is maintaining consistent page numbering, figure numbering, and cross-references across sections authored independently. Establish figure numbering and citation conventions with all contributors before they begin writing — far easier than reconciling inconsistencies after the fact. For figure plates — pages with multiple artifact photographs, plan drawings, or section profiles arranged together — compile these in a layout application and export as PDF before merging with the text. Figures in archaeological reports need to be high resolution (300 DPI minimum) and clearly labeled with figure numbers, scale bars, and north arrows for site plans. Compression after the fact can maintain visual quality while reducing a 200MB combined report to a more manageable 50–80MB. For grey literature reports submitted to developer clients and historic environment records (HERs), the file size requirements and quality standards differ from publication submissions. Clients and HERs typically need complete, legible PDFs under 20MB for practical handling. Compress accordingly while maintaining text legibility and plan/photograph quality.
- 1Establish figure numbering, table numbering, and citation conventions with all contributing specialists before they begin writing.
- 2Collect all specialist report PDFs and assemble them in the site report structure.
- 3Create a master PDF by merging all sections in the correct order using LazyPDF's Merge tool.
- 4Add page numbers to the complete report using LazyPDF's Page Numbers tool.
- 5Update all internal cross-references to reflect the final page numbers.
- 6Compress the final report for distribution while maintaining a full-quality archival master.
Managing Artifact Photographic Archives
Archaeological photography is a primary record type — photographs of artifacts, in-situ features, section profiles, and general site views document conditions and objects that may change or be removed during excavation. Managing artifact photographic archives effectively requires both high-quality image capture and systematic organization into searchable, archival-quality PDFs. For artifact photography, photograph each significant find under controlled conditions: consistent lighting, appropriate background (typically a neutral grey or blue scale card backdrop), with a photographic scale and label in frame. RAW format capture from a DSLR is best for archival purposes; JPEG at the highest quality setting is acceptable for operational photography. For publication-quality photographs, post-process and export as TIFF at 300 DPI minimum. Organizing artifact photographs into themed PDF plates — grouped by type, period, or significance — creates navigable visual records. Use LazyPDF's image-to-PDF tool to compile artifact photographs into organized PDF sheets: upload the photographs for a specific category, arrange in the desired sequence, and convert to a labeled PDF plate. A PDF plate labeled Ceramics_IronAge_ContextGroup_45.pdf with 12 artifact photographs arranged on facing pages is a far more useful archival format than 12 individual loose JPEG files with cryptic filenames. For large excavation projects with thousands of photographic records, establish a hierarchical naming convention for both the physical photographs and their PDF compilations. Site_Area_Context_Find_PhotNum.jpg for individual photos and Plate_FindType_Period.pdf for compiled plates creates a clear relationship between original records and compiled documentation.
- 1Photograph each artifact category under consistent controlled conditions with photographic scale in frame.
- 2Export processed photographs as high-resolution JPEG or TIFF files with descriptive names.
- 3Use LazyPDF's Image to PDF tool to compile related photographs into organized PDF plates.
- 4Label each plate with the category, period, and site reference.
- 5Compress completed plates for practical storage and distribution while maintaining archival quality.
- 6Merge related plates into comprehensive photographic appendices for site reports.
Preparing Archaeological Reports for Repository Submission
Archaeological data archives — the Archaeology Data Service (ADS) in the UK, tDAR in North America, Open Context internationally, and national or state archives in many countries — have specific technical requirements for digital submissions. PDFs are typically required for text-based records, and the quality and format requirements for archival submissions differ from client reports. Many archaeological data archives prefer or require PDF/A format for long-term archival submissions. PDF/A ensures the document remains fully renderable without dependency on external fonts, media, or linked content — essential for archives that need to guarantee access to records in 50 or 100 years. Convert reports and records to PDF/A using appropriate conversion tools before submission. Validate PDF/A compliance using the free VeraPDF tool. Resolution requirements for scanned records in archival submissions are typically higher than for operational use — the ADS recommends 400 DPI for text documents and 600 DPI for drawn plans and scale drawings. These higher resolutions ensure that fine detail — pencil shading on section drawings, pencil annotations on forms, small-scale features on plans — is captured accurately at the archival resolution. Metadata is a critical component of archival PDF submission. Each PDF submitted to a data archive should have descriptive title, author, institution, date, site code, and subject keywords embedded in the PDF metadata. Well-populated metadata makes archived records discoverable through the archive's search tools, which directly affects whether future researchers can find and use your site data. Populating PDF metadata takes five minutes per document but has lasting impact on the accessibility of your archive.
- 1Review the technical submission requirements for your target data archive before preparing files.
- 2Convert final report and record PDFs to PDF/A format for archival submission.
- 3Validate PDF/A compliance using the free VeraPDF validator.
- 4Verify that all PDFs are scanned at the archive's minimum resolution (typically 400 DPI for text, 600 DPI for drawings).
- 5Populate PDF metadata (title, author, date, site code, keywords) for all submitted files.
- 6Prepare a submission manifest listing all files, their format, size, and content description.
Frequently Asked Questions
What is the recommended DPI for scanning archaeological context record sheets?
For primary site records like context sheets, 300 DPI is the minimum acceptable resolution, and 400 DPI is preferred for archival quality. For drawn records — site plans, section profiles, elevation drawings, and stratigraphic matrices — 600 DPI is recommended to capture fine pencil lines, hatching, and annotations accurately. The Archaeology Data Service in the UK recommends 400 DPI for text documents and 600 DPI for plans and drawings. Using higher resolution increases file sizes but protects against future needs — if you ever need to publish a drawing at large scale, having the high-resolution scan avoids the need to rescan. Always scan at the highest practical quality: storage is inexpensive, rescanning is not possible once the physical archive is inaccessible.
How should archaeological grey literature be prepared as a PDF?
Grey literature archaeological reports (evaluation reports, watching brief reports, desk-based assessments) are typically submitted to local Historic Environment Records (HERs) and retained in project archives. Best practices for these PDFs include: complete internal bookmarks for major sections, page numbers, figure list and table list if the report exceeds 20 pages, embedded fonts, high-resolution figures and plans (300 DPI minimum), and comprehensive title page information including site name, site code, NGR, client, contractor, and date. For submission to HERs, most now accept PDFs under 20MB via email or online portal submission. Compress accordingly, but maintain sufficient image quality for plan legibility — site plans with fine feature detail need to be readable at 100% zoom.
Can handwritten field notes and sketches be made searchable with OCR?
Handwritten field notes present a significant OCR challenge. Modern OCR systems are highly accurate with printed text but struggle with handwriting — individual handwriting styles, technical abbreviations, numbers mixed with text, and the variety of conditions under which field notes are written (in rain, in failing light, under time pressure) all reduce OCR accuracy for handwritten records. For printed forms where only certain fields are handwritten, OCR works well for the printed text and poorly for the handwritten entries. The most practical approach is to scan handwritten records for visual preservation and add searchable typed metadata (site code, context number, date, recorder name, key terms) in the PDF's metadata fields rather than attempting full OCR of the handwritten content.
What file format should archaeological photographs be archived as?
For long-term archival of archaeological photographs, TIFF is the preferred format — it is lossless, widely supported, and has been stable for decades. For practical operational use and PDF compilation, high-quality JPEG (90%+ quality) is acceptable and dramatically more storage-efficient. The Archaeology Data Service recommends TIFF for master archival images and allows JPEG derivatives for access copies. A practical workflow is to capture as RAW, archive the RAW and a TIFF derivative as primary records, and use compressed JPEG copies for reports and PDF plates. This preserves maximum quality in the long-term archive while keeping working copies at practical file sizes.