PDF to Excel Merges Cells Wrong — How to Fix

You convert a PDF table to Excel, expecting clean rows and columns, and instead get a spreadsheet where random cells are merged, data that should be in separate columns is jammed together, and the structure makes no sense for calculation. This is one of the most frustrating PDF conversion outcomes because spreadsheets are only useful when the data is in the right cells. Understanding why converters get table structure wrong points directly to the fixes.

Why PDF to Excel Conversion Gets Cell Merging Wrong

PDF tables are drawn using lines and positioned text — there's no inherent 'cell' concept in the PDF format. When a converter reads a PDF table, it must infer the cell structure entirely from visual information: where the lines are, where text blocks fall, and what overlaps with what. This inference is error-prone in several common situations: **Multi-line cell content.** When a cell contains two lines of text, the converter sometimes sees two separate rows instead of one merged cell. Conversely, it might see two narrow cells instead of one wide cell. **Missing grid lines.** Tables without complete borders are harder to parse. A table with only horizontal lines (no vertical separators) may have all data merged into column A because the converter can't find column boundaries. **Merged header cells.** A header that spans three columns in the PDF often gets incorrectly mapped to the first column of three independent columns in Excel, pushing data into the wrong cells. **Irregular column widths.** When columns aren't evenly spaced, the converter's column-detection algorithm may group two adjacent columns together or split one wide column into two. **Nested tables.** Tables within tables are common in financial reports and forms. Nested structures confuse most converters, producing cell-merging errors in both the inner and outer tables.

Step-by-Step Fixes for Wrong Cell Merging

Work through these approaches based on the severity of the problem in your converted spreadsheet:

1Identify the pattern of merging errors first. Are entire rows merged? Individual cells? Is the problem in headers only or throughout the table? Understanding the pattern tells you which fix to apply and whether the problem is systemic or localized.
2Unmerge incorrectly merged cells manually for small tables. Select the affected cells, go to Format > Cells > Alignment, and uncheck 'Merge cells'. Then redistribute the data into the correct cells. For tables under 20 rows, manual correction is often faster than any automated fix.
3Use Excel's Text to Columns feature for data jammed into one cell. If multiple values are in a single cell separated by spaces, select the column, go to Data > Text to Columns, and split on delimiter (space, comma, or fixed width). This recovers column structure when the converter missed vertical boundaries.
4Re-convert using a different tool for systematic merging problems. If the entire table structure is wrong, no amount of manual fixing is efficient. Try a different PDF to Excel converter — some handle table detection significantly better than others. Adobe Acrobat Pro and Smallpdf have strong table-detection engines.
5Convert the PDF page to an image first, then use a tool with dedicated table OCR. Tools like Tabula (free, open-source) are specifically designed to extract tables from PDFs and produce cleaner Excel output than general-purpose converters.
6For recurring table types (monthly reports, standard forms), create an Excel template with the correct structure and paste the extracted data into it. Then use a macro or formula to map the data from the raw converted sheet into the template. This adds overhead but produces reliable results for regular workflows.

Using Tabula for Better Table Extraction

Tabula is a free, open-source tool specifically built for extracting tables from PDFs into CSV or Excel format. It works differently from general-purpose converters: - You manually draw a selection box around the table you want to extract - Tabula analyzes only that selection, reducing errors from surrounding content - It uses line-detection and spatial analysis optimized specifically for tables - Output is a clean CSV or Excel file with one value per cell Tabula is particularly strong on financial reports, government data tables, and research papers. It's less useful for tables without clear grid lines. Download it from tabula.technology.

Preparing PDFs for Better Excel Conversion

If you control how the PDF is created (or can get a better version from the source), these factors significantly improve conversion quality: **Get the original data source.** If the PDF was generated from an Excel file, request the original Excel file. Converting from Excel to PDF and back is always lossy — the original file is always cleaner. **Use PDFs with full table borders.** Tables with complete grids (all four sides of every cell) convert much better than borderless or partially-bordered tables. **Avoid complex merged headers in the source document.** If you're creating the PDF yourself, simplify the header structure. Headers that span multiple columns are a major source of conversion errors. **Embed the original data.** Some PDF creation tools embed the original spreadsheet data as an attachment inside the PDF. If this option is available, enable it — the embedded file provides clean extraction without conversion.

Frequently Asked Questions

Why does the first table on the page convert correctly but later tables don't?

Converters detect page regions and process them independently. Later tables may have different styling (fewer borders, different font sizes) that confuses the detection algorithm. Try extracting later tables separately, or use a tool that lets you select specific table regions.

My converted spreadsheet has the right data but in the wrong order — is that a merging problem?

Probably not a merging problem — more likely a reading-order problem. The converter may be reading the PDF in a different order than the visual layout (left-to-right vs. right-to-left, or mixing column order). Check if the PDF has multiple column sections and whether the converter handled the column sequence correctly.

Can I use Python to fix merged cells automatically after conversion?

Yes. Python's openpyxl library can programmatically unmerge cells and redistribute data. You can write a script that iterates through merged cell ranges, unmerges them, and fills each individual cell with the appropriate value. This is efficient for recurring document types.

Is there a way to convert PDF tables to Excel without any merging errors?

For well-structured tables in digitally-created PDFs, modern tools get it right most of the time. For complex, multi-level, or borderless tables, some manual cleanup is almost always necessary. Tabula + manual verification is the most reliable free workflow.

Convert your PDF tables to Excel format — LazyPDF handles standard tables cleanly, free and browser-based.

Convert PDF to Excel

Comparisons