Skip to content

JakubPelka/MergeExcelFiles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MergeExcelFiles

Status: MAINTAINED / PAUSE
Purpose: Merge several Excel exports into one clean Excel file.

MergeExcelFiles is a small Python/Tkinter desktop tool for combining multiple Excel files with the same or similar table structure into one output workbook.

The project was originally created for merging exports related to Skyddsvärda träd and other tree-related datasets, including data exported from Artportalen-related workflows. The tool is intentionally generic and can also be used for other Excel files when they share a comparable header/data structure.

What the tool does

  • Reads multiple Excel files (.xlsx, .xlsm, .xls).
  • Uses a configurable header row. Default: row 3.
  • Reads data from the rows below the header.
  • Removes fully empty rows and columns.
  • Merges all rows into one output sheet named Merged.
  • Preserves the column order from the first file.
  • Adds new columns from later files at the end.
  • Optionally adds a source file column for traceability.
  • Optionally exports a CSV file next to the Excel output.

Main use case

The primary use case is merging several Excel exports containing tree-related records, for example:

  • Skyddsvärda träd exports.
  • Artportalen-related tree observations.
  • Similar Excel-based inventories where each file has the same header structure.

This tool does not validate biological records, deduplicate observations, enrich data from APIs, or perform GIS analysis. It only combines tabular Excel data.

Data safety note

Do not commit real working exports to this repository if they contain:

  • precise locations,
  • protected or sensitive observations,
  • personal data,
  • internal identifiers,
  • non-public project data.

Use only synthetic or clearly sanitized sample files in sample_data/.

Suggested repository structure

MergeExcelFiles/
├── README.md
├── LICENSE
├── requirements.txt
├── .gitignore
├── .gitattributes
├── run_gui.py
├── src/
│   └── merge_excel_files/
│       ├── __init__.py
│       ├── core.py
│       └── gui.py
└── sample_data/
    └── README.md

Requirements

  • Python 3.10+
  • Tkinter, normally included with standard Python installations on Windows and macOS
  • Python packages listed in requirements.txt

Installation

Create a virtual environment:

python -m venv .venv

Activate it on Windows PowerShell:

.\.venv\Scripts\Activate.ps1

Activate it on macOS/Linux:

source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Run

python run_gui.py

How to use

  1. Select one or more Excel files.
  2. Select an output folder.
  3. Set the header row if needed. Default: 3.
  4. Set a sheet name or sheet index if the data is not in the first sheet.
  5. Choose whether to add a source file column.
  6. Choose whether to also save CSV.
  7. Click Merge and save.

Input format

The default setup expects this type of structure:

Row 1: description or metadata, ignored
Row 2: description or metadata, ignored
Row 3: column headers
Row 4+: data rows

The header row can be changed in the GUI.

Output

The tool creates:

  • one .xlsx file with a sheet named Merged,
  • optionally one .csv file with the same merged data.

If later files contain columns that were not present in the first file, those columns are added at the end of the output table.

Current limitations

  • No duplicate detection.
  • No coordinate validation.
  • No species/taxon validation.
  • No advanced column mapping between differently named fields.
  • No command-line interface yet.
  • Excel formatting from the input files is not preserved.

Roadmap ideas

Possible future improvements:

  • validation report after merge,
  • duplicate detection based on selected key columns,
  • optional column mapping presets,
  • CLI mode for repeatable batch processing,
  • automatic output naming with timestamp,
  • safer sample data generator.

License

MIT License. See LICENSE.

About

✅ - working version. Laczy kilka plikow excel z danymi o drzewach z Artoprtalen w jeden plik

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages