Report & Provenance
BibexPy is built so that every transformation is traceable, reproducible and reversible. This page describes the provenance model behind the Report step.
The three pillars
1 · Append-only audit log
Every data-modifying operation is recorded with:
- timestamp and operation category (merge, filter, disambiguation, enrichment, export, …)
- the parameters used (thresholds, sources, criteria)
- before/after states for affected fields
The log is append-only: entries are never edited or deleted, so the history can't be silently rewritten.
2 · Snapshots
Before any operation that affects records, a pre-operation snapshot is created automatically (only when records will actually change — no snapshot spam). Restoring a snapshot is itself a logged operation, so even rollbacks leave a trail.
3 · Isolated analyses
Each merge run lives in its own workspace inside the project. Re-merging with new data never mutates a previous analysis — old results remain exactly as reported.
The methodology narrative
The Report step compiles provenance into a structured operation report (Markdown / plain text / PDF) and an automatically generated methodology narrative. The narrative is restricted to facts present in the audit log — it cannot claim a cleaning step that never ran, which keeps your methods section honest by construction.
Alignment with FAIR & computational reproducibility
- Findable/Accessible — exports and reports are plain, open formats.
- Interoperable — WoS-convention field tags, standard BibTeX/RIS/CSV.
- Reusable — presets + audit log + snapshots make the exact corpus reconstructable.
- Deterministic-by-default processing keeps repeated runs identical — the foundation of computational reproducibility.
For your supplementary materials
Export the operation report as PDF and attach it — reviewers can verify every preparation decision without access to your machine.
