BibexPy — V2 Helium

Report & Provenance

BibexPy is built so that every transformation is traceable, reproducible and reversible. This page describes the provenance model behind the Report step.

The three pillars

1 · Append-only audit log

Every data-modifying operation is recorded with:

  • timestamp and operation category (merge, filter, disambiguation, enrichment, export, …)
  • the parameters used (thresholds, sources, criteria)
  • before/after states for affected fields

The log is append-only: entries are never edited or deleted, so the history can't be silently rewritten.

2 · Snapshots

Before any operation that affects records, a pre-operation snapshot is created automatically (only when records will actually change — no snapshot spam). Restoring a snapshot is itself a logged operation, so even rollbacks leave a trail.

3 · Isolated analyses

Each merge run lives in its own workspace inside the project. Re-merging with new data never mutates a previous analysis — old results remain exactly as reported.

The methodology narrative

The Report step compiles provenance into a structured operation report (Markdown / plain text / PDF) and an automatically generated methodology narrative. The narrative is restricted to facts present in the audit log — it cannot claim a cleaning step that never ran, which keeps your methods section honest by construction.

Alignment with FAIR & computational reproducibility

  • Findable/Accessible — exports and reports are plain, open formats.
  • Interoperable — WoS-convention field tags, standard BibTeX/RIS/CSV.
  • Reusable — presets + audit log + snapshots make the exact corpus reconstructable.
  • Deterministic-by-default processing keeps repeated runs identical — the foundation of computational reproducibility.

For your supplementary materials

Export the operation report as PDF and attach it — reviewers can verify every preparation decision without access to your machine.