Reproducibility

Reference system

  • OS: Ubuntu 24.04.2 LTS
  • CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
  • GPU: None
  • Python: 3.12.10

Environment setup

  1. Create and activate a virtual environment:
    python -m venv .venv
    source .venv/bin/activate
  2. Install development dependencies and run tests to verify:
    pip install -e ".[dev]"
    PYTHONPATH=src pytest -q

Random seeds

All SheShe estimators accept a random_state argument to ensure deterministic results:

from sheshe import ModalBoundaryClustering
clustering = ModalBoundaryClustering(random_state=0)

Set the same random_state across runs to reproduce experiments.

Export system information

Capture exact package versions and platform details for future reference:

python -m pip freeze > packages.txt
python - <<'PY'
import json, platform, sys
info = {
    "platform": platform.platform(),
    "python": sys.version,
}
print(json.dumps(info, indent=2))
PY

How to report differences

If results differ from published benchmarks:

  • Share the generated packages.txt and platform JSON.
  • Describe the dataset and parameters used, including random_state values.
  • Open an issue with logs and reproduction steps.

TODO

  • Specify recommended RAM capacity.