RegionInterpreter
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sheshe import ModalBoundaryClustering, RegionInterpreter
iris = load_iris()
X, y = iris.data, iris.target
sh = ModalBoundaryClustering().fit(X, y)
cards = RegionInterpreter(feature_names=iris.feature_names).summarize(sh.regions_)
sh.plot_classes(X, y)
plt.show()
Converts ClusterRegion objects into compact human‑readable rule sets.
It summarises each region with axis‑aligned boxes, highlights informative
projections and offers helpers like pretty_print for reporting. Optional
LLM backends can turn the summaries into natural‑language descriptions.
Mathematical formulation
For each feature j, an axis‑aligned rule uses quantiles [Q_q(x_j), Q_{1-q}(x_j)], capturing about 1-2q of the data.
Capped radii are flagged using the z‑score z=(r-μ)/σ when |z| > cap_threshold.
Example: with q_box=0.05 and values 1‑10, the rule becomes [1.45, 9.55]. A radius 3 with mean 2 and σ=0.3 yields z≈3.3.
Example
from sheshe import RegionInterpreter
ri = RegionInterpreter(feature_names=["sepal", "petal"])
summary = ri.summarize(region)
Usage examples
from sheshe import RegionInterpreter
ri = RegionInterpreter(feature_names=["sepal", "petal"])
ri.summarize(region) # summarize a single region
from sheshe import RegionInterpreter
ri = RegionInterpreter(feature_names=["sepal", "petal"])
ri.summarize([region]) # summarize a list of regions
Interpretability example
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sheshe import ModalBoundaryClustering, RegionInterpreter
iris = load_iris()
X, y = iris.data, iris.target
sh = ModalBoundaryClustering(
base_estimator=RandomForestClassifier(random_state=0),
task="classification",
).fit(X, y)
# Tabular summary of the regions
print(sh.interpretability_summary(iris.feature_names).head())
# Human-readable rules per region
cards = RegionInterpreter(feature_names=iris.feature_names).summarize(sh.regions_)
RegionInterpreter.pretty_print(cards)
Parameters
feature_names(list[str]orNone, defaultNone): names for each feature used in the generated rules.q_box(float, default0.05): quantile used to compute robust axis-aligned boxes.k_pairs(int, default2): number of informative 2D projections to include.decimals(int, default2): decimal precision in the emitted rules.cap_threshold(float, default6.393): z-score threshold to mark capped radii.near_const_tol(float, default0.12): tolerance to report nearly constant dimensions.inverse_transform(callable, optional): function applied to points before rule extraction (e.g. inverse scaling).feature_bounds(Sequence[Tuple[float, float]]orNone): hard bounds for each feature to clamp boxes.include_center_in_box(bool, defaultTrue): include region centre when computing axis-aligned boxes.
Methods
summarize(regions)– return a list of dictionaries containing headlines, axis-aligned rules and pairwise projections for the provided regions.