ShuShu


import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sheshe import ShuShu

X, y = load_iris(return_X_y=True)
sh = ShuShu().fit(X, y)
sh.plot_classes(X, y)
plt.show()

Gradient-based optimiser that searches for local maxima of a scalar score or

class probabilities. Runs one optimisation per class when labels are given and

can also operate on arbitrary user-defined score functions, returning cluster

centroids for each discovered mode.

Mathematical formulation

ShuShu performs gradient-ascent updates x_{t+1} = x_t + η∇f(x_t) until ‖∇f(x_t)‖ < tol or a maximum iteration count is reached.

When analytic gradients are unavailable, it estimates them via SPSA: ∂f/∂x_i ≈ (f(x + cΔ) - f(x - cΔ))/(2cΔ_i) with Rademacher perturbations Δ_i ∈ {−1,1}.

Example


from sheshe import ShuShu
ss = ShuShu()
ss.fit(X, y)
labels = ss.predict(X)

Usage examples


from sheshe import ShuShu

shu = ShuShu(random_state=0)
shu.fit(X, y)                      # fit


from sheshe import ShuShu

shu = ShuShu(random_state=0)
shu.fit_predict(X, y)              # fit_predict


from sheshe import ShuShu

shu = ShuShu(random_state=0)
shu.fit_transform(X, y)            # fit_transform


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.transform(X)                   # transform


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.predict(X)                     # predict


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.predict_proba(X)               # predict_proba


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.decision_function(X)           # decision_function


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.predict_regions(X)             # predict_regions


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.score(X, y)                    # score


from sheshe import ShuShu

shu = ShuShu(random_state=0).fit(X, y)
shu.save("shu.joblib")             # save


from sheshe import ShuShu

shu = ShuShu.load("shu.joblib")

Additional examples


from sklearn.datasets import load_iris
from sheshe import ShuShu

X, y = load_iris(return_X_y=True)
sh = ShuShu(random_state=0).fit(X, y)
print(sh.summary_tables()[0][["class_label", "n_clusters"]])

import numpy as np
def paraboloid(Z):
    return -np.linalg.norm(Z - 1.0, axis=1)

sc = ShuShu(random_state=0).fit(np.random.rand(100, 2), score_fn=paraboloid)
print(sc.centroids_)

from sklearn.linear_model import LogisticRegression
model = LogisticRegression(max_iter=200).fit(X, y)
ShuShu(random_state=0).fit(X, y, score_model=model)

Parameters

clusterer_factory (callable or None, default None): factory returning the internal optimiser. When None a default implementation is used.
random_state (int or None, default None): seed for reproducibility.
**clusterer_kwargs – additional arguments forwarded to the internal optimiser.

Methods

fit(X, y=None, score_fn=None, ...) – fit the optimiser. When y is provided, the score function is learned per class; otherwise score_fn must be supplied.
fit_predict(X, y=None, **kwargs) – convenience wrapper calling fit then predict.
predict(X) – return class labels or cluster ids.
predict_proba(X) – class probabilities (after fitting with labels).
decision_function(X) – raw decision scores.
transform(X) – membership/affinity matrix.
fit_transform(X, y=None, **kwargs) – combine fit and transform.
plot_pairs(X, y=None, max_pairs=None, show_histograms=False) – scatter plots for feature pairs with optional marginal histograms.
plot_classes(X, y, grid_res=200, contour_levels=None, max_paths=20, show_paths=True) – visualise class-wise score surfaces.
plot_pair_3d(X, y=None, features=(i, j), ax=None, fig=None, grid=64) – 3D surface rendering of the score function.
get_cluster(cluster_id, with_geometry=False) – retrieve information about a discovered cluster.