Prototype-Based Explainable Deep Clustering for Time Series

Introduction and Background

Time series are generated in many domains such as energy monitoring, finance, industrial IoT, and human activity recognition. Understanding latent structure in time series without labels is an important challenge for data exploration and anomaly discovery.
Deep clustering methods jointly learn representations and clusters, but they lack transparency. In sensitive applications, domain experts need to understand why instances are grouped, and what characterizes each cluster.

Prototype learning offers a promising approach to explainable clustering: each cluster is represented by one or more learned prototype time series, which serve as human-interpretable cluster exemplars. Instead of post-hoc explanations, prototypes provide built-in interpretability. However, while prototype networks have shown success in computer vision and tabular data, their potential for time series deep clustering remains largely unexplored.

This thesis will develop prototype-based deep clustering architectures for time series, providing interpretable clusters and insights into cluster structure.

Research Question

Main Research Question
How can prototype learning be incorporated into deep clustering for time series to jointly achieve high cluster quality and intrinsic interpretability?

Sub-questions

  • How should prototype representations be defined for time series (raw vs latent space prototypes)?
  • How can time-series-specific properties (temporal alignment, warping, invariances) be integrated into prototype learning?
  • What is the trade-off between interpretability (prototype clarity) and clustering performance?
  • Can prototype-based clustering reveal meaningful structure in real-world time series datasets?
  • How does prototype-based interpretability compare to post-hoc methods (e.g. SHAP, attention explanation)?

Tasks & Goals

  • Literature Review
    Study prototype learning, deep clustering, and interpretability for time series.
  • Data Preparation
    Select benchmark time series datasets (e.g. UCR, HAR) and preprocess them.
  • Model Development
    Implement deep clustering with prototype-based extensions (latent prototypes, decoded prototypes, temporal similarity).
  • Training & Evaluation
    Train models and evaluate clustering quality and prototype interpretability.
  • Comparison & Analysis
    Compare against baseline deep clustering and post-hoc explainability methods.
  • Thesis Documentation
    Summarize findings, discuss limitations, and outline future research directions.

Expected Outcome

  • A prototype-based deep clustering architecture (or a few variants) suitable for multivariate time series.
  • Interpretable prototypes: prototypes that can be visualized as time series and help explain clusters.
  • Empirical results comparing clustering performance and interpretability trade-offs.
  • Insights about when prototype approaches work well, when they struggle, and how they compare with alternative explainability techniques.
  • Accompanying code base and reproducible experiments.

Requirements

  • Basic Programming Skills:
    Experience with Python, including libraries such as NumPy, pandas, and matplotlib.

  • Introductory Machine Learning Knowledge:
    Understanding of neural networks, especially autoencoders or CNNs. Prior coursework or projects using PyTorch or TensorFlow is helpful but not mandatory.

  • Interest in interpretability, time series, and unsupervised methods
  • Willingness to learn or deepen knowledge about time-series alignment and similarity measures

References

Schlegel, U., Tavares, G. M., & Seidl, T. (2025). Towards Explainable Deep Clustering for Time Series Data. arXiv preprint arXiv:2507.20840.

Ming, Y., Xu, P., Qu, H., & Ren, L. (2019, July). Interpretable and steerable sequence learning via prototypes. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 903-913).