API Documentation

himap.ab module

Functions:

_curr_u(n_samples, u, t, j, d)

Provides the current value of u checking whether the t-d is non-negative, t is less than n_samples, and d is greate than or equal to t - (n_samples - 1).

_forward(n_samples, n_states, n_durations, ...)

Computes the forward variable alpha needed for the likelihood computation and the parameters re-estimation.

_backward(n_samples, n_states, n_durations, ...)

Computes the backward variable beta needed for the likelihood computation and the parameters re-estimation.

_u_only(n_samples, n_states, n_durations, ...)

Computes the u values needed for the forward variable computation.

himap.ab._curr_u(n_samples, u, t, j, d)

Provides the current value of u checking whether the t-d is non-negative, t is less than n_samples, and d is greate than or equal to t - (n_samples - 1). Utilized by the _forward auxiliary function.

Parameters:
  • n_samples (int) – Number of samples.

  • u (np.ndarray) – Array of shape (n_samples, n_states, n_durations) containing the u values as produced by the _u_only function.

  • t (int) – Current time step.

  • j (int) – Current state.

  • d (int) – Current duration.

Returns:

curr_u – Current value of u.

Return type:

float

See also

_forward

Function that computes the forward variable.

himap.base.HSMM._core_u_only

Method that computes the u values.

himap.ab._forward(n_samples, n_states, n_durations, log_startprob, log_transmat, log_durprob, left_censor, right_censor, eta, u, xi)

Computes the forward variable alpha needed for the likelihood computation and the parameters re-estimation. Utilized by the HSMM._core_forward method.

Parameters:
  • n_samples (int) – Number of samples.

  • n_states (int) – Number of states.

  • n_durations (int) – Number of durations.

  • log_startprob (np.ndarray) – Array of shape (n_states,) containing the log of the initial state probabilities.

  • log_transmat (np.ndarray) – Array of shape (n_states, n_states) containing the log of the transition probabilities.

  • log_durprob (np.ndarray) – Array of shape (n_states, n_durations) containing the log of the duration probabilities.

  • left_censor (int) – 0 if no left censoring, 1 if left censoring (Default is 0).

  • right_censor (int) – 0 if no right censoring, 1 if right censoring (Default is 0).

  • eta (np.ndarray) – Array of shape (n_samples, n_states, n_durations) containing the eta values.

  • u (np.ndarray) – Array of shape (n_samples, n_states, n_durations) containing the u values as produced by the _u_only auxiliary function.

  • xi (np.ndarray) – Array of shape (n_samples, n_states, n_states) containing the xi values.

Returns:

alpha – Array of shape (n_states,) containing the alpha values.

Return type:

np.ndarray

See also

himap.base.HSMM._core_forward

Method that computes the forward variable.

himap.ab._backward(n_samples, n_states, n_durations, log_startprob, log_transmat, log_durprob, right_censor, beta, u, betastar)

Computes the backward variable beta needed for the likelihood computation and the parameters re-estimation. Utilized by the HSMM._core_backward method.

Parameters:
  • n_samples (int) – Number of samples.

  • n_states (int) – Number of states.

  • n_durations (int) – Number of durations.

  • log_startprob (np.ndarray) – Array of shape (n_states,) containing the log of the initial state probabilities.

  • log_transmat (np.ndarray) – Array of shape (n_states, n_states) containing the log of the transition probabilities.

  • log_durprob (np.ndarray) – Array of shape (n_states, n_durations) containing the log of the duration probabilities.

  • right_censor (int) – 0 if no right censoring, 1 if right censoring (Default is 0).

  • beta (np.ndarray) – Array of shape (n_samples, n_states) containing the initialized beta values.

  • u (np.ndarray) – Array of shape (n_samples, n_states, n_durations) containing the u values as produced by the _u_only auxiliary function.

  • betastar (np.ndarray) – Array of shape (n_samples, n_states) containing the beta* values.

Return type:

None

See also

himap.base.HSMM._core_backward

Method that computes the backward variable.

Notes

The beta values are computed inplace.

himap.ab._u_only(n_samples, n_states, n_durations, log_obsprob, u)

Computes the u values needed for the forward variable computation. Utilized by the HSMM._core_u_only method.

Parameters:
  • n_samples (int) – Number of samples.

  • n_states (int) – Number of states.

  • n_durations (int) – Number of durations.

  • log_obsprob (np.ndarray) – Array of shape (n_samples, n_states) containing the log of the observation probabilities.

  • u (np.ndarray) – Array of shape (n_samples, n_states, n_durations) containing the u values.

Return type:

None

See also

himap.base.HSMM._core_u_only

Method that computes the u values.

Notes

The u values are computed inplace.

himap.base module

Classes:

HSMM([n_states, n_durations, n_iter, tol, ...])

Base class for Hidden Semi-Markov Models (HSMMs)

GaussianHSMM([n_states, n_durations, ...])

The GaussianHSMM class models Hidden Semi-Markov processes with Gaussian-distributed emissions.

HMM([n_states, n_obs_symbols, n_iter, tol, ...])

The HMM class models Hidden Markov processes with discrete emissions.

class himap.base.HSMM(n_states=2, n_durations=5, n_iter=20, tol=0.01, left_to_right=False, obs_state_len=None, f_value=None, random_state=None, name='', results_parent_path=None)

Bases: object

Base class for Hidden Semi-Markov Models (HSMMs)

Parameters:
  • n_states (int) – Number of hidden states. Must be ≥ 2.

  • n_durations (int) – Number of duration categories per state. Must be ≥ 1.

  • n_iter (int) – Maximum number of iterations for training.

  • tol (float) – Convergence threshold for stopping the training.

  • left_to_right (bool) – Indicates whether the model follows a left-to-right topology.

  • obs_state_len (int, optional) – Length of the observed state (required if f_value is provided).

  • f_value (int/float, optional) – Final observed value of the state (required if obs_state_len is provided).

  • random_state (int/None, optional) – Seed for reproducibility.

  • name (str, optional) – Name of the model. Defaults to “hsmm” if not provided.

  • results_parent_path (str, optional) – The path to create the himap_results directory tree where the results (models, figures, dictionaries, performance metrics) are saved

Methods:

_init([X])

Initializes model parameters if they are not already set.

_init_mc()

Initialize the model parameters for MC sampling (to be implemented in child class).

_require_core()

Ensures that the HiMAP Cython extension is available.

_check()

Validates the initialized parameters:

_dur_init(*args)

Ιnitializes duration parameters if there are no arguments yet (to be implemented in child class).

_dur_check(*args)

Checks if properties of duration parameters are satisfied arguments (to be implemented in child class).

_dur_probmat(*args)

Compute the probability per state of each duration arguments (to be implemented in child class).

_dur_mstep(*args)

Compute the duration parameters (to be implemented in child class).

_emission_logl(*args)

Compute the log-likelihood of each observation under each state (to be implemented in child class).

_emission_pre_mstep(*args)

Prepare for emission parameters re-estimation (process gamma and save output to emission_var) (to be implemented in child class).

_emission_mstep(*args)

Compute the emission parameters.

_state_sample(*args)

Genrate observation sequence for given state arguments (to be implemented in child class).

sample([n_samples, random_state])

Generates a sequence of observations and corresponding state sequence performing a random walk on the model (MC Sampling).

mc_dataset(num, timesteps)

Generates a dataset of a number of observations and corresponding state sequences utilizing the sample method.

_core_u_only(logframe)

Computes auxiliary matrix u for duration probabilities utilizing the ab._u_only method.

_core_forward(u, logdur)

Performs the forward step of the HSMM algorithm using duration and transition probabilities, utilizing the ab._forward method.

_core_backward(u, logdur)

Implements the backward algorithm for the HSMM.

_core_smoothed(beta, betastar, eta, xi)

Combines forward and backward variables to compute the smoothed probabilities.

_core_viterbi(u, logdur)

Implements the Viterbi algorithm for finding the most probable state sequence given the observations.

score(X)

Computes the log-likelihood of the observation sequences under the current model.

predict(X)

Predicts the most likely hidden state sequence for a given observation sequence using the Viterbi algorithm.

fit(X[, save_iters])

Trains the model using the Expectation-Maximization (EM) algorithm.

bic(train)

Computes the Bayesian Information Criterion (BIC) score to evaluate model performance.

fit_bic(X, states[, return_models])

Fits multiple models with different numbers of states, evaluates them using bic method, and selects the best one.

RUL(viterbi_states, max_samples[, equation])

Estimates the Remaining Useful Life (RUL) for a given state history using convolution of duration probabilities.

prognostics(data[, max_samples, plot_rul, ...])

Performs prognostics for given degradation histories, estimating RUL utilizing the RUL method and saving the results.

save_model([path])

Saves the current model state to self.results_path/models/self.name.pkl or to path if provided.

load_model([model_name, path])

Loads a previously saved model state from self.results.path/models/model_name.pkl or to path if provided.

_init(X=None)

Initializes model parameters if they are not already set. For left-to-right models: Sets the initial state to 1 (pi[0] = 1) and enforces forward transitions. For other topologies: Distributes probabilities evenly among states.

Parameters:

X (dict) – Observation dataset (optional) as a dictionary with trajectory identifiers and observation sequences made with the utils.create_data_hsmm method.

Return type:

None

See also

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

_init_mc()

Initialize the model parameters for MC sampling (to be implemented in child class).

_require_core()

Ensures that the HiMAP Cython extension is available. :rtype: None

_check()

Validates the initialized parameters:

Ensures starting probabilities (pi) sum to 1. Checks transition matrix (tmat) shape and sums across rows. Verifies duration probabilities.

Return type:

None

_dur_init(*args)

Ιnitializes duration parameters if there are no arguments yet (to be implemented in child class).

_dur_check(*args)

Checks if properties of duration parameters are satisfied arguments (to be implemented in child class).

_dur_probmat(*args)

Compute the probability per state of each duration arguments (to be implemented in child class).

_dur_mstep(*args)

Compute the duration parameters (to be implemented in child class).

_emission_logl(*args)

Compute the log-likelihood of each observation under each state (to be implemented in child class).

_emission_pre_mstep(*args)

Prepare for emission parameters re-estimation (process gamma and save output to emission_var) (to be implemented in child class).

_emission_mstep(*args)

Compute the emission parameters. arguments (to be implemented in child class).

_state_sample(*args)

Genrate observation sequence for given state arguments (to be implemented in child class).

sample(n_samples=5, random_state=None)

Generates a sequence of observations and corresponding state sequence performing a random walk on the model (MC Sampling).

Parameters:
  • n_samples (int) – Number of observations to generate.

  • random_state (int/None) – Seed for reproducibility.

Returns:

  • ctr_sample (int) – Number of samples generated.

  • X (ndarray) – Generated observation sequence.

  • state_sequence (ndarray) – State sequence corresponding to the observations.

mc_dataset(num, timesteps)

Generates a dataset of a number of observations and corresponding state sequences utilizing the sample method.

Parameters:
  • num (int) – Number of samples to generate.

  • timesteps (int) – Number of maximum timesteps for each sample.

Returns:

  • obs (dict[str, List[int]]) – A dictionary with trajectory observations.

  • states (dict[str, List[int]]) – A dictionary with the corresponding states for each trajectory.

See also

HSMM.sample

Generates a sequence of observations and corresponding state sequence performing a random walk on the model (MC Sampling).

_core_u_only(logframe)

Computes auxiliary matrix u for duration probabilities utilizing the ab._u_only method.

Parameters:

logframe (ndarray) – A 2D array of log-likelihood values for each observation under each state. Shape: (n_samples, n_states).

Returns:

u – A 3D array of intermediate values computed for each sample, state, and duration. Shape: (n_samples, n_states, n_durations).

Return type:

ndarray

See also

himap.ab._u_only

Computes the auxiliary matrix u for duration probabilities.

_core_forward(u, logdur)

Performs the forward step of the HSMM algorithm using duration and transition probabilities, utilizing the ab._forward method.

Parameters:
  • u (ndarray) – Intermediate values computed from _core_u_only. Shape: (n_samples, n_states, n_durations).

  • logdur (ndarray) – Logarithm of the duration probabilities for each state. Shape: (n_states, n_durations).

Returns:

  • eta (ndarray) – Smoothed probabilities for states and durations at each sample. Shape: (n_samples + 1, n_states, n_durations).

  • xi (ndarray) – Transition probabilities between states at each step. Shape: (n_samples + 1, n_states, n_states).

  • alpha (ndarray) – Forward probabilities for each state at each sample. Shape: (n_samples, n_states).

See also

himap.ab._forward

Performs the forward step of the HSMM algorithm.

_core_backward(u, logdur)

Implements the backward algorithm for the HSMM. Computes backward probabilities and intermediate variables for scaling. Utilizes the ab._backward method.

Parameters:
  • u (ndarray) – Scaled forward probabilities from _core_u_only.

  • logdur (ndarray) – Logarithmic duration probability matrix.

Returns:

  • beta (ndarray) – Backward probabilities for each state.

  • betastar (ndarray) – Scaled backward probabilities.

See also

himap.ab._backward

Implements the backward algorithm for the HSMM.

_core_smoothed(beta, betastar, eta, xi)

Combines forward and backward variables to compute the smoothed probabilities. Implemented in Cython.

Parameters:
  • beta (ndarray) – Backward probabilities for each state.

  • betastar (ndarray) – Scaled backward probabilities.

  • eta (ndarray) – Transition probabilities.

  • xi (ndarray) – Joint probabilities of transitions.

Returns:

gamma – Smoothed probabilities.

Return type:

ndarray

_core_viterbi(u, logdur)

Implements the Viterbi algorithm for finding the most probable state sequence given the observations.

Parameters:
  • u (ndarray) – Scaled forward probabilities from _core_u_only.

  • logdur (ndarray) – Logarithmic duration probability matrix.

Returns:

  • state_sequence (ndarray) – The most probable sequence of states.

  • state_logl (float) – Log-likelihood of the state sequence.

score(X)

Computes the log-likelihood of the observation sequences under the current model.

Parameters:

X (ndarray) – Observation sequences.

Returns:

score – Total log-likelihood of the observations.

Return type:

float

predict(X)

Predicts the most likely hidden state sequence for a given observation sequence using the Viterbi algorithm.

Parameters:

X (ndarray) – Observation sequences.

Returns:

  • state_sequence (ndarray) – Predicted state sequence.

  • state_logl (float) – Log-likelihood of the predicted state sequence.

fit(X, save_iters=False)

Trains the model using the Expectation-Maximization (EM) algorithm.

Parameters:
  • X (dict) – Observation sequences following the format of the utils.create_data_hsmm method.

  • save_iters (bool, optional) – Whether to save the model after each iteration. Defaults to False.

Returns:

self – The trained model.

Return type:

object

See also

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

bic(train)

Computes the Bayesian Information Criterion (BIC) score to evaluate model performance.

Parameters:

train (dict) – Observation sequences used for training.

Returns:

score – The BIC score for the model.

Return type:

float

fit_bic(X, states, return_models=False)

Fits multiple models with different numbers of states, evaluates them using bic method, and selects the best one.

Parameters:
  • X (dict) – Observation sequences (same format as fit).

  • states (list[int]) – List of state counts to evaluate.

  • return_models (bool, optional) – Whether to return all trained models. Defaults to False.

Returns:

  • self (object) – The best-performing model.

  • bic (list[float]) – BIC scores for each fitted model.

  • models (dict, optional) – All trained models, returned if return_models=True.

See also

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

HSMM.bic

Computes the Bayesian Information Criterion (BIC) score to evaluate model performance.

HSMM.fit

Trains the model using the Expectation-Maximization (EM) algorithm.

RUL(viterbi_states, max_samples, equation=1)

Estimates the Remaining Useful Life (RUL) for a given state history using convolution of duration probabilities.

Parameters:
  • viterbi_states (numpy.ndarray) – Sequence of Viterbi states representing the history of hidden states.

  • max_samples (int) – Maximum length of RUL to consider.

  • equation (int, optional) – Equation type for RUL estimation. Default is 1.

Returns:

  • RUL (numpy.ndarray) – RUL probability distribution for each timestep.

  • mean_RUL (numpy.ndarray) – Mean RUL for each timestep.

  • UB_RUL (numpy.ndarray) – Upper bound of the RUL distribution.

  • LB_RUL (numpy.ndarray) – Lower bound of the RUL distribution.

prognostics(data, max_samples=None, plot_rul=True, get_metrics=True, equation=1, return_results=False)

Performs prognostics for given degradation histories, estimating RUL utilizing the RUL method and saving the results.

Parameters:
  • data (dict) – A dictionary where keys are trajectory IDs and values are degradation histories following the format of the utils.create_data_hsmm method.

  • max_samples (int, optional) – Maximum length of RUL. Defaults to 10x the maximum trajectory length.

  • plot_rul (bool, optional) – Whether to plot RUL results for each sample. Default is True.

  • get_metrics (bool, optional) – Whether to compute and save evaluation metrics. Default is True.

  • equation (int, optional) – Equation type for RUL estimation. Default is 1.

  • return_results (bool, optional) – Whether to return the results: mean_rul_per_step, pdf_ruls_all, upper_rul_per_step, lower_rul_per_step (default is False).

Returns:

  • None if return_results is False

  • mean_rul_per_step (dict (Optional)) – A dictionary containing the mean_RUL ndarray per trajectory.

  • pdf_ruls_all (dict (Optional)) – A dictionary containing the RUL ndarray per trajectory.

  • upper_rul_per_step (dict (Optional)) – A dictionary containing the upper_rul_per_step ndarray per trajectory.

  • lower_rul_per_step (dict (Optional)) – A dictionary containing the lower_rul_per_step ndarray per trajectory.

Notes

Saves the following in the ‘results’ directory:

  • PDF RUL distributions.

  • Mean RUL per step.

  • Upper and lower RUL bounds.

  • Evaluation metrics (if get_metrics=True).

  • RUL plots (if plot_rul=True).

See also

HSMM.RUL

Estimates the Remaining Useful Life (RUL) for a given state history using convolution of duration probabilities.

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

save_model(path=None)

Saves the current model state to self.results_path/models/self.name.pkl or to path if provided.

Parameters:

path (str (optional)) – The path to save the model (overrides the default path).

Return type:

None

load_model(model_name=None, path=None)

Loads a previously saved model state from self.results.path/models/model_name.pkl or to path if provided.

Parameters:
  • model_name (str (optional)) – Name of the model file to load (without extension). Not needed if path is provided.

  • path (str (optional)) – The path to load the model from (overrides the default path).

Return type:

None

class himap.base.GaussianHSMM(n_states=2, n_durations=5, n_iter=100, tol=0.5, left_to_right=True, obs_state_len=None, f_value=None, random_state=None, name='', results_parent_path=None, kmeans_init='k-means++', kmeans_n_init='auto')

Bases: HSMM

The GaussianHSMM class models Hidden Semi-Markov processes with Gaussian-distributed emissions. It supports explicit duration modeling, and it can handle left-to-right or arbitrary state transitions. K-means clustering is used for initialization.

Parameters:
  • n_states (int) – Number of hidden states in the model. Default is 2.

  • n_durations (int) – Maximum duration for each state. Default is 5.

  • n_iter (int) – Maximum number of iterations for model fitting. Default is 100.

  • tol (float) – Convergence threshold for the EM algorithm. Default is 0.5.

  • left_to_right (bool) – If True, constrains transitions to progress in a left-to-right manner. Default is True for prognostics.

  • obs_state_len (int, optional) – Length of observed state (relevant in specific configurations).

  • f_value (float, optional) – Emission value for the final state, if applicable.

  • random_state (int or RandomState instance, optional) – Seed or random state for reproducibility.

  • name (str) – Name identifier for the model.

  • kmeans_init (str) – Initialization method for K-means clustering (‘k-means++’ or ‘random’). Default is ‘k-means++’.

  • kmeans_n_init (int or str) – Number of initializations for K-means clustering. Default is ‘auto’.

  • results_parent_path (str, optional) – The path to create the himap_results directory tree where the results (models, figures, dictionaries, performance metrics) are saved

Methods:

_init([X])

Initializes model parameters based on input data X.

_init_mc()

Initializes model parameters for the Monte Carlo Sampling example.

_check()

Performs validation checks to ensure model parameters are consistent.

_dur_init()

Initializes the duration probability matrix self.dur.

_dur_check()

Validates the duration probability matrix self.dur.

_dur_probmat()

Returns the duration probability matrix self.dur.

_dur_mstep(new_dur)

Performs the M-step update for the duration probabilities.

_emission_logl(X)

Calculates the log-likelihood of the emissions given the observations.

_emission_mstep(X, emission_var[, inplace])

Performs the M-step update for emission parameters.

_state_sample(state[, random_state])

Generates a sample from the Gaussian distribution of a specified state.

_init(X=None)

Initializes model parameters based on input data X.

Parameters:

X (numpy.ndarray, optional) – Observations to initialize the model. If None, defaults to 1D Gaussian emissions.

Return type:

None

_init_mc()

Initializes model parameters for the Monte Carlo Sampling example.

Return type:

None

_check()

Performs validation checks to ensure model parameters are consistent.

Return type:

None.

_dur_init()

Initializes the duration probability matrix self.dur.

Return type:

None.

_dur_check()

Validates the duration probability matrix self.dur.

Return type:

None.

_dur_probmat()

Returns the duration probability matrix self.dur. (no changes for non-parametric duration distributions)

_dur_mstep(new_dur)

Performs the M-step update for the duration probabilities. (no changes for non-parametric duration distributions) :param new_dur: Updated duration probabilities. :type new_dur: numpy.ndarray

Return type:

None

_emission_logl(X)

Calculates the log-likelihood of the emissions given the observations.

Parameters:

X (numpy.ndarray) – Observations.

Returns:

logframe – Log-likelihood of each observation under each state.

Return type:

numpy.ndarray

_emission_mstep(X, emission_var, inplace=True)

Performs the M-step update for emission parameters.

Parameters:
  • X (numpy.ndarray) – Observations.

  • emission_var (numpy.ndarray) – Responsibilities or posteriors for each observation-state pair.

  • inplace (bool, optional) – If True, updates parameters in-place. If False, returns updated parameters.

Returns:

  • mean (numpy.ndarray, optional) – Updated means for each state (if inplace=False).

  • covmat (numpy.ndarray, optional) – Updated covariance matrices for each state (if inplace=False).

_state_sample(state, random_state=None)

Generates a sample from the Gaussian distribution of a specified state.

Parameters:
  • state (int) – Index of the state to sample from.

  • random_state (int or RandomState, optional) – Random seed or state for reproducibility.

Returns:

sample – Sampled observation.

Return type:

numpy.ndarray

class himap.base.HMM(n_states=2, n_obs_symbols=30, n_iter=100, tol=0.01, left_to_right=True, name='', results_parent_path=None)

Bases: object

The HMM class models Hidden Markov processes with discrete emissions.

Parameters:
  • n_states (int) – Number of hidden states in the model. Must be ≥ 2.

  • n_obs_symbols (int) – Number of observation symbols.

  • n_iter (int) – Maximum number of iterations for training. Default is 100.

  • tol (float) – Tolerance for convergence during training. Default is 1e-2.

  • left_to_right (bool) – Whether the HMM uses a left-to-right structure. Default is True for use in prognostics.

  • name (str) – Name of the model. Default is “hmm” if no name is provided.

  • results_parent_path (str, optional) – The path to create the himap_results directory tree where the results (models, figures, dictionaries, performance metrics) are saved

Methods:

_init([X])

Initializes transition and emission matrices based on model structure (left_to_right).

_init_mc()

Initializes the model parameters for the Monte Carlo Sampling example.

fit(X[, return_all_scores, save_iters])

Trains the HMM using the Baum-Welch algorithm.

fit_bic(X, states[, return_models])

Fits multiple HMMs using the Bayesian Information Criterion (BIC) to select the best model.

decode(history, calc_emi, calc_tr)

Computes forward (fs) and backward (bs) probabilities for a given sequence.

sample()

Generates a sequence of observations and corresponding state sequences performing a random walk on the model.

mc_dataset(n_samples)

Generates a dataset of a number of observations and corresponding state sequences utilizing the sample method.

predict(history[, return_score])

Predicts the most likely state sequence for a given observation sequence using the Viterbi algorithm.

estimate(history, estimatedStates[, ...])

Estimates transition and emission matrices based on observed sequences and states.

RUL(estimatedStates, max_samples[, confidence])

Estimates the remaining useful life of a system based on state sequence.

prognostics(data[, max_samples, plot_rul, ...])

Performs prognostics utilizing the RUL method and evaluates model performance.

save_model([path])

Saves the current model state to self.results_path/models/self.name.pkl or to path if provided.

load_model([model_name, path])

Loads a previously saved model state from self.results.path/models/model_name.pkl or to path if provided.

_init(X=None)

Initializes transition and emission matrices based on model structure (left_to_right).

Parameters:

X (dict) – Dataset of trajectories for determining the maximum sequence length following the format of utils.create_data_hsmm. The default is None.

Return type:

None.

See also

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

_init_mc()

Initializes the model parameters for the Monte Carlo Sampling example.

Return type:

None.

fit(X, return_all_scores=False, save_iters=False)

Trains the HMM using the Baum-Welch algorithm.

Parameters:
  • X (dict) – Observations organized as { “traj_<index>”: [sequence] } following the format of utils.create_data_hsmm.

  • return_all_scores (bool, optional) – If True, returns log-likelihood scores for all iterations, default is False.

  • save_iters (bool, optional) – If True, saves the model at each iteration, default is False.

Returns:

  • hmm (object) – Trained HMM instance.

  • score_per_iter (list, optional) – Log-likelihood scores for each iteration (if return_all_scores=True).

See also

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

fit_bic(X, states, return_models=False)

Fits multiple HMMs using the Bayesian Information Criterion (BIC) to select the best model.

Parameters:
  • X (dict) – Observation dataset.

  • states (list) – List of candidate numbers of states.

  • return_models (bool, optional) – If True, returns all trained models and BIC scores (default is False).

Returns:

  • hmm (object) – Best HMM model based on BIC.

  • bic (list) – BIC scores for each candidate model.

  • models (dict, optional) – All trained models and BIC scores (if return_models=True).

See also

HMM.fit

Fits the HMM using the Baum-Welch algorithm.

decode(history, calc_emi, calc_tr)

Computes forward (fs) and backward (bs) probabilities for a given sequence.

Parameters:
  • history (list) – Observation sequence.

  • calc_emi (array) – Current emission matrix.

  • calc_tr (array) – Current transition matrix.

Returns:

  • pStates (numpy.ndarray) – Posterior probabilities for states.

  • pSeq (float) – Log-probability of the sequence.

  • fs (numpy.ndarray) – Forward probabilities.

  • bs (numpy.ndarray) – Backward probabilities.

  • s (numpy.ndarray) – Scaling factors.

sample()

Generates a sequence of observations and corresponding state sequences performing a random walk on the model.

Returns:

  • history (list) – A list containing the generated sequence of observations, where each observation corresponds to a state in the sequence.

  • states (list) – A list containing the sequence of states visited during the process, where each state is represented by its index.

mc_dataset(n_samples)

Generates a dataset of a number of observations and corresponding state sequences utilizing the sample method.

Parameters:

n_samples (int) – Number of sequences to generate.

Returns:

  • obs (dict) – Generated observation sequences.

  • states_all (dict) – Corresponding state sequences.

See also

HMM.sample

Generates a sequence of observations and corresponding state sequences.

predict(history, return_score=False)

Predicts the most likely state sequence for a given observation sequence using the Viterbi algorithm.

Parameters:
  • history (list) – Observation sequence.

  • return_score (bool, optional) – If True, returns the log-probability of the best state sequence (default is False).

Returns:

  • currentState (numpy.ndarray) – Most likely state sequence.

  • logP (float, optional) – Log-probability of the predicted sequence (if return_score=True).

estimate(history, estimatedStates, return_matrices=False)

Estimates transition and emission matrices based on observed sequences and states.

Parameters:
  • history (list) – Observation sequence.

  • estimatedStates (list) – Corresponding state sequence.

  • return_matrices (bool, optional) – If True, returns the matrices instead of updating the model (default is False).

Returns:

  • tr (numpy.ndarray, optional) – Updated transition matrix (if return_matrices=True).

  • emi (numpy.ndarray, optional) – Updated emission matrix (if return_matrices=True).

  • hmm (object) – Updated HMM instance.

RUL(estimatedStates, max_samples, confidence=0.95)

Estimates the remaining useful life of a system based on state sequence.

Parameters:
  • estimatedStates (list) – Sequence of estimated states.

  • max_samples (int) – Maximum number of timesteps for RUL estimation.

  • confidence (float) – Confidence level for bounds.

Returns:

  • rul_mean (list) – Mean RUL estimates.

  • rul_upper_bound (list) – Upper confidence bounds.

  • rul_lower_bound (list) – Lower confidence bounds.

  • rul_matrix (numpy.ndarray) – RUL probability distributions.

prognostics(data, max_samples=None, plot_rul=True, get_metrics=True, return_results=False)

Performs prognostics utilizing the RUL method and evaluates model performance.

Parameters:
  • data (dict) – Observation data for multiple trajectories following the format of utils.create_data_hsmm.

  • max_samples (int, optional) – Maximum timesteps for RUL. Default is 10× the max sequence length (default is None).

  • plot_rul (bool, optional) – If True, saves RUL plots (default is True).

  • get_metrics (bool, optional) – If True, evaluates RUL predictions with metrics (default is True).

  • return_results (bool, optional) – Whether to return the results: mean_rul_per_step, pdf_ruls_all, upper_rul_per_step, lower_rul_per_step (default is False).

Returns:

  • None if return_results is False

  • rul_mean_all (dict (Optional)) – A dictionary containing the rul_mean ndarray per trajectory.

  • pdf_ruls_all (dict (Optional)) – A dictionary containing the rul_matrix ndarray per trajectory.

  • rul_upper_bound_all (dict (Optional)) – A dictionary containing the rul_upper_bound ndarray per trajectory.

  • rul_lower_bound_all (dict (Optional)) – A dictionary containing the rul_lower_bound ndarray per trajectory.

See also

HMM.RUL

Estimates the remaining useful life of a system based on state sequence.

himap.utils.create_data_hsmm

Generates a dataset of trajectories for the model.

save_model(path=None)

Saves the current model state to self.results_path/models/self.name.pkl or to path if provided.

Parameters:

path (str (optional)) – The path to save the model (overrides the default path).

Return type:

None

load_model(model_name=None, path=None)

Loads a previously saved model state from self.results.path/models/model_name.pkl or to path if provided.

Parameters:
  • model_name (str (optional)) – Name of the model file to load (without extension). Not needed if path is provided.

  • path (str (optional)) – The path to load the model from (overrides the default path).

Return type:

None

himap.main module

Functions:

run_process(args)

Run the process for the selected model

himap_main(hsmm, mc_sampling, bic_fit, save, ...)

Main function for running the HMM models

himap.main.run_process(args)

Run the process for the selected model

Parameters:

args (argparse.Namespace) –

Arguments for the process. Expected attributes are:

  • hsmm (bool): Flag to indicate if HSMM model should be used.

  • mc_sampling (bool): Flag to indicate if Monte Carlo sampling should be used.

  • bic_fit (bool): Flag to indicate if BIC fitting should be performed.

  • save (bool): Flag to indicate if the model should be saved.

  • metrics (bool): Flag to indicate if metrics should be calculated.

  • enable_visuals (bool): Flag to indicate if visualizations should be enabled.

  • num_histories (int): Number of histories for Monte Carlo sampling.

  • n_states (int): Number of states for the HMM/HSMM model.

Return type:

None

himap.main.himap_main(hsmm, mc_sampling, bic_fit, save, metrics, enable_visuals, num_histories, n_states)

Main function for running the HMM models

Parameters:
  • hsmm (bool) – If True use Hidden Semi-Markov Model. If False use Hidden Markov Model.

  • mc_sampling (bool) – If True use Monte-Carlo Sampling as case example. If False use CMAPSS data.

  • bic_fit (bool) – If True enable Bayesian Information Criterion fitting for Markov Models.

  • save (bool) – If True enable saving of the fitted models.

  • metrics (bool) – If True enable calculation of performance metrics for RUL prediction.

  • enable_visuals (bool) – If True enable generating and saving figures.

  • num_histories (int) – The number of generated histories via Monte Carlo Sampling. It is only used if mc_sampling is True.

  • n_states (int) – The number of hidden states for Markov Model.

Return type:

None

himap.plot module

Functions:

plot_multiple_observ(obs, states, num2plot)

Plot multiple degradation histories from MC sampling.

plot_ruls(rul_mean, rul_upper, rul_lower, ...)

Plot RUL predictiction with confidence intervals vs true RUL.

himap.plot.plot_multiple_observ(obs, states, num2plot)

Plot multiple degradation histories from MC sampling.

Parameters:
  • obs (dict) – Dictionary containing all observations.

  • states (dict) – Dictionary containing all statesm of the corresponding observations.

  • num2plot (int) – Number of histories to plot.

Return type:

None

Notes

The figure is saved at ‘/path/to/current/directory/results/figures/mc_traj.png’.

himap.plot.plot_ruls(rul_mean, rul_upper, rul_lower, fig_path)

Plot RUL predictiction with confidence intervals vs true RUL.

Parameters:
  • rul_mean (list) – Mean RUL predictions.

  • rul_upper (list) – Upper bound of the confidence interval.

  • rul_lower (list) – Lower bound of the confidence interval.

  • fig_path (str) – Path to save the figure.

Return type:

None

Notes

The figure is saved at ‘/path/to/current/directory/results/figures/’.

himap.utils module

Classes:

NumpyArrayEncoder(*[, skipkeys, ...])

Custom JSON encoder to handle numpy.ndarray and numpy.integer objects for serialization.

Functions:

str2bool(v)

create_data_hsmm(files, obs_state_len, f_value)

Creates a dictionary of trajectories for input into the HSMM model.

load_data_cmapss([obs_state_len, f_value])

Loads the C-MAPSS dataset and prepares it for input into the HSMM model.

log_mask_zero(a)

Applies the log function to an array, masking zero values.

get_single_history_states(states, index, ...)

Returns the history states for a single trajectory.

get_viterbi(HSMM, data)

Applies the Viterbi algorithm to predict the most probable states for each trajectory in data using the HSMM.

fix_input_data(traj, f_value, obs_state_len)

Prepares trajectory data for input into the HSMM model by appending f_value and adjusting indexing if needed.

get_rmse(mean_rul_dict, true_rul_dict)

Computes the Root Mean Square Error (RMSE) between predicted Remaining Useful Life (RUL) and true RUL.

get_coverage(upper_bound_dict, ...)

Calculates the coverage of true RUL values within the predicted upper and lower bounds.

calculate_area_weighted_by_time(x_values, ...)

Calculates the area under the curve weighted by time for the given x and y values.

get_wsu(upper_bound_dict, lower_bound_dict)

Computes the Weighted Spread Uncertainty (WSU) between the upper and lower bounds.

evaluate_test_set(mean_rul_dict, ...)

Evaluates the test set by calculating RMSE, coverage, and WSU.

baumwelch_method(n_states, n_obs_symbols, ...)

Implements the Baum-Welch algorithm for parameter estimation in Hidden Markov Models (HMM).

fs_calculation(n_states, end_traj, fs, s, ...)

Computes the forward probabilities (fs) for a given sequence using the emission and transition matrices.

bs_calculation(n_states, end_traj, bs, s, ...)

Computes the backward probabilities (bs) for a given sequence using the emission and transition matrices.

calculate_expected_value(pmf_values)

Calculates the expected value of a probability mass function (PMF).

calculate_cdf(pmf, confidence_level)

Calculates the cumulative distribution function (CDF) and percentile values for a given probability mass function (PMF).

create_folders([results_parent_path])

Create a directory structure for storing results.

class himap.utils.NumpyArrayEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Bases: JSONEncoder

Custom JSON encoder to handle numpy.ndarray and numpy.integer objects for serialization.

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an RecursionError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (’, ‘, ‘: ‘) if indent is None and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

Methods:

default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
himap.utils.str2bool(v)
himap.utils.create_data_hsmm(files, obs_state_len, f_value)

Creates a dictionary of trajectories for input into the HSMM model.

Parameters:
  • files (list of str) – List of file paths to CSV files containing trajectory data.

  • obs_state_len (int) – The length of the observed state.

  • f_value (float) – A value used for fixing the input data.

Returns:

traj – A dictionary where keys are trajectory identifiers and values are lists of cluster data.

Return type:

dict

himap.utils.load_data_cmapss(obs_state_len=5, f_value=21)

Loads the C-MAPSS dataset and prepares it for input into the HSMM model.

Parameters:
  • obs_state_len (int, optional) – Length to be used for the failure state, by default 5

  • f_value (int, optional) – Failure value corresponding to the final state, by default 21

Returns:

  • seqs_train (dict) – A dictionary containing the training trajectories.

  • seqs_test (dict) – A dictionary containing the testing trajectories.

himap.utils.log_mask_zero(a)

Applies the log function to an array, masking zero values.

Parameters:

a (np.ndarray) – An array of values.

Returns:

The log-transformed array with zero values masked.

Return type:

np.ndarray

himap.utils.get_single_history_states(states, index, last_state)

Returns the history states for a single trajectory.

Parameters:
  • states (list) – A list of list, each list contains the states for a trajectory.

  • index (int) – The index of the trajectory

  • last_state (int) – The last state of the trajectory

Returns:

history_states – A list of the history states for the trajectory.

Return type:

list

himap.utils.get_viterbi(HSMM, data)

Applies the Viterbi algorithm to predict the most probable states for each trajectory in data using the HSMM.

Parameters:
  • HSMM (HSMM) – The trained Hidden Semi-Markov Model used to predict states.

  • data (dict[str, List[int]]) – A dictionary of trajectories where each key is a trajectory name and each value is a list of observations.

Returns:

results – A list of lists containing the predicted states for each trajectory.

Return type:

List[List[int]]

himap.utils.fix_input_data(traj, f_value, obs_state_len, is_zero_indexed=True)

Prepares trajectory data for input into the HSMM model by appending f_value and adjusting indexing if needed.

Parameters:
  • traj (dict[str, List[int]]) – A dictionary containing the trajectories as lists of observed states.

  • f_value (int) – The value to append to each trajectory.

  • obs_state_len (int) – The number of times to append f_value to each trajectory.

  • is_zero_indexed (bool, optional) – Flag indicating whether the data is zero-indexed. Default is True.

Returns:

traj – The modified trajectory dictionary with f_value appended and indexing adjusted if necessary.

Return type:

dict[str, List[int]]

himap.utils.get_rmse(mean_rul_dict, true_rul_dict)

Computes the Root Mean Square Error (RMSE) between predicted Remaining Useful Life (RUL) and true RUL.

Parameters:
  • mean_rul_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of predicted RUL values.

  • true_rul_dict (dict[str, int]) – A dictionary where each key is a trajectory name and the value is the true RUL for that trajectory.

Returns:

df_results – A DataFrame containing RMSE values for each trajectory, including the average RMSE.

Return type:

pd.DataFrame

himap.utils.get_coverage(upper_bound_dict, lower_bound_dict, true_rul_dict)

Calculates the coverage of true RUL values within the predicted upper and lower bounds.

Parameters:
  • upper_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of upper bounds for predicted RUL.

  • lower_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of lower bounds for predicted RUL.

  • true_rul_dict (dict[str, int]) – A dictionary where each key is a trajectory name and the value is the true RUL for that trajectory.

Returns:

df_results – A DataFrame containing coverage values for each trajectory, including the average coverage.

Return type:

pd.DataFrame

himap.utils.calculate_area_weighted_by_time(x_values, y_values)

Calculates the area under the curve weighted by time for the given x and y values.

Parameters:
  • x_values (list[int]) – A list of x values (e.g., time).

  • y_values (list[float]) – A list of y values (predicted values).

Returns:

area – The area under the curve weighted by time.

Return type:

float

himap.utils.get_wsu(upper_bound_dict, lower_bound_dict)

Computes the Weighted Spread Uncertainty (WSU) between the upper and lower bounds.

Parameters:
  • upper_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of upper bounds for predicted RUL.

  • lower_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of lower bounds for predicted RUL.

Returns:

df_results – A DataFrame containing WSU values for each trajectory, including the average WSU.

Return type:

pd.DataFrame

himap.utils.evaluate_test_set(mean_rul_dict, upper_bound_dict, lower_bound_dict, true_rul_dict)

Evaluates the test set by calculating RMSE, coverage, and WSU.

Parameters:
  • mean_rul_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of predicted RUL values.

  • upper_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of upper bounds for predicted RUL.

  • lower_bound_dict (dict[str, List[float]]) – A dictionary where each key is a trajectory name and the value is the list of lower bounds for predicted RUL.

  • true_rul_dict (dict[str, int]) – A dictionary where each key is a trajectory name and the value is the true RUL for that trajectory.

Returns:

combined_df – A DataFrame combining RMSE, coverage, and WSU for each trajectory, including the average values.

Return type:

pd.DataFrame

himap.utils.baumwelch_method(n_states, n_obs_symbols, logPseq, fs, bs, scale, score, history, tr, emi, calc_tr, calc_emi)

Implements the Baum-Welch algorithm for parameter estimation in Hidden Markov Models (HMM).

Parameters:
  • n_states (int) – The number of hidden states in the model.

  • n_obs_symbols (int) – The number of observation symbols

  • logPseq (float) – The log-probability of the observed sequence.

  • fs (np.ndarray) – The forward probabilities matrix (shape: [n_states, sequence_length]).

  • bs (np.ndarray) – The backward probabilities matrix (shape: [n_states, sequence_length]).

  • scale (np.ndarray) – The scale factors for normalization (shape: [1, sequence_length]).

  • score (float) – The cumulative score (log probability) to be updated.

  • history (List[int]) – The sequence of observed symbols (integer indices).

  • tr (np.ndarray) – The transition matrix (shape: [n_states, n_states]).

  • emi (np.ndarray) – The emission matrix (shape: [n_states, n_obs_symbols]).

  • calc_tr (np.ndarray) – A precomputed matrix of transition probabilities (shape: [n_states, n_states]).

  • calc_emi (np.ndarray) – A precomputed matrix of emission probabilities (shape: [n_states, n_obs_symbols]).

Returns:

  • tr (np.ndarray) – Updated transition matrix after the algorithm has performed parameter estimation.

  • emi (np.ndarray) – Updated emission matrix after the algorithm has performed parameter estimation.

himap.utils.fs_calculation(n_states, end_traj, fs, s, history, calc_emi, calc_tr)

Computes the forward probabilities (fs) for a given sequence using the emission and transition matrices.

Parameters:
  • n_states (int) – The number of hidden states in the model.

  • end_traj (int) – The length of the observation sequence.

  • fs (np.ndarray) – The forward probabilities matrix (shape: [n_states, end_traj]).

  • s (np.ndarray) – Scaling factors to prevent underflow (shape: [1, end_traj]).

  • history (List[int]) – The sequence of observed symbols (integer indices).

  • calc_emi (np.ndarray) – A matrix of emission probabilities (shape: [n_states, n_obs_symbols]).

  • calc_tr (np.ndarray) – A matrix of transition probabilities (shape: [n_states, n_states]).

Returns:

  • fs (np.ndarray) – The updated forward probabilities matrix.

  • s (np.ndarray) – The updated scaling factors.

himap.utils.bs_calculation(n_states, end_traj, bs, s, history, calc_emi, calc_tr)

Computes the backward probabilities (bs) for a given sequence using the emission and transition matrices.

Parameters:
  • n_states (int) – The number of hidden states in the model.

  • end_traj (int) – The length of the observation sequence.

  • bs (np.ndarray) – The backward probabilities matrix (shape: [n_states, end_traj]).

  • s (np.ndarray) – Scaling factors for normalization (shape: [1, end_traj]).

  • history (List[int]) – The sequence of observed symbols (integer indices).

  • calc_emi (np.ndarray) – A matrix of emission probabilities (shape: [n_states, n_obs_symbols]).

  • calc_tr (np.ndarray) – A matrix of transition probabilities (shape: [n_states, n_states]).

Returns:

bs – The updated backward probabilities matrix.

Return type:

np.ndarray

himap.utils.calculate_expected_value(pmf_values)

Calculates the expected value of a probability mass function (PMF).

Parameters:

pmf_values (List[float]) – A list of probabilities for each possible value.

Returns:

expected_value – The expected value calculated from the PMF.

Return type:

float

himap.utils.calculate_cdf(pmf, confidence_level)

Calculates the cumulative distribution function (CDF) and percentile values for a given probability mass function (PMF).

Parameters:
  • pmf (List[float]) – A list of probabilities for each possible value.

  • confidence_level (float) – The confidence level for calculating the percentiles (e.g., 0.95 for 95%).

Returns:

lower_value – The index corresponding to the lower percentile.

Return type:

int

himap.utils.create_folders(results_parent_path=None)

Create a directory structure for storing results.

This function creates a main himap_results folder in the specified results parent directory path and the subdirectories within it, including “dictionaries”, “figures”, and “models”. If the results_parent_path is not specified, the himap_results folder is created in the current working directory.

Parameters:

results_parent_path (str (Optional)) – Defines the parent directory of the results folder where the himap_results directory tree is created.

Notes

  • The function does not return any values.

  • The created folder structure is as follows:

    himap_results/

    ├── dictionaries/

    ├── figures/

    ├── models/

Examples

>>> create_folders()
Created folder: /results_parent_path/himap_results
Created folder: /results_parent_path/himap_results/dictionaries
Created folder: /results_parent_path/himap_results/figures
Created folder: /results_parent_path/himap_results/models