EnsembleNormalizer

class ompy.EnsembleNormalizer(*, extractor, normalizer_nld=None, normalizer_gsf=None, normalizer_simultan=None, path='saved_run/normalizers', regenerate=False)[source]

Bases: AbstractNormalizer

Normalizes NLD nad γSF extracted from the ensemble

Usage:

The calling syntax can be either to normalize simultaneously:

EnsembleNormalizer(extractor=...,
                   normalizer_simultan=...)

, or to normalize sequentially:

EnsembleNormalizer(extractor=...,
                     normalizer_nld=...,
                     normalizer_gsf=...)

Note

If one should add a functionality that depends on random numbers withing the parallelized loop make sure to use the random generator exposed via the arguments (see Ensemble class for an example). If one uses np.random instead, this will be the same an exact copy for each process. Note that this is not an issue for multinest seach routine, which is anyhow seeded by default as implemented in ompy.

Variables:
  • extractor (Extractor) – Extractor instance

  • normalizer_nld (NormalizerNLD) – NormalizerNLD instance

  • normalizer_gsf (NormalizerGSF) – NormalizerGSF instance

  • normalizer_simultan (NormalizerSimultan) – NormalizerSimultan instance

  • res (List[ResultsNormalized]) – List of the results

  • nprocesses (int) – Number of processes for multiprocessing. Defaults to number of available cpus-1 (with mimimum 1).

Parameters:
  • extractor (Extractor) – Extractor instance

  • normalizer_nld (NormalizerNLD, optional) – NormalizerNLD instance

  • normalizer_gsf (NormalizerGSF, optional) – NormalizerGSF instance

  • normalizer_simultan (NormalizerSimultan, optional) – NormalizerSimultan instance

Attributes Summary

LOG

Methods Summary

load([path])

Loads (pickeled) instance.

normalize()

Normalize ensemble

normalizeSimultan(num, *, nld, gsf)

Wrapper for simultaneous normalization

normalizeStagewise(num, *, nld, gsf)

Wrapper for stagewise normalization

plot([ax, add_figlegend, n_plot, ...])

Plots randomly drawn samples

plot_gsf_ext_stats(ax, *, xlow, xhigh, ...)

Helper for plotting statistics of the gsf extrapolations

plot_nld_ext_stats(ax, *, x, samples, ...)

Helper for plotting statistics of the nld extrapolation

plot_selection(*, ax, samples, ...[, ...])

Plot some nld and gsf samples

plot_vector_stats(ax, samples, percentiles, ...)

Helper for plotting of stats from a vector

samples_from_res([random_state])

Draw random samples from results with transformed nld & gsf

samples_unify_E(df)

Get nlds (or gsfs) on common energy grid, if diff.

save([path, overwrite])

Save (pickels) the instance

save_results_txt([path, suffix])

Save results as txt

stats_from_df(df, fmap, shape_out, percentiles)

Helper to get median, 68% or similar from a collection of Vectors

step(i, nld, gsf)

Normalization step for each ensemble member

Attributes Documentation

LOG = <Logger ompy.ensembleNormalizer (WARNING)>

Methods Documentation

load(path=None)

Loads (pickeled) instance.

Such that it can be loaded if regenerate = False. Note that if any modifications of the __getstate__ method are present, these will effect what attributes are pickeled.

Parameters:

path (Union[str, Path, None]) – The path to the directoryto load file. If the value is None, ‘self.path’ will be used.

Raises:

FileNotFoundError – If file is not found

normalize()[source]

Normalize ensemble

Return type:

None

normalizeSimultan(num, *, nld, gsf)[source]

Wrapper for simultaneous normalization

Parameters:
  • num (int) – Loop number

  • nld (Vector) – NLD before normalization

  • gsf (Vector) – gsf before normalization

Returns:

results (/parameters) of normalization

Return type:

res (ResultsNormalized)

normalizeStagewise(num, *, nld, gsf)[source]

Wrapper for stagewise normalization

Parameters:
  • num (int) – Loop number

  • nld (Vector) – NLD before normalization

  • gsf (Vector) – gsf before normalization

Returns:

results (/parameters) of normalization

Return type:

res (ResultsNormalized)

plot(ax=None, add_figlegend=True, n_plot=5, plot_model_stats=False, random_state=None, return_stats=False, **kwargs)[source]

Plots randomly drawn samples

Parameters:
  • ax (Tuple[Any, Any], optional) – The matplotlib axis to plot onto. Creates axis is not provided.

  • add_figlegend (bool, optional) – Defaults to True.

  • n_plot (bool, optional) – Number of (nld, gsf) samples to plot

  • plot_model_stats (bool, optional) – Plot stats also for models used in normalization

  • random_state (np.random.RandomState, optional) – random state, set by default such that a repeated use of the function gives the same results.

  • return_stats (bool) – Whether to return vector stats (percentiles)

  • **kwargs – Description

Todo

  • Refactor code

  • Could not find out how to not plot dublicate legend entries, thus using a workaround

  • Checks if extrapolating where nld or gsf is np.nan

Returns:

If return_stats=False, returns fig, ax,

otherwise fig, ax, (stats_nld, stats_gsf)

Return type:

Tuple

static plot_gsf_ext_stats(ax, *, xlow, xhigh, samples, normalizer_gsf, percentiles, color)[source]

Helper for plotting statistics of the gsf extrapolations

Parameters:
  • ax (Any) – The matplotlib axis to plot onto.

  • xlow (ndarray) – x-axis values (Energies) of the lower extrapolation

  • xhigh (ndarray) – x-axis values (Energies) of the higher extrapolation

  • samples (DataFrame) – Samples of (nld, gsf, transfromation parameters)

  • normalizer_gsf (NormalizerGSF) – NormalizerNLD instance.

  • percentiles (Tuple[float, float]) – Lower and upper percentile to plot the shading

  • **kwargs – Additional keyword arguments for the plotting

Return type:

Tuple[DataFrame, DataFrame]

Returns:

Tuple of DataFrames with collumns [‘median’, ‘low’, ‘high’] and entries for each energy of the Vectors. First entry is for the lower extrapolation, secondentry is for the higher extrapolation

static plot_nld_ext_stats(ax, *, x, samples, normalizer_nld, percentiles, **kwargs)[source]

Helper for plotting statistics of the nld extrapolation

Parameters:
  • ax (Any) – The matplotlib axis to plot onto.

  • x (ndarray) – x-axis values (Energies)

  • samples (DataFrame) – Samples of (nld, gsf, transfromation parameters)

  • normalizer_nld (NormalizerNLD) – NormalizerNLD instance.

  • percentiles (Tuple[float, float]) – Lower and upper percentile to plot the shading

  • **kwargs – Additional keyword arguments for the plotting

Return type:

DataFrame

Returns:

DataFrame with collumns [‘median’, ‘low’, ‘high’] and entries for each energy of the Vectors.

plot_selection(*, ax, samples, normalizer_nld, normalizer_gsf, n_plot=5, random_state=None)[source]

Plot some nld and gsf samples

Parameters:
  • ax (Tuple[Any, Any]) – The matplotlib axis to plot onto. Creates axis is not provided.

  • samples (pd.DataFrame) – Random samples from results with transformed nld & gsf

  • normalizer_nld (NormalizerNLD) – NormalizerNLD instance. Note: Input a copy as the instance attributes will be changed.

  • normalizer_gsf (NormalizerGSF) – NormalizerGSF instance. Note: Input a copy as the instance attributes will be changed.

  • n_plot (bool, optional) – Number of (nld, gsf) samples to plot

  • random_state (np.random.RandomState, optional) – random state, set by default such that a repeated use of the function gives the same results.

Return type:

None

static plot_vector_stats(ax, samples, percentiles, color)[source]

Helper for plotting of stats from a vector

Parameters:
  • ax (Tuple[Any, Any]) – Axes to plot on

  • samples (DataFrame) – Samples of (nld, gsf, transfromation parameters)

  • percentiles (Tuple[float, float]) – Lower and upper percentile to plot the shading

  • color (Any) – Color of nld and gsf

Return type:

Tuple[Any, DataFrame, DataFrame]

Returns:

Lines of fill between, and stats DataFrame of nld and gsf

samples_from_res(random_state=None)[source]

Draw random samples from results with transformed nld & gsf

Parameters:

random_state (np.random.RandomState, optional) – random state, set by default such that a repeated use of the function gives the same results.

Return type:

DataFrame

Returns:

Samples

samples_unify_E(df)[source]

Get nlds (or gsfs) on common energy grid, if diff. lengths

After applying, DataFrame with vectors are on common energy grid. Missing values filled with np.nan.

Parameters:

df (DataFrame) – DataFrame collumn with vectors to be put on unified energy grid

Return type:

None

save(path=None, overwrite=True)

Save (pickels) the instance

Such that it can be loaded, and enabling the regenerate later.

Parameters:
  • path (Union[str, Path, None]) – The path to the save directory. If the value is None, ‘self.path’ will be used.

  • overwrite (bool) – Overwrite file if existent

save_results_txt(path=None, suffix=None)[source]

Save results as txt

Uses a folder to save nld, gsf, and the samples (converted to an array)

Parameters:

path (Union[str, Path, None]) – The path to the save directory. If the value is None, ‘self.path’ will be used.

static stats_from_df(df, fmap, shape_out, percentiles)[source]

Helper to get median, 68% or similar from a collection of Vectors

Parameters:
  • df (DataFrame) – DataFrame of Vectors

  • fmap (Callable[[Vector, array], None]) – Applied to each row of df

  • shape_out (Tuple[int, int]) – output shape

  • percentiles (Tuple[float, float]) – Upper and lower percentiles for the stats (eg. 16 and 84% for something like 1 sigma)

Return type:

DataFrame

Returns:

DataFrame with collumns [‘median’, ‘low’, ‘high’] and entries for each energy of the Vectors.

step(i, nld, gsf)[source]

Normalization step for each ensemble member

Parameters:
  • i (int) – Loop number

  • nld (Vector) – NLD before normalization

  • gsf (Vector) – gsf before normalization

Returns:

results (/parameters) of normalization

Return type:

res (ResultsNormalized)