data_set module

class data_set.MD_data_set(xvec, weights)

Bases: data_set.data_set

This class is a child class of data_set, used for data of molecular system.

align()

Align all states in the data set with respect to the reference configuration.

This function implements the Kabash’s algorithm, which minimizes the the root mean squared deviation of states with respect to the reference configuration.

classmethod from_file(states_filename)

Initize the data set from file.

load_ref_state()

Load a reference configuration from the file ./data/ref_state.txt. It is used to align the data.

map_to_all_features()

Map the states in the mini-batch to all features, by calling map_to_feature() for each feature.

Returns

2d torch tensor, whose dimensions are (batch_size, num_features).

map_to_feature(idx)

Map the states in the mini-batch to one feature.

Parameters

idx – idx of the feature to which states are mapped.

pre_processing_layer()

If features are defined, then map data to features. Otherwise, return the aligned data.

write_all_features_file(feature_filename)

Map all states to features, and write the features to file.

Parameters

feature_filename – filename to output features.

class data_set.data_set(xvec, weights)

Bases: object

The class for data set.

Parameters
  • xvec (2d numpy array) – array containing trajectory data

  • weights (1d numpy array) – weights of trajectory data

Variables
  • batch_size – current batch-size

  • K – number of data

  • active_index – indices of active states. A state is included in mini-batch, if the entry is 1.

  • features – the tuple of features (the state is first transformed to features, before it is passed to neural networks).

  • num_features – number of features

  • batch_uniform_weight – when True (default), the data is included in mini-batch with equal probability.

dim_of_features()

Number of features.

classmethod from_file(states_filename)

Initialize the data_set from a data file.

generate_minibatch(batch_size, minibatch_flag=True)

Generate a mini-batch.

Parameters
  • batch_size (int) – size of mini-batch.

  • minibatch_flag (bool) – control whether generate mini-batch or entire data set.

pre_processing_layer()

Process data before it is sent to neural networks. This function returns the mini-batch itself (i.e., no features is used to transform data, in base class). It can be overrided in child class (i.e., mapping data to features).

set_features(features)

Set the features of the data set. :param features: list of features. :type features:

set_nonuniform_batch_weight()

By default, batch_uniform_weight is True and indices for mini-batch are selected with equal probability, unless this function is called, which set batch_uniform_weight to False, and indices for mini-batch will be selected randomly according to their weights.

weights_minibatch()

Return the array containing the weights of states in mini-batch. The array will be constant 1 if batch_uniform_weight is False.

class data_set.feature_tuple(features)

Bases: object

Class containing features.

Parameters

features (string) – user-defined features.

Currently supported types of features:

dihedral angle, angle, bond.

convert_atom_ix_by_file(ids_filename)

Convert indices in features according to the index file.

show_features()

Display information of features.