data_set module
- class data_set.MD_data_set(xvec, weights)
Bases:
data_set.data_set
This class is a child class of
data_set
, used for data of molecular system.- align()
Align all states in the data set with respect to the reference configuration.
This function implements the Kabash’s algorithm, which minimizes the the root mean squared deviation of states with respect to the reference configuration.
- classmethod from_file(states_filename)
Initize the data set from file.
- load_ref_state()
Load a reference configuration from the file ./data/ref_state.txt. It is used to align the data.
- map_to_all_features()
Map the states in the mini-batch to all features, by calling
map_to_feature()
for each feature.- Returns
2d torch tensor, whose dimensions are (batch_size, num_features).
- map_to_feature(idx)
Map the states in the mini-batch to one feature.
- Parameters
idx – idx of the feature to which states are mapped.
- pre_processing_layer()
If features are defined, then map data to features. Otherwise, return the aligned data.
- write_all_features_file(feature_filename)
Map all states to features, and write the features to file.
- Parameters
feature_filename – filename to output features.
- class data_set.data_set(xvec, weights)
Bases:
object
The class for data set.
- Parameters
xvec (2d numpy array) – array containing trajectory data
weights (1d numpy array) – weights of trajectory data
- Variables
batch_size – current batch-size
K – number of data
active_index – indices of active states. A state is included in mini-batch, if the entry is 1.
features – the tuple of features (the state is first transformed to features, before it is passed to neural networks).
num_features – number of features
batch_uniform_weight – when True (default), the data is included in mini-batch with equal probability.
- dim_of_features()
Number of features.
- classmethod from_file(states_filename)
Initialize the data_set from a data file.
- generate_minibatch(batch_size, minibatch_flag=True)
Generate a mini-batch.
- Parameters
batch_size (int) – size of mini-batch.
minibatch_flag (bool) – control whether generate mini-batch or entire data set.
- pre_processing_layer()
Process data before it is sent to neural networks. This function returns the mini-batch itself (i.e., no features is used to transform data, in base class). It can be overrided in child class (i.e., mapping data to features).
- set_features(features)
Set the features of the data set. :param features: list of features. :type features:
- set_nonuniform_batch_weight()
By default,
batch_uniform_weight
is True and indices for mini-batch are selected with equal probability, unless this function is called, which setbatch_uniform_weight
to False, and indices for mini-batch will be selected randomly according to their weights.
- weights_minibatch()
Return the array containing the weights of states in mini-batch. The array will be constant 1 if
batch_uniform_weight
is False.
- class data_set.feature_tuple(features)
Bases:
object
Class containing features.
- Parameters
features (string) – user-defined features.
- Currently supported types of features:
dihedral angle, angle, bond.
- convert_atom_ix_by_file(ids_filename)
Convert indices in features according to the index file.
- show_features()
Display information of features.