contact_map.frequency_task

Task-based implementation of ContactFrequency.

The overall algorithm is:

  1. Identify how we’re going to slice up the trajectory into task-based chunks (block_slices(), default_slices())
  2. On each node
    1. Load the trajectory segment (load_trajectory_task())
    2. Run the analysis on the segment (map_task())
  3. Once all the results have been collected, combine them (reduce_all_results())

Notes

Includes versions where messages are Python objects and versions (labelled with _json) where messages have been JSON-serialized. However, we don’t yet have a solution for JSON serialization of MDTraj objects, so if JSON serialization is the communication method, the loading of the trajectory and the calculation of the contacts must be combined into a single task.

Functions

block_slices(n_total, n_per_block) Determine slices for splitting the input array.
default_slices(n_total, n_workers) Calculate default slices from number of workers.
load_trajectory_task(subslice, file_name, …) Task for loading file.
map_task(subtrajectory, parameters) Task to be mapped to all subtrajectories.
map_task_json(subtrajectory, parameters) JSON-serialized version of map_task()
reduce_all_results(contacts) Combine multiple ContactFrequency objects into one
reduce_all_results_json(results_of_map) JSON-serialized version of reduce_all_results()
contact_map.frequency_task.block_slices(n_total, n_per_block)[source]

Determine slices for splitting the input array.

Parameters:
  • n_total (int) – total length of array
  • n_per_block (int) – maximum number of items per block
Returns:

slices to be applied to the array

Return type:

list of slice

contact_map.frequency_task.default_slices(n_total, n_workers)[source]

Calculate default slices from number of workers.

Default behavior is (approximately) one task per worker.

Parameters:
  • n_total (int) – total number of items in array
  • n_workers (int) – number of workers
Returns:

slices to be applied to the array

Return type:

list of slice

contact_map.frequency_task.load_trajectory_task(subslice, file_name, **kwargs)[source]

Task for loading file. Reordered for to take per-task variable first.

Parameters:
  • subslice (slice) – the slice of the trajectory to use
  • file_name (str) – trajectory file name
  • kwargs – other parameters to mdtraj.load
Returns:

subtrajectory for this slice

Return type:

md.Trajectory

contact_map.frequency_task.map_task(subtrajectory, parameters)[source]

Task to be mapped to all subtrajectories. Run ContactFrequency

Parameters:
  • subtrajectory (mdtraj.Trajectory) – single trajectory segment to calculate ContactFrequency for
  • parameters (dict) – kwargs-style dict for the ContactFrequency object
Returns:

contact frequency for the subtrajectory

Return type:

ContactFrequency

contact_map.frequency_task.map_task_json(subtrajectory, parameters)[source]

JSON-serialized version of map_task()

contact_map.frequency_task.reduce_all_results(contacts)[source]

Combine multiple ContactFrequency objects into one

Parameters:contacts (iterable of ContactFrequency) – the individual (partial) contact frequencies
Returns:total of all input contact frequencies (summing them)
Return type:ContactFrequency
contact_map.frequency_task.reduce_all_results_json(results_of_map)[source]

JSON-serialized version of reduce_all_results()