Structural Bioinformatics Library
Template C++ / Python API for developping structural bioinformatics applications.
BM_Batch Class Reference

Loading

 set_dataset (self, dataset)
 Set the current dataset from an existing dataset.
 add_run_specification (self, run_specification)
 Add a simple run specification to the batch.
 load_dataset (self, directory, file_name_re=".*", recursive=False)
 Load the dataset from a directory.
 check_specification (self)
 Check that the loaded specification is correct.
 load_run_specification (self, file_name)
 Load the specification of the runs from a specification file.

Building the Runs

 build_run_commands (self)
 Builds the run specifications of each run of the batch, and return them.
 split_per_IFO (self)
 Split the batch such that each new batch is invariant w.r.t the IFO.
 split_per_NFO (self)
 Split the batch such that each new batch is invariant w.r.t the NFO.
 split_per_selected_NFO_option (self, option_name)
 Split the batch such that the new batch is invariant w.r.t the input NFO name.
 split_per_selected_IFO_option (self, option_name)
 Split the batch such that the new batch is invariant w.r.t the input IFO name.
 split_per_selected_option (self, option_name)
 Split the batch such that the new batch is invariant w.r.t the input option.
 split_per_selected_options (self, option_names)
 Same as previous but do it recursively over a list of option names.

Accessing the Runs

 get_output_directory (self)
 Simple access to the directory where the runs of the batch are run.
 set_output_directory_prefix (self, prefix)
 Sets a prefix for the output directory where the runs of the batch are run.
 set_output_directory (self, output_directory)
 Sets the output directory, where the runs of the batch are run.
 get_run_commands (self)
 Simple access to the run specification of each run.

Starting the Runs

 print_batch (self)
 Print all the run commands from the list of specifications.
 get_lists_of_run_options (self)
 Return a list of run options for each run in the batch.
 run (self, nb_instances=1)
 Does the runs nb_instances times.
 repeat (self, nb_instances)
 Synonym of run, but the number of instances has no default value.
 make_scripts (self)
 Make one file per execution instead of runnning them.

Detailed Description

Definition of a batch with a data set, and specification of runs. The functionnality are :

  • loading the batch from a couple (dataset, specification_file),
  • printing the commands to run, or running them directly,
  • splitting the batch onto several smaller batches where the IFO or NFO are identical.

Member Function Documentation

◆ add_run_specification()

add_run_specification ( self,
run_specification )

Add a simple run specification to the batch.

◆ build_run_commands()

build_run_commands ( self)

Builds the run specifications of each run of the batch, and return them.

◆ check_specification()

check_specification ( self)

Check that the loaded specification is correct.

In particular, it checks that the IFO and the IFO assoc rule match.

◆ get_lists_of_run_options()

get_lists_of_run_options ( self)

Return a list of run options for each run in the batch.

◆ get_output_directory()

get_output_directory ( self)

Simple access to the directory where the runs of the batch are run.

Note that if an option of the executable modifies the output directory from the current directory, the output will not be dumped into the batch output directory

◆ get_run_commands()

get_run_commands ( self)

Simple access to the run specification of each run.

◆ load_dataset()

load_dataset ( self,
directory,
file_name_re = ".*",
recursive = False )

Load the dataset from a directory.

Only the files for which the regular expression file_name_re is found are loaded (all files are loaded by default). The recursive tag checks that the subdirectories of this directory are also included in the dataset (default is False).

◆ load_run_specification()

load_run_specification ( self,
file_name )

Load the specification of the runs from a specification file.

The specification file is parsed and four keywords are recognized:

  • EXECUTABLE <executable-path> : determines the path to the executable to run,
  • IFO-ASSOC-RULE : determines how to associate the input files of the dataset for each run (see BM_Dataset_association_rules),
  • IFO <option-name>: option corresponding to an input file. The argument is the name of the option without the dash at the beginning. The arity of the input files is determined by the number of times IFO appears (see BM_IFO_set).
  • NFO <option-name> <argument> [, <argument>] [...]: option not corresponding to an input file. The first argument is the name of the option without the dash, the other arguments separated separated by a comma are the possible values of the option.

◆ make_scripts()

make_scripts ( self)

Make one file per execution instead of runnning them.

Usefull when using another system for executing all the run.

◆ print_batch()

print_batch ( self)

Print all the run commands from the list of specifications.

◆ repeat()

repeat ( self,
nb_instances )

Synonym of run, but the number of instances has no default value.

◆ run()

run ( self,
nb_instances = 1 )

Does the runs nb_instances times.

◆ set_dataset()

set_dataset ( self,
dataset )

Set the current dataset from an existing dataset.

◆ set_output_directory()

set_output_directory ( self,
output_directory )

Sets the output directory, where the runs of the batch are run.

As a result, all the outputs from the executable should be written in the specified directory. Use set_output_directory_prefix for running batches in different directories but from a different root than the current directory.

◆ set_output_directory_prefix()

set_output_directory_prefix ( self,
prefix )

Sets a prefix for the output directory where the runs of the batch are run.

Useful if the user wishes that the batch directories are created in a different directory than the current one.

◆ split_per_IFO()

split_per_IFO ( self)

Split the batch such that each new batch is invariant w.r.t the IFO.

Note that it builds also the run specifications of the new batches. Note also that the run specifications manually added to the batch will remain in the current batch, but will not be included in the splitted batches.

◆ split_per_NFO()

split_per_NFO ( self)

Split the batch such that each new batch is invariant w.r.t the NFO.

Note that it builds also the run specifications of the new batches. Note also that the run specifications manually added to the batch will remain in the current batch, but will not be included in the splitted batches.

◆ split_per_selected_IFO_option()

split_per_selected_IFO_option ( self,
option_name )

Split the batch such that the new batch is invariant w.r.t the input IFO name.

Note that it builds also the run specifications of the new batches. Note also that the run specifications manually added to the batch will remain in the current batch, but will not be included in the splitted batches.

◆ split_per_selected_NFO_option()

split_per_selected_NFO_option ( self,
option_name )

Split the batch such that the new batch is invariant w.r.t the input NFO name.

Note that it builds also the run specifications of the new batches. Note also that the run specifications manually added to the batch will remain in the current batch, but will not be included in the splitted batches.

◆ split_per_selected_option()

split_per_selected_option ( self,
option_name )

Split the batch such that the new batch is invariant w.r.t the input option.

Note that it builds also the run specifications of the new batches. Note also that the run specifications manually added to the batch will remain in the current batch, but will not be included in the splitted batches.

◆ split_per_selected_options()

split_per_selected_options ( self,
option_names )

Same as previous but do it recursively over a list of option names.