rstoolbox.components.DesignFrame.sequence_distance

DesignFrame.sequence_distance(seqID, other=None)

Make identity sequence distance between the selected decoys.

Generate a matrix counting the distance between each pair of sequences in the DesignFrame. This is a time-consuming operation; better to execute it over a specific set of selected decoys that over all your designs.

If other is provided as a second DesignFrame, distances are calculated between the sequences of the current DesignFrame against the sequence of the other.

Parameters:
  • seqID (str) – Identifier of the sequence of interest.
  • other (DesignFrame) – Secondary data container. Optional.

return: DataFrame - table with the sequence distances.

Raises:
KeyError:if there is no sequence information for chain seqID of the decoys.
KeyError:if description column cannot be found.
ValueError:if sequence of self and other are of different length.
ValueError:if data container only has one sequence and no other is provided.

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_2seq.minisilent.gz",
   ...:                         {'scores': ['score', 'description'], 'sequence': 'B'})
   ...: df.sequence_distance('B')
   ...: 
Out[1]: 
                                test_3lhp_binder_labeled_00001  test_3lhp_binder_labeled_00002  test_3lhp_binder_labeled_00003  test_3lhp_binder_labeled_00004  test_3lhp_binder_labeled_00005  test_3lhp_binder_labeled_00006
test_3lhp_binder_labeled_00001  0                               75                              77                              74                              69                              81                            
test_3lhp_binder_labeled_00002  75                              0                               75                              65                              66                              73                            
test_3lhp_binder_labeled_00003  77                              75                              0                               68                              81                              76                            
test_3lhp_binder_labeled_00004  74                              65                              68                              0                               74                              71                            
test_3lhp_binder_labeled_00005  69                              66                              81                              74                              0                               66                            
test_3lhp_binder_labeled_00006  81                              73                              76                              71                              66                              0