rstoolbox.components.DesignFrame.get_sequence_with

DesignFrame.get_sequence_with(seqID, selection, confidence=1, invert=False)

Selects those decoys with a particular set of residue matches.

Basically, is meant to find, for example, all the decoys in which position 25 is A and position 46 is T.

Parameters:
  • seqID (str) – Identifier of the sequence of interest.
  • selection (list() of tuple) – List of tuples with position and residue type (in 1 letter code).
  • confidence (float) – Percentage of the number of the selection rules that we expect the matches to fulfill. Default is 1 (all).
  • invert (bool) – When False, return the sequences that do NOT fulfill the search conditions.
Returns:

DesignFrame - filtered by the requested sequence

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_2seq.minisilent.gz",
   ...:                         {'scores': ['score'], 'sequence': 'B'})
   ...: df.get_sequence_with('B', [(1, 'T')])
   ...: 
Out[1]: 
     score                                                                                                            sequence_B
0 -206.678  TRPEEARERAWRLAEIAMRKGWEEHEREWEWWKRASKGREERDMLPERMIAAALRAIGEIFNAEWQMRLEMEKERKNPNAGEEKMKEQKKEAWKIAYYWGLMAAYWIKQHREKERK
2 -203.582  TKPEEMAREAYKRMLKALKQGEEEMKRMYEQMKKGVDSKEERDMEPEKMIAIALRAIGELFNAWMKALRHMKELRKLGTSGPKEEEKHWRWIFELHRWAGEEIQRAAEIQERKARW
3 -213.779  TKPEEWARWAYKEHLKMAEKHRKEMEIEWEELKRRDGKEEEKDMWPERMIAMALRAIGELFNHHMYAEMRAKEEKKKPEAKTEEARRARREIMKYHHEAGRLIEEAMRRLMERHKK