rstoolbox.analysis.positional_structural_identity

rstoolbox.analysis.positional_structural_identity(df, seqID=None, ref_sse=None, key_residues=None)

Per position evaluation of how many times the provided data matches the expected reference_structure.

Parameters:
Returns:

DataFrame - where rows are sequence positions and columns are sse (expected secondary structure), max_sse (most represented secondary structure) and identity_perc (percentage of matched secondary structure).

Raises:
AttributeError:if the data passed is not in Union[DesignFrame, FragmentFrame]. It will not try to cast a provided DataFrame, as it would not be possible to know into which of the two possible inputs it needs to be casted.
AttributeError:if input is DesignFrame and seqID is not provided.
KeyError:if there is no structure information for chain seqID of the decoys when input is DesignFrame.
AttributeError:if input is FragmentFrame and ref_sse is not provided.

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: from rstoolbox.analysis import positional_structural_identity
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_ssebig.minisilent.gz",
   ...:                         {'scores': ['score'], 'structure': 'C'})
   ...: df.add_reference_structure('C', df.get_structure('C').values[0])
   ...: df = positional_structural_identity(df.iloc[1:], 'C')
   ...: df.head()
   ...: 
Out[1]: 
  sse max_sse  identity_perc
1  L   L       1.000000     
2  E   E       0.212389     
3  E   E       0.985841     
4  E   E       1.000000     
5  E   E       1.000000