rstoolbox.analysis.positional_structural_count

rstoolbox.analysis.positional_structural_count(df, seqID=None, key_residues=None)

Percentage of secondary structure types for each sequence position of all decoys.

The secondary structure dictionary is a minimized one: H, E and L.

Parameters:
Returns:

DataFrame - where rows are sequence positions and columns are the secondary structure identifiers H, E, L.

Raises:
AttributeError:if the data passed is not in Union[DesignFrame, FragmentFrame]. It will not try to cast a provided DataFrame, as it would not be possible to know into which of the two possible inputs it needs to be casted.
AttributeError:if input is DesignFrame and seqID is not provided.
KeyError:if there is no structure information for chain seqID of the decoys when input is DesignFrame.

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: from rstoolbox.analysis import positional_structural_count
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_ssebig.minisilent.gz",
   ...:                         {'scores': ['score'], 'structure': 'C'})
   ...: df = positional_structural_count(df.iloc[1:], 'C')
   ...: df.head()
   ...: 
Out[1]: 
     H         E         L
1  0.0  0.000000  1.000000
2  0.0  0.787611  0.212389
3  0.0  0.985841  0.014159
4  0.0  1.000000  0.000000
5  0.0  1.000000  0.000000