rstoolbox.components.DesignFrame.retrieve_sequences_from_pdbs

DesignFrame.retrieve_sequences_from_pdbs(prefix=None, dropna=True)

Obtain sequence data related to the decoys through their Rosetta-generated PDB files.

This is a method that might be necessary when reading from score files, as they do not contain sequence information.

Parameters:
  • prefix (str) – description might not point to the path of the PDB if we have read the score file from a different directory. Apply a prefix to properly find them. Consider that one will need to add the path to the score file if the path to the PDB inside it starts from the score file position.
  • dropna (bool) – If True, non-standard residues are dropped when making the sequence. Otherwise, it appears as X. Consider that modifications of residues that are known by Rosetta such as LYS:CtermProteinFull or HIS_D are considered standard in this context.
Returns:

DataFrame with the new sequence data.