rstoolbox.components.DesignFrame(*args, **kwargs)¶The DesignFrame extends the DataFrame
adding some functionalities in order to improve its usability in
the analysis of sets of design decoys.
Filled through the functions provided through this library, each row represents a decoy while each column represents the scores attached to it.
As a rule, it is assumed that the object:
This two assumptions are easily adapted if casting a
DataFrame into the class, and several functions
of the library depend on them.
Note
This assumptions are automatically fulfilled when the data container is
loaded through parse_rosetta_file(). To obtain sequence information
is is necessary to request for that particular data, as described in
tutorial: reading Rosetta.
The DesignFrame basically contains four extra attributes
(accessible through the appropiate functions):
seqID     present in the DesignFrame. By adding this sequence, other functions     of the library can add that information to its calculations.seqID present in the DesignFrame. By adding this sequence, other     functions of the library can add that information to its calculations.seqID     present in the DesignFrame. In short, this would be the initial number     of the protein in the source PDB. This allows working with the right numbering. This     value is, by default, 1 in all seqID. A more complex alternative allows for a     list of numbers to also be assigned as reference_shift. This is usefull when     the original structure does not have a continuous numbering schema.parse_rosetta_file()). This information can     be used to extract the structures from the silent files.See also
Getters
| get_id() | Return identifier data for the design(s). | 
| get_available_sequences() | List which sequence identifiers are available in the data container | 
| get_sequence(seqID[, key_residues]) | Return the sequence data for seqIDavailable in the container. | 
| get_available_structures() | List which structure identifiers are available in the data container | 
| get_structure(seqID[, key_residues]) | Return the structure data for seqIDavailable in the container. | 
| get_available_structure_predictions() | List which structure prediction identifiers are available in the data container. | 
| get_structure_prediction(seqID[, key_residues]) | Return the structure prediction(s) data. | 
| get_sequential_data(query, seqID) | Provides data on the requested query. | 
| get_dihedrals(seqID[, key_residues]) | Return the dihedrals data for phi-psiavailable in the container. | 
| get_phi(seqID[, key_residues]) | Return the phi angle for seqIDavailable in the container. | 
| get_psi(seqID[, key_residues]) | Return the psi angle for seqIDavailable in the container. | 
| get_available_labels() | List which slabels are available in the data container. | 
| get_label(label[, seqID]) | Return the content(s) of the labels of interest as a Selectionfor a given sequece. | 
Reference Data
| has_reference_sequence(seqID) | Checks if there is a reference_sequenceforsequID. | 
| add_reference_sequence(seqID, sequence) | Add a reference_sequenceattached to chainseqID. | 
| get_reference_sequence(seqID[, key_residues]) | Get the reference_sequenceattached to chainseqID. | 
| has_reference_structure(seqID) | Checks if there is a reference_structureforseqID. | 
| add_reference_structure(seqID, structure) | Add a reference_structureattached to chainseqID. | 
| get_reference_structure(seqID[, key_residues]) | Get the reference_structureattached to chainseqID. | 
| add_reference_shift(seqID, shift[, shift_labels]) | Add a reference_shiftattached to a chainseqID. | 
| get_reference_shift(seqID) | Get a reference_shiftattached to a particularseqID. | 
| get_available_references() | List which decoy chain identifiers have some kind of reference data. | 
| add_reference(seqID[, sequence, structure, …]) | Single access to add_reference_sequence(),add_reference_structure()andadd_reference_shift(). | 
| transfer_reference(df) | Transfer reference data from one container to another. | 
| delete_reference(seqID[, shift_labels]) | Remove all reference data regarding a particular seqID. | 
Source Files
| add_source_file(file) | Adds a source_fileto theDesignFrame. | 
| add_source_files(files) | Adds source_fileto theDesignFrame. | 
| get_source_files() | Get source_filestored in the data container. | 
| has_source_files() | Checks if there are source files added. | 
| replace_source_files(files) | Replaces source_fileof theDesignFrame. | 
Frequencies
| sequence_bits(seqID[, seqType, cleanExtra, …]) | Create a bit-based SequenceFrame. | 
| sequence_distance(seqID[, other]) | Make identity sequence distance between the selected decoys. | 
| sequence_frequencies(seqID[, seqType, …]) | Create a frequency-based SequenceFrame. | 
| structure_bits(seqID[, seqType, cleanExtra, …]) | Create a bit-based SequenceFramefor secondary structure assignation. | 
| structure_frequencies(seqID[, seqType, …]) | Create a frequency-based SequenceFramefor secondary structure assignation. | 
Mutation Methods
| identify_mutants(seqID) | Assess mutations of each decoy for sequence seqIDagaint thereference_sequence. | 
| get_identified_mutants() | List for which sequence identifiers mutants have been calculated. | 
| get_mutation_count(seqID) | Return the number of mutantion positions data for seqID available in the container. | 
| get_mutation_positions(seqID) | Return the mutantion positions data for seqID available in the container. | 
| get_mutations(seqID) | Return the mutantions data for seqIDavailable in the container. | 
| get_sequence_with(seqID, selection[, …]) | Selects those decoys with a particular set of residue matches. | 
| generate_mutant_variants(seqID, mutations[, …]) | Expands selected decoy sequences generating all the provided mutant combinations. | 
| generate_mutants_from_matrix(seqID, matrix, …) | From a provided positional frequency matrix, generates countrandom variants. | 
| generate_wt_reversions(seqID[, key_residues]) | Generate all variant that revert decoy sequences to the reference_sequence. | 
| make_resfile(seqID, header, filename[, write]) | Generate a Rosetta resfile to match the design’s sequence assuming the reference_sequenceas the starting point. | 
| apply_resfile(seqID, filename[, rscript, …]) | Apply a generated Rosetta resfile to the decoy. | 
| score_by_pssm(seqID, matrix) | Score sequences according to a provided PSSM matrix. | 
| view_mutants_alignment(seqID[, …]) | Generates a pretty representation alignment of the mutations in Jupyter Notebooks. | 
Miscellaneous
| clean_rosetta_suffix() | Remove the numerical suffix that Rosetta adds to the output identifiers. | 
| retrieve_sequences_from_pdbs([prefix, dropna]) | Obtain sequence data related to the decoys through their Rosetta-generated PDB files. |