Modeling#
The modeling module within the FAMP pipeline is designed for de novo RNA structure prediction. Two external tools are employed:
RNAfold for secondary structure (2D) prediction
Rosetta (FARFAR2) for tertiary structure (3D) prediction
All modeling-related functions are encapsulated in the Modeling class. This module forms the first step in the FAMP pipeline and enables generation of structural input models for FRET-based analysis.
Initialization#
To start modeling, initialize the class by providing:
working_dir: Output directory for generated filesfile_path_sequence: Path to the input RNA sequence in FASTA formatmodeling_parameter: Python dictionary defining the modeling parameters (see the parameter section)
import famp
import os
current_dir = os.getcwd()
modeling = famp.modeling.Modeling(
working_dir=f"{current_dir}",
file_path_sequence=f"{current_dir}/input_data/RNA_Hairpin.fasta",
modeling_parameter=rosetta_parameter
)
The input FASTA file should contain a single RNA sequence in the following format:
>Example_Sequence
caauauuuauuaauaucuuccggauauuaauaaauauug
Secondary Structure Prediction (2D)#
The function predict_2d_structure() performs secondary structure prediction using RNAfold. The RNA sequence is read from the input FASTA file, and the result is stored as a dot-bracket notation file.
Output directory:
secondary_prediction/Output file:
dot_bracket.secstruct
This file can be edited manually after prediction to define or modify secondary structures.
modeling.predict_2d_structure()
Tertiary Structure Prediction (3D)#
3D structure prediction is performed using Rosetta’s FARFAR2 module via the predict_3d_structure() function. It requires the RNA secondary structure file (from dot_bracket.secstruct)
Output directory:
rosetta_resultsOutput file:
silent_out.out
modeling.predict_3d_structure("dot_bracket.secstruct")
Extracting PDB Structures#
Once Rosetta modeling is complete, the extract_pdb() function is used to convert the generated structure files into PDB format.
The models are ranked by Rosetta’s “res4” score.
The function returns the top
nstructures, wherenis defined bynumber_of_structures.
modeling.extract_pdb(number_of_structures=5)
The resulting .pdb files are stored in the output directory and can be used for subsequent MD simulations or in silico dye labeling.