NYSBC course: nextPYP practical (day 1)¶
This session shows how to use nextPYP
to convert raw tilt-series from EMPIAR-10164 into a ~4Å resolution structure of immature HIV-1 Gag protein. We will also cover pre-processing, tomogram reconstruction, and particle-picking for two other datasets representative of datatypes often processed in tomography.
Datasets¶
Immature Gag protein from HIV-1 Virus-Like Particles (EMPIAR-10164)
Ribosomes from whole Mycoplasma pneumoniae cells (EMPIAR-10499)
Ribosomes from FIB-SEM milled mouse eplithelial cells (EMPIAR-10987)
Session 1: Pre-processing and particle picking¶
In this session we will import frames, perform pre-processing and tomogram reconstruction, and pick particles for HIV VLPs together. We will also import workflows to pick ribosomes from whole Mycoplasma cells and lamellae cut from mouse epithelial cells.
Create a new project¶
Data processing runs are organized into projects. We will create a new project for this tutorial
Click on Create new project, give the project a name, and select Create
Select the new project from the Dashboard and click Open
The newly created project will be empty and a Jobs panel will appear on the right
Dataset 1: Immature Gag protein from HIV-1 VLPs¶
Step 1: Import raw tilt-series
Go to Import Data and select Tomography (from Raw Data)
A form to enter parameters will appear:
Go to the Raw data tab:
Set the
path to raw data
by clicking on the icon and browse to/nfs/bartesaghilab/nextpyp/workshop/10164/
Type
*.tif
into the filter box (lower right) and click the icon
Go the the Microscope Parameters tab:
Set
Pixel size (A)
to 1.35Set
Acceleration voltage (kV)
to 300Set
Tilt-axis angle (degrees)
to 85.3
Click Save and the new block will appear on the project page. The block is in the modified state (indicated by the sign) and is ready to be executed
Clicking the Run button will show another dialog where you can select which blocks to run:
Click Start Run for 1 block. This will launch a process that reads one tilt at random and displays the resulting image inside the block
Click on the thumbnail inside the block to see a larger version of the projection image
Step 2: Pre-processing
Click on
Tilt-series
(output of the Tomography (from Raw Data) block) and select Pre-processingGo to the Frame alignment tab:
nextPYP
uses theFrame pattern
to extract metadata form the file names. EMPIAR-10164 follows the default file naming scheme and.tif
extension, so we will leave the default setting.We will use
unblur
for frame alignment.
Go to the CTF determination tab
Set
Max resolution
to 5
Go to the Tilt-series alignment tab
Our
Alignment method
will be IMOD fiducial-based which is the default so make no changes.
Go to the Tomogram reconstruction tab
Our
Reconstruction method
will be IMOD, this is the default so make no changes.
Go to the Resources tab
Set
Split, Threads
to 41
Click Save, Run, and Start Run for 1 block. Follow the status of the run in the Jobs panel
When the block finishes running, examine the Tilt-series, Plots, Table, and Gallery tabs. We will measure our virions in this block as well.
Step 3: Particle picking
We will be utilizing three separate blocks to perform geometrically constrained particle picking. This will allow for increased accruacy in particle detection and provides geometric priors for downstream refinement.
Block 1: Virion selection
Click on
Tomograms
(output of the Pre-processing block) and select Particle-PickingGo to the Particle detection tab:
Set
Detection method
to virionsSet
Virion radius (A)
to 500 (half the diameter we measured)
Click Save
Block 2: Virion segmentation
Click on
Particles
(output of the Particle-Picking block) and select Segmentation (closed surfaces)Click Save
Block 3: Spike (Gag) detection
Click on
Segmentation (closed)
(output of the Segmentation (closed surfaces) block) and select Particle-Picking (closed surfaces)Go to the Particle detection tab:
Set
Detection method
to uniformSet
Particle radius (A)
to 50Set
Size of equatorial band to restrict spike picking (A)
to 800
Click Save, Run, and Start Run for 3 blocks. Follow the status of the run in the Jobs panel
Dataset 2: Ribosomes (whole Mycoplasma cells)¶
Step 1: Import workflow
In the upper left of your project page, click Import Workflow
Choose the 2025 NYSBC course: Pre-processing (EMPIAR-10499) workflow by clicking the Import button to its right
We pre-set the parameters for the workflow, so you can immediately click Save. Three blocks will populate on the project page.
Step 2: Edit particle picking parameters
Click into the settings of the Particle-Picking block
Set
Particle radius (A)
to 80Change
Detection method
from none to size-based using the dropdown menu
Click Save, Run, and Start Run for 3 blocks. Follow the status of the run in the Jobs panel
Step 3: Copy particles and manually edit
Click on the menu for the Particle-Picking block
Select Copy
Check Copy files and data and Make automatically-picked particles editable
Click Next
Click into the new Particle-Picking block.
Ensure you are on the Particles tab. Here, you can right click to remove particles and left click to add particles.
This manual picking feature is what I used the generate a particle set for nn-training for the next particle picking method we will use on the third dataset.
Dataset 3: Ribosomes (lamellae from mouse epithelial cells)¶
Step 1: Import workflow
In the upper left of your project page, click Import Workflow
Choose the 2025 NYSBC course: Pre-processing (EMPIAR-10987) workflow by clicking Import
We pre-set the parameters for the workflow, so you can immediately click Save. Three blocks will populate on the project page.
Step 2: Edit particle picking parameters
Click into the settings of the Particle-Picking (eval) block
Click the icon. Browse to
/nfs/bartesaghilab/nextpyp/workshop/10987/model_last_contrastive.pth
Set
Particle radius (A)
to 100Set
Threshold for soft/hard positives
to 0.5Set
Max number of particles
to 700
Click Save, Run, and Start Run for 3 blocks. Follow the status of the run in the Jobs panel
Session 2: 3D reconstruction and refinement¶
In this session we will import 19,972 HIV-Gag protein particles, import initial reference-based alignments, then go through a condensed version of the 3D Refinement pipeline to attain an ~4Å resolution structure from 5,000 filtered particles. At a high level, we will be performing reference-based refinement, filtering particles, performing region-based refinement and tilt-geometry refinement, refining movie frames, and completing post-processing. Then we will demonstrate using ChimeraX to visualize our results.
Step 1: Import particles
Click on
Tomograms
(output of the Pre-processing block) and select Particle-PickingSet
Detection method
to importSet
Particle radius (A)
to 50Click and browse to
/nfs/bartesaghilab/nextpyp/workshop/10164/particles
. Select Choose FolderClick Save, Run, and Start Run for 1 block
Step 2: Import alignments
Click on
Particles
(output of the Particle-Pickng block) and select Calculate reconstructionGo to the Sample tab
Set
Molecular weight (kDa)
to 300Set
Particle radius (A)
to 150Set
Symmetry
to C6
Go to the Extraction tab
Set
Box size (pixels/voxels)
to 128Set
Image binning
to 2
Go to the Alignments tab
From the
Import from
dropdown menu, selectnextPYP (*.bz2)
Click the icon next to
Input parameter file (*.bz2)
and browse to/nfs/bartesaghilab/nextpyp/workshop/10164/tomo-coarse-refinement-fg2v2MJLSY4Ui908_r01_02.bz2
Click Choose File
Go to the Reconstruction tab
Select
Apply dose weighting
by checking the box
Go to the Resources tab
Set
Split, Threads
to 124
Set
Split, Threads
to 70
Click Save, Run, and Start Run for 1 block
Constrained single-particle tomography (CSPT)¶
Step 3: Particle filtering
Click on
Particles
(output of the Particle refinement block) and select Particle filteringGo to the Particle filtering tab
Set
Score threshold
to 3.5Set
Min distance between particles (unbinned pixels)
to 54Click the icon next to
Input parameter file(*.bz2)
and select the*.bz2
file that appears (this is from the parent directory). Click Choose FileCheck the box next to
Permanently remove particles
Click Save, Run, and Start Run for 1 block
Step 4: Region-based refinement, tilt-geometry refinement, further particle refinement
Click on
Particles
(output of the Particle filtering block) and select 3D refinementGo to the Extraction tab
Set
Box size (pixels/voxels)
to 256Set
Image binning
to 1
Go to the Particle scoring function tab
Set
Last tilt for refinement
to 8Set
Max resolution (A)
to 4:3.5From the
Masking strategy
dropdown menu, selectfrom file
Click the icon to select the
Shape mask (*.mrc)
, browse to/nfs/bartesaghilab/nextpyp/workshop/10164/EMPIAR-10164_shape_mask.mrc
, and click Choose File
Go to the Refinement tab
Next to
Input parameter file (*.bz2)
click the icon. Select the_r01_02_clean.bz2
file and click Choose FileSet
Last iteration
to 3Check
Refine tilt-geometry
Check
Refine particle alignments
Set
Number of regions
to 8,8,2
Go to the Reconstruction tab
Check
Apply dose weighting
(It may already be checked)
Click Save, Run, and Start Run for 1 block
Region-based refinement¶
Step 5: Movie frame refinement
Click on
Particles
(output of the Particle refinement block) and select Movie refinementGo to the Particle scoring function tab
Set
Last exposure for refinement
to 4Set
Max resolution (A)
to 3.5
Go to the Frame refinement tab
Next to
Input parameter file (*.bz2)
click the icon. Select the_r01_03.bz2
file and click Choose FileSet
Spatial sigma
to 400Set
Time sigma
to 16
Go to the Reconstruction tab
Check
Apply dose weighting
Click Save, Run, and Start Run for 1 block
Refinement of individual tilt-frames¶
While the Movie refinement block is running, we will demonstrate use of ArtiaX to visualize particle alignments
3D Visualization of alignments in ArtiaX
For reference, these instructions are also available on the User Guide.
We assume the user already has the ArtiaX plugin, if not a simple google search will bring you to their docs for installation.
Download files
Select a tomogram you wish to visualize the particles in. I will be using
TS_43
.Click into the Pre-processing block, go to Tilt Series tab and Tomogram sub tab. On this page, click the search icon, search for TS_43. Click the green button immediately above the tomogram display. This will download the tomogram in .rec format.
Click into the Particle refinement block, go to the Metadata tab. On this page, type
TS_43
into the search bar and click Search. Click the .star file to download particle alignments.Go to the Reconstruction tab and download the Cropped Map.
Display in ChimeraX
Open ChimeraX (again, we assume ArtiaX is installed)
Open the tomogram
TS_43.rec
Run the following commands in the ChimeraX shell:
volume permuteAxes #1 xzy volume flip #2 axis z
Go to the ArtiaX tab and click Launch to start the plugin.
In the Tomograms section on the left, select model #3 (permuted z flip) from the Add Model dropdown menu and click Add!
Go to the ArtiaX options panel on the right, and set the Pixel Size for the Current Tomogram to 10.8 (The current binned pixel size)
On the left panel, under the Particles List section, select Open List … and open the .star file.
Return to the panel on the right and select the Select/Manipulate tab. Set the Origin to 1.35 (the unbinned pixel size)
From the Color Settings section, select Colormap and then rlnLogLikelihoodContribution from the dropdown menu.
Play with the Marker Radius and Axes Size sliders to visualize the particle locations, cross correlation scores, and orientations.
Step 6: Post-processing
Click on
Frames
(output of the Movie refinement block) and select Post-processingGo to the Post-processing tab
Next to
First half map (*_half1.mrc)
click the icon. Select the*_half1.mrc
file and click Choose FileSet
Masking method
to from file usign the dropdown menuNext to
Mask file (*.mrc)
click the icon. Browse to/nfs/bartesaghilab/nextpyp/workshop/10164/EMPIAR-10164_shape_mask.mrc
and click Choose FileSet the
B-factor method
to adhoc using the dropdown menuSet the
Adhoc value (A^2)
to -25
Click Save, Run, and Start Run for 1 block
Map and model assessment in ChimeraX
I will be using a prealigned pdb file and files downloaded from nextPYP to demonstrate how one can visualize their final map aligned to a model in Chimera.
Download files
In the Post-processing block, go to the Reconstruction tab. Click on the drop down menu Select an MRC file to download. Select the Full-Size Map. Your browser will download the post processed map as an MRC file.
We are using a pre-aligned, pre-cropped pdb file (5L93) so do not need to download this. For your experiments, you would download whatever model required.
Open the downloaded MRC file in Chimera. Visualize your beautiful map. To get a better look at your map/model fitting, open an atomic model in Chimera. Under the Map tab, Click Zone. Note we are left with a slightly larger zone than we would like so we will copy the zone command from the output to the terminal line, and edit the range. This leaves us with:
volume zone #2 nearAtoms #1 range 2.4
Select the model, go to Actions, Atoms/Bonds, and Show Sidechain/Base
You can now view the model fit to your map interactively in ChimeraX
Day 1 summary¶
What we learned today
In this session we learned some of the things we are capable of doing in nextPYP
:
Raw data import
Pre-processing (frame alignment, tilt-series alignment, CTF estimation)
Tomogram reconstruction (WBP, fakeSIRT, SART)
nextPYP
also supports tomogram denoising using cryoCARE, IsoNet and Topaz Denoise
Segmentation (closed surfaces)
nextPYP
also supports open surface segmentation which uses membrain-seg
Particle picking (geometrically constrained, size-based, nn-based, manual)
nextPYP
also supports template-search and molecular pattern mining
Particle refinement (constrained single particle tomography, particle filtering, exposure weighting, region-based refinement, movie frame refinement, and post-processing)
nextPYP
also supports particle-based CTF refinement, building shape masks, ab-initio refinement, and 3D classification
We encourage you to explore the things we learned today as well as the other options available in nextPYP
. On day 2 we will demonstrate nextPYP
’s functionality for on-the-fly data pre-processing.