Q&A Report: Spectronaut: Expand Biological Insights with DIA Proteomics

Oliver Bernhardt and Andreas-David Brunner answer questions about Spectronaut 15 with a specific focus on directDIA usage in proteomics.

These answers have been provided by:

Oliver M. Bernhardt
Principal Scientist
Bioinformatics
Biognosys

Andreas-David Brunner, PhD
Postdoctoral Researcher
Matthias Mann Lab
Max Planck Institute of Biochemistry

How does directDIA compare to using in-silico libraries?

O. Bernhard: The directDIA and in-silico library workflows both try to address the problem of analyzing DIA runs without a library, but are very different in their implementation. In silico libraries are based on the prediction of spectra and retention times for entire digested protein databases, resulting in large, but not sample-specific libraries. In contrast, the directDIA workflow first performs a classical database search directly on your DIA runs to create a library. The library is then used for a targeted analysis of the same DIA runs. Both of these steps in directDIA are vigorously FDR controlled and the quality of analysis is similar to when using a sample specific library. This also means that directDIA natively supports post-translational modifications and non-tryptic search spaces, both of which are challenging to cover with in-silico libraries.

How good is directDIA compared to the library based analysis?

O. Bernhard: directDIA performs at a very comparable level to using a project-specific DDA library for many use cases, especially when working with larger biological experiments with some variability across samples. If you are using dia-PASEF, you might get better proteome depth when using a hybrid (DDA + DIA) library.

Can a single peptide be visualized in a directDIA experiment, across multiple samples (to have an idea of its quantity)?

O. Bernhard: Yes, you can visualize a single peptide with its MS2 and MS1 XIC across all runs. Additionally, you can also look at individual spectra at apex RT for each instance of that peptide in the experiment. There are also several plots that let you visualize peptide quantity individually or across runs.

How do results of DIA analysis differ if you use empirical Ion Mobility (IM) vs your in-silico generated IM?

O. Bernhard: Our in-silico IM prediction provides performance that gets close to using a library with empirical IM values. In a comparison, using a library with predicted IM resulted in only 5% less precursors than the empirical IM approach. For library generation, Spectronaut will by default use empirical IM annotation whenever possible and will only predict IM if you are creating a library from data that does not include empirical IM.

What is the recommended workflow for phospho analysis (directDIA or library based approach)?

O. Bernhard: directDIA has been shown to perform better than library based approach when doing phospho analysis. However, a recent study has also shown that using a hybrid library (directDIA + DDA) could perform even better.

Are there improvements in Spectonaut 15 for pulsed SILAC-DIA data analysis?

O. Bernhard: Yes, besides the fact that you would expect more and improved identifications in general, we also now report Heavy and Light quantities at precursor and protein group levels in the report.

What kind of sample/tissue types have you analyzed so far? Only FFPE tissue or also e.g. snap-frozen tissue.

AD. Brunner: Cell culture is presented so far, but we are working on fresh-frozen tissue for single-cell analysis and FFPE for deep visual proteomics

Did you benchmark different cell types? Is there a difference in protein amount per cell between them for example?

AD. Brunner: We did our experiments on HeLa cells because they are a readily available standard cell line. There was a noticeable increase in raw signal throughout the cell cycle corresponding to cell growth. We also successfully used other cell types for single-cell proteomics and are working on moving towards tissue applications at the moment to obtain cell-type specific proteomes.

What is the lowest number of cells needed for accurate/reliable quantification? (eg. if I have treatment groups, each with N ~ 100 cells)

AD. Brunner: Our diaPASEF single-cell measurements are developed to be fully quantitative and the accuracy/reliability has been rigorously benchmarked. This means that we are quantitative already when comparing one against one cell.