References¶
All citations throughout this thesis are managed with
sphinxcontrib-bibtex and resolved from docs/source/references.bib.
The list below is rendered from every entry cited in the preceding chapters,
sorted alphabetically by author.
Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A. Grüning, and others. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research, 46(W1):W537–W544, 2018. doi:10.1093/nar/gky379.
Carlos P. Cantalapiedra, Ana Hernández-Plaza, Ivica Letunic, Peer Bork, and Jaime Huerta-Cepas. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Molecular Biology and Evolution, 38(12):5825–5829, 2021. doi:10.1093/molbev/msab293.
Wyatt T. Clark and Predrag Radivojac. Information-theoretic evaluation of predicted ontological annotations. Bioinformatics, 29(13):i53–i61, 2013. doi:10.1093/bioinformatics/btt228.
Jeff Daily. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinformatics, 17(1):81, 2016. doi:10.1186/s12859-016-0930-z.
Paolo Di Tommaso, Maria Chatzou, Evan W. Floden, Pablo Prieto Barja, Emilio Palumbo, and Cedric Notredame. Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4):316–319, 2017. doi:10.1038/nbt.3820.
Ahmed Elnaggar, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, Tamas Feher, Christoph Angerer, Martin Steinegger, Debsindhu Bhowmik, and Burkhard Rost. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7112–7127, 2022. doi:10.1109/TPAMI.2021.3095381.
Martin Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley, 2002. ISBN 978-0321127426.
Jaime Huerta-Cepas, Francois Serra, and Peer Bork. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Molecular Biology and Evolution, 33(6):1635–1638, 2016. doi:10.1093/molbev/msw046.
Jeff Johnson, Matthijs Douze, and Hervé Jégou. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2021. doi:10.1109/TBDATA.2019.2921572.
Philip Jones, David Binns, Hsin-Yu Chang, Matthew Fraser, Weizhong Li, Craig McAnulla, and others. InterProScan 5: genome-scale protein function classification. Bioinformatics, 30(9):1236–1240, 2014. doi:10.1093/bioinformatics/btu031.
Maxat Kulmanov and Robert Hoehndorf. DeepGOPlus: improved protein function prediction from sequence. Bioinformatics, 36(2):422–429, 2020. doi:10.1093/bioinformatics/btz595.
Johannes Köster and Sven Rahmann. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics, 28(19):2520–2522, 2012. doi:10.1093/bioinformatics/bts480.
Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Nikita Smetanin, Robert Verkuil, Ori Kabeli, Yaniv Shmueli, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Salvatore Candido, and Alexander Rives. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637):1123–1130, 2023. doi:10.1126/science.ade2574.
Damiano Piovesan and others. cafa-evaluator: official CAFA5 evaluation tool. https://github.com/BioComputingUP/CAFA-evaluator, 2023. Accessed 2026-04-10.
Predrag Radivojac, Wyatt T. Clark, Tal Ronnen Oron, Alexandra M. Schnoes, Tobias Wittkop, Artem Sokolov, and others. A large-scale evaluation of computational protein function prediction. Nature Methods, 10(3):221–227, 2013. doi:10.1038/nmeth.2340.
Petri Törönen, Alan Medlar, and Liisa Holm. PANNZER2: a rapid functional annotation web server. Nucleic Acids Research, 46(W1):W84–W88, 2018. doi:10.1093/nar/gky350.
Qianmu Yuan, Junjie Xie, Jiancong Xie, Huiying Zhao, and Yuedong Yang. Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion. Briefings in Bioinformatics, 24(3):bbad117, 2023. doi:10.1093/bib/bbad117.
Naihui Zhou, Yuxiang Jiang, Timothy R. Bergquist, Alexandra J. Lee, Balint Z. Kacsoh, and others. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 20(1):244, 2019. doi:10.1186/s13059-019-1835-8.
CAFA Consortium. CAFA5: protein function prediction (Kaggle competition). https://www.kaggle.com/competitions/cafa-5-protein-function-prediction, 2023. Accessed 2026-04-10.
EvolutionaryScale Team. ESM Cambrian: revealing the mysteries of proteins with unsupervised learning. https://www.evolutionaryscale.ai/blog/esm-cambrian, 2024. Accessed 2026-04-10.