MSc thesis: NEMALearn, a tool to sort nematodes into feeding groups using Machine Learning
Nematodes are omnipresent in almost every ecosystem, even under harsh environmental conditions, and play important roles in soil processes, contributing to the provision of key ecosystem services such as nutrient cycling (Mekonen, Petros, and Hailemariam 2017). Nematodes alone are responsible for 2,2% of global carbon emission from the soil (van den Hoogen et al. 2019), meaning the impact of climate change on these organisms needs to be considered for future climate scenarios. Because nematodes are such important part of natural ecosystems, the FUNDER project will focus on how their functional role, associated with their trophic position in the soil food web, change with climate. Nematode trophic classification is based on feeding groups. However, identifying and counting nematodes is time consuming, even if nematodes are identified to the family level. Finding efficient and automated tools for nematode identification will not only be faster, but also cheaper, and more precise, allowing us to process more samples, analyze more data and, hence, get a better understanding of the complex ecosystem processes we are studying. Machine learning has received increasing interest in the last two decades thanks to both increasing datasets and easier access to computing power (Krizhevsky, Sutskever, and Hinton 2017). Deep learning, and specifically Convolutional Neural Networks (CNNs), a branch of machine learning that uses internal hidden layers and convolutional layers, has proven to be very efficient for classifying images and has already been used in various fields of study. So far, only a few recent studies have been using CNNs for the identification of nematodes, but the initial results are encouraging (Bogale, Baniya, and DiGennaro 2020; Thevenoux et al. 2021).
The FUNDER project
Climate change alters plant and soil communities, and interactions in the plant-soil food web. These changes pose threats to biodiversity and key ecosystem processes and functions, such as carbon and nutrient cycling, and ecosystem productivity. The FUNDER project assesses and disentangles the direct effects of climate from the indirect effects, mediated through biotic interactions, on the diversity and functioning of the plant−soil food web. We use a macro-ecological experimental approach to quantify the impacts of vegetation diversity on interactions and ecosystem functioning across factorial broad-scale temperature and precipitation gradients in Norway. The objectives are to disentangle direct and indirect climate impacts on plants, soil nematodes and microarthropods, and microbes, and ecosystem processes. We aim to better understand landscape variation and whole-ecosystem consequences of indirect climate impacts as well as climate feedbacks of the plant-soil food web.
This MSc thesis will be conducted as part of the FUNDER project, where the MSc student will have responsibility to gather available datasets of nematode images (Lu et al. 2021) and complete the dataset through the collection of new images of individuals using a microscope in order to train the model. The successful candidate will classify these images from a newly constructed dataset into nematodes feeding groups. Using previous CNN work on nematode recognition and online code (Lu et al. 2021), the MSc student will perform training of a CNN with varying parameters (size and number of neuron layers in the network) and find the best parameters to achieve maximum accuracy in classifying images. The MSc student will have the opportunity to participate to the fieldwork campaign (if desired). The trained CNN will be applied on the samples collected. The collected new images will feed a new dataset that would be published, helping the global effort of identifying nematodes.
- Create a machine learning tool to identify nematodes at the feeding level: training a CNN and make the CNN available for future research
- Identify a large number of soil nematode samples collected in the summer fieldwork 2022 for the FUNDER project with the CNN
- While identifying, collect, classify and publish new images of nematodes depending on the feeding group, and make this image dataset available on GitHub and OSF (Center for Open Science)
You will be part of a dynamic research team, gather experience in scientific approach and have a cross disciplinary approach for the development of promising identification techniques.
- Labwork: collect images
- Programming work: build the CNN
- Participate to the fieldwork in August (optional)
- Data management, reproducibility and Open Science practice
- Share your results: write a thesis which can be published as a scientific paper and present your work in national/international conferences
- Programming and machine learning background
- Lab skills (microscope)
- Interest for ecology and soil organisms
- Scientific writing skill
- Team spirit
- The project is funded through research grants
- Date: From May 2022 to October 2022 (possibility to shift the date if necessary)
- Place of work: University of Bergen, Norway
- Supervisors: Florian Muthreich (Post-Doc at UiB who worked on pollen identification using CNN (Muthreich 2021)), Morgane Demeaux (PhD FUNDER), and Vigdis Vandvik (leader of FUNDER).
If you are interested send your CV and motivation letter to firstname.lastname@example.org.
Afuye, G. A., A. M. Kalumba, and I. R. Orimoloye. 2021. ‘Characterisation of Vegetation Response to Climate Change: A Review’. Sustainability 13 (13): 7265. https://doi.org/10.3390/su13137265.
Althuizen IHJ, Lee H, Sarneel J & Vandvik V. 2018. Long-term climate regime modulates the impact of short-term climate variability on decomposition in alpine grassland soils. Ecosystems 21: 1580-1592. doi: 10.1007/s10021-018-0241-5
Bogale, M., A. Baniya, and P. DiGennaro. 2020. “Nematode Identification Techniques and Recent Advances.” Plants (Basel, Switzerland) 9 (10): E1260. https://doi.org/10.3390/plants9101260.
Engemann, K., B. Sandel, B. J. Enquist, P. M. Jørgensen, N. Kraft, A. Marcuse-Kubitza, B. McGill, et al. 2016. ‘Patterns and Drivers of Plant Functional Group Dominance across the Western Hemisphere: A Macroecological Re-Assessment Based on a Massive Botanical Dataset’. Botanical Journal of the Linnean Society 180 (2): 141–60. https://doi.org/10.1111/boj.12362.
Hoogen, J. van den, S. Geisen, D. Routh, H. Ferris, W. Traunspurger, David A. Wardle, Ron G. M. de Goede, et al. 2019. “Soil Nematode Abundance and Functional Group Composition at a Global Scale.” Nature 572 (7768): 194–98. https://doi.org/10.1038/s41586-019-1418-6.
Jaroszynska, F. 2019. ‘Climate and Biotic Interactions – Drivers of Plant Community Structure and Ecosystem Functioning in Alpine Grasslands’. University of Bergen.
Kelly, A. E., and M. L. Goulden. 2008. ‘Rapid Shifts in Plant Distribution with Recent Climate Change’. Proceedings of the National Academy of Sciences 105 (33): 11823–26. https://doi.org/10.1073/pnas.0802891105.
Krizhevsky, A., I. Sutskever, and G. E. Hinton. 2017. “ImageNet Classification with Deep Convolutional Neural Networks.” Communications of the ACM 60 (6): 84–90. https://doi.org/10.1145/3065386.
Lu, Xuequan, Yihao Wang, Sheldon Fung, and Xue Qing. 2021. “I-Nema: A Biological Image Dataset for Nematode Recognition.” ArXiv:2103.08335 [Cs, Eess, q-Bio], March. http://arxiv.org/abs/2103.08335.
Mekonen, S., I. Petros, and M. Hailemariam. 2017. “The Role of Nematodes in the Processes of Soil Ecology and Their Use as Bioindicators,” 9.
Muthreich, F. 2021. “New Methods in Palaeopalynology: Classification of Pollen through Pollen Chemistry.” University of Bergen.
Thevenoux, R., V. L. Le, H. Villessèche, A. Buisson, M. Beurton-Aimar, Eric Grenier, Laurent Folcher, and Nicolas Parisey. 2021. “Image Based Species Identification of Globodera Quarantine Nematodes Using Computer Vision and Deep Learning.” Computers and Electronics in Agriculture 186 (July): 106058. https://doi.org/10.1016/j.compag.2021.106058.
Vandvik V, Klanderud K, Skarpaas O, Telford RJ, Halbritter AH & Goldberg DE. 2020. Biotic rescaling reveals importance of species interactions for variation in biodiversity responses to climate change. PNAS 17 (37): 22858-22865. doi: 10.1073/pnas.2003377117