Learning about Machine Learning about Microbiomes and Pollinators

Pete VanZandt
5 min readJul 15, 2021

(Or, how many buzzword topics can I cram into one blog post?)

They say that when you have something to learn, you should try to write about it. Well, I have a lot to learn, and an impending deadline, so…

I don’t really have a lot to say, but I do know a little about all of these topics (learning, ML, microbiomes, and pollinators), so I’ll be learning as I go. How meta. I’ll have to break all of this down into a series of posts. This first one will be about the background and motivational question, the next post will be about the data collection and analysis, and the final one will hopefully reveal some results.

Your microbiome and you

You’ve probably heard about microbiomes. They’ve become a pretty hot topic over the past 10 years or so. If you’re interested, there are great books on them (like this one by Ed Yong is tremendous!), short but informative videos (like this one from NPR), and thousands of scientific studies on all human body parts (inside and out), and loads of plants, animals, and ecosystems (seriously, read Ed’s book or check this video!).

How much do we know about microbiomes? A LOT! But what we know is a drop in the bucket compared to what we’ll learn in the future.

If you haven’t heard of microbiomes, here’s a real brief rundown. Basically, bacteria, fungi, and viruses are all over the place — in you, on you, and all around you. They’re also inside of and covering most of the other organisms in the world too. Don’t freak out, though, because almost all of these microbes that are on and in you make up a diverse community that at worst does you no harm and at best helps you digest food, fight infections, and may even help you control your weight.

Note that I said “almost all”. Of course, not all bacteria, fungi, and viruses are beneficial or neutral, so in some cases an imbalance or infection can lead to dysbiosis, or an imbalance in your microbiome that has been associated with colon cancer, autoimmune disorders, or life-threatening diarrhea and colitis. This last disease can be caused by a disruption of your gut bacteria, and is often associated with an unusually high abundance of Clostridium difficile, also known as C. diff.

While people commonly think of the microbiome in terms of their digestive system, Staphylococcus epidermidis is a common member of the human skin community (as the scientific name suggests). “Staphylococcus epidermidis Bacteria” by NIAID is licensed under CC BY 2.0

I first became interested in microbiomes when I was asked to be a fecal donor for a patient who had a C. diff infection. Each year in the US, these infections impact nearly a half a million people and kill as many as 15,000 of them. These infections are often brought on by repeated exposure to antibiotics, and because C. diff is frequently resistant to antibiotics, there isn’t an easy way to get rid of it. Oddly enough, the surest way to cure someone is by recolonizing them with the gut bacteria of a healthy person. Yup, we’re talking about poop transplants.

I don’t do research on humans, so ever since that interaction with real live microbiome researchers, I’ve been trying to think of ways of studying the microbiomes of moths. There are 12 times more moth species then there are butterflies, and it turns out that we don’t know what most of the adult moths eat, if anything. As caterpillars, some of them are agricultural pests and almost all of them eat plant material, but why would anyone care what adult moths eat if they don’t cause any economic damage? The answer to this question might be that several moths are pollinators.

Moth pollinators. Image by Pete VanZandt

Lots of people assume that moths pollinate flowers, but there is very little evidence that this is the case for most species. If we can make the connection between the microbiomes of moths that are pollinators, then maybe we can get a better understanding of how and whether they are important in these mutualisms. Plus, there’s always the chance that we will discover something that turns out to be more practical along the way. This happens all the time in science. For example, did you know that stomach ulcers are caused by bacteria? This was accidentally discovered by someone who was just interested in bacteria from the harsh environments in people’s stomachs. They weren’t interested in curing stomach ulcers, but they ended up making a huge contribution to that field anyway.

Helicobacter pylori, the bacterium that causes stomach ulcers. Barry Marshall and Robin Warren were awarded the Nobel Prize in 2005 for discovering this connection.

Machine Learning

Now a little bit about the other buzzword topic in this story: machine learning (ML). At its simplest, ML is an approach using statistics and computer modeling to train algorithms to learn information about something. There are millions of examples, including determining what movies people will like, voice activated devices, self driving cars, and helping people hook up with robots.

This resonates on so many levels

It turns out that people have united the fields of ML and microbiome research together recently in an effort to better understand human (and some other) microbiomes. As you probably suspect, microbiomes are complex — insanely and incomprehensibly complex (to humans, anyway). Maybe computer algorithms that are capable of artificial intelligence can make better sense of these microbial communities? One area where machine learning categorization models have been used a lot lately is in trying to diagnose people with diseased digestive systems based on their gut microbiomes. Traditional statistical techniques like ordination can have a hard time discriminating smaller groups of important players in microbiome communities. They’re also pretty bad at detecting multiple influential species and their interactions. For ML models, that’s right up their alley. So far, they seem to do a far better job of characterizing the complexity that is common with microbial systems and building predictive models to better understand them.

While others are uniting these fields to address human health questions, my interest in these applications revolves around studying moths. It’s my hope that these models will be able to help me determine which moths are pollinators and what else they might be eating, just based on their gut bacteria. Just like ML models of microbiome communities are connecting bacterial communities with disease, or associating criminals with the scene of the crime (yes, microbiomes are even used in forensics), I’m hoping that I’ll be able to predict whether a bacterial community is characteristic of a flower or a moth that just pollinated it. In the next post, I’ll tell you about the way we get our data.

Let’s hope the whole project doesn’t end up like this…

Thanks for reading!

--

--

Pete VanZandt

Lifelong learner; neither botanist nor statistician; fan of moths