What people say they’ve eaten and what they’ve actually eaten are often two very different lists of foods. But a new technique using DNA barcoding to identify the plant matter in human feces may get at the truth, improving clinical trials, nutrition studies and more.
Building on earlier studies that attempted to compare DNA found in feces with reported diets, researchers in the lab of Lawrence David, an associate professor of molecular genetics and microbiology in the Duke School of Medicine, have developed a genetic marker for plant-based foods that can be retrieved from poop.
“We can go back after the fact and detect what foods were eaten,” said Brianna Petrone, PhD, an MD/PhD student who led the project.
The marker is a region of DNA plants use to power chloroplasts, the organelle that converts sunlight into sugars. Every plant has this genomic region, called trnL-P6, but it varies slightly from species to species. In a series of experiments, they tested the marker on more than 1,000 fecal samples from 324 study participants across five different studies, about twenty of whom had high-quality records of their diet.
In findings appearing June 27 in the Proceedings of the National Academy of Sciences, the researchers show that these DNA markers can indicate not only what was consumed, but the relative amounts of certain food species, and that the diversity of plant DNA found in feces varies according to a person’s diet, age, and household income.
David’s lab relied on a reference database of dietary plants that contains markers for 468 species typically eaten by Americans to connect versions of trnL-P6 detected in poop to specific plant sources. After some tweaking, their barcode was able to distinguish 83 percent of all major crop families.
Petrone said the subset of crop families that could not currently be detected tended to be consumed in other parts of the world. The lab is now working to add crops such as pearl millet and pili nuts to their database.
They also haven’t tracked meat intake yet, though the technology is capable of that as well, David said. “That relative ratio of plant to animal intake is probably one of the most important nutritional factors we might look at.”
The scientists first tried the marker out on fecal samples from four individuals in a weight loss intervention where they knew exactly what study participants had been fed a day or two before. Knowing the patients had been given a dish called mushroom wild rice pilaf for example, they looked for the markers of its components: wild rice, white rice, portobello mushrooms, onion, pecans, thyme, parsley and sage.
In this and a second intervention group, they found that barcoding could not only identify the plants, it also could identify relative amounts consumed for some kinds of plants. “When big portions of grains or berries were recorded in the meal, we also saw more trnLfrom those plants in stool,” Petrone said.
Then they looked at samples from 60 adults who had taken part in two studies of fiber supplementation and kept track of what they were eating with surveys. The number of plants detected by trnL was in good agreement with dietary diversity and quality estimated from participants’ survey responses.
Next, they applied the barcoding to a study 246 adolescents with and without obesity with diverse racial, ethnic, and socioeconomic backgrounds. There was only a minimal record of diet in this cohort.
“Dietary data collection was challenging because some traditional surveys are 140 pages long and take up to an hour to fill out, families are busy, and a child might not be able to fill it out alone,” David said. “But because they had banked stool, we were able reanalyze those samples and then gather information about diet that could be used to better understand health and lifestyle patterns between kids. What really struck me was that we could recapitulate things that were known as well get new insights that might not have been as obvious.”
They found 111 different markers from 46 plant families and 72 species in the adolescents’ diet. Four kinds of plants were eaten by more than two thirds of subjects: wheat, found in 96 percent of participants, chocolate (88%), corn (87%) and the potato family (71%), a group of closely related plants that includes potato and tomatillo.
David said the barcode isn’t able to distinguish individual members of the cabbage family — the brassica — such as broccoli, Brussels sprouts, kale, and cauliflower, which are closely related.
Still, the large adolescent cohort showed that dietary variety was greater for the higher-income study participants. The older the adolescents were however, the lower their intake of fruits, vegetables and whole grain foods, potentially because of a known pattern of older children eating with their families less often.
David said the barcode is readily able to identify the diversity of plants found in a sample as a proxy for dietary diversity, a known marker of nutrient adequacy and better heart health.
David said that in each of these cohorts, the genomic analyses had been carried out on samples that had been collected years in the past, so the technique opens up the possibility of reconstructing dietary data for studies that have already been finished.
The authors think the new methodology should be a boon for all sorts of studies of human nutrition. “We are limited in how we can track our diets, or participate in nutrition research or improve our own health, because of the current techniques by which diet is tracked,” David said. “Now we can use genomics to help gather data on what people eat around the world, regardless of differences in age, literacy, culture, or health status.”
The team anticipates extending the technique to studies of disease across the globe, as well as monitoring food biodiversity in settings facing climate instability or ecological distress.
Funding for this work came from the National Institute of Diabetes and Digestive and Kidney Diseases (grants 5R24DK110492-05 and 5R01DK116187-05), the Burroughs Wellcome Fund Pathogenesis of Infectious Disease Award, the Duke Microbiome Center, the Springer Nature Limited Global Grant for Gut Health, the Chan Zuckerberg Initiative, the Triangle Center for Evolutionary Medicine, the Integrative Bioinformatics for Investigating and Engineering Microbiomes Graduate Student Fellowship, and the Ruth L. Kirschstein National Research Service Award to the Duke Medical Scientist Training Program. This work used a high-performance computing facility partially supported by grants from the North Carolina Biotechnology Center (2016-IDG-1013 and 2020-IIG-2109).