To determine whether a patient has cancer, a bodily fluid sample is taken, exposed in a vial to several types of DNA–wrapped carbon nanotubes, and the fluorescence of each nanotube is recorded. One of the goals of the project, DNA-Nanocarbon Hybrid Materials for Perception-Based, Analyte-Agnostic Sensing, is to have an automated disease detection system into which data about a bodily fluid can be inputted.
The NSF project focuses on how the system works and how it can be improved. Some of the fundamental questions researchers are looking to answer, according to Jagota, include, “Is there a rationale for why the nanotubes hybridized with DNA can detect diseases?” and “What is the mechanism by which it detects?”
One issue is that the modulation of fluorescence of the DNA-carbon nanotubes isn’t very specific, meaning a shift in fluorescence can be triggered by one of many biomarkers, not just, in this case, by the ones that reveal a patient has cancer.
“Most people, they'll say, ‘Ah, that's useless. You can't use this for sensing,’” Jagota says.
With many other molecules present in blood, it’s essentially impossible for any single type of DNA-carbon nanotube to detect whether a cancer biomarker is present. To account for a mixture of molecules in the sample, many types of DNA-nanotubes are needed for collective analysis, he says.
“You try to ask the question … did they all shift in some way?” Jagota says. “Can I train this system? Can I expose it to different combinations of different concentrations of my analyte and look at the output, and from that output, can I train a machine-learning algorithm? Can I train a black box which says, ‘You tell me how each one of these shifted and I'll tell you whether this molecule was present or not’?”
By identifying and using a number of sensors, researchers can be more confident they’re finding what is associated with a biomarker and not something else in the blood, Davison says. That could lead to figuring out how to detect other characteristics or disease states in people.
“If there's nothing there at all, there's sort of a baseline fluorescence that will happen,” Davison says. “When one of these nanotubes has attached to some other molecule, then it can change how it fluoresces either by increasing the brightness or changing the frequency and those are the things that we're measuring.”
Davison used the human nose as an analogy for the work they’re doing: Inside the nose, there are different receptors for scent, but it’s not as simple as one scent per sensor. A collection of sensors activating is what allows people to recognize a particular smell.
The researchers don’t want to have to rely on one sensor in this project; they want a set of sensors detecting—or not detecting—a recognizable pattern.
“A big worry for us is that we could have lots of conflicting compounds that are discoverable in blood that aren't what we're looking for, but similarly excite the sensors that we have,” Davison says.
One of the broader questions the NSF project asks, according to Davison, is how can they identify the best set of sensors, which they expect need to be as diverse as possible.
Davison’s area of the project is in machine learning. He says one of the challenges includes building reliable prediction systems with little data. Unlike big tech companies, such as Google or Meta, which have millions of data points, Davison says they’re more likely to have just a few hundred data points because their data corresponds to real patients.
“How can we be as accurate as possible even though we have a small data set to work with?” Davison asks.
He says they also have to figure out how to represent the data gathered. For instance, should measurements gathered from the fluorescence of the sample be represented directly with the wavelengths, in the differences or something else?
A separate National Institutes of Health award aims to make the process suitable for clinical practice. Memorial Sloan Kettering Cancer Center is the lead on the NIH project with researchers from Lehigh, the National Institute of Standards and Technology (NIST) and a collaborator from the University of Maryland contributing.