In this episode, we have the privilege of hosting the outstanding Tom Chittenden, Chief Data Officer and Founding Director of the Genuity Science Advanced Artificial Intelligence Research Laboratory. Genuity Science is a data insights company.
Tom discusses how his company leverages AI and sequence data to find or build techniques that can robustly identify the drivers of disease. He explains about creating more effective therapeutics by applying AI in diseases, from cardiovascular disease to thoracic aortic aneurysm, and more recently COVID-19. Tom also shares some of Genuity Science’s breakthroughs and challenges as well as his thoughts on single-cell science.
There’s so much to learn from this amazing interview with Tom, so please tune in!
About Tom Chittenden
Tom Chittenden is Chief Data Science Officer and Founding Director of the Genuity Science Advanced Artificial Intelligence Research Laboratory. He is responsible for the development and execution of the company global AI/ML R&D strategy which includes the development of advance deep learning, statistical machine learning, and probabilistic programming analytics aimed at furthering scientific understanding of human disease initiation and progression, the knowledge that can be directly applied in innovative products for better care and medicine in a range of disease areas.
Tom is an Accredited Professional Statistician™ with the American Statistical Association. In addition to his position at Genuity Science, he holds academic faculty appointments at Boston Children’s Hospital and the Harvard Medical School (HMS), where he lectures on biostatistics and mathematical biology. From 2016 to 2018, Tom held a Visiting Lecturer appointment in the Department of Biological Engineering at the Massachusetts Institute of Technology. He currently serves as a Senior Consultant for the HMS Research Computing Group. He is a Senior Fellow and Chief Statistical Sciences Advisor for the Global Strategic Initiatives and Planning Committee of the International Society for Philosophical Enquiry.
Tom holds a PhD in Molecular Cell Biology and Biotechnology from Virginia Tech and a DPhil in Computational Statistics from the University of Oxford. His multidisciplinary postdoctoral training includes experimental investigations in Molecular and Cellular Cardiology from the Dartmouth Medical School and Integrative Functional Genomics, Biostatistics, and Mathematical Biology from the Dana-Farber Cancer Institute and the Harvard School of Public Health.
What 2021 holds for Biotech, AI, and Genomics with Tom Chittenden, PhD, Chief Data Science Officer at Genuity Science: this mp3 audio file was automatically transcribed by Sonix with the best speech-to-text algorithms. This transcript may contain errors.
Saul Marquez:
Hey Outcomes Rocket listeners, Saul Marquez here. I get what a phenomenal asset a podcast could be for your business and also how frustrating it is to navigate editing and production, monetization, and achieving the ROI you’re looking for. Technical busywork shouldn’t stop you from getting your genius into the world, though. You should be able to build your brand easily with the professional podcast that gets attention. A patched-up podcast could ruin your business. Let us do the technical busy work behind the scenes while you share your genius on the mic and take the industry stage. Visit smoothpodcasting.com to learn more. That’s smoothpodcasting.com to learn more.
Saul Marquez:
Welcome back to the Outcomes Rocket everyone, Saul Marquez here. Today I have the privilege of hosting the outstanding Tom Chittenden. He is the Chief Data Science Officer and Founding Director of the Genuity Science Advanced Artificial Intelligence Research Laboratory. Tom is responsible for the development and execution of the global AI and machine learning R&D strategy. This R&D initiative includes the development of advanced deep learning, statistical machine learning, and probabilistic programming and analytics aimed at furthering scientific understanding of human disease, initiation, and progression, knowledge that could be directly applied in innovative products for better care and medicine in a range of disease areas. The principal focus of the work of Tom and his team is the development and application of integrated system biology models to investigate evolutionary factors of human disease. They’re taking a lot of cost out of the production of these drugs. Tom holds his Ph.D. in molecular cell biology and biotechnology from Virginia Tech and a DPhil in computational statistics from the University of Oxford. I’m excited to have him here on the podcast. And so with that Tom, why don’t you fill in any of the gaps of the intro and excited to have you here on the podcast today?
Tom Chittenden:
First of all, thank you very much for having me. And you did a perfect job. There’s nowhere to go but down after that introduction. And so I hope I don’t disappoint you.
Saul Marquez:
You definitely won’t. And so the work you’re doing is incredibly meaningful. And obviously, there’s so much that needs to be done to optimize how we develop drugs before we get into the power of Genuity science. Why don’t you tell us a little bit about you and what inspires your work in this space?
Tom Chittenden:
Fantastic question, Saul. I’ve always had a very strong sense of inquiry and particularly how biology works. And most would say that that human biology is the most complex system in the known universe and that most cognitive scientists would like to argue that human cognition. But I learned to gently point out that that is human biology. And so we’re applying artificial intelligence machine learning techniques, arguably the most advanced technology in human history, to advance our collective understanding of human biology in order to better understand disease. And I think that’s what differentiates us from the rest of the pack, is that we have to fully understand human disease before we can start building more effective therapeutics. And what we have done over the last six years is we figured out how to build in something called probabilistic programming or causal inference. So we’ve moved away from strict classification, which is basically just dressed up-correlation. We’re not trying to find what correlates with human disease. We’ve had three and a half billion years of natural engineering that has gone into the current state of the cell. So everything in the human genome, in human biology correlates with one degree to another. And so what we’re trying to do is find or build techniques that can robustly identify the drivers of disease. And we have a very strong track record here, a scientifically-published track record now that says this is actually possible or feasible.
Saul Marquez:
Fascinating. Just incredible. Just to hear your passion and belief in the opportunity that we have in this space. Talk to us a little bit about how you believe Genuity Science is adding value to really, I think, the pharma value chain and the ecosystem of health care at large.
Tom Chittenden:
Great question. So we have been referring to ourselves as a data insights company. So I run the AI Research Laboratory. But we also have a very large initiative. It’s patient cohorts and we have some of the largest expertly curated patient cohorts in the world. In my responsibility or the AI team’s responsibility is to then extract meaningful information from these disease sets or these patient cohorts. And we represent over 60 disease areas. And that’s why I refer to our algorithms as being disease agnostic. So I think this is the scope of the problem here. And the last credible study that I came across was back in 2015, where total revenue from the entire pharmaceutical industry was one point two trillion. One hundred and fifty billion of that was spent on R&D. Now seventy-five percent of that, or about one hundred and twelve billion in twenty fifteen, can be directly attributed to failed clinical trials. And so most don’t know this, but about eighty-six percent of all clinical trials fail. And we believe that Genuity science, the reason that they’re failing is that we don’t fully understand human biology. If we can identify the drivers, what’s driving cellular behavior and dictating phenotype, we can actually right the ship, then we can build more effective therapeutics, and we have done this consistently now over the last four years, and so if you just bear with me a little bit here, what we published a paper in Nature Metabolism in 2019 with our investigators, our collaborators, Mike Simons and his team at Yale University Medical School.
Tom Chittenden:
And what we were able to do is apply the A.I. in a disease state or cardiovascular disease state called atherosclerosis, and the manifestation of that in many different forms. But heart disease is the number one killer in the world. And what we were able to do in 2019 was not only inhibit atherosclerotic plaque development in mice, we were actually able to reverse it. Now, the information that we gained from that, we then published another more advanced paper in 2020 and a journal called Cell STEM Cell where we were looking at another big killer, its thoracic aortic aneurysm. So when the aorta ruptures, it’s a guaranteed death sentence. And so what we were able to do with that study, from everything that we learned from the 2019 study, we were able to apply that in a longitudinal experimental design. So we were looking at how cells differentiate or change over time and we were able to reverse thoracic aortic aneurysm. And now here in 2021, we are very, very excited. We’ve been working with our collaborators at the University of Strausberg Semak Broms team there at the medical school. We are actually looking at the molecular drivers of COVID-19 and what we’ve uncovered is a very complex disease, etiology associated with something called Acute Respiratory Distress Syndrome.
Tom Chittenden:
And the bottom line here is that we’ve been able to block the virus’s means of getting into the cell. So we are blocking viral uptake and viral replication in human lung cells. This work has been done in actual patient populations. And so please bear with me here. Forgive my Midwest vernacular if I’m just going to put it down at this level here so everyone understands what we’re talking about. If the virus can’t get into the dance, it can’t pee in the punch bowl. And that is what we’ve been able to show. It’s not going to be able to basically muck up all the signal transduction networks and how the cell behaves. And so we’re very, very excited about that. And as equally exciting is that there is a therapeutic that’s currently in phase one trials that block the expression of this protein that we’ve uncovered in COVID-19. So if we can actually block all of this in human patients and show that it’s effective, this is going to be the industry’s first repositioning, real repositioning of an actual drug A.I. ML dated, or base repositioning of an actual drug. And so what we’re showing now is that all of the math and statistics can point investigators in the right direction to save time and expense in developing these therapeutics.
Saul Marquez:
Yeah, that’s very interesting, Tom. So this particular drug, at what point is it taken, the one you’re telling me about for COVID that I guess prohibits the protein intake? How does it work? When do people take it?
Tom Chittenden:
We’re not there yet. And I can’t share all the specifics of this. We’re just about ready to submit the paper for peer review. So I can’t again, I can’t share the specifics of it. But it’s pos protein. It’s a protein that’s expressed in human cells. And so when we block the expression of that, the virus can’t interact with the cell, it can’t gain access into the cell, and then do what this virus is doing. That’s what we’re actually doing. And so we’re not teaching the body or the immune system to attack the virus. We’re actually we’ve uncovered what we believe to be an actual mechanism of action that causes the disease and we’re inhibiting it.
Saul Marquez:
Fascinating. A new approach to it. And so this arrived with the use, like this conclusion, this opportunity, this drug was you got to it with the use of AI and machine learning. So talk to us about how and why that’s different.
Tom Chittenden:
Yeah, well, OK, so machine learning has been around for quite some time. And when I was in graduate school, we were working on that twenty-five years ago. But these are very, very sophisticated next-generation approaches at looking at high dimensional mixed data. So what I mean by all makes we’re looking at how all the genes in the human transcript are expressed or how they are mutated. We’re able to capture that on these data platforms and then we use these very advanced analytics to go in and find patterns. Between what’s differentiating a normal patient population or in our case, what we were looking at, those patients that are admitted to the hospital but only need supplemental oxygen versus those that are admitted to the ICU and are on mechanical ventilation. What is the difference between the two in artificial intelligence is very good at going in and finding the patterns with these high dimensional OMICS platforms. But then at the end of the day, if they’re doing their job, what they do is they generate very, very robust working hypotheses. But then the investigators, the experimentalists can then get in behind the webpage and validate. And that’s how we work very closely with all of our academic collaborators.
Saul Marquez:
A lot of sense makes a lot of sense. And so help us understand how you translate these approaches, these unique approaches in doing things better than what’s available today. And what kind of results are you getting?
Tom Chittenden:
The results that we are getting again? 2019. We reversed atherosclerosis. In 2020 we reverse the aortic aneurysm or inhibited aortic aneurysm. And now 2021, we’re addressing the actual pandemic. But the way that we go about this is the strategic application of six or seven classes of these types of algorithms, and we’re integrating them together. So it’s an ensemble approach of these classification methods. I have never trusted a single run of a single algorithm on a single cut of the data. So normally we have to do is we partition the data so that we’re training the algorithms on a certain percentage of the data, usually 80 percent of the data, and then we are testing on 20 percent to validate that signal that the algorithm is actually uncovered. And so we cut our data a hundred different times. We’re using seven different classes of algorithms or anywhere from seven to 10. We’ve actually now published the first successful classification of human cancer patients with quantum machine learning, which is extraordinarily exciting. And so we now integrating quantum machine learning as one of our classification approaches. But at the end of the day, we have as many as seven hundred different models that we can then evaluate what these models are saying. Just not one model itself, but seven hundred different models that are saying that this is what’s most important.
Tom Chittenden:
And so that differentiates it’s the strategic application of these algorithms. But then what we’ve done is that we take it a step further. At the end of the day, if we identify, say, the top six hundred signals that our algorithms are telling us that are important. We go downstream with something called structural causal modeling or probabilistic programming. Is that now we want to find the gene within those six hundred genes that are responsible for the state of all of these other genes. In a network type of approach that we have shown time and time again is extraordinarily powerful because that represents that single gene, then represents a potential drug target. And then so then we work with our collaborators to go downstream with actually building drugs on top of what we found from a causal inference standpoint. So all of this comes back across the field is looking at drug development and assessing drug efficacy and clinical trials. We do all of that, but it can all be traced back to identifying the most appropriate drug target. And what we’re doing is we’re saying, hey, from our best guess from what the A.I. is telling us, our best guess is Gene X is driving this disease state. And that’s what differentiates us from the majority of what everyone else is doing in this phase.
Saul Marquez:
That’s fascinating. You know you mentioned, Tom, the reversal of sclerosis, right? So what do you mean by that and what’s being reversed?
Tom Chittenden:
So atherosclerosis is the buildup of plaque in the vascular wall if you will, and it excludes blood flow and that can happen throughout the body. What we are doing when I say reversing that is that within these experimental models and in this instance, it was a mouse model is our investigator Mike Simons and his team at Yale would let that process advance. So he mutated the mouse genome so that these mice were assessable to atherosclerosis. So they’re actually disease-prone mice. And so he lets that that process, that pathologic process occur. And then by knowing how that it’s occurring, we turn off that mechanism and it actually reverses plaque burden within these animals. And so it’s a step further than just inhibiting the drug. So all of those patients out there and I do not want to give any false hope. So I think it’s very important here is that there is a lot of hype, a great idea. There is a great deal of snake oil salesmen. Midnight infomercial. The list is. The space you would have thought that we had already cured cancer by now. we haven’t we’re getting closer. So I don’t want to provide false hope at this time, but we are getting much, much closer. And we’ve shown this in model organisms that we can reverse an actual disease and where I think that this is going.
Tom Chittenden:
So can I step back in my career just a little bit here with everyone has a defining moment? And my defining moment was in a graduate school class. It was a biotechnology course. And this was in the mid-90s. The professor was talking about something called DNA microarray chips and was explaining the actual application. And I was just absolutely fascinated. You said in a single assay you can actually capture the state of the entire human transcript. And at the time, we knew that there were about 22,000 or so genes. But you could capture the state of all these genes in a single assay. And there was something that just resonated with me. It’s that voice that told me this is the future. This is where things are going. And now that voice is telling me 25 years later, what we’re doing is that these technologies are actually going to lead to the ratification of human disease. So maybe not in my lifetime at 58, but definitely in my granddaughters lifetime, my six-year-old granddaughter’s lifetime, we are going to see just some major breakthroughs with our understanding first of human biology, because we can’t do anything until we understand biology first and then how we address disease.
Saul Marquez:
Yeah, that’s fascinating. Thanks for that clarification and it is exciting. And I do see the value and I think listeners, you probably hear the values well of the importance of understanding biology, because if we understand how these cells get diseases and how they break down, we’re able to stop that from happening, maybe even reverse it, like the examples that Tom just shared. So, Tom, you obviously have been thinking about this for a long time. And the team at Genuity Sciences is working on a slew of different global disease data sets that are going to lead to some great solutions in the work that you do. What would you say is one of the biggest setbacks you’ve experienced and great learning that has made you and the team and the company better?
Tom Chittenden:
Yeah, we are tackling a number of very, very difficult or complex problems, and one of them which plagues everything that we do, one of the reasons that we run ensemble approaches and I and I won’t get too much or too deep into the weeds here, but it’s something called statistical optimization. How well can an actual algorithm define a pattern in this data that is so highly correlated with all the other features? And so if you run that algorithm, once you run it again, you’re going to get the same classification performance. You’re going to be able to discriminate between these two classes or three classes or four classes depending on the experimental design. But the features that that classification is based on will always be different for the most part. And so how do we build these algorithms that are more robust that we can capture the real underlying biology? What is actually driving disease versus what is the significant amount of signal that actually correlates with the disease? And that has been the biggest hurdle or problem that we are currently addressing right now. And that’s why we’ve moved into unconventional computing approaches like neuromorphic computing and quantum computing. It appears that they are much better at defining what’s actually driving that disease or more consistently defining what is important and just not what is correlated with the disease.
Saul Marquez:
It’s a challenge, right? Because when you get in there, there’s a lot of noise. And so what do you actually do with it? What’s the signal? And it’s challenging when you have such diverse data sets. I think that the approaches you guys are taking are exciting and unique ways to get to the answer in a clearer, quicker way. What are you most excited about today?
Tom Chittenden:
Oh, fantastic question, Saul. I could spend the next hour and a half talking to you about this is the advent of single-cell science, what I’ve been talking about? Well, the nature of metabolism in the cell stem cell paper was based on single-cell extracting signals from single cells and coupling that with very advanced generative models. And that, I believe, is going to not only revolutionize our understanding of human biology, but it’s going to completely transform the field. And so what we found with that longitudinal experimental design from the cell stem cell paper is that there is evidence now that it’s not the whole collection of a cell population that’s actually driving diseases. It’s specific pockets of abnormal cells that are actually driving diseases. And we can now, again, at the single-cell level, actually look at how these cells are changing over time, which gives us vastly more, much more information than we could have ever, ever thought to have been able to capture in the past or in the recent past. Even this what it’s going to do is help us address rare diseases. And so the problem with this in the past has been the small number of patients that you can’t build machine learning approaches with. And for example, we’re working with Children’s Hospital, Philadelphia, and the University of Pennsylvania on a disease called NWF two. It’s a rare childhood disorder. But now we can go in because we’re looking at the single-cell level because we can build these algorithms within a single patient and capture all that intra biological variation within a single patient. Now and then we can take very sophisticated approaches such as transfer learning and then go look in bigger populations. And so that is going to be a real game-changer in the field as well.
Saul Marquez:
Wow. Just to be able to do that, right? Yeah. These rare diseases, you know, I had an interview with another gentleman who is focused on child oncology and he’s talking about the number of drugs developed there compared to adult oncology drugs is just shameful. And a lot of it has to do with the number of patients and the size of the market. And it’s a shame. But the things that you and your team are doing give broader access because of the leverage. So, OK, there’s this idea that I just learned and I’m curious about your take on this, Tom, that when you take a look at, for example, a blood sample, you look at the cells that are there, but so much is not actually currently studied, such as the broken-down cells that we could learn so much from. What’s your take on that?
Tom Chittenden:
Let me take a step back as well to address that question. We have learned one of the issues that have been associated with machine learning is when you’re using apriority biological information. We’re only looking at the universe of features that we actually know and in this case, genes, what we know, what they do. But in the human genome, you can go down to a data repository or pull down from a data repository in any given day, about 68,000 gene entities. We only know the molecular function of about 22,000 of these gene entities. And so what we have been able to do is couple all of that signal that is very, very informative, that we have no idea what it does, and couple it to what we actually know. And that’s how we are advancing our collective understanding. And in the same setting, we can do that with abnormal cells that we are pulling out of the blood. And in fact, we’ve done this with Alzheimer’s already. Is that can you detect a signal in the systemic circulation or in the blood that is actually a surrogate of what’s going on in the brain? The direct question or to directly answer that question is yes. The bottom line is yes. We have been able to pull out signatures that are highly informative of what’s actually occurring in the brain.
Saul Marquez:
That’s awesome. So going back to your point, it makes the process of discovery more efficient and it cuts into the one hundred and fifty billion dollars of lost R&D. So if you’re a pharma company listening to this, thinking about all that money going out the window, think about this. This is an opportunity for you to do it differently. It’s very exciting. I’m excited about this and the future is now here. Tom, you talked about that day that you were in your classroom and you thought, man, this is the future. Well, now we have the computing power. We have the genome sequenced. We’re ready for this. So take us home with what you believe we should be thinking about here. And then also, as folks think about how they can best engage with the Genuity Science team, where’s the best place they could do that? How can they reach out to you and the team?
Tom Chittenden:
First of all, we are in the way I’m looking at it is that we have just this is the tip of the iceberg where basically we’re off to the races here. We have all the tools now to actually start addressing diseases in a very, very robust, reproducible, meaningful manner here that will lead to more efficacious approaches to addressing diseases. As far as how to get a hold of us, the website that we have, GenuitySci.com is a good way. Your listeners can reach out to me on LinkedIn and get a hold of me. That’s a good way as well. But I would love to hear from your listeners here, their interest and what they are working on as well, because it’s going to take a team effort and we are extraordinarily collaborative Genuity Science to be able to address these very, very complex questions. It’s going to take a lot of individuals thinking outside the box.
Saul Marquez:
Well, there you have it, folks. Reach out. The time to collaborate is now Tom. I’m floored with the awesome work that you and your team are doing. And I’m excited for the future. So thank you for inspiring that. And me and I know the listeners are, too, so appreciate you doing that. And. Certainly excited to keep up with you and the success that you guys are having and will continue to have. Thanks, Tom.
Tom Chittenden:
Fantastic, Saul.Thank you for giving us the opportunity to share what we’re working on. Thank you.
Saul Marquez:
Hey, everyone. Saul Marquez here. Have you launched your podcast already and discovered what a pain it can be to keep up with editing, production, show notes, transcripts and operations? What if you could turn over the keys to your podcast busywork while you do the fun stuff like expanding your network and taking the industry stage? Let us edit your first episode for free so you can experience the freedom. Visit smoothpodcasting.com to learn more. That’s smoothpodcasting.com to learn more.
Sonix has many features that you’d love including transcribe multiple languages, share transcripts, upload many different filetypes, automated translation, and easily transcribe your Zoom meetings. Try Sonix for free today.
Things You’ll Learn
Resources
Website: https://genuitysci.com/
LinkedIn: https://www.linkedin.com/in/tom-chittenden-24004027/