Originally published Sep 6, 2019 / https://bit.ly/32iqouU
Alex Zhavoronkov, Ph.D.

Founder & CEO

Insilico Medicine has succeeded in using AI to design a new molecule from scratch in 21 days and validate it in 25 days. This is the first time that the potential for AI in the Pharma industry has been validated in practice, moving from theory into reality.

This is Pharma’s AlphaGo moment. The original AlphaGo moment occurred in 2015 when DeepMind succeeded in developing AlphaGo, the first AI capable of beating a human Go champion at Go. This study by Insilico Medicine may be an analogous game-changing moment for Pharma. Insilico Medicine used a generative approach combined with deep reinforcement learning (the form of AI that was used in AlphaGo) to design and validate a new drug candidate end-to-end in 45 days. This is 15x faster than Pharma companies. The public is now aware of AI’s potential to radically transform the Pharma industry.

This landmark study was published in Nature Biotechnology on September 2, 2019. In the study Insilico Medicine showed that they designed a drug candidate from scratch in 21 days and validated it in 25 days. This achievement demonstrates the true power of AI to accelerate the pace of scientific R&D. This is the first time that GAN-RL technology, a combination of generative adversarial networks and reinforcement learning, was used to generate novel small molecules for a protein target that were validated in vitro and in vivo.

” The drug discovery process consists of many phases and often takes decades. In preclinical phases the failure rates are over 99%. Our AI can be used in all phases and in some cases lead to superhuman results. Our AI is exceptionally good at finding the molecular targets in specific diseases and inventing new chemistry. We intend to use this in a big way.”

Alex Zhavoronkov PhD, Founder & CEO, Insilico Medicine

We expect that this will have a big impact on the Pharma industry generally, and incentivize an increasing number of large Pharma companies to on-board AI in a very integral manner. This news may even create the beginning of an arms race in drug development, where the largest Pharma and Tech corporations compete to acquire the strongest AI in drug discovery companies.

AI holds the promise to reduce the 99% preclinical failure rate and to expedite the time it takes to go from R&D to real treatments. AI’s potential has been a topic of discussion for years, but so far no AI designed drugs have reached the market. This new study is the closest that the industry has come to demonstrating the real-world potential of AI in drug discovery in a tangible way. This is Insilico’s most important publication to date because it shows that the molecules imagined using the GENTRL approach work in vitro and in vivo.

Insilico Medicine is known for validated science - they announce every achievement they make. Insilico has published over 330 scientific papers with over 2,300 citations. They strive to be transparent in showing that their methodology and their solutions really work. Most AI companies offering similar services barely have a few released papers or a proof of concept derived from their technology. Insilico is known worldwide for being a pioneer in cutting edge science backed up by transparency and credibility. The company employs over 85 AI experts and scientists in 6 countries and collaborates with over 150 academic and industry partners worldwide. Insilico’s team includes bioinformatics, computational and structural biology experts, highly experienced medicinal chemists, and deep learning engineers, who work with massive amounts of biomedical data and practical problems in drug discovery.

“The platforms built on this technology may save millions to tens of millions in R&D costs and 1-2 years in small molecule discovery time. And since the molecules are generated with the specific conditions and objectives in mind including safety, it is likely that these molecules will have better chances of passing through clinical trials.

Alex Zhavoronkov PhD, Founder & CEO, Insilico Medicine

Deep Knowledge Ventures provided Insilico Medicine’s initial funding in 2014, and has remained a close advisor in the company’s journey towards becoming a global leader in the application of advanced AI for aging research. Insilico Medicine is one of our most promising portfolio companies, not only in terms of its potential ROI, but also of its potential impact on serious problems facing humanity. Methods like GAN-RL could expedite drug discovery dramatically so that life saving treatments could reach doctors and patients sooner, reducing suffering and saving human lives.

A graphical representation of the GENTRL approach. It generates the molecules with specific conditions and learns to generate molecules with the specific objectives. 

From Molecule Design in 2016 to Molecule Validation in 2019 

This game-changing accomplishment is the culmination of Insilico Medicine’s efforts in pioneering the use of cutting-edge techniques in AI and deep learning, specifically the combination of GANS and reinforcement learning, for drug discovery and biomarker development, which began more than 2 years ago. The concept of GANs is relatively new and is sometimes referred to as the AI Imagination. Conceptually it is a competition between two deep neural networks, where one, the generator, is generating novel content with the desired set of criteria, and another, the discriminator, is testing whether the output of the generator is true or false.

Insilico Medicine was the first to utilize GANs to generate novel molecules in 2016, and since then has spent two years developing the theoretical base for the combined use of GANs and RL, documented in 15+ papers and 80+ conference presentations. Now, for the first time, these efforts have been utilized to design, synthesize and validate a novel DDR1 kinase inhibitor both in vivo and in vitro, end-to-end, in just 26 days.

Insilico Medicine screens potential drug candidates using GANs, which create synthetic datasets that are indistinguishable from real datasets by having two neural networks compete against each other. One neural network generates the data and the other compares it to a real data set in iterative cycles so that the degree of error in the synthetic data set is gradually decreased. Rather than using trial and error when looking for molecular leads, requests are made to the network to generate specific leads and leads are generated on demand.

Applying Deep Learning to Drug Discovery

So, how exactly does this work in practice? The process begins with identification of a protein target. Once a target is identified, scientists use a deep learning algorithm to design molecular structures with desired physical and chemical properties. This is a brand new approach to drug design. The traditional method screens existing molecule libraries against specific targets. Insilico Medicine spent two years developing the theoretical base for this method, a deep generative model called GENTRL.

Deep generative models are machine learning techniques that use neural networks to produce new data objects. This technique generates objects with specific properties so it’s well-suited to discover drug candidates. This new model optimizes for synthetic feasibility and biological activity. Insilico Medicine performed a challenging experiment where they timed the process from target nomination, small molecule design, synthesis, disease-relevant models, and animal pharmacokinetics for lead-like molecule development.

Identifying new small molecule kinase inhibitors in 46 days

Researchers at Insilico Medicine mapped the chemical space to a continuous space of 50 dimensions, and then explored the space with reinforcement learning to discover new compounds. They used three distinct self-organizing maps as reward functions and used six datasets to build the model. By day 23 after target selection, they had identified 6 lead candidates, and by day 35 the molecules had been successfully synthesized. As a final experimental validation of GENTRL’s potential as a valuable tool, they tested one compound in a rodent model. This study demonstrates the utility of Insilico Medicine’s deep generative model for the successful, rapid design of compounds. The company plans to develop this technology so that it can be used as a useful tool to identify drug candidates.

A timeline summarizing the key advances towards the development of machine and deep learning and a timeline of the release of the successive GAN-based models.

By using AI in drug development, it’s possible to accurately predict which drugs will be safe and effective for specific patient subgroups. AI accelerates the drug development cycle by generating drug candidates for which we already have some evidence of effectiveness. Traditional pharma companies screen through a large number of candidates and test each one with the hope that one will work. Insilico Medicine starts with molecular leads that have been specifically designed, in terms of their pharmacokinetic and pharmacodynamic properties, and therefore have a higher probability of being effective for specific disease targets.














Insilico Medicine’s drug discovery engine is trained on massive amounts of structural, functional, and phenotypic data in order to predict the biological activity of compounds. Insilico Medicine has published seminal papers in Oncotarget and Molecular Pharmaceutics. Another paper, published in Molecular Pharmaceutics in 2016, demonstrated the proof of concept of the application of deep neural networks for predicting the therapeutic class of the molecule using the transcriptional response data.

Every Drug Can Be Made and Every Disease Can Be Treated

This new study was a close collaboration between Insilico Medicine and WuXi AppTec. WuXi AppTec is a leading pharmaceutical and medical device open-access capability and technology platform company with global operations. WuXi AppTec is committed to enabling innovative collaborators to bring innovative healthcare products to patients, and to fulfilling WuXi’s dream that every drug can be made and every disease can be treated. 

WuXi AppTec and Insilico Medicine share a mutual vision that AI and machine learning will optimize the drug discovery process by increasing the probability of success at the preclinical level. Insilico Medicine’s domain expertise in next-generation AI coupled with WuXi AppTec’s capability platform, can potentially improve the efficiency of drug discovery and increase the productivity to serve partners. By combining WuXi AppTec’s comprehensive platform and services with Insilico Medicine’s hallmark expertise in AI for drug discovery they hope to make dramatic paradigm shifts in the drug development process. They will focus on slashing inefficiencies in the preclinical drug design stage of drug development and cutting development time and cost.

“This paper is a significant milestone in our journey towards AI-driven drug discovery. We’ve been working in generative chemistry since 2015. When Insilico’s and Alán’s theoretical papers were published in 2016 everyone was very skeptical. Now, this technology is going mainstream and we are happy to see the models that we developed a few years ago being validated experimentally in animals. When integrated into comprehensive drug discovery pipelines, these models work for many target classes. We work with the leading biotechnology companies to push the limits of generative chemistry and generative biology even further.”

Alex Zhavoronkov, PhD, Founder and CEO, Insilico Medicine

When Deep Knowledge Ventures chose to provide Insilico Medicine’s initial funding round in 2014, we did so because we saw their potential to increase Quality-Adjusted Life Years for the betterment of humanity as a whole. Since then, they have been the first to use cutting edge deep learning techniques like GANs to design novel drug candidates from scratch with specified molecular properties, and succeeding in designing, synthesizing and validating a new drug end-to-end in less than 2 months. We are thrilled by the fact that this paper shows what Insilico Medicine has been developing in R&D all the way back to 2017, and submitted for publication in 2018. Perhaps Insilico Medicine has made even greater progress in applying next-generation AI techniques for drug design, which might be publicly disclosed in 2020.


The GENTRL model is a variational auto encoder with a rich prior distribution of the latent space. Insilico used tensor decompositions to encode the relations between molecular structures and their properties and to learn on data with missing values. They train the model in two steps. First, they learn a mapping of a chemical space on the latent manifold by maximizing the evidence lower bound. Then they freeze all the parameters except for the learnable prior and explore the chemical space to find molecules with a high reward.

The GENTRL source code is open source and available on GitHub. In the repository Insilico provides an implementation of a GENTRL model with an example trained on a MOSES dataset.



Comments From Key Opinion Leaders

“This paper is certainly a really impressive advance and likely to be applicable to many other problems in drug-design. Based on state-of-the-art reinforcement learning, I am also very impressed by the breadth of this study involving as it does molecular modeling, affinity measurements, and animal studies”

Michael Levitt, PhD, professor of structural biology, Stanford University. Dr. Levitt received the Nobel Prize in Chemistry in 2013

“Using Advanced GANs in the discovery of drugs is a great example of cutting edge application of AI in the pharmaceutical industry - it speeds up a critical process from years to just weeks.”

Christian Guttmann, PhD, Executive Director at the Nordic AI Institute and Senior Research Fellow in Artificial Intelligence at the Karolinska Institute

“Zhavoronkov et al. show that AI techniques can be used to guide our search for good drug molecules in the vastness of chemical space, one of the key challenges in drug discovery today. The work provides compelling evidence that AI can learn from historical datasets to generate novel molecular compounds with drug-like properties, and helps clarify how AI can be used to improve the speed of drug development.”

Mark DePristo, PhD, former Head of Genomics at Google Brain, Co-founder and CEO, BigHat Biosciences

“I met Alex when working at OpenAI and have been excited to see him pioneer the use of GANs/RL for the pharmaceutical industry since 2016. One major criticism of GANs is that their usefulness has been limited to image editing applications, so I’m glad that Alex and his team are finding ways to use them for molecular generation,”

Ian Goodfellow, PhD, the original inventor of GANs

“The generative tensorial reinforcement learning in this paper substantially advances the efficiency of biochemistry implementation in drug discovery. Yet to be further experimented at scale, this method signals a breakthrough of pharmaceutical artificial intelligence at industrial level, and may bring significant social and economic impact to our society,”

Kai-Fu Lee, PhD, founder of Sinovation Ventures, former executive of Microsoft and Google, and the original inventor of multiple AI technologies

“This is an important demonstration of the power of AI, using a GAN approach, to markedly accelerate the design and experimental validation of a new molecule, no less one targeting fibrosis, a major unmet medical need.”

Eric Topol, MD, Executive Vice-President of Scripps Research and Founder and Director of the Scripps Research Translational Institute (Dr. Eric Topol has no relationship with the company in question nor its authors).

“I interacted with many AI startups in the past and Insilico was the only deep learning company with impressive, demonstrated capabilities integrating target identification and small molecule discovery. They did a lot of theoretical work in GANs from the very beginning and this experimental validation is a significant demonstration that this technology may improve and accelerate drug discovery,”

John Baldoni, PhD, CTO of a stealth AI-powered drug development startup and former SVP of Platform Technology and Science at GSK.

“Much hyperbole exists about the promise of artificial intelligence (AI) in improving medical care and in the development of new medical tools. Here however is a paper “Deep learning enables rapid identification of potent DDR1 kinase inhibitors” recently published in Nature Biotechnology that describes an application of AI in drug discovery that is indeed important. A new drug candidate was proposed and tested pre-clinically in a remarkably short period of time. The results are significant for two reasons. The AI procedures replaced the role normally played by medicinal chemists, and these individuals are in limited supply. The acceleration in rate translates into longer patent coverage that improves the economics of drug development. If this approach can be generalized it could become a widely adopted method in the pharmaceutical industry,”

Charles Cantor, PhD, a professor at Boston University, co-founder of Retrotope, Inc, and former Chief Scientist of the Human Genome Project with the US Department of Energy

“This technology builds on our early work on adversarial and generative neural networks since 1990. Insilico has been working on generative models for drug discovery since 2015, and I am happy to see that their GENTRL system produced molecules that were experimentally validated in cells and in mice. AI will have a transformative effect on the pharmaceutical industry, and we need more experimental validation results to accelerate progress,”

Jürgen Schmidhuber, PhD, professor at IDSIA, co-founder of NNAISENSE, and the original inventor of many core techniques and initial concepts in the field of AI.

“Reduction of cycle time and overall cost of goods is critical to the future success of Pharma drug discovery activities. In this paper, Insilico highlight a novel AI based technology (GAN-RL) which allowed them to identify lead molecules with efficacy in animal models in notably short timeframes. If this technology proves broadly useful it may well have transformational potential for future lead generation efforts,”

Stevan Djuric, PhD, Adjunct Professor, School of Pharmacy, High Point University and former Vice President, Discovery Chemistry and Technology, Abbvie.

“In a recent Nature Biotechnology article, Zhavoronkov et al., experimentally demonstrate the utility of their novel GENerative Tensorial Reinforcement Learning (GENTRL) strategy for de novo drug design. In this study, GENTRL was used to design novel compounds against Discoidin Domain Receptor Tyrosine Kinase 1 (DDR1), a pro-inflammatory receptor tyrosine kinase involved in idiopathic pulmonary fibrosis and breast cancer. Of most interest, six DDR1 compounds were designed, synthesized, and experimentally tested all within only 46 days. By coupling advanced deep generative AI models, such as GENTRL with robust causal dependency structure prediction of multi-omics data in drug target discovery studies, we now hold the potential to revolutionize the pharmaceutical industry.”  

Tom Chittenden, PhD, DPhil, PStat, Chief AI Scientist and Founding Director, Advanced AI Research Laboratory, WuXi NextCODE Genomics

” This study is a significant step forward in the field of de novo small molecule design. GAN has been used before for generating new molecules but A. Zhavoronkov and colleagues have developed Generative Tensorial Reinforcement Learning framework where they have shown how GAN can be complemented with reinforcement learning and prioritize regenerated structure using self-organizing maps strategies. Moreover, what amazes me is the timeline within which lead compounds are generated which are both in vivo and in vitro validated. I appreciate Insilico Medicine’s efforts for sharing their code repository to the open-source community, I’m confident this study will open many avenues towards the research activities within AI in drug discovery.”

Gopal Karemore, PhD, Principal Data Scientist, Novo Nordisk

“Deep Knowledge Ventures provided Insilico Medicine’s initial funding round in 2014, and has remained a close advisor in the company’s journey towards becoming a global leader in the application of advanced AI for aging research. Insilico Medicine is one of our most promising portfolio companies, not only in terms of its potential ROI, but also because of its potential impact on serious problems facing humanity. Deep Knowledge Ventures continues to make the AI for Drug Discovery sector a major priority in its strategic agenda, and will soon launch a new subsidiary fund, AI-Pharma, which will use hybrid investment technologies combining the profitability of venture funds with the liquidity of hedge funds, significantly de-risking the interests of LPs and simultaneously providing the best and most promising AI companies with a relevant amount of investment.”

Margaretta Colangelo, Managing Partner, Deep Knowledge Ventures

“Exhilarating news in Nature Biotechnology today, as scientists from Insilico Medicine report that an AI process called GENTRL, has facilitated the identification of new small molecule kinase inhibitors, DNA damage response (DDR1) inhibitors, in a two month time frame, reducing the current non-AI early ‘research/preclinical development’ time estimates for new drugs by approximately 94%. The cost savings for biomarker drugs using AI processes is huge. Not only is the end-to-end development time reduced, but so too are the costs related to R&D scientific, professional and technical personnel, which account for approximately 29% of the total cost to develop a drug, according to Tufts CSDD. Since the FDA fast tracks many drugs for serious conditions, there is incredible potential to reduce overall developments costs while increasing the speed which novel drugs can be approved for very sick patients waiting for them. This welcome news comes at a time when soaring costs for drug development, arguably are being recouped in high prices of novel innovative therapies hitting the market.”

Barbara Gilmore, Senior Consultant, Transformational Health, Frost & Sullivan

“It is extremely exciting seeing Deep Learning and other techniques being used to help pinpoint drug discovery in a matter of days. In particular, exploiting large, publicly-available data sets to accelerate this process can give huge benefits for low cost. The data-driven approach will give better and faster results than the traditional methods, leading to faster drug discovery and safer, more reliable results than clinical trials on their own. While it’s unlikely that AI will replace the current methods overnight, it’s obvious that organizations which add AI to their methods will quickly replace those who do not. It is vital these organizations ‘Uber’ themselves before they get Kodaked”

David Whewel, former Director of Architecture & Software Innovation, Merck Group

“This is the first time that an AI company has designed a novel drug from scratch, synthesized it and preclinically validated it end-to-end in days rather than years - 15 times faster than the approach used by even the most efficient big pharma players. This is a true game changer, and proves that AI will be the central driver in drug development for years to come.”

Robin Starbuck Farmanfarmaian, author of The Patient as CEO: How Technology Empowers the Healthcare Consumer

“This newest achievement made by Insilico Medicine, a leading AI for drug discovery and longevity company and an official partner of Aging Research at King’s, demonstrates the truly disruptive potential that AI holds in terms of accelerating the pace of progress in drug discovery. Furthermore, this is just the latest step in a much grander agenda of applying AI for aging and longevity R&D, and to the accelerated translation of that research into real-world therapies for human patients. It is also quite notable that the team released the code behind their algorithm in an open-source format, allowing other researchers to apply their techniques and build upon their achievements for the advancement of the entire field of AI for drug design, aging research and longevity”

Richard Siow, PhD, Director of Aging Research at King’s and former Vice-Dean (International), Faculty of Life Sciences & Medicine, King’s College London

“Besides cost savings, vendors need to demonstrate high-quality results that can be measured and compared against standard practices potentially reducing the burden on sponsors. Within the drug discover space Insilico Medicine is one such successful company that leverages Deep Learning Platform solutions for Drug Repurposing and Biomarker Development. Through their commercial partnerships and peer-reviewed publications the company has clearly demonstrated its strong position. AI is becoming a significant source of competitive advantage and differentiation. Frost & Sullivan finds a moderate level of investment towards appropriate AI products and services for R&D can provide up to 5x-8x times returns on investment. For example, deep learning and GANs (Generative adversarial networks) are providing opportunities for reducing the timeline for molecule hit discovery in a matter of weeks when compared to years with the traditional approach. Target validation, compound discovery, and repurposing supported by Deep Learning and Big Data will lead to further advances and recognizable benefits. With advances in Deep neural networks based models, the field of de novo drug design will start to produce truly novel drug candidates.”

Kamaljit Behera, Senior Industry Analyst for Transformational Health, Frost & Sullivan

“As far as I know, this marks the first ever demonstration that AI can generate entirely novel, synthesizable, active molecules against a specific pharmacological target. In my view, the fact that they were able to generate entirely novel, pharmacologically viable compounds using AI is the most amazing achievement here. Of course it’s even more amazing that they established this ground-breaking proof of concept in just 46 days!”

Olivier Elemento, PhD, Director of the Englander Institute for Precision Medicine & Associate Director of the Institute for Computational Biomedicine at Weill Cornell Medicine

“When Deep Knowledge Ventures chose to provide Insilico Medicine’s initial funding round in 2014, we did so because we saw their potential to increase Quality-Adjusted Life Years (QALY) for the betterment of humanity as a whole. Since then they have been the first to use cutting edge deep learning techniques like Generative Adversarial Networks to design novel drug candidates from scratch with specified molecular properties in 2016, and in 2018 to succeed in designing, synthesizing and validating a new drug end to end in less than 2 months. I am also thrilled by the fact that this article visualizes what Insilico Medicine has been making in their R&D already back in 2017 and submitted for publication in 2018. I would not be surprised to find out that since then they have made even greater progress in applying next-generation AI techniques for drug design, which might be publicly disclosed in 2020”

Dmitry Kaminskiy, General Partner, Deep Knowledge Ventures

First published by Margaretta Colangelo at Deep Knowledge Ventures here.

/* */