At Accelrys, we have seen the use of predictive sciences applications in the Life Sciences transition from being a tool for experts only, to a tool used by all. Be it the routine use of physicochemical property predictors, ADME and 'Toxicity' models, or the now ubiquitous use of structure-based design (SBD) tools across project teams, or even the application of biophysical property predictions by antibody researchers. No longer are these the preserve of a few computational chemists or computational biologists. So, what drove this change?
Across the industry, a number of common factors can be seen to reoccur. The first of these that I would like to call out has been the impact that the “patent cliff” has had across the industry. 2012 saw something in the region of $67B USD in drug sales put at risk as major blockbuster drugs came off patent (“Embracing the Patent Cliff” EvaluatePharma, 2012). Furthermore, between 2012 and 2018, it’s been estimated that somewhere in the order of $290B USD of sales could be at risk. That’s a pretty big hit on anyone’s budget! Amongst the numerous effects this has had on the industry has been the increased move to externalized research. But, such distributed research structures creates new challenges. Can each of the research partners plan, resource and collate the relevant results in time to mine them and make informed team decisions? Indeed, it is not always possible to coordinate experiments and get results returned in time to affect decision making. Hence, design decisions are often made in the absence of all information.
Another factor that cannot go unmentioned is the increased regulation and scrutiny of drug discovery and development by governmental agencies across all major drug markets. To address many of the parameters now required in the optimization of a drug candidate, both in vivo and in vitro screening are now introduced early on in the discovery process to weed out potential issues long before clinical testing. However, with so many more experimental parameters to test for, comes the added problem of resource availability and affordability. In the combinatorial chemistry era, it is simply not cost effective to take every potential compound idea and synthesise, purify, characterise and test. Some reduction is essential, so choosing the right molecules to make and test raises challenges. Which ones to test and which ones to infer?
In the face of the above challenges, it starts to become apparent why the use of predictive sciences is becoming more wide-spread. In comparison, they are fast and cheap and can be readily integrated seamlessly into existing decision making systems. However, at the heart of this move has been addressing the question of accuracy. Arguably, predictive algorithms are now broadly mature enough to facilitate the resolution of a broad range of R&D challenges. Crucially, the industry has recognised that while individual prediction tools might not always be absolutely precise for an individual molecule, across a series they are often broadly accurate and can be reliably applied as a rank-order tool to separate the ‘good’ from the ‘bad’. This pragmatic approach to their use enables scientists to evaluate many more hypotheses more quickly than is possible with experimentation alone. As a result, predictive science methodologies not only enhance the quality and speed of decision making, they also provide a less expensive and more scalable approach to improving R&D efficiency, especially when deployed on a unified informatics platform.
Over the past five years, one of the fastest growing areas of research has been the development of novel biologic therapeutics based on second and third-generation antibody technologies. These are antibodies that utilize various engineering and optimization technologies, to derive improved clinical properties, such as enhanced antigen binding properties, effector function, clinical safety, etc. Bolstered by increasing drug approvals from the FDA in recent years [i], combined with improved patent protection rights [ii], biotherapeutics are now widely viewed as one of the most valuable opportunities to the pharmaceutical industry. Indeed, some analyst forecasts are now projecting biotherapeutics will comprise 50% of the top 100 drugs by 2018 [i],[iii].
With such promise and competition in biotherapeutics, efficient development is increasingly an essential pre-requisite. However, their development is non-trivial and includes a number of unique challenges. For example, unlike small molecule drugs, storage and administration of Biotherapeutics typically requires the use of high-concentration solutions. This requires specific biophysical profiles including solubility, thermal and chemical stability and low aggregation propensity. While it is possible to test for many of these properties and characteristics, the testing is typically time consuming, expensive and dependent on both experimental methods to test and on cell lines to express the biologic material. This has become a major bottleneck for the industry.
Increasingly, the pharmaceutical industry is taking a lesson from the small molecule drug discovery process and applying predictive, or in silico methods, to help identify and optimize the best biologic leads early on. When applied in conjunction with experimental procedures, predictive methods can not only help prioritize the selection of biologics; they can also identify potential undesirable properties long before any material has even been expressed.
One important area where predictive methods are being applied is in protein maturation. For example, understanding the effects of mutation on protein binding affinity can help to guide experiment design and reduce the time spend in maturation. At Accelrys, we’ve recently published a paper detailing a new in silico mutagenesis algorithm that can accurately calculate the effect of mutation on protein-protein binding. Uniquely, our method can do this as a function of pH [iv]. Experimental studies have previously shown that small changes in solution pH can have a significant effect on the binding affinity of protein complexes [v],[vi]. Usefully, this same pH-dependent binding profile can be exploited in antibody engineering, to improve the half-life of therapeutic antibodies in serum, by increasing its binding to FcRn [vii],[viii].
As part of the validation studies, we considered the experimentally measured pH-dependency of the effect of mutations on the dissociation constants for the complex of turkey ovomucoid third domain (OMTKY3) and proteinase B. The results for the predicted pH-dependent energy profiles demonstrated excellent agreement with the experimental data [Figure 1] (further details are available in the paper ). Based on the full validation studies, we believe the method should be reliable enough to be useful for initial screening to find candidate antibodies for further experimental study.
Figure 1. Binding of OMTKY3 inhibitor to proteinase B. pH-dependence of logR, derived from the calculated mutation energies shown in Figure 1A. R is the ratio of the binding constants Ka(Glu18)/Ka(Gln18) (red line), Ka(His18)/Ka(Gln18) (blue line), and Ka(Leu18)/Ka(Gln18) (green line). The triangles, circles, and squares represent the experimental logR values, obtained in the study of pH dependency of OMTKY3 binding to proteinase B [vi]
The pH-dependent mutational analysis tools are one example of the new generation of in silico algorithms being developed to address the need to predict biologic physical (biophysical) properties. Today the development and validation of these novel biophysical property prediction methods is arguably one of the most exciting and active areas of innovation in predictive science.
[i]. ‘EvaluatePharma: World Preview 2018 Embracing the Patent Cliff’, EvaluatePharma,2012.
[ii]. ‘Patient Protection and Affordable Care Act (PPACA)’, Public Law 111‑148, 124 STAT. 119, 111th Congress, 2010.
[iii]. Strohl W.R., Strohl L.M., ‘Therapeutic Antibody Engineering: Current and Future Advances Driving the Strongest Growth Area in the Pharma Industry’, Woodhead Publishing Ltd., 16 Oct., 2012, ISBN-10: 1907568379, ISBN-13: 978-1907568374.
[iv]. Spassov V.Z., Yan, L., ‘pH-Selective mutagenesis of protein-protein interfaces: In Silico design of therapeutic antibodies with prolonged half-life’, Proteins, 2013. DOI: 10.1002/prot.24230
[v]. Schreiber G., Fersht A.R., ‘Interaction of barnase with its polypeptide inhibitor barstar studied by protein engineering’, Biochemistry, 1993, 32(19), 5145–5150. DOI: 10.1021/bi00070a025
[vi]. Qasim M.A., Ranjbar M.R., Wynn R., Anderson S., Laskowski M. Jr, ‘Ionizable P1 residues in serine proteinase inhibitors undergo large pK shifts on complex formation’, J. Biol. Chem.,1995, 270, 27419–27422. DOI:10.1074/jbc.270.46.27419. DOI: 10.1074/jbc.270.46.27419
[vii]. Roopenian D.C., Akilesh S., ‘FcRn: the neonatal Fc receptor comes of age’, Nat. Rev. Immunol., 2007, 7, 715–725. DOI: 10.1038/nri2155
[viii]. Dall’Acqua W.F., Kiener P.A., Wu H., ‘Properties of human IgG1s engineered for enhanced binding to the neonatal Fc receptor (FcRn)’, J. Biol. Chem., 2006, 281, 23514–23524. DOI: 10.1074/jbc.M604292200
Accelrys’ recent acquisition of Aegis comes at an exciting time for the company, as we continue expanding our portfolio of scientific innovation lifecycle management software into downstream development, quality and manufacturing. As many of you know, we expanded into early manufacturing in 2012 with the purchase of Velquest. As a result of that acquisition, we released last month the Accelrys Process Management and Compliance Suite, an integrated suite of software for enhancing product and process insight, facilitating collaboration and streamlining product development from research through late-stage quality control and manufacturing.
Aegis’ software will be added to the suite, giving quality and manufacturing organizations the ability to access, aggregate, contextualize and analyze manufacturing, quality and product development data. In other words, they’ll be able to gain predictive control of their processes. We know this is critically important to our customers.
Aegis also plays a key role as the development of biologics continues to grow in pharmaceutical drug portfolios. For biologics manufacturers, process understanding ensures predictable product outcomes, something that has been difficult to achieve with large molecule development and manufacturing.
On a daily basis – whether from big pharma or biotech, consumer goods or chemicals companies – we hear of our customers’ struggles not just to visualize information but to understand what’s happening with their processes throughout the product development lifecycle. This is especially true in the context of externalization. Today’s science-driven organizations manage and collaborate with multiple partners scattered across all corners of the world. Without insight into their operations and a true understanding of their processes throughout the scientific innovation lifecyle, these organizations are often flying blind. As a result, cycle times are delayed, product quality is threatened, and margins are squeezed.
We believe that a holistic approach to product development - from lab to commercialization – is imperative. It’s a business issue, a global competitiveness issue, an innovation issue. The addition of Aegis’ top-notch team and industry-leading enterprise process intelligence capabilities give Accelrys – and our customers – an important new tool for managing the scientific innovation lifecycle and solving today’s common, costly and complex business challenges.
The recently released Accelrys Draw 4.0 addresses a key problem in working with biologics--how to represent them so that they can be databased and later searched by scientists. Check out the video below to see a demo I gave on the new features in Accelrys Draw, which can recognize the structural information underlying UniProt files.
Compared to chemical data, biological data has always been harder to manage electronically. But what exactly makes it so problematic? The major hurdle has to do with vocabulary and definitions: what exactly constitutes a unique therapeutic biomolecule and how to represent it so that it can be stored and searched electronically.
Before you can set up a system that can reliably search for and retrieve a unique substance, you first need to define and describe what you’ve made. And this is harder than it should be. Scientists and research managers I’ve spoken to acknowledge that even within the same organization, different departments use different words to describe the same thing. These “dialects” make it particularly difficult to deal with biologics, which are very often defined and characterized by what they aren’t or how they are made. A sequence of interest may be known and understood by its genealogy with respect to parent sequences or the steps and techniques used to make it. Shaken, not stirred doesn’t just apply to 007—the way a biologic is made and handled is often as important as the specific sequence itself.
Standardized vocabularies for specific modifications and processes can help organizations build consistent recipes to describe the biomolecules they are making. And another thing that will help is the ability to describe electronically the actual, underlying structure of biomolecular components. Chemical nomenclature, standardized through IUPAC and other conventions, provides a “thing” that can be represented and stored and, most importantly, retrieved and linked to associated information. But life scientists have lacked a consistent way to represent biologics, which has led to ambiguous shorthand that can be particularly confusing when it comes to hybrid structures containing both amino acids and nucleotides.
Consider the sequence “GGG.” Any of those Gs could refer to guanine, the nucleic acid building block; glycine, the amino acid; or even guanosine, the nucleoside. There’s no ambiguity at all, though, if the shorthand description simply stands in for the underlying chemical structures for each G, stored electronically and visible with a simple mouse over. Here what that looks like:
Such a system could also help with some dialect issues. For example, if sulfur and sulphur are both linked to the element 16 [S], then searches based on S find the expected hits no matter how the element is named.
Scientists know that a UniProt sequence is more than a list of letters. But for that meaning to not just be useful, but able to protect IP in an electronic environment, that list of letters must be linked inextricably to the underlying chemistry. Only then can the sequence be stored, searched, and modified to uniquely describe biomolecular components.
We think this approach goes a long way toward solving some of the challenges associated with representing biologics, but there’s clearly a lot to consider. What issues have you encountered in trying to describe, store, and search biologics electronically?
This fall’s ACS meeting in Boston seemed lower key than the spring meeting in San Francisco. But maybe it was just that I was insanely busy with Chemical Information Division (CINF) activities, as my term as chair of the division ended at this meeting. I was really thrilled to be able to present Tony (Anton J. Hopfinger) with the 2010 Herman Skolnik award. I do not have a computational chemistry background, but I worked with Tony on my first MDL business development project, and his reasonable, fair, and generous nature left me with a long-term respect and affection for him.
I also quite enjoyed the speaker we had at the CINF luncheon. Michael Capuzzo, former Philadelphia enquirer reporter, joined us to talk about his most recent book, The Murder Room: The Heirs of Sherlock Holmes Gather to Solve the World's Most Perplexing Cold Cases. While, yes, he doesn’t work directly in cheminformatics, Capuzzo painted a picture of information gathering and relationship extraction that mapped very closely to the challenges we information professionals face every day.
Capuzzo noted that over meals not unlike the one we had shared prior to his talk, a team of dedicated and skilled detectives known as the Vidocq Society pore over gruesome crime scene photos of corpses and cannibalism and detailed police investigative reports to try to bring serial killers to justice. Good thing we had finished eating before Michael took to the podium! Michael explained that the Vidocq Society has handled hundreds of cases over the years and made significant contributions leading to new arrests and/or exonerations. Michael noted there is still a big difference between knowing who committed the crime and being able to secure justice—and so Vidocq Society is very strict about only getting involved if family members and the local law enforcement invite them to participate
My colleague Keith Taylor participated in a session dedicated to chemical representation, which Wendy Warr reviewed recently. The session showed that work in this area is by no means finished. One talk covered the recent IUPAC recommendations for chemical structure representation and another covered advances with InChI—both are true community projects that bring together vendors, academics, and publishers. I’ll admit I held my breath when Keith, who spoke on our new flexible sequence representation, launched a live demo—but it worked splendidly, really showing off the flexibility and power of this new approach. You can view Keith’s slides below.
Finally, the ACS exhibition allowed the Accelrys content group to showcase all the database content that is now part of the Accelrys product family. We continue to focus on chemical sourcing, reactions, and bioactivity information, offering multiple in-house and hosted options. We also demoed the All New DiscoveryGate at ACS and are looking for beta testers to come on board as we begin adding reaction content to the sourcing data currently delivered through that system. If you’d like to participate, please email firstname.lastname@example.org or comment on this post.
One of the most talked about scientific publications this year described the creation of a bacterial cell controlled by a synthetic genome (Gibson et al., 2010). The team that conducted the research, based at the Craig Venter Institute, synthesized the modified genome sequence of one species of Mycoplasma (about one million base pairs) and successfully transplanted it into the cell of another Mycoplasma species. The phenotype of the resulting bacteria was exactly as expected and the cells were shown to be capable of self-reproduction.
The paper sparked a series of debates about the significance of this particular experimental strategy and to what extent it constituted the creation of synthetic life. Additionally, and of more practical interest, the efficiency of this de novo strategy was also widely debated. While there is relatively little dispute about the potential value of engineering new biological systems, the work fuelled an ongoing controversy about the efficiency of such an empirical approach to creating a new biological compared to more conventional strategies that modify existing genomes in their native cellular environments.
But the study also emphasized some more specific challenges beginning to emerge as genome sequencing establishes itself as a mainstream tool in the Life Sciences. Of the many technical obstacles that the team overcame in creating the new cell, one of the most surprising was the impact of seemingly trivial error rates on the successful creation of the synthetic genome sequence. A single base pair deletion in the dnaA gene, involved in chromosome replication, rendered the transplanted cells unviable. The failure to identify this error during quality control sequencing of the synthetic genome significantly delayed the completion of the project. As soon as this one-in-a-million error was identified and rectified in the synthetic sequence, the team was able to successfully recover viable cells.
In this case, the significance of the sequencing error in determining the synthesized genome sequence was obvious, as it resulted in an effectively lethal genotype. However, more generally, it emphasizes the critical need for highly accurate sequence determination. As the pharmaceutical industry increasingly relies on genome sequence data as a foundation of personalized medicine, the accuracy of the genotypic data collected on a large scale will come sharply to focus. Being able to determine sequences with 100% accuracy, particularly in non-coding regions of genomes, may become a challenging pre-requisite to personalized therapeutic strategies.
Accelrys is actively working on products, such as the Next Generation Sequencing (NGS) component collection and a biological registration system that can play a critical role in quality control of biological data. In the case of the NGS collection, alongside the integration of the latest mapping and assembly algorithms, the core data pipelining capabilities of the Pipeline Pilot platform make it possible for scientists to develop quality control pipelines without any programming knowledge. As applications of genome sequencing such as synthetic biology and personalized medicine progress, such computational approaches to quality control will play an increasingly central role.
Recently I was reading a very interesting article on the rise of Roundup-resistant weeds. This, as a person who likes corn based products, is quite important. However, unlike my recent posts on Biochemistry and the development of bio-crude and biomass conversion systems, this is about the convergence of chemistry, environmental fate and toxicology and genetic biology.
What I mean is, as the threat from large scale current generation resistant pesticides grows, some farmers are predicting cotton and soy crops having 30%+ weed content in less than five years. This means, just to sustain our current food production levels and efficiencies, we will have to both develop new compounds, but also develop herbicide resistant variants of older compounds and gene lines. This older is new again answer to sustaining our food pipeline is somewhat ironic, but it is clear that all the major agrochemical companies are approaching this and that the next generation choice of chemicals will be governed by the development of seed lines that are tolerant to the materials in use.
Interestingly, this also creates potential captive farmers who gear the agricultural business to a specific seed class and so are locked into a specific herbicide regime and process. What is clear to me is that the need to analyze biological entities along-side chemical, Genomics, environmental and other complex data types will require a very flexible, extensive data platform and a unique type of registration engine.
Accelrys is exhibiting at booth #536 at the TechConnect expo & conference in Anaheim and the conference is teeming with researchers in Nanotech, Microtech, Cleantech and BioNanotech areas. With separate tracks in these domains, scientists and leaders in various areas of chemistry, physics and biology have gathered to network, share and create partnerships for innovation and breakthrough in technology.
One of the invited talks stressed how crucial it is to use masterbatch products for maximizing performance of carbon nanostructure composite materials. The speaker showed critical differences in the extent of dispersion in such products. Another presentation dealt with nanotechnology applied in drugs and biologics delivery. Other talks were about formulation, imaging agents, and carbon nanotube ink technology. Feeling at home in the world of nanotech… signing off for now.
The inherently complex nature of cell lines, plasmids, proteins, antibodies and vaccines makes a biological registration system challenging. Yet such systems are needed so that researchers and companies can track these entities and their relationships, creating critical intellectual property positions as well as connections to past research and manufacturing processes.
Patterned on the services of registration systems for chemical entities, which are well-known and entrenched in the drug discovery process, the Accelrys Biological Registration system is an "intelligent" solution for registering, associating, searching and retrieving data for entities such as siRNA, plasmids, cell lines, proteins, antibodies, vaccines and future biological entities.
Join us on Wednesday, May 26 for our live webinar, “Intro to Accelrys Biological Registration,” the first in a series on biological registration. To register or learn more, please click here.
Accelrys has announced the commercial availability of theaward winningAccelrys Biological Registration, the world's first multi-entity, fully-integrated, flexible and extensible database for biological entities. The enterprise-scalable system supports eight major biological entities: Yeast, Cell Lines, DNA, Protein, Plasmid, Vaccine, Antibody and siRNA.
Although registration has been a standard in small molecule researchfor many years, it is relatively new to biological sciences. With the advent ofAccelrys Biological Registration, companies with innovation in biology can now:
Implement and automate the process of biological entity registration
Capture, secure and protect corporate intellectual property
Search, retrieve and utilize biological information across the enterprise
Enable scientists to collaborate and share biological information
Increase operational efficiencies and reduce costs through registering biological entities
Reduce the risks associated with failing to register biological entities
This product was developed in a pre-competitive environment with several of the world's leading biopharmaceutical companies, including Abbott Laboratories and Merck & Co., Inc. With the release of Accelrys Biological Registration, we are further demonstrating our commitment to ongoing innovation in the scientific informatics market.
We invite you tolearn moreabout Accelrys Biological Registation:
Registerfor our Accelrys Biological Registration webinar series
Biology is undergoing a revolution and is becoming a more analytical science with the advent of omics, high content screening, next generation sequencing, and other methods. These methods lead to the more in-depth understanding of systems biology and the discovery of new biomarkers. This greater understanding can be used to fill-in our knowledge about pathways, to the point of building mathematical models of the multiple processes involved in any response to stimuli. All of this taken together should increase the odds of success by having better information to base decisions on.
The other area in biology that has great opportunities is in the use of biologics as drug entities. These drug entities range in complexity from antibodies, vaccines, siRNA, etc. The value to the marketplace is in the hundreds of billions of dollars and intellectual property (IP) protection is essential. Some of the largest patent infringement cases ever awarded are around biologics.
In the process of building-out of these analytical biology systems, and the biologics as drug entities, there are many biological innovations and inventions. For example, new stem cell lines, antibody generation as a tool or as a drug entity, plasmids, algae strains, etc. There is also a lot more inventory, reagents and data to track today than ever before. The best way to track data across multiple sources is the use of a consistent and meaningful key. The way this is handled in the chemical space is to use a registration system to uniquely identify an entity and give the entity a unique integer that represents the entity in every data system. Until recently, the biologist might track a bar code for a cell line or antibody in their notebook, and a possible location for this entity in a lab-based, simple inventory system. However, this type of system only tells the same researcher where the cell line is, not uniquely what it is. In order for the entire company to benefit from the inventory, and protect their IP, there is the need for describing the biological entity uniquely. This is a rather new concept for biologist which needs to be carefully considered moving forward to better protect IP, manage expensive reagents, implement safety systems and most importantly, to ensure the query and aggregation of data. All of this has been implemented in chemistry and shown to be of great value, now it is biology’s turn.
Does your organization have a Biological Registration system? How could such a system add value to your organization?
Learn how Accelrys is on the forefront of scientific innovation by being one of the first to preview our BioIT World award-winning application, Accelrys Biologics Registration. Developed with leading pharmaceutical companies, the application was designed to address the challenges posed by the dynamic nature of biological entities.
Get a glimpse into the latest release of Pipeline Pilot and the Imaging Collection; or hear how Accelrys products are being used to address next-generation sequencing analysis challenges by attending “Pipelining Your Next Generation Sequencing Data,” on Wednesday, April 21, 12:00pm in Track 3.
Visit us at booth #301-303 to learn more about our leading scientific informatics solutions.
Accelrys is the official Twitter sponsor for BioIT World Conference & Expo ’10, follow us (#BioIT10) for your chance to win an Apple iPad.
Accelrys has recently concluded a series of meetings with a specially convened Biological Registration Special Interest Group , (SIG), formed between several major pharmaceutical companies and Accelrys. The objective of this forum was to understand some of the critical market and product requirements needed in order to build a state-of-the-art Biologics Registration system.
The success of the SIG can be attributed to the customer members being very open towards one another, in spite of being competitors, and the tremendous diligence each company put into specifying user requirements. This open and collaborative approach to software development has become an innovative way to introduce first of a kind technology into the market.
First of a kind software is usually developed as a bespoke project for a single company and then modified over time to meet the needs of the wider market. This can create disadvantages for early adopters as the product functionality evolves and improves with subsequent releases. This situation can be avoided by getting a wider set of requirements through a collaborative SIG formed of a diverse and representative sample of interested parties.
The ability to capture and prioritize a wider set of requirements through leading companies discussing and debating the relative merits and benefits of proposed features, is a more efficient and effective way of understanding market requirements than more traditional methods. The approach also enables the development team to capture feedback and more rapidly create a product that should be attractive to the wider market. The anticipated result is the timely delivery of a product that is well positioned to capture both broad interest and market share.
Have you innovated through collaborative work groups? If so, we would welcome the chance to learn from your experience.
Why do we view life in 3D rather than 2D? I am not thinking fundamental physics here (which may consider 12D or more), rather what advantages do we have in perceiving chemical and biological molecules as more than two dimensional entities? Are there any benefits to viewing synthetic schemes and reaction mechanisms as more than two dimensional, likewise can regarding proteins/DNA/RNA as more than sequences and collections of secondary structures bring us additional insights?
There is an argument to be made that the intersection of biology and chemistry is best viewed in 3D. Just ask your local crystallographer! Life really does happen in 3D. When trying to understand how simple molecules interact with their complex targets, three-dimensional visualization can be indispensible! Nearly 60,000 (and growing) PDB structures can’t be wrong.
There is no consensus on how “best” to view molecular structures or complexes and therefore, a number of solutions are available. Everyone has their personal preference - I find that the DSVisualizer provides all the flexibility and customizability I would like … and it’s free to everyone. That said, I would love to know what your personal favorite visualizer is and why? Vote for your personal favorite over in the right sidebar, and explain why below in the comments.