Development of a Natural Products Database from ... - ACS Publications


Development of a Natural Products Database from...

4 downloads 141 Views 4MB Size

Note pubs.acs.org/jnp

Development of a Natural Products Database from the Biodiversity of Brazil Marilia Valli,† Ricardo N. dos Santos,‡ Leandro D. Figueira,† Cíntia H. Nakajima,† Ian Castro-Gamboa,† Adriano D. Andricopulo,‡ and Vanderlan S. Bolzani*,† †

Núcleo de Bioensaios, Biossíntese e Ecofisiologia de Produtos Naturais (NuBBE), Departamento de Química Orgânica, Instituto de Química, UNESP - Univ. Estadual Paulista, 14801-970, Araraquara-SP, Brazil ‡ Laboratório de Química Medicinal e Computacional (LQMC), Instituto de Física de São Carlos, Universidade de São Paulo (USP), 13560-970, São Carlos-SP, Brazil ABSTRACT: We describe herein the design and development of an innovative tool called the NuBBE database (NuBBEDB), a new Web-based database, which incorporates several classes of secondary metabolites and derivatives from the biodiversity of Brazil. This natural product database incorporates botanical, chemical, pharmacological, and toxicological compound information. The NuBBEDB provides specialized information to the worldwide scientific community and can serve as a useful tool for studies on the multidisciplinary interfaces related to chemistry and biology, including virtual screening, dereplication, metabolomics, and medicinal chemistry. The NuBBEDB site is at http://nubbe.iq. unesp.br/nubbeDB.html.

N

conservation initiatives with a solid scientific basis can be achieved.12 Notable NuBBE research-related compounds include the Casearia sylvestris-derived cytotoxic clerodane diterpene casearin X,13 the anxiolytic Erythrina alkaloid (+)-erythravine from the medicinal plant Erythrina mulungu,14 and (−)-spectaline, a rare piperidine alkaloid from Senna spectabilis, a valuable raw material for a semisynthetic derived acetylcholinesterase inhibitor,15 utilized as a model in our recent studies16 (Figure 1).

atural products have been a wonderful source of inspiration for the design and development of new drugs.1−6 An inspection of drug approvals reveals that approximately 64% of all drugs considered had a natural product involved in their development.6 This source includes unmodified natural products, natural products derivatives, and drugs that were designed by being inspired by a natural product. The unique chemical diversity of secondary metabolites is one of the reasons for the continued scientific interest of natural products, and they continue to be an especially important source of new ideas in regard to chemical structure.7,8 Therefore, the availability of natural compounds libraries is of significant importance for in vitro and in silico screening in drug discovery.9 Brazil possesses an extremely rich biodiversity, accounting for approximately 20% of all known living species globally, which are found in several important biomes, such as the Amazonian and the Atlantic forest regions.10 There are several research groups in Brazil that focus on exploring this rich biodiversity rationally. One of these is the Nuclei of Bioassays, Biosynthesis and Ecophysiology of Natural Products (NuBBE) research group, which has been involved in the latest advances in natural product chemistry, including the search for biologically active compounds from plants of the Cerrado, the Atlantic forest and plant endophytic fungi.10−12 Furthermore, in 1999, NuBBE was one of the first Brazilian natural products chemistry research groups involved in the creation of the Virtual Institute of Biodiversity, BIOTA-FAPESP, an ongoing successful program in the state of São Paulo, Brazil, nowadays a recognized Worldwide Biodiversity Program, a useful example of how © 2013 American Chemical Society and American Society of Pharmacognosy

Figure 1. Examples of biologically active natural products of Brazilian biomes. Special Issue: Special Issue in Honor of Lester A. Mitscher Received: October 3, 2012 Published: January 18, 2013 439

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444

Journal of Natural Products

Note

Figure 2. (A) NuBBEDB search tool. (B) Molecular drawing interface. (C) Search results. (D) Compound description.

specialized tool in studies of natural products and medicinal chemistry, including dereplication, metabolomics, virtual screening, and the design of new biologically active compounds. Additionally, the NuBBEDB is a valuable starting point for cataloging and accessing all Brazilian biodiversity information. The data published by the NuBBE research group, a total of 170 scientific papers containing information on pure compounds, were analyzed, and, to date, the NuBBEDB features a total of 640 compounds, which will be continually updated as new information becomes available. The database can be freely accessed via a user-friendly Web interface at http://nubbe.iq. unesp.br/nubbeDB.html.18 The NuBBEDB includes a variety of information for each compound including chemical class and name; code, molecular formula, and mass; and source. Whenever available, biological,

Information on Brazilian biodiversity collected over the years and generated by modern methodologies, with an emphasis on chemical data, is fragmented and, thus, very difficult to access readily. Owing to the absence of any database of secondary metabolites from the Brazilian biodiversity, herein is described the design and development of a Web-based, freely available, and easy to access database of natural products and their derivatives from the Brazilian biodiversity called the NuBBE database (NuBBEDB). The NuBBEDB is the result of an effective collaborative project between the NuBBE group and the Laboratory of Computational and Medicinal Chemistry (LQMC, USP−São Carlos), which has experience in the development of innovative databases, such as PK/DB, a robust tool for pharmacokinetic studies and in silico ADME predictive models.17 The scientific community may benefit from this 440

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444

Journal of Natural Products

Note

Figure 3. Distribution of properties of the NuBBEDB compounds: (A) molecular mass, (B) molecular volume, (C) cLogP, (D) TPSA, (E) hydrogenbond donors, (F) hydrogen-bond acceptors, (G) nRotb, (H) Lipinski’s “rule of five”.

molecular volume. Chemical structures are represented by a SMILES string, and the 3D conformations are available in Mol2 file format (widely employed in molecular modeling studies).19−22 To search compounds in the database, a Web-based search tool is available, incorporating a molecular drawing interface enabling the user to search for compounds by property, chemical structure, or a combination of criteria (Figure 2A and

pharmacological, and toxicological information (qualitative and quantitative) are also included, with the corresponding references. Other molecular and physicochemical properties provided are the number of rotatable bonds (nRotb), the calculated octanol/water partition coefficients (cLogP), the number of hydrogen-bond donors, the number of hydrogenbond acceptors, the number of Lipinski’s “rule of five” violations, topological polar surface area (TPSA), and 441

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444

Journal of Natural Products

Note

Figure 4. Physicochemical property scatter plots of the compounds presented in NuBBEDB. (A) cLogP versus molecular mass, (B) TPSA versus molecular mass, (C) nRotb versus molecular mass, (D) hydrogen-bond acceptors versus hydrogen-bond donors.

B, respectively). The results are displayed in the same Web session (Figure 2C), where it is possible to see the main properties, the chemical structure of the compound, a link to download the 3D structure in Mol2 file format, and a table of information from each compound in pdf format (or for all compounds in one click) (Figure 2C). By clicking the compound, all information is displayed (Figure 2D). The Web system was designed to allow the scientific community to search, browse, and download molecules, providing a rapid response to specific queries. In order to facilitate the analysis of the database, the compounds were grouped by acquisition source: 80% of the compounds were isolated from plants, 7% are semisynthetic, 6% were isolated from fungi/microorganisms, 5% are synthetic (inspired by a natural product), and 2% are products of biotransformation using a plant or fungi extract. The next step of this work, and one of the most striking achievements of the creation of this database, is to add as much as possible information on all secondary metabolites isolated from species of other representative Brazilian ecosystems, including compounds from marine organisms, which were not studied in the same depth as terrestrial organisms. The chemical diversity is rather large, with compounds belonging to several different chemical classes, including terpenoids (160), alkaloids (135), flavonoids (80), iridoids (50), lignans (40), phenylpropanoids (34), benzoic acid derivatives (30), pyrans (25), chromanes/enes (22), and other classes (64). The pharmacological properties of the isolated compounds are also diversified, comprising acetylcholinesterase inhibitors, antiangiogenic, antibacterial, antifungal, antioxidant, antitrypanosomal, antiulcerogenic, anxiolytic, and cytotoxic agents, and protease inhibitors. In light of these numbers, it is evident that NuBBEDB

contains mostly natural products, representing compounds with unique and complex chemical scaffolds. The compounds were primarily isolated from the Amazon (Amazônia), Atlantic Rainforest (Mata Atlântica), or Cerrado biomes of Brazil. These plants were collected using rational approaches including ethnopharmacology and chemosystematics. Molecular mass, the number of hydrogen-bond donors and acceptors, cLogP, nRotb, and TPSA are useful descriptors for predicting “drug-likeness” of small molecules.23−27 As shown in Figure 3A and B, molecular mass and molecular volume have very similar distributions, as expected. The average molecular mass of the set of compounds is 386.3. The cLogP values, −4 < cLogP < 9, follow a Gaussian distribution and incorporate both highly hydrophilic and hydrophobic compounds (Figure 3C). The mean cLogP value for the set is 2.98 (Figure 3C). TPSA is a suitable parameter for predicting fractional absorption, where compounds with a TPSA > 140 Å2 are likely to have poor absorption characteristics.28 The TPSA distribution of NuBBEDB compounds has a peak at values in the range from 50 to 100 Å2 (Figure 3D). Remarkably, the ratios of compounds with no more than 10 hydrogen-bond acceptors (529 compounds) and five hydrogen-bond donors (539 compounds) are both approximately 80% (Figure 3E and F), following exactly the donor/acceptor ratio proposed by Lipinski for drug-like compounds.23 The nRotb is a useful parameter for evaluating molecular flexibility and oral bioavailability of drugs. Compounds with more than 10 rotatable bonds (Rotb) have been associated with decreased oral bioavailability.25 The majority of the database compounds (67%) have up to six Rotb (Figure 3G). The distribution drastically decreases for higher values, but extends to account for very flexible compounds possessing up to 24 Rotb. The mean nRotb is 442

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444

Journal of Natural Products

Note

5.8. Lipinski’s “rule of five” is one of the most common sets of parameters used to evaluate “drug-likeness”.23,25,28 Altogether 485 compounds (75% of the database) violate fewer than two of Lipinski’s four parameters (Figure 3H). Taking into account the considered properties, it is clear that the database is chemically rich and highly diverse in its contents. It also presents a very interesting profile for the evaluation and identification of bioactive compounds for drug design studies. To scrutinize the correlations between the properties of the compounds provided in NuBBEDB, scatter plots of the different properties were generated. The lack of correlation observed in Figure 4A between molecular mass and cLogP suggests the high chemical diversity of the NuBBE database. Figure 4B shows that the database has representative compounds of both high and low polar character for a specific molecular size. Figure 4C shows that the database possesses large compounds with either high or low flexibility. The scatter plot for the number of hydrogen-bond donors and acceptors (Figure 4D) indicates a direct correlation between these two properties. To access the main characteristics, the newly created database is compared with three distinct data sets of the ZINC database:29,30 “drug-like”, “lead-like”, and “fragmentlike”. These data sets are widely used in a series of studies in drug design.31,32 “Drug-like” is defined by ZINC as those compounds having a molecular mass between 150 and 500, cLogP less than or equal to five, nRotb less than eight, TPSA less than 150, and number of hydrogen-bond acceptors less than or equal to 10.29,33 “Lead-like” refers to those compounds having a molecular mass between 250 and 350, cLogP between 2.5 and 3.5, and between five and seven Rotb.29,31 Finally, “fragment-like” identifies those compounds having a molecular mass less than or equal to 250, cLogP less than or equal to 3.5, and less than or equal to five Rotb.33,34 We filtered NuBBEDB with the properties defined by each of these subsets separately and obtained 311 “drug-like”, 11 “lead-like”, and 103 “fragmentlike” results, indicating that the NuBBE database contains compounds with primarily “drug-like” characteristics. Without the compilation of the data into NuBBEDB, it would not be possible to access and understand these properties. In conclusion, the NuBBEDB is an innovative database of compounds from species of the Brazilian biodiversity, especially from the two main biomes Cerrado and Atlantic Forest. The compounds of the database comprise rich chemical diversity and a wide spectrum of biological and pharmacological activities. This unique information is now available to support drug discovery projects, connecting chemistry and biology. We hope that the NuBBEDB will be useful to the scientific community for studies involving virtual screening, dereplication, metabolomics, and the design of new bioactive compounds. This useful database is an effort to reinforce collaboration between researchers from the areas of natural products, synthesis, chemoinformatics, and medicinal chemistry.



were calculated using the Gasteiger−Hückel method, also available in the Sybyl 8.0 package. During these steps, the molecules were considered to be in an implicit aqueous environment (dielectric constant of 80.0). Finally, the 3D conformation of each molecule is available in the database in Mol2 file format. Physicochemical Property Determination. The physicochemical properties of all molecules in NuBBEDB were predicted using the Web Property Calculation Service available as part of the Web-based Molinspiration software.36,37 These properties include molecular mass, molecular volume, cLogP, TPSA, number of hydrogen-bond acceptors, number of hydrogen-bond donors, nRotb, and number of Lipinski’s “rule of five” violations. The fragment-based approach used by Molinspiration has proven to be reliable and is employed in relevant scientific publications and chemical databases.30,38−40 Database Interface. An integrated system was developed to allow easy and efficient retrieval of the compounds. The NuBBE Web system is installed on a Linux server with Apache Tomcat as the Web server and PostgreSQL as the relational database server, where the set of data is stored. The Web interface was designed using standard Web technologies such as HTML, CSS, and JavaScript (AJAX), while the server itself is implemented using Java/Servlets with Hibernate, an object-relational mapping database framework. All of the software packages used are open source and recognized by the industry and the community as robust and reliable software. The Web interface was designed to work in all browsers equally, with special attention given to the ease of use. The molecular drawing interface WebME provided by Molinspiration36 in association with the substructure search engine provided by CDK (Chemistry Development Kit)41 enables the user to search for compounds by chemical structure, offering a more userfriendly search and retrieval system.



AUTHOR INFORMATION

Corresponding Author

*Tel: +55 16 33019660. Fax: +55 16 33019692. E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by Biota-FAPESP, CAPES, SISBIOTA-CNPq. We are grateful to M. Leonard and Molinspiration for the WebME tool. We also wish to acknowledge A. J. Cavalheiro, A. R. Araújo, D. H. S. Silva, M. N. Lopes, and M. Furlan for their advice and their collaboration at NuBBE. M.V., R.N.S., A.D.A., and V.S.B. acknowledge FAPESP and CNPq for the fellowships.



DEDICATION Dedicated to Dr. Lester A. Mitscher, of the University of Kansas, for his pioneering work on the discovery of bioactive natural products and their derivatives.



REFERENCES

(1) Koehn, E. F.; Carter, T. G. Nat. Rev. Drug Discovery 2005, 4, 206−220. (2) Chin, Y. W.; Balunas, M. J.; Chai, H. B.; Kinghorn, A. D. AAPS J. 2006, 8, E239−E253. (3) Newman, D. J.; Cragg, G. M. J. Nat. Prod. 2007, 70, 461−477. (4) Newman, D. J. J. Med. Chem. 2008, 51, 2589−2599. (5) Kinghorn, A. D.; Pan, L.; Fletcher, J. N.; Chai, H. J. Nat. Prod. 2011, 74, 1539−1555. (6) Newman, D. J.; Cragg, G. M. J. Nat. Prod. 2012, 75, 311−335. (7) Li, J. W. H.; Vederas, J. C. Science 2009, 325, 161−165. (8) Bolzani, V. S.; Valli, M.; Pivatto, M.; Viegas, C., Jr. Pure Appl. Chem. 2012, 84, 1837−1846.

EXPERIMENTAL SECTION

Molecular Structure Preparation. The 3D chemical structure of each compound was generated using the standard tools available in the molecular modeling software package Sybyl 8.0 (Tripos, St. Louis, MO, USA), running on Red Hat Enterprise Linux workstations. The hybridization of every atom was verified using the Sybyl “Atom Types” option. All molecules were considered to have a neutral charge. The single 3D representation of each molecule from the database had its conformational energy minimized using the Tripos force field and Powell’s method.35 Partial atomic charges of the minimized structures 443

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444

Journal of Natural Products

Note

(9) De Luca, V.; Salim, V.; Atsumi, S. M.; Yu, F. Science 2012, 336, 1658−1661. (10) Bolzani, V. S.; Castro-Gamboa, I.; Silva, D. H. S. In Comprehensive Natural Products II Chemistry and Biology; Verpoorte, R., Ed.; Elsevier: Oxford, UK, 2010; Vol. 3, Chapter 3.05, pp 95−133. (11) Silva, G. H.; Teles, H. L.; Zanardi, L. M.; Young, M. C. M.; Eberlin, M. N.; Hadad, R.; Pfenning, L. H.; Costa-Neto, C. M.; CastroGamboa, I.; Bolzani, V. S.; Araujo, A. R. Phytochemistry 2006, 67, 1964−1969. (12) Joly, C. A.; Rodrigues, R. R.; Metzger, J. P.; Haddad, C. F. B.; Verdade, L. M.; Oliveira, M. C.; Bolzani, V. S. Science 2010, 328, 1358−1359. (13) Ferreira, P. M. P.; Santos, A. G.; Tininis, A. G.; Costa, P. M.; Cavalheiro, A. J.; Bolzani, V. S.; Moraes, M. O.; Costa-Lotufo, L. V.; Montenegro, R. C.; Pessoa, C. Chem. Biol. Interact. 2010, 188, 497− 504. (14) Flausino, O., Jr.; Santos, L. A.; Verli, H.; Pereira, A. M.; Bolzani, V. S.; Nunes-de-Souza, R. L. J. Nat. Prod. 2007, 70, 48−53. (15) Viegas, C., Jr.; Bolzani, V. S.; Pimentel, L. S. B.; Castro, N. G.; Cabral, R. F.; Costa, R. S.; Floyd, C.; Rocha, M. S.; Young, M. C. M.; Barreiro, E. J.; Fraga, C. A. M. Bioorg. Med. Chem. 2005, 13, 4184− 4190. (16) Valli, M.; Danuello, A.; Pivatto, M.; Saldaña, J. C.; Heinzen, H.; Domínguez, L.; Campos, V. P.; Marqui, S. R.; Young, M. C. M.; Viegas, C., Jr.; Silva, D. H. S.; Bolzani, V. S. Curr. Med. Chem. 2011, 18, 3423−3430. (17) Moda, T. L.; Torres, L. G.; Carrara, A. E.; Andricopulo, A. D. Bioinformatics 2008, 24, 2270−2271. ́ (18) NuBBE: Núcleo de Bioensaios, Biossintese e Ecofisiologia de Produtos Naturais. NuBBE Database (http://nubbe.iq.unesp.br/ nubbeDB.html), accessed September 10, 2012. (19) Cole, J. C.; Nissink, J. W. M.; Taylor, R. In Virtual Screening in Drug Discovery; Shoichet, B.; Alvarez, J., Eds.; Taylor & Francis CRC Press: Boca Raton, FL, USA, 2005; Vol. 1, Chapter 15, pp 379−415. (20) Jain, A. N. J. Med. Chem. 2003, 46, 499−511. (21) Ewing, T. J. A.; Makino, S.; Skillman, G.; Kuntz, I. D. J. Comput.Aided Mol. Des. 2001, 15, 411−428. (22) Rarey, M.; Kramer, B.; Lengauer, T.; Klebe, G. J. Mol. Biol. 1996, 261, 470−489. (23) Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Adv. Drug. Delivery Rev. 1997, 23, 4−25. (24) Veber, D. F.; Johnson, S. R.; Cheng, H. Y.; Smith, B. R.; Ward, K. W.; Kopple, K. D. J. Med. Chem. 2002, 45, 2615−2623. (25) Lipinski, C. A. Drug Discovery Today: Technol. 2004, 1, 337− 341. (26) Clark, D. E.; Pickett, S. D. Drug Discovery Today. 2000, 5, 49− 58. (27) Lipinski, C. A. J. Pharmacol. Toxicol. 2000, 44, 235−249. (28) Quinn, R. J.; Carroll, A. R.; Pham, N. B.; Baron, P.; Palframan, M. E.; Suraweera, L.; Pierens, G. K.; Muresan, S. J. Nat. Prod. 2008, 71, 464−468. (29) ZINC database (http://zinc.docking.org/), accessed September 10, 2012. (30) Irwin, J. J.; Shoichet, B. K. J. Chem. Inform. Model. 2005, 45, 177−182. (31) Oprea, T. I.; Allu, T. K.; Fara, D. C.; Rad, R. F.; Ostopovici, L.; Bologa, C. G. J. Comput.-Aided Mol. Des. 2007, 21, 113−119. (32) Van Deursen, R.; Blum, L. C.; Reymond, J. L. J. Comput.-Aided Mol. Des. 2011, 25, 649−662. (33) Congreve, M.; Carr, R.; Murray, C.; Jhoti, H. Drug Discovery Today 2003, 8, 876−877. (34) Carr, R. A. E.; Congreve, M.; Murray, C. W.; Rees, D. C. Drug Discovery Today 2005, 10, 987−992. (35) Santos, R. N.; Guido, R. V. C.; Oliva, G.; Dias, L. C.; Andricopulo, A. D. Med. Chem. 2011, 7, 155−164. (36) Molinspiration Cheminformatics (http://molinspiration.com), accessed September 10, 2012. (37) Ertl, P.; Rohde, B.; Selzer, P. J. Med. Chem. 2000, 43, 3714− 3717.

(38) Akhoon, B. A.; Gupta, S. K.; Dhaliwal, G.; Srivastava, M.; Gupta, S. K. J. Mol. Model. 2011, 17, 265−73. (39) Pieroni, M.; Lilienkampf, A.; Wang, Y.; Wan, B.; Cho, S.; Franzblau, S. G.; Kozikowski, A. P. ChemMedChem. 2010, 5, 1667− 1672. (40) Hsin, K. Y.; Morgan, H. P.; Shave, S. R.; Hinton, A. C.; Taylor, P.; Walkinshaw, M. D. Nucleic Acids Res. 2011, 39, D1042. (41) Steinbeck, C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. J. Chem. Inf. Comput. Sci. 2003, 43, 493−500.

444

dx.doi.org/10.1021/np3006875 | J. Nat. Prod. 2013, 76, 439−444