Counting up the compounds in a database should be as easy as, well, 1, 2, 3… But a recent thread on the Chemical Information Sources Discussion List (login required) pointed out that what counts is how (and what) vendors count. In this post, I'd like to comment on this confusion and explain how we count compounds for our premiere sourcing databases: the Available Chemicals Directory (ACD) and the Screening Compounds Directory (SCD).
The discussion on CHMINF is far from the first time that scientists have complained about vendors providing apparently misleading database counts. That’s because of the perception in the database market that size matters. If you’re selling a database, it looks good to boast that you have more compounds or reactions or suppliers than everyone else. So it shouldn’t be surprising that vendors will jockey for position.
As this thread indicated, so much goes into a compound database that it’s possible to count in many different ways, which is why things get so confusing. In sourcing, for instance, a chemical supplier may sell a given unique chemical as many different products (HPLC grade, reagent grade, etc.) and offer various packages of each (10 g, 500 g, 1 Kg, etc.). So as a database vendor, do you size up your database by counting the number of unique chemicals, or do you count each instance of that chemical separately, or do you count each package? All would be accurate! But for any comparisons between databases to be accurate, customers need to know what is being counted.
Sourcing databases present another challenge in the numbers game as well. If a compound was available once, but isn’t available now, should it still count? With ACD, our intent is to emphasize the word available. Sure, we have a few older catalogs in the database, but those are the exception. Our goal is to provide the most comprehensive database of available chemicals. So for us, size may not matter as much as utility.
So with that said, our aim is to provide all the relevant statistics about what our databases contain so that you can see exactly what you are getting. Here is where ACD and SCD stand today:
New Chemicals 286,407
New Catalogs 5
Updated Catalogs 77
With these numbers in hand you should be able to make the comparisons yourself.