||Note that the names, content, and URLs of Web sites constantly change, especially in health and biomedicine. This update does not update every last site mentioned in the textbook, but focuses on the more important ones.|
||A good overview about metadata has recently
||Riley, J (2017). Understanding Metadata:
What Is Metadata, and What Is It For? Baltimore, MD,
National Information Standards Organization. http://www.niso.org/apps/group_public/download.php/17446/Understanding%20Metadata.pdf.
||The 2015 MEDLINE/PubMed baseline contains
over 23 million citations, with an additional 4
million citations in PubMed but not in MEDLINE.
NLM continues to revamp all
(including bibliographic) databases, with former
subject-specific databases (e.g., AIDSLINE, Cancerlit) now
"subsets" within MEDLINE/PubMed. NLM has also retired the
Gateway interface, steering users instead to its other
interfaces. One of those interfaces is GQuery, which is
also sometimes called Enrtez, and provides a global entry
point to all NLM databases.
A description of all the NLM databases is published each year in the annual Database Issue of the journal, Nucleic Acids Research (2017).
NLM is also no longer adding scientific meeting abstracts from HIV/AIDS or cancer to its databases. It is, however, still adding meeting papers from informatics conferences, such as AMIA.
Anonymous (2017). Database Resources of the National Center for Biotechnology Information. Nucleic Acids Research. 45: D12-D17.
PubMed Health is a new NLM resource that links to reviews of
comparative effectiveness research. Its database includes
content from sources such as:
||Another Web feed standard is Atom, although
both it and RSS have been declining in use.
||PubMed Central continues to grow, although
only about 1900 journals are full participants. Content can
be viewed in several formats:
||Papers from the AMIA Annual Symposium
Proceedings (and the Symposium for Computer Applications in
Medical Care, or SCAMC, before it) are all available now in
||Other informatics-related journal content
available in full text in PubMed Central includes:
||A growing number of major journals are
scanning archives back to their inception, such as JAMA to
1883 and the BMJ to 1840 (Delmothe, 2009). The AMA has also
revamped not only its journal sites, but also changed the
names of some journals to include the JAMA moniker, e.g., Archives
of Internal Medicine became JAMA Internal Medicine
in 2013 (Winker et al., 2012).
||Delamothe, T. (2009). The new BMJ online
archive. British Medical Journal, 338: 1025-1026.
Winker, MA, Herron, M, et al. (2012). The JAMA Network Website: today's content on the future of medical publishing. Journal of the American Medical Association. 307: 2321.
||The NLM Bookshelf continues to grow, and now
has over 4500 items. The books are presented in a variety of
formats, including PubReader, which is also used for PubMed
Of note, the famous online genetics textbook, Online Mendelian Inheritance in Man (OMIM), is no longer maintained by the NLM. Although NLM Entrez still allows searching of OMIM, results are passed to off to the new OMIM site. In addition, OMIM does not link back to NCBI databases.
||The text of the online encyclopedia Wikipedia can be downloaded. This resource is also making efforts to improve the quality of its health-related content (Heilman, 2013).||http://en.wikipedia.org/wiki/Wikipedia:Database_download
Heilman, J (2013). Online encyclopedia provides free health info for all. Bulletin of the World Health Organization. 91: 8-9.
||A number of early reports from early IR researchers have been scanned and made available as part of the SIGIR Museum.||http://sigir.org/resources/museum/
||Some consumer health sites have gone away
(most notably, Intelihealth and Medpedia), while other new
ones have emerged, such as the site from the Mayo Clinic.
||Some of the URLs for clinical practice guidelines have changed, including those from the American College of Cardiology, the American College of Physicians, the American Academy of Pediatrics, the Institute for Clinical Systems Improvement, and the International Diabetes Federation. The guidelines from University of California San Francisco no longer seem to be available.||http://www.acc.org/guidelines
||Among the many blogs is my own, The
||InfoPOEMS and InfoRetriever, along with the
Cochrane Database of Systematic Reviews, are now part of
Essential Evidence Plus. Other evidence-based resources that
provide access to different varying types of content
The BrighamRad collection of teaching files no longer
appears to be available. Fortunately a number of other
radiology teaching file collections are available:
||A new image resource is Viziometrics, which
contains diagrams, visualizations, and photographs from
scientific publications (Lee, 2016).
||Lee, P, West, JD, et al. (2016).
Viziometrics: Analyzing Visual Information in the Scientific
Literature. arXiv. https://arxiv.org/abs/1605.04951.
A new and growing type of annotated content is clinical
decision support (CDS) for use in the electronic health
record (EHR), including decision rules and order sets. Some
providers are commercial companies, such as Zynx and Thomson
Reuters, as well as EHR vendors themselves.
Another CDS resource, beginning initially as a dermatology collection and then expanding to other images, is VisualDX, which also has mobile app version.
||An excellent source of information for omics
and related data is the annual Database Issue of the
journal, Nucleic Acids Research (Galperin, 2017).
A prominent article in each year's issue is an overview of
the database resources from the NLM National Center for
Biotechnology Information (NCBI, 2017). The NLM
continues to evolve and improve its genomics resources in
response to new technologies, data, types, and usability
concerns. A key feature is linkage across databases.
Another information source for genomics is Gene Wiki, which is an effort to annotate the human genome within Wikipedia (Hoffmann, 2008; Huss et al., 2008).
One of the largest amounts of activity going on now is discovering the clinical effects (phenotype) of genomic variation (genotype). As such, a new resource, ClinVar, has been developed (Landrum, 2016).
|Galperin, MY, Fernández-Suárez,
XM, et al. (2017). The 24th annual Nucleic Acids Research
database issue: a look back and upcoming changes. Nucleic
Acids Research. 45: D1-D11.
Anonymous (2017). Database Resources of the National Center for Biotechnology Information. Nucleic Acids Research. 45: D12-D17.
Hoffmann, R. (2008). A wiki for the life sciences where authorship matters. Nature Genetics, 40: 1047-1051.
Huss, JW, Orozco, C, et al. (2008). A gene wiki for community annotation of gene function. PLoS Biology. 6(7): e175. http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0060175.
Landrum, MJ, Lee, JM, et al. (2016). ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Research. 44: D862-D868.
||The ClinicalTrials.gov database continues to
evolve. Starting from a catalog of clinical trials sponsored
by NIH, it has evolved to become a trials registration
system (Zarin, 2013; Zarin, 2017), and now a means to report
results of trials. A new rule expanding the legal mandate
for sponsors and others responsible for certain clinical
trials of FDA-regulated drug, biologic, and device products
to register their studies and report summary results
information to ClinicalTrials.gov was announced in 2016
(Zarin, 2016). A next step under consideration is the
inclusion of individual patient data (Zarin, 2016).
||Zarin, DA and Tse, T (2013). Trust but
verify: trial registration and determining fidelity to the
protocol. Annals of Internal Medicine. 159: 65-67.
Zarin, DA, Tse, T, et al. (2017). Update on trial registration 11 years after the ICMJE policy was established. New England Journal of Medicine. 376: 383-391.
Zarin, DA, Tse, T, et al. (2016). Trial reporting in ClinicalTrials.gov — the final rule. New England Journal of Medicine. 375: 1998-2004.
Zarin, DA and Tse, T (2016). Sharing individual participant data (IPD) within the context of the trial reporting system (TRS). PLoS Medicine. 13(1): e1001946. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001946.
||Many believe that the next step in data
transparency and utility in clinical trials is to make
researchers share their raw data. High-impact clinical
journals do not require research data to be made available,
and when they do, investigators rarely adhere to the
requirement (Alsheikh-Ali et al., 2011). It is argued that
clinical trial data is a "public good" (Rodwin and Abramson,
2012) and that we must usher in a new era of "open science
through data sharing" (Ross and Krumholz, 2013). Eicher et
al. (2013) examine the issues for and against this notion,
expressing concern about patient privacy and data being
"vulnerable to distortion." Other issues have been raised by
the International Consortium of
Investigators for Fairness in Trial Data Sharing, who
express concern over misleading or inaccurate analyses as
well as efforts aimed at discrediting or undermining the
original research (Anonymous, 2016). They also express
concern about the costs, given that there are over 27,000
RCTs performed each year. As such, this group calls for an
embargo on reuse of data for two years plus another
half-year for each year of the length of the RCT.
If datasets from clinical trials and other research are to be made available, what kind of databases are designed and populated? An effort funded by the US NIH has been the biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) (Ohno-Machado, 2015). bioCADDIE can be accessed via the DataMed search interface and is based on a data tag suite (DATS) for datasets (Sansone, 2017).
This and related efforts have built upon the FAIR principles of Findability, Accessibility, Interoperability and Reusability (Wilkinson, 2016).
Another source of research data sets is r3data.org, whose metadata schema has been described by Rücknagel et al. (2015).
Alsheikh-Ali, AA, Qureshi, W, et al. (2011). Public
availability of published research data in high-impact
journals. PLoS ONE. 6(9): e24357. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0024357.
Rodwin, MA and Abramson, JD (2012). Clinical trial data as a public good. Journal of the American Medical Association. 308: 871-872.
Ross, JS and Krumholz, HM (2013). Ushering in a new era of open science through data sharing: the wall must come down. Journal of the American Medical Association. 309: 1355-1356.
Eichler, HG, Abadie, E, et al. (2012). Open clinical trial data for all? A view from regulators. PLoS Clinical Trials. 9(4): e1001202. http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001202
Anonymous (2016). Toward fairness in data sharing. New England Journal of Medicine. 375: 405-407.
Ohno-Machado, L, Alter, G, et al. (2015). bioCADDIE white paper - Data Discovery Index. La Jolla, CA, Univeristy of California San Diego. https://figshare.com/articles/bioCADDIE_white_paper_Data_Discovery_Index/1362572.
Sansone, SA, Gonzalez-Beltran, A, et al. (2017). DATS: the data tag suite to enable discoverability of datasets. bioRxiv. http://biorxiv.org/content/early/2017/01/25/103143.
Wilkinson, MD, Dumontier, M, et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 3: 160018. http://www.nature.com/articles/sdata201618.
Rücknagel, J, Vierkant, P, et al. (2015). Metadata Schema for the Description of Research Data Repositories: version 3.0. Potsdam, Germany, Helmholtz Centre Potsdam. http://gfzpublic.gfz-potsdam.de/pubman/item/escidoc:1397899:6.
||The US Food and Drug Administration (FDA) has
improved its information availability. Two resources of note
Schwartz, LM, Woloshin, S, et al. (2016). ClinicalTrials.gov and Drugs@FDA: a comparison of results reporting for new drug approval trials. Annals of Internal Medicine. 165: 421-430.
||The CRISP database of grants funded by NIH has been retired and replaced by the NIH RePORTER system.||https://projectreporter.nih.gov/reporter.cfm
||Another interesting type of data (or representation of data) is cartograms, with a good hub source being Cartogram Central.||http://www.ncgia.ucsb.edu/projects/Cartogram_Central/index.html
||More and more publishers are aggregating
their content into large collections that can be marketed as
a single entity. These include the Scitable resource of
Nature Publishing, Sciverse of Elsevier, and several of the
resources described in section 3.2 above. Sciverse has an
application programming interface (API) that allows others
to write interactive apps. The NLM also has eUtilities with
||The URL for DrugBank has changed.
||Another model organism database is the Zebra Fish Information Network. I have had a chance to visit the Zebra Fish colony in Eugene, OR!||https://zfin.org/
||The future? NIH Commons aims to be shared
space for all digital objects of biomedical research.