Browse Source

first inclusion of ITIS data

main
parent
commit
9c566585b0
  1. 5
      DESCRIPTION
  2. 6
      NAMESPACE
  3. 40
      NEWS.md
  4. 46
      R/data.R
  5. 11
      R/globals.R
  6. 4
      R/key_antibiotics.R
  7. 6
      R/misc.R
  8. 384
      R/mo.R
  9. 170
      R/mo_property.R
  10. 26
      README.md
  11. BIN
      data/microorganisms.old.rda
  12. BIN
      data/microorganisms.rda
  13. BIN
      data/septic_patients.rda
  14. 73
      man/as.mo.Rd
  15. BIN
      man/figures/itis_logo.jpg
  16. 33
      man/microorganisms.Rd
  17. 33
      man/microorganisms.old.Rd
  18. 2
      man/microorganisms.umcg.Rd
  19. 67
      man/mo_property.Rd
  20. 10
      tests/testthat/test-count.R
  21. 18
      tests/testthat/test-first_isolate.R
  22. 6
      tests/testthat/test-freq.R
  23. 14
      tests/testthat/test-join_microorganisms.R
  24. 1
      tests/testthat/test-key_antibiotics.R
  25. 6
      tests/testthat/test-kurtosis.R
  26. 135
      tests/testthat/test-mo.R
  27. 32
      tests/testthat/test-mo_property.R
  28. 40
      tests/testthat/test-portion.R
  29. 6
      tests/testthat/test-skewness.R

5
DESCRIPTION

@ -1,6 +1,6 @@ @@ -1,6 +1,6 @@
Package: AMR
Version: 0.3.0.9008
Date: 2018-09-16
Version: 0.3.0.9009
Date: 2018-09-24
Title: Antimicrobial Resistance Analysis
Authors@R: c(
person(
@ -48,6 +48,7 @@ Imports: @@ -48,6 +48,7 @@ Imports:
backports,
clipr,
curl,
data.table (>= 1.9.0),
dplyr (>= 0.7.0),
hms,
knitr (>= 1.0.0),

6
NAMESPACE

@ -90,7 +90,7 @@ export(kurtosis) @@ -90,7 +90,7 @@ export(kurtosis)
export(labels_rsi_count)
export(left_join_microorganisms)
export(like)
export(mo_aerobic)
export(mo_TSN)
export(mo_class)
export(mo_family)
export(mo_fullname)
@ -101,6 +101,7 @@ export(mo_phylum) @@ -101,6 +101,7 @@ export(mo_phylum)
export(mo_property)
export(mo_shortname)
export(mo_species)
export(mo_subkingdom)
export(mo_subspecies)
export(mo_taxonomy)
export(mo_type)
@ -161,6 +162,9 @@ exportMethods(summary.rsi) @@ -161,6 +162,9 @@ exportMethods(summary.rsi)
importFrom(clipr,read_clip_tbl)
importFrom(clipr,write_clip)
importFrom(curl,nslookup)
importFrom(data.table,as.data.table)
importFrom(data.table,data.table)
importFrom(data.table,setkey)
importFrom(dplyr,"%>%")
importFrom(dplyr,arrange)
importFrom(dplyr,arrange_at)

40
NEWS.md

@ -1,6 +1,30 @@ @@ -1,6 +1,30 @@
# 0.3.0.90xx (latest development version)
#### New
* The data set `microorganisms` now contains **all microbial taxonomic data from ITIS** (kingdoms Bacteria, Fungi and Protozoa), the Integrated Taxonomy Information System, available via https://itis.gov. The data set now contains more than 18,000 microorganisms with all known bacteria, fungi and protozoa according ITIS with genus, species, subspecies, family, order, class, phylum and subkingdom. The new data set `microorganisms.old` contains all previously known taxonomic names from those kingdoms.
* Aliases for existing function `mo_property`
* Taxonomic names: `mo_phylum`, `mo_class`, `mo_order`, `mo_family`, `mo_genus`, `mo_species`, `mo_subspecies`
* Semantic names: `mo_fullname`, `mo_shortname`
* Microbial properties: `mo_type`, `mo_gramstain`.
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese, and it defaults to the systems locale:
```r
mo_gramstain("E. coli")
# [1] "Gram negative"
mo_gramstain("E. coli", language = "de") # "de" = Deutsch / German
# [1] "Gramnegativ"
mo_gramstain("E. coli", language = "es") # "es" = Español / Spanish
# [1] "Gram negativo"
mo_fullname("S. group A") # when run on a on a Portuguese system
# [1] "Streptococcus grupo A"
```
Furthermore, old taxonomic names kan easily be looked up and give a note about the taxonomic change:
```r
mo_fullname("Pseudomonas facilis")
# Note: 'Pseudomonas facilis' was renamed to 'Acidovorax facilis' by Willems et al. in 1990
# [1] "Acidovorax facilis"
```
* Functions `count_R`, `count_IR`, `count_I`, `count_SI` and `count_S` to selectively count resistant or susceptible isolates
* Extra function `count_df` (which works like `portion_df`) to get all counts of S, I and R of a data set with antibiotic columns, with support for grouped variables
* Function `is.rsi.eligible` to check for columns that have valid antimicrobial results, but do not have the `rsi` class yet. Transform the columns of your raw data with: `data %>% mutate_if(is.rsi.eligible, as.rsi)`
@ -27,21 +51,7 @@ @@ -27,21 +51,7 @@
* All old syntaxes will still work with this version, but will throw warnings
* Function `labels_rsi_count` to print datalabels on a RSI `ggplot2` model
* Functions `as.atc` and `is.atc` to transform/look up antibiotic ATC codes as defined by the WHO. The existing function `guess_atc` is now an alias of `as.atc`.
* Aliases for existing function `mo_property` and new data from ITIS (Integrated Taxonomic Information System, https://www.itis.gov)
* Taxonomic names: `mo_phylum`, `mo_class`, `mo_order`, `mo_family`, `mo_genus`, `mo_species`, `mo_subspecies`
* Semantic names: `mo_fullname`, `mo_shortname`
* Microbial properties: `mo_aerobic`, `mo_type`, `mo_gramstain`.
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese, and it defaults to the systems locale:
```r
mo_gramstain("E. coli")
# [1] "Negative rods"
mo_gramstain("E. coli", language = "de") # "de" = Deutsch / German
# [1] "Negative Stäbchen"
mo_gramstain("E. coli", language = "es") # "es" = Español / Spanish
# [1] "Bacilos negativos"
mo_fullname("S. group A") # when run on a on a Portuguese system
# [1] "Streptococcus grupo A"
```
* Function `ab_property` and its aliases: `ab_name`, `ab_tradenames`, `ab_certe`, `ab_umcg` and `ab_trivial_nl`
* Introduction to AMR as a vignette

46
R/data.R

@ -120,30 +120,48 @@ @@ -120,30 +120,48 @@
#
"antibiotics"
#' Data set with human pathogenic microorganisms
#' Data set with taxonomic data from ITIS
#'
#' A data set containing (potential) human pathogenic microorganisms. MO codes can be looked up using \code{\link{guess_mo}}.
#' @format A \code{\link{tibble}} with 2,642 observations and 14 variables:
#' A data set containing the complete microbial taxonomy of the kingdoms Bacteria, Fungi and Protozoa. MO codes can be looked up using \code{\link{as.mo}}.
#' @inheritSection as.mo ITIS
#' @format A \code{\link{data.frame}} with 18,831 observations and 15 variables:
#' \describe{
#' \item{\code{mo}}{ID of microorganism}
#' \item{\code{bactsys}}{Bactsyscode of microorganism}
#' \item{\code{genus}}{Genus name of microorganism, like \code{"Echerichia"}}
#' \item{\code{species}}{Species name of microorganism, like \code{"coli"}}
#' \item{\code{subspecies}}{Subspecies name of bio-/serovar of microorganism, like \code{"EHEC"}}
#' \item{\code{fullname}}{Full name, like \code{"Echerichia coli (EHEC)"}}
#' \item{\code{gramstain}}{Gram of microorganism, like \code{"Negative rods"}}
#' \item{\code{aerobic}}{Logical whether bacteria is aerobic}
#' \item{\code{tsn}}{Taxonomic Serial Number (TSN), as defined by ITIS}
#' \item{\code{genus}}{Taxonomic genus of the microorganism as found in ITIS, see Source}
#' \item{\code{species}}{Taxonomic species of the microorganism as found in ITIS, see Source}
#' \item{\code{subspecies}}{Taxonomic subspecies of the microorganism as found in ITIS, see Source}
#' \item{\code{fullname}}{Full name, like \code{"Echerichia coli"}}
#' \item{\code{family}}{Taxonomic family of the microorganism as found in ITIS, see Source}
#' \item{\code{order}}{Taxonomic order of the microorganism as found in ITIS, see Source}
#' \item{\code{class}}{Taxonomic class of the microorganism as found in ITIS, see Source}
#' \item{\code{phylum}}{Taxonomic phylum of the microorganism as found in ITIS, see Source}
#' \item{\code{type}}{Type of microorganism, like \code{"Bacteria"} and \code{"Fungus/yeast"}}
#' \item{\code{subkingdom}}{Taxonomic subkingdom of the microorganism as found in ITIS, see Source}
#' \item{\code{gramstain}}{Gram of microorganism, like \code{"Gram negative"}}
#' \item{\code{type}}{Type of microorganism, like \code{"Bacteria"} and \code{"Fungi"}}
#' \item{\code{prevalence}}{A rounded integer based on prevalence of the microorganism. Used internally by \code{\link{as.mo}}, otherwise quite meaningless.}
#' \item{\code{mo.old}}{The old ID for package versions 0.3.0 and lower.}
#' }
#' @source Integrated Taxonomic Information System (ITIS) on-line database, \url{https://www.itis.gov}.
#' @seealso \code{\link{guess_mo}} \code{\link{antibiotics}} \code{\link{microorganisms.umcg}}
#' @source [3] Integrated Taxonomic Information System (ITIS) on-line database, \url{https://www.itis.gov}.
#' @seealso \code{\link{as.mo}} \code{\link{mo_property}} \code{\link{microorganisms.umcg}}
"microorganisms"
#' Data set with old taxonomic data from ITIS
#'
#' A data set containing old, previously valid, taxonomic names. This data set is used internally by \code{\link{as.mo}}.
#' @inheritSection as.mo ITIS
#' @format A \code{\link{data.frame}} with 58 observations and 5 variables:
#' \describe{
#' \item{\code{tsn}}{Old Taxonomic Serial Number (TSN), as defined by ITIS}
#' \item{\code{name}}{Old taxonomic name of the microorganism as found in ITIS, see Source}
#' \item{\code{tsn_new}}{New Taxonomic Serial Number (TSN), as defined by ITIS}
#' \item{\code{authors}}{Authors responsible for renaming as found in ITIS, see Source}
#' \item{\code{year}}{Year in which the literature was published about the renaming as found in ITIS, see Source}
#' }
#' @source [3] Integrated Taxonomic Information System (ITIS) on-line database, \url{https://www.itis.gov}.
#' @seealso \code{\link{as.mo}} \code{\link{mo_property}} \code{\link{microorganisms}}
"microorganisms.old"
#' Translation table for UMCG
#'
#' A data set containing all bacteria codes of UMCG MMB. These codes can be joined to data with an ID from \code{\link{microorganisms}$mo} (using \code{\link{left_join_microorganisms}}). GLIMS codes can also be translated to valid \code{MO}s with \code{\link{guess_mo}}.
@ -152,7 +170,7 @@ @@ -152,7 +170,7 @@
#' \item{\code{umcg}}{Code of microorganism according to UMCG MMB}
#' \item{\code{mo}}{Code of microorganism in \code{\link{microorganisms}}}
#' }
#' @seealso \code{\link{guess_mo}} \code{\link{microorganisms}}
#' @seealso \code{\link{as.mo}} \code{\link{microorganisms}}
"microorganisms.umcg"
#' Data set with 2000 blood culture isolates of septic patients

11
R/globals.R

@ -17,9 +17,11 @@ @@ -17,9 +17,11 @@
# ==================================================================== #
globalVariables(c(".",
"..property",
"antibiotic",
"Antibiotic",
"antibiotics",
"authors",
"cnt",
"count",
"cum_count",
@ -29,6 +31,7 @@ globalVariables(c(".", @@ -29,6 +31,7 @@ globalVariables(c(".",
"fctlvl",
"first_isolate_row_index",
"Freq",
"fullname",
"genus",
"gramstain",
"Interpretation",
@ -40,8 +43,11 @@ globalVariables(c(".", @@ -40,8 +43,11 @@ globalVariables(c(".",
"median",
"mic",
"microorganisms",
"microorganisms.old",
"mo",
"mo.old",
"n",
"name",
"observations",
"other_pat_or_mo",
"Pasted",
@ -52,6 +58,9 @@ globalVariables(c(".", @@ -52,6 +58,9 @@ globalVariables(c(".",
"S",
"septic_patients",
"species",
"tsn",
"tsn_new",
"value",
"Value",
"y"))
"y",
"year"))

4
R/key_antibiotics.R

@ -150,7 +150,7 @@ key_antibiotics <- function(tbl, @@ -150,7 +150,7 @@ key_antibiotics <- function(tbl,
# Gram +
tbl <- tbl %>% mutate(key_ab =
if_else(gramstain %like% '^Positive ',
if_else(gramstain == "Gram positive",
apply(X = tbl[, gram_positive],
MARGIN = 1,
FUN = function(x) paste(x, collapse = "")),
@ -158,7 +158,7 @@ key_antibiotics <- function(tbl, @@ -158,7 +158,7 @@ key_antibiotics <- function(tbl,
# Gram -
tbl <- tbl %>% mutate(key_ab =
if_else(gramstain %like% '^Negative ',
if_else(gramstain == "Gram negative",
apply(X = tbl[, gram_negative],
MARGIN = 1,
FUN = function(x) paste(x, collapse = "")),

6
R/misc.R

@ -157,6 +157,12 @@ tbl_parse_guess <- function(tbl, @@ -157,6 +157,12 @@ tbl_parse_guess <- function(tbl,
#' @importFrom dplyr case_when
Sys.locale <- function() {
alreadyset <- getOption("AMR_locale")
if (!is.null(alreadyset)) {
if (tolower(alreadyset) %in% c("en", "de", "nl", "es", "fr", "pt", "it")) {
return(tolower(alreadyset))
}
}
sys <- base::Sys.getlocale()
case_when(
sys %like% '(Deutsch|German|de_)' ~ "de",

384
R/mo.R

@ -18,38 +18,62 @@ @@ -18,38 +18,62 @@
#' Transform to microorganism ID
#'
#' Use this function to determine a valid ID based on a genus (and species). Determination is done using Artificial Intelligence (AI), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
#' Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms \emph{Bacteria}, \emph{Fungi} and \emph{Protozoa} (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
#' @param x a character vector or a \code{data.frame} with one or two columns
#' @param Becker a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1].
#'
#' This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".
#' @param Lancefield a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
#' @param Lancefield a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
#'
#' This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D.
#' @param allow_uncertain a logical to indicate whether empty results should be checked for only a part of the input string. When results are found, a warning will be given about the uncertainty and the result.
#' @rdname as.mo
#' @aliases mo
#' @keywords mo Becker becker Lancefield lancefield guess
#' @details \code{guess_mo} is an alias of \code{as.mo}.
#' @details
#' A microbial ID (class: \code{mo}) typically looks like these examples:\cr
#' \preformatted{
#' Code Full name
#' --------------- --------------------------------------
#' B_KLBSL Klebsiella
#' B_KLBSL_PNE Klebsiella pneumoniae
#' B_KLBSL_PNE_RHI Klebsiella pneumoniae rhinoscleromatis
#' | | | |
#' | | | |
#' | | | ----> subspecies, a 3-4 letter acronym
#' | | ----> species, a 3-4 letter acronym
#' | ----> genus, a 5-7 letter acronym, mostly without vowels
#' ----> taxonomic kingdom, either Bacteria (B), Fungi (F) or Protozoa (P)
#' }
#'
#' Use the \code{\link{mo_property}} functions to get properties based on the returned code, see Examples.
#'
#' Thus function uses Artificial Intelligence (AI) to help getting more logical results, based on type of input and known prevalence of human pathogens. For example:
#' This function uses Artificial Intelligence (AI) to help getting more logical results, based on type of input and known prevalence of human pathogens. For example:
#' \itemize{
#' \item{\code{"E. coli"} will return the ID of \emph{Escherichia coli} and not \emph{Entamoeba coli}, although the latter would alphabetically come first}
#' \item{\code{"H. influenzae"} will return the ID of \emph{Haemophilus influenzae} and not \emph{Haematobacter influenzae} for the same reason}
#' \item{Something like \code{"p aer"} will return the ID of \emph{Pseudomonas aeruginosa} and not \emph{Pasteurella aerogenes}}
#' \item{Something like \code{"stau"} or \code{"S aur"} will return the ID of \emph{Staphylococcus aureus} and not \emph{Staphylococcus auricularis}}
#' }
#' Moreover, this function also supports ID's based on only Gram stain, when the species is not known. \cr
#' For example, \code{"Gram negative rods"} and \code{"GNR"} will both return the ID of a Gram negative rod: \code{GNR}.
#' @source
#' This means that looking up human non-pathogenic microorganisms takes a longer time compares to human pathogenic microorganisms.
#'
#' \code{guess_mo} is an alias of \code{as.mo}.
#' @section ITIS:
#' \if{html}{\figure{itis_logo.jpg}{options: height=60px style=margin-bottom:5px} \cr}
#' This \code{AMR} package contains the \strong{complete microbial taxonomic data} from the publicly available Integrated Taxonomic Information System (ITIS, https://www.itis.gov). ITIS is a partnership of U.S., Canadian, and Mexican agencies and taxonomic specialists [3]. The complete taxonomic kingdoms Bacteria, Fungi and Protozoa (from subkingdom to the subspecies level) are included in this package.
# (source as section, so it can be inherited by mo_property:)
#' @section Source:
#' [1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13}
#'
#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571}
#'
#' [3] Integrated Taxonomic Information System (ITIS). Retrieved September 2018. \url{http://www.itis.gov}
#' @export
#' @importFrom dplyr %>% pull left_join arrange
#' @importFrom dplyr %>% pull left_join
#' @importFrom data.table as.data.table setkey
#' @return Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}.
#' @seealso \code{\link{microorganisms}} for the dataframe that is being used to determine ID's.
#' @seealso \code{\link{microorganisms}} for the \code{data.frame} with ITIS content that is being used to determine ID's. \cr
#' The \code{\link{mo_property}} functions (like \code{\link{mo_genus}}, \code{\link{mo_gramstain}}) to get properties based on the returned code.
#' @examples
#' # These examples all return "STAAUR", the ID of S. aureus:
#' as.mo("stau")
@ -61,22 +85,27 @@ @@ -61,22 +85,27 @@
#' as.mo("MRSA") # Methicillin Resistant S. aureus
#' as.mo("VISA") # Vancomycin Intermediate S. aureus
#' as.mo("VRSA") # Vancomycin Resistant S. aureus
#' as.mo(369) # Search on TSN (Taxonomic Serial Number), a unique identifier
#' # for the Integrated Taxonomic Information System (ITIS)
#'
#' as.mo("Streptococcus group A")
#' as.mo("GAS") # Group A Streptococci
#' as.mo("GBS") # Group B Streptococci
#'
#' # guess_mo is an alias of as.mo and works the same
#' guess_mo("S. epidermidis") # will remain species: STAEPI
#' guess_mo("S. epidermidis", Becker = TRUE) # will not remain species: STACNS
#' guess_mo("S. epidermidis") # will remain species: B_STPHY_EPI
#' guess_mo("S. epidermidis", Becker = TRUE) # will not remain species: B_STPHY_CNS
#'
#' guess_mo("S. pyogenes") # will remain species: STCPYO
#' guess_mo("S. pyogenes", Lancefield = TRUE) # will not remain species: STCGRA
#' guess_mo("S. pyogenes") # will remain species: B_STRPTC_PYO
#' guess_mo("S. pyogenes", Lancefield = TRUE) # will not remain species: B_STRPTC_GRA
#'
#' # Use mo_* functions to get a specific property based on `mo`
#' Ecoli <- as.mo("E. coli") # returns `ESCCOL`
#' Ecoli <- as.mo("E. coli") # returns `B_ESCHR_COL`
#' mo_genus(Ecoli) # returns "Escherichia"
#' mo_gramstain(Ecoli) # returns "Negative rods"
#' mo_gramstain(Ecoli) # returns "Gram negative"
#' # but it uses as.mo internally too, so you could also just use:
#' mo_genus("E. coli") # returns "Escherichia"
#'
#'
#' \dontrun{
#' df$mo <- as.mo(df$microorganism_name)
@ -96,7 +125,7 @@ @@ -96,7 +125,7 @@
#' df <- df %>%
#' mutate(mo = guess_mo(paste(genus, species)))
#' }
as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, allow_uncertain = FALSE) {
if (NCOL(x) == 2) {
# support tidyverse selection like: df %>% select(colA, colB)
@ -118,17 +147,33 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -118,17 +147,33 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
}
}
MOs <- AMR::microorganisms %>%
arrange(prevalence) %>% # more expected result on multiple findings
filter(!mo %like% '^_FAM', # don't search in those
(nchar(mo) > 3 | mo %in% c("GNR", "GPR", "GNC", "GPC"))) # no genera
MOs <- as.data.table(AMR::microorganisms)
setkey(MOs, prevalence, tsn)
MOs_mostprevalent <- MOs[prevalence != 9999,]
MOs_allothers <- NULL # will be set later, if needed
MOs_old <- NULL # will be set later, if needed
if (all(unique(x) %in% MOs[,mo])) {
class(x) <- "mo"
attr(x, 'package') <- 'AMR'
attr(x, 'ITIS') <- TRUE
return(x)
}
if (AMR::is.mo(x) & isTRUE(attributes(x)$ITIS)) {
# check for new mo class, data coming from ITIS
return(x)
}
failures <- character(0)
x_input <- x
# only check the uniques, which is way faster
x <- unique(x)
x_backup <- x
x_backup <- trimws(x, which = "both")
x_species <- paste(x_backup, "species")
# translate to English for supported languages of mo_property
x <- gsub("(Gruppe|gruppe|groep|grupo|gruppo|groupe)", "group", x)
# remove 'empty' genus and species values
@ -138,6 +183,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -138,6 +183,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
# but spaces before and after should be omitted
x <- trimws(x, which = "both")
x_trimmed <- x
x_trimmed_species <- paste(x_trimmed, "species")
# replace space by regex sign
x_withspaces <- gsub(" ", ".* ", x, fixed = TRUE)
x <- gsub(" ", ".*", x, fixed = TRUE)
@ -148,111 +194,137 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -148,111 +194,137 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
x_withspaces <- paste0('^', x_withspaces, '$')
# cat(paste0('x "', x, '"\n'))
# cat(paste0('x_species "', x_species, '"\n'))
# cat(paste0('x_withspaces_all "', x_withspaces_all, '"\n'))
# cat(paste0('x_withspaces_start "', x_withspaces_start, '"\n'))
# cat(paste0('x_withspaces "', x_withspaces, '"\n'))
# cat(paste0('x_backup "', x_backup, '"\n'))
# cat(paste0('x_trimmed "', x_trimmed, '"\n'))
# cat(paste0('x_trimmed_species "', x_trimmed_species, '"\n'))
for (i in 1:length(x)) {
if (identical(x_trimmed[i], "")) {
if (identical(x_trimmed[i], "") | is.na(x_trimmed[i])) {
# empty values
x[i] <- NA
next
}
if (toupper(x_backup[i]) %in% AMR::microorganisms$mo) {
# is already a valid MO code
x[i] <- toupper(x_backup[i])
next
}
if (toupper(x_trimmed[i]) %in% AMR::microorganisms$mo) {
# is already a valid MO code
x[i] <- toupper(x_trimmed[i])
next
}
if (tolower(x_backup[i]) %in% tolower(AMR::microorganisms$fullname)) {
# is exact match in fullname
x[i] <- AMR::microorganisms[which(AMR::microorganisms$fullname == x_backup[i]), ]$mo[1L]
next
}
# CoNS/CoPS in different languages (support for German, Dutch, Spanish, Portuguese) ----
if (tolower(x[i]) %like% '[ck]oagulas[ea] negatie?[vf]'
| tolower(x_trimmed[i]) %like% '[ck]oagulas[ea] negatie?[vf]'
| tolower(x[i]) %like% '[ck]o?ns[^a-z]?$') {
# coerce S. coagulase negative
x[i] <- 'STACNS'
next
}
if (tolower(x[i]) %like% '[ck]oagulas[ea] positie?[vf]'
| tolower(x_trimmed[i]) %like% '[ck]oagulas[ea] positie?[vf]'
| tolower(x[i]) %like% '[ck]o?ps[^a-z]?$') {
# coerce S. coagulase positive
x[i] <- 'STACPS'
next
}
# translate known trivial abbreviations to genus + species ----
if (!is.na(x_trimmed[i])) {
if (toupper(x_trimmed[i]) == 'MRSA'
| toupper(x_trimmed[i]) == 'VISA'
| toupper(x_trimmed[i]) == 'VRSA') {
x[i] <- 'STAAUR'
x[i] <- 'B_STPHY_AUR'
next
}
if (toupper(x_trimmed[i]) == 'MRSE') {
x[i] <- 'STAEPI'
x[i] <- 'B_STPHY_EPI'
next
}
if (toupper(x_trimmed[i]) == 'VRE') {
x[i] <- 'ENCSPP'
x[i] <- 'B_ENTRC'
next
}
if (toupper(x_trimmed[i]) == 'MRPA') {
# multi resistant P. aeruginosa
x[i] <- 'PSEAER'
x[i] <- 'B_PDMNS_AER'
next
}
if (toupper(x_trimmed[i]) %in% c('PISP', 'PRSP', 'VISP', 'VRSP')) {
# peni I, peni R, vanco I, vanco R: S. pneumoniae
x[i] <- 'STCPNE'
x[i] <- 'B_STRPTC_PNE'
next
}
if (toupper(x_trimmed[i]) %like% '^G[ABCDFGHK]S$') {
x[i] <- gsub("G([ABCDFGHK])S", "B_STRPTC_GR\\1", x_trimmed[i])
next
}
if (toupper(x_trimmed[i]) %like% '^G[ABCDFHK]S$') {
x[i] <- gsub("G([ABCDFHK])S", "STCGR\\1", x_trimmed[i])
# CoNS/CoPS in different languages (support for German, Dutch, Spanish, Portuguese) ----
if (tolower(x[i]) %like% '[ck]oagulas[ea] negatie?[vf]'
| tolower(x_trimmed[i]) %like% '[ck]oagulas[ea] negatie?[vf]'
| tolower(x[i]) %like% '[ck]o?ns[^a-z]?$') {
# coerce S. coagulase negative
x[i] <- 'B_STPHY_CNS'
next
}
if (tolower(x[i]) %like% '[ck]oagulas[ea] positie?[vf]'
| tolower(x_trimmed[i]) %like% '[ck]oagulas[ea] positie?[vf]'
| tolower(x[i]) %like% '[ck]o?ps[^a-z]?$') {
# coerce S. coagulase positive
x[i] <- 'B_STPHY_CPS'
next
}
}
# try any match keeping spaces ----
found <- MOs[which(MOs$fullname %like% x_withspaces[i]),]$mo
# FIRST TRY FULLNAMES AND CODES
# if only genus is available, don't select species
if (all(!c(x[i], x_trimmed[i]) %like% " ")) {
found <- MOs[tolower(fullname) %in% tolower(c(x_species[i], x_trimmed_species[i])), mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
if (nchar(x_trimmed[i]) > 4) {
# not when abbr is esco, stau, klpn, etc.
found <- MOs[tolower(fullname) %like% gsub(" ", ".*", x_trimmed_species[i], fixed = TRUE), mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
}
}
# search for GLIMS code ----
found <- AMR::microorganisms.umcg[which(toupper(AMR::microorganisms.umcg$umcg) == toupper(x_trimmed[i])),]$mo
if (length(found) > 0) {
x[i] <- found[1L]
x[i] <- MOs[mo.old == found, mo][1L]
next
}
# try the same, now based on genus + species ----
found <- MOs[which(paste(MOs$genus, MOs$species) %like% x_withspaces[i]),]$mo
# TRY FIRST THOUSAND MOST PREVALENT IN HUMAN INFECTIONS ----
found <- MOs_mostprevalent[tolower(fullname) %in% tolower(c(x_backup[i], x_trimmed[i])), mo]
# most probable: is exact match in fullname
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_mostprevalent[tsn == x_trimmed[i], mo]
# is a valid TSN
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_mostprevalent[mo == toupper(x_backup[i]), mo]
# is a valid mo
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_mostprevalent[mo.old == toupper(x_backup[i])
| (substr(x_backup[i], 4, 6) == "SPP" & mo.old == substr(x_backup[i], 1, 3)), mo]
# is a valid old mo
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match with genus, keeping spaces, not ending with $ ----
found <- MOs[which(MOs$genus %like% x_withspaces_start[i] & MOs$mo %like% 'SPP$'),]$mo
# try any match keeping spaces ----
found <- MOs_mostprevalent[fullname %like% x_withspaces[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match keeping spaces, not ending with $ ----
found <- MOs[which(MOs$fullname %like% x_withspaces_start[i]),]$mo
found <- MOs_mostprevalent[fullname %like% x_withspaces_start[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match diregarding spaces ----
found <- MOs[which(MOs$fullname %like% x[i]),]$mo
found <- MOs_mostprevalent[fullname %like% x[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
@ -260,14 +332,100 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -260,14 +332,100 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
# try fullname without start and stop regex, to also find subspecies ----
# like "K. pneu rhino" -> "Klebsiella pneumoniae (rhinoscleromatis)" = KLEPNERH
found <- MOs[which(gsub("[\\(\\)]", "", MOs$fullname) %like% x_withspaces_all[i]),]$mo
found <- MOs_mostprevalent[fullname %like% x_withspaces_start[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# search for GLIMS code ----
found <- AMR::microorganisms.umcg[which(toupper(AMR::microorganisms.umcg$umcg) == toupper(x_trimmed[i])),]$mo
# try splitting of characters and then find ID ----
# like esco = E. coli, klpn = K. pneumoniae, stau = S. aureus
x_split <- x
x_length <- nchar(x_trimmed[i])
x_split[i] <- paste0(x_trimmed[i] %>% substr(1, x_length / 2) %>% trimws(),
'.* ',
x_trimmed[i] %>% substr((x_length / 2) + 1, x_length) %>% trimws())
found <- MOs_mostprevalent[fullname %like% paste0('^', x_split[i]), mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match with text before and after original search string ----
# so "negative rods" will be "GNR"
# if (x_trimmed[i] %like% "^Gram") {
# x_trimmed[i] <- gsub("^Gram", "", x_trimmed[i], ignore.case = TRUE)
# # remove leading and trailing spaces again
# x_trimmed[i] <- trimws(x_trimmed[i], which = "both")
# }
# if (!is.na(x_trimmed[i])) {
# found <- MOs_mostprevalent[fullname %like% x_trimmed[i], mo]
# if (length(found) > 0) {
# x[i] <- found[1L]
# next
# }
# }
# THEN TRY ALL OTHERS ----
if (is.null(MOs_allothers)) {
MOs_allothers <- MOs[prevalence == 9999,]
}
found <- MOs_allothers[tolower(fullname) == tolower(x_backup[i]), mo]
# most probable: is exact match in fullname
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_allothers[tolower(fullname) == tolower(x_trimmed[i]), mo]
# most probable: is exact match in fullname
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_allothers[tsn == x_trimmed[i], mo]
# is a valid TSN
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_allothers[mo == toupper(x_backup[i]), mo]
# is a valid mo
if (length(found) > 0) {
x[i] <- found[1L]
next
}
found <- MOs_allothers[mo.old == toupper(x_backup[i]), mo]
# is a valid old mo
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match keeping spaces ----
found <- MOs_allothers[fullname %like% x_withspaces[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match keeping spaces, not ending with $ ----
found <- MOs_allothers[fullname %like% x_withspaces_start[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match diregarding spaces ----
found <- MOs_allothers[fullname %like% x[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try fullname without start and stop regex, to also find subspecies ----
# like "K. pneu rhino" -> "Klebsiella pneumoniae (rhinoscleromatis)" = KLEPNERH
found <- MOs_allothers[fullname %like% x_withspaces_start[i], mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
@ -280,23 +438,52 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -280,23 +438,52 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
x_split[i] <- paste0(x_trimmed[i] %>% substr(1, x_length / 2) %>% trimws(),
'.* ',
x_trimmed[i] %>% substr((x_length / 2) + 1, x_length) %>% trimws())
found <- MOs[which(MOs$fullname %like% paste0('^', x_split[i])),]$mo
found <- MOs_allothers[fullname %like% paste0('^', x_split[i]), mo]
if (length(found) > 0) {
x[i] <- found[1L]
next
}
# try any match with text before and after original search string ----
# so "negative rods" will be "GNR"
if (x_trimmed[i] %like% "^Gram") {
x_trimmed[i] <- gsub("^Gram", "", x_trimmed[i], ignore.case = TRUE)
# remove leading and trailing spaces again
x_trimmed[i] <- trimws(x_trimmed[i], which = "both")
# # try any match with text before and after original search string ----
# # so "negative rods" will be "GNR"
# if (x_trimmed[i] %like% "^Gram") {
# x_trimmed[i] <- gsub("^Gram", "", x_trimmed[i], ignore.case = TRUE)
# # remove leading and trailing spaces again
# x_trimmed[i] <- trimws(x_trimmed[i], which = "both")
# }
# if (!is.na(x_trimmed[i])) {
# found <- MOs_allothers[fullname %like% x_trimmed[i], mo]
# if (length(found) > 0) {
# x[i] <- found[1L]
# next
# }
# }
# MISCELLANEOUS ----
# look for old taxonomic names ----
if (is.null(MOs_old)) {
MOs_old <- as.data.table(microorganisms.old)
setkey(MOs_old, name, tsn_new)
}
if (!is.na(x_trimmed[i])) {
found <- MOs[which(MOs$fullname %like% x_trimmed[i]),]$mo
if (length(found) > 0) {
x[i] <- found[1L]
found <- MOs_old[tolower(name) == tolower(x_backup[i]) |
tsn == x_trimmed[i],]
if (NROW(found) > 0) {
x[i] <- MOs[tsn == found[1, tsn_new], mo]
message("Note: '", found[1, name], "' was renamed to '",
MOs[tsn == found[1, tsn_new], fullname], "' by ",
found[1, authors], " in ", found[1, year])
next
}
# check for uncertain results ----
# (1) try to strip off one element and check the remains
if (allow_uncertain == TRUE) {
x_strip <- x_backup[i] %>% strsplit(" ") %>% unlist()
x_strip <- x_strip[1:length(x_strip) - 1]
x[i] <- suppressWarnings(suppressMessages(as.mo(x_strip)))
if (!is.na(x[i])) {
warning("Uncertain result: '", x_backup[i], "' -> '", MOs[mo == x[i], fullname], "' (", x[i], ")")
next
}
}
@ -309,7 +496,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -309,7 +496,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
failures <- failures[!failures %in% c(NA, NULL, NaN)]
if (length(failures) > 0) {
warning("These ", length(failures) , " values could not be coerced to a valid mo: ",
warning("These ", length(failures) , " values could not be coerced (try again with allow_uncertain = TRUE):\n",
paste('"', unique(failures), '"', sep = "", collapse = ', '),
".",
call. = FALSE)
@ -341,43 +528,36 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -341,43 +528,36 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
"pseudintermedius", "pseudointermedius",
"schleiferi")) %>%
pull(mo)
x[x %in% CoNS] <- "STACNS"
x[x %in% CoPS] <- "STACPS"
x[x %in% CoNS] <- "B_STPHY_CNS"
x[x %in% CoPS] <- "B_STPHY_CPS"
if (Becker == "all") {
x[x == "STAAUR"] <- "STACPS"
x[x == "B_STPHY_AUR"] <- "B_STPHY_CPS"
}
}
# Lancefield ----
if (Lancefield == TRUE | Lancefield == "all") {
# group A
x[x == "STCPYO"] <- "STCGRA" # S. pyogenes
x[x == "B_STRPTC_PYO"] <- "B_STRPTC_GRA" # S. pyogenes
# group B
x[x == "STCAGA"] <- "STCGRB" # S. agalactiae
x[x == "B_STRPTC_AGA"] <- "B_STRPTC_GRB" # S. agalactiae
# group C
S_groupC <- MOs %>% filter(genus == "Streptococcus",
species %in% c("equisimilis", "equi",
"zooepidemicus", "dysgalactiae")) %>%
pull(mo)
x[x %in% S_groupC] <- "STCGRC" # S. agalactiae
x[x %in% S_groupC] <- "B_STRPTC_GRC" # S. agalactiae
if (Lancefield == "all") {
x[substr(x, 1, 3) == "ENC"] <- "STCGRD" # all Enterococci
x[substr(x, 1, 7) == "B_ENTRC"] <- "B_STRPTC_GRD" # all Enterococci
}
# group F
x[x == "STCANG"] <- "STCGRF" # S. anginosus
x[x == "B_STRPTC_ANG"] <- "B_STRPTC_GRF" # S. anginosus
# group H
x[x == "STCSAN"] <- "STCGRH" # S. sanguis
x[x == "B_STRPTC_SAN"] <- "B_STRPTC_GRH" # S. sanguinis
# group K
x[x == "STCSAL"] <- "STCGRK" # S. salivarius
x[x == "B_STRPTC_SAL"] <- "B_STRPTC_GRK" # S. salivarius
}
# for the returned genera without species, add species ----
# like "ESC" -> "ESCSPP", but only where the input contained it
indices <- nchar(unique(x)) == 3 & !x %like% "[A-Z]{3}SPP" & !x %in% c("GNR", "GPR", "GNC", "GPC",
"GNS", "GPS", "GNK", "GPK")
indices <- indices[!is.na(indices)]
x[indices] <- paste0(x[indices], 'SPP')
# left join the found results to the original input values (x_input)
df_found <- data.frame(input = as.character(unique(x_input)),
found = x,
@ -392,9 +572,11 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { @@ -392,9 +572,11 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) {
class(x) <- "mo"
attr(x, 'package') <- 'AMR'
attr(x, 'ITIS') <- TRUE
x
}
#' @rdname as.mo
#' @export
is.mo <- function(x) {

170
R/mo_property.R

@ -22,21 +22,17 @@ @@ -22,21 +22,17 @@
#' @param x any (vector of) text that can be coerced to a valid microorganism code with \code{\link{as.mo}}
#' @param property one of the column names of one of the \code{\link{microorganisms}} data set, like \code{"mo"}, \code{"bactsys"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"fullname"}, \code{"gramstain"} and \code{"aerobic"}
#' @inheritParams as.mo
#' @param language language of the returned text, defaults to the systems language. Either one of \code{"en"} (English), \code{"de"} (German), \code{"nl"} (Dutch), \code{"es"} (Spanish) or \code{"pt"} (Portuguese).
#' @source
#' [1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13}
#'
#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571}
#'
#' [3] Integrated Taxonomic Information System (ITIS) on-line database, \url{https://www.itis.gov}.
#' @param language language of the returned text, defaults to the systems language but can also be set with \code{\link{getOption}("AMR_locale")}. Either one of \code{"en"} (English), \code{"de"} (German), \code{"nl"} (Dutch), \code{"es"} (Spanish) or \code{"pt"} (Portuguese).
#' @inheritSection as.mo ITIS
#' @inheritSection as.mo Source
#' @rdname mo_property
#' @name mo_property
#' @return A logical (in case of \code{mo_aerobic}), a list (in case of \code{mo_taxonomy}), a character otherwise
#' @export
#' @importFrom dplyr %>% left_join pull
#' @seealso \code{\link{microorganisms}}
#' @examples
#' # All properties
#' mo_subkingdom("E. coli") # "Negibacteria"
#' mo_phylum("E. coli") # "Proteobacteria"
#' mo_class("E. coli") # "Gammaproteobacteria"
#' mo_order("E. coli") # "Enterobacteriales"
@ -46,42 +42,30 @@ @@ -46,42 +42,30 @@
#' mo_subspecies("E. coli") # ""
#' mo_fullname("E. coli") # "Escherichia coli"
#' mo_shortname("E. coli") # "E. coli"
#' mo_gramstain("E. coli") # "Gram negative"
#' mo_TSN("E. coli") # 285
#' mo_type("E. coli") # "Bacteria"
#' mo_gramstain("E. coli") # "Negative rods"
#' mo_aerobic("E. coli") # TRUE
#'
#'
#' # Abbreviations known in the field
#' mo_genus("MRSA") # "Staphylococcus"
#' mo_species("MRSA") # "aureus"
#' mo_shortname("MRSA") # "S. aureus"
#' mo_gramstain("MRSA") # "Positive cocci"
#' mo_gramstain("MRSA") # "Gram positive"
#'
#' mo_genus("VISA") # "Staphylococcus"
#' mo_species("VISA") # "aureus"
#'
#'
#' # Known subspecies
#' mo_genus("EHEC") # "Escherichia"
#' mo_species("EHEC") # "coli"
#' mo_subspecies("EHEC") # "EHEC"
#' mo_fullname("EHEC") # "Escherichia coli (EHEC)"
#' mo_shortname("EHEC") # "E. coli"
#'
#' mo_genus("doylei") # "Campylobacter"
#' mo_species("doylei") # "jejuni"
#' mo_fullname("doylei") # "Campylobacter jejuni (doylei)"
#' mo_fullname("doylei") # "Campylobacter jejuni doylei"
#'
#' mo_fullname("K. pneu rh") # "Klebsiella pneumoniae (rhinoscleromatis)"
#' mo_fullname("K. pneu rh") # "Klebsiella pneumoniae rhinoscleromatis"
#' mo_shortname("K. pneu rh") # "K. pneumoniae"
#'
#'
#' # Anaerobic bacteria
#' mo_genus("B. fragilis") # "Bacteroides"
#' mo_species("B. fragilis") # "fragilis"
#' mo_aerobic("B. fragilis") # FALSE
#'
#'
#' # Becker classification, see ?as.mo
#' mo_fullname("S. epi") # "Staphylococcus epidermidis"
#' mo_fullname("S. epi", Becker = TRUE) # "Coagulase Negative Staphylococcus (CoNS)"
@ -99,10 +83,9 @@ @@ -99,10 +83,9 @@
#' mo_type("E. coli", language = "de") # "Bakterium"
#' mo_type("E. coli", language = "nl") # "Bacterie"
#' mo_type("E. coli", language = "es") # "Bakteria"
#' mo_gramstain("E. coli", language = "de") # "Negative Staebchen"
#' mo_gramstain("E. coli", language = "nl") # "Negatieve staven"
#' mo_gramstain("E. coli", language = "es") # "Bacilos negativos"
#' mo_gramstain("Giardia", language = "pt") # "Parasitas"
#' mo_gramstain("E. coli", language = "de") # "Gramnegativ"
#' mo_gramstain("E. coli", language = "nl") # "Gram-negatief"
#' mo_gramstain("E. coli", language = "es") # "Gram negativo"
#'
#' mo_fullname("S. pyogenes",
#' Lancefield = TRUE,
@ -112,7 +95,7 @@ @@ -112,7 +95,7 @@
#' language = "nl") # "Streptococcus groep A"
#'
#'
#' # Complete taxonomy up to Phylum, returns a list
#' # Complete taxonomy up to Subkingdom, returns a list
#' mo_taxonomy("E. coli")
mo_fullname <- function(x, Becker = FALSE, Lancefield = FALSE, language = NULL) {
mo_property(x, "fullname", Becker = Becker, Lancefield = Lancefield, language = language)
@ -191,6 +174,12 @@ mo_phylum <- function(x) { @@ -191,6 +174,12 @@ mo_phylum <- function(x) {
mo_property(x, "phylum")
}
#' @rdname mo_property
#' @export
mo_subkingdom <- function(x) {
mo_property(x, "subkingdom")
}
#' @rdname mo_property
#' @export
mo_type <- function(x, language = NULL) {
@ -199,33 +188,43 @@ mo_type <- function(x, language = NULL) { @@ -199,33 +188,43 @@ mo_type <- function(x, language = NULL) {
#' @rdname mo_property
#' @export
mo_gramstain <- function(x, language = NULL) {
mo_property(x, "gramstain", language = language)
mo_TSN <- function(x) {
mo_property(x, "tsn")
}
#' @rdname mo_property
#' @export
mo_aerobic <- function(x) {
mo_property(x, "aerobic")
mo_gramstain <- function(x, language = NULL) {
mo_property(x, "gramstain", language = language)
}
#' @rdname mo_property
#' @importFrom data.table data.table as.data.table setkey
#' @export
mo_property <- function(x, property = 'fullname', Becker = FALSE, Lancefield = FALSE, language = NULL) {
property <- tolower(property[1])
if (!property %in% colnames(AMR::microorganisms)) {
stop("invalid property: ", property, " - use a column name of the `microorganisms` data set")
}
result1 <- as.mo(x = x, Becker = Becker, Lancefield = Lancefield) # this will give a warning if x cannot be coerced
result2 <- suppressWarnings(
data.frame(mo = result1, stringsAsFactors = FALSE) %>%
left_join(AMR::microorganisms, by = "mo") %>%
pull(property)
)
if (property != "aerobic") {
if (Becker == TRUE | Lancefield == TRUE | !is.mo(x)) {
# this will give a warning if x cannot be coerced
result1 <- AMR::as.mo(x = x, Becker = Becker, Lancefield = Lancefield)
} else {
result1 <- x
}
A <- data.table(mo = result1, stringsAsFactors = FALSE)
B <- as.data.table(AMR::microorganisms)
setkey(B, mo)
result2 <- B[A, on = 'mo', ..property][[1]]
if (property == "tsn") {
result2 <- as.integer(result2)
} else {
# will else not retain `logical` class
result2[x %in% c("", NA) | result2 %in% c("", NA, "(no MO)")] <- ""
result2 <- mo_translate(result2, language = language)
if (property %in% c("fullname", "shortname", "genus", "species", "subspecies", "type", "gramstain")) {
result2 <- mo_translate(result2, language = language)
}
}
result2
}
@ -234,7 +233,8 @@ mo_property <- function(x, property = 'fullname', Becker = FALSE, Lancefield = F @@ -234,7 +233,8 @@ mo_property <- function(x, property = 'fullname', Becker = FALSE, Lancefield = F
#' @export
mo_taxonomy <- function(x) {
x <- as.mo(x)
base::list(phylum = mo_phylum(x),
base::list(subkingdom = mo_subkingdom(x),
phylum = mo_phylum(x),
class = mo_class(x),
order = mo_order(x),
family = mo_family(x),
@ -266,15 +266,11 @@ mo_translate <- function(x, language) { @@ -266,15 +266,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Koagulase-positive Staphylococcus", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Beta-h\u00e4molytischer Streptococcus", ., fixed = TRUE) %>%
gsub("(no MO)", "(kein MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Negative St\u00e4bchen", ., fixed = TRUE) %>%
gsub("Negative cocci", "Negative Kokken", ., fixed = TRUE) %>%
gsub("Positive rods", "Positive St\u00e4bchen", ., fixed = TRUE) %>%
gsub("Positive cocci", "Positive Kokken", ., fixed = TRUE) %>%
gsub("Parasites", "Parasiten", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Pilze und Hefen", ., fixed = TRUE) %>%
gsub("Bacteria", "Bakterium", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Pilz/Hefe", ., fixed = TRUE) %>%
gsub("Parasite", "Parasit", ., fixed = TRUE) %>%
gsub("Gram negative", "Gramnegativ", ., fixed = TRUE) %>%
gsub("Gram positive", "Grampositiv", ., fixed = TRUE) %>%
gsub("Bacteria", "Bakterien", ., fixed = TRUE) %>%
gsub("Fungi", "Hefen/Pilze", ., fixed = TRUE) %>%
gsub("Protozoa", "Protozoen", ., fixed = TRUE) %>%
gsub("biogroup", "Biogruppe", ., fixed = TRUE) %>%
gsub("biotype", "Biotyp", ., fixed = TRUE) %>%
gsub("vegetative", "vegetativ", ., fixed = TRUE) %>%
@ -287,15 +283,11 @@ mo_translate <- function(x, language) { @@ -287,15 +283,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Coagulase-positieve Staphylococcus", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Beta-hemolytische Streptococcus", ., fixed = TRUE) %>%
gsub("(no MO)", "(geen MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Negatieve staven", ., fixed = TRUE) %>%
gsub("Negative cocci", "Negatieve kokken", ., fixed = TRUE) %>%
gsub("Positive rods", "Positieve staven", ., fixed = TRUE) %>%
gsub("Positive cocci", "Positieve kokken", ., fixed = TRUE) %>%
gsub("Parasites", "Parasieten", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Schimmels en gisten", ., fixed = TRUE) %>%
gsub("Bacteria", "Bacterie", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Schimmel/gist", ., fixed = TRUE) %>%
gsub("Parasite", "Parasiet", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram-negatief", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram-positief", ., fixed = TRUE) %>%
gsub("Bacteria", "Bacteri\u00ebn", ., fixed = TRUE) %>%
gsub("Fungi", "Schimmels/gisten", ., fixed = TRUE) %>%
gsub("Protozoa", "protozo\u00ebn", ., fixed = TRUE) %>%
gsub("biogroup", "biogroep", ., fixed = TRUE) %>%
# gsub("biotype", "biotype", ., fixed = TRUE) %>%
gsub("vegetative", "vegetatief", ., fixed = TRUE) %>%
@ -308,15 +300,11 @@ mo_translate <- function(x, language) { @@ -308,15 +300,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Staphylococcus coagulasa positivo", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-hemol\u00edtico", ., fixed = TRUE) %>%
gsub("(no MO)", "(sin MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Bacilos negativos", ., fixed = TRUE) %>%
gsub("Negative cocci", "Cocos negativos", ., fixed = TRUE) %>%
gsub("Positive rods", "Bacilos positivos", ., fixed = TRUE) %>%
gsub("Positive cocci", "Cocos positivos", ., fixed = TRUE) %>%
gsub("Parasites", "Par\u00e1sitos", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Hongos y levaduras", ., fixed = TRUE) %>%
# gsub("Bacteria", "Bacteria", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Hongo/levadura", ., fixed = TRUE) %>%
gsub("Parasite", "Par\u00e1sito", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Bacterias", ., fixed = TRUE) %>%
gsub("Fungi", "Hongos", ., fixed = TRUE) %>%
gsub("Protozoa", "Protozoarios", ., fixed = TRUE) %>%
gsub("biogroup", "biogrupo", ., fixed = TRUE) %>%
gsub("biotype", "biotipo", ., fixed = TRUE) %>%
gsub("vegetative", "vegetativo", ., fixed = TRUE) %>%
@ -329,15 +317,11 @@ mo_translate <- function(x, language) { @@ -329,15 +317,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Staphylococcus coagulase positivo", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-hemol\u00edtico", ., fixed = TRUE) %>%
gsub("(no MO)", "(sem MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Bacilos negativos", ., fixed = TRUE) %>%
gsub("Negative cocci", "Cocos negativos", ., fixed = TRUE) %>%
gsub("Positive rods", "Bacilos positivos", ., fixed = TRUE) %>%
gsub("Positive cocci", "Cocos positivos", ., fixed = TRUE) %>%
gsub("Parasites", "Parasitas", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Cogumelos e leveduras", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9ria", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Cogumelo/levedura", ., fixed = TRUE) %>%
gsub("Parasite", "Parasita", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9rias", ., fixed = TRUE) %>%
gsub("Fungi", "Fungos", ., fixed = TRUE) %>%
gsub("Protozoa", "Protozo\u00e1rios", ., fixed = TRUE) %>%
gsub("biogroup", "biogrupo", ., fixed = TRUE) %>%
gsub("biotype", "bi\u00f3tipo", ., fixed = TRUE) %>%
gsub("vegetative", "vegetativo", ., fixed = TRUE) %>%
@ -350,15 +334,11 @@ mo_translate <- function(x, language) { @@ -350,15 +334,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Staphylococcus positivo coagulasi", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-emolitico", ., fixed = TRUE) %>%
gsub("(no MO)", "(non MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Bastoncini Gram-negativi", ., fixed = TRUE) %>%
gsub("Negative cocci", "Cocchi Gram-negativi", ., fixed = TRUE) %>%
gsub("Positive rods", "Bastoncini Gram-positivi", ., fixed = TRUE) %>%
gsub("Positive cocci", "Cocchi Gram-positivi", ., fixed = TRUE) %>%
gsub("Parasites", "Parassiti", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Funghi e lieviti", ., fixed = TRUE) %>%
gsub("Bacteria", "Batterio", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Fungo/lievito", ., fixed = TRUE) %>%
gsub("Parasite", "Parassita", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Batteri", ., fixed = TRUE) %>%
gsub("Fungi", "Fungo", ., fixed = TRUE) %>%
gsub("Protozoa", "Protozoi", ., fixed = TRUE) %>%
gsub("biogroup", "biogruppo", ., fixed = TRUE) %>%
gsub("biotype", "biotipo", ., fixed = TRUE) %>%
gsub("vegetative", "vegetativo", ., fixed = TRUE) %>%
@ -371,15 +351,11 @@ mo_translate <- function(x, language) { @@ -371,15 +351,11 @@ mo_translate <- function(x, language) {
gsub("Coagulase Positive Staphylococcus","Staphylococcus \u00e0 coagulase positif", ., fixed = TRUE) %>%
gsub("Beta-haemolytic Streptococcus", "Streptococcus B\u00eata-h\u00e9molytique", ., fixed = TRUE) %>%
gsub("(no MO)", "(pas MO)", ., fixed = TRUE) %>%
gsub("Negative rods", "Bacilles n\u00e9gatif", ., fixed = TRUE) %>%
gsub("Negative cocci", "Cocci n\u00e9gatif", ., fixed = TRUE) %>%
gsub("Positive rods", "Bacilles positif", ., fixed = TRUE) %>%
gsub("Positive cocci", "Cocci positif", ., fixed = TRUE) %>%
# gsub("Parasites", "Parasites", ., fixed = TRUE) %>%
gsub("Fungi and yeasts", "Champignons et levures", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9rie", ., fixed = TRUE) %>%
gsub("Fungus/yeast", "Champignon/levure", ., fixed = TRUE) %>%
# gsub("Parasite", "Parasite", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram n\u00e9gatif", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positif", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9ries", ., fixed = TRUE) %>%
gsub("Fungi", "Champignons", ., fixed = TRUE) %>%
gsub("Protozoa", "Protozoaires", ., fixed = TRUE) %>%
gsub("biogroup", "biogroupe", ., fixed = TRUE) %>%
# gsub("biotype", "biotype", ., fixed = TRUE) %>%
gsub("vegetative", "v\u00e9g\u00e9tatif", ., fixed = TRUE) %>%

26
README.md

@ -38,13 +38,17 @@ Erwin E.A. Hassing<sup>2</sup>, @@ -38,13 +38,17 @@ Erwin E.A. Hassing<sup>2</sup>,
* [Copyright](#copyright)
## Why this package?
This R package was intended to make microbial epidemiology easier. Most functions contain extensive help pages to get started.
This R package was intended **to make microbial epidemiology easier**. Most functions contain extensive help pages to get started.
This `AMR` package basically does four important things:
<a href="https://www.itis.gov"><img src="man/figures/itis_logo.jpg" height="50px"></a>
This `AMR` package contains the *complete microbial taxonomic data* from the publicly available Integrated Taxonomic Information System (ITIS, https://www.itis.gov). ITIS is a partnership of U.S., Canadian, and Mexican agencies and taxonomic specialists. The complete taxonomic kingdoms Bacteria, Fungi and Protozoa (from subkingdom to the subspecies level) are included in this package. This allows users to use authoritative taxonomic information for their data analyses on any microorganisms, not only human pathogens.
Combined with several new functions, this `AMR` package basically does four important things:
1. It **cleanses existing data**, by transforming it to reproducible and profound *classes*, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect:
* Use `as.mo` to get an ID of a microorganism. The IDs are quite obvious - the ID of *E. coli* is "ESCCOL" and the ID of *S. aureus* is "STAAUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, this package contains a freely available database of almost 3,000 different (potential) human pathogenic microorganisms.
* Use `as.mo` to get an ID of a microorganism. The IDs are human readable for the trained eye - the ID of *Klebsiella pneumoniae* is "B_KLBSL_PNE" (B stands for Bacteria) and the ID of *S. aureus* is "B_STPHY_AUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, it uses Artificial Intelligence to look up values in the included ITIS data, consisting of more than 18,000 microorganisms.
* Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S".
* Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called *ordinal* in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002".
* Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine.
@ -55,7 +59,7 @@ This `AMR` package basically does four important things: @@ -55,7 +59,7 @@ This `AMR` package basically does four important things:
* Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute).
* You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them.
* Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported.
* The data set `microorganisms` contains the taxonomic properties of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). Taxonomic names were downloaded from ITIS (Integrated Taxonomic Information System, http://www.itis.gov). Furhermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
* The data set `microorganisms` contains the complete taxonomic tree of more than 18,000 microorganisms (bacteria, fungi/yeasts and protozoa). Furthermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
* The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_name` and `ab_tradenames` to look up values. The `ab_*` functions use `as.atc` internally so they support AI to guess your expected result. For example, `ab_name("Fluclox")`, `ab_name("Floxapen")` and `ab_name("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data.
3. It **analyses the data** with convenient functions that use well-known methods.
@ -378,19 +382,19 @@ Learn more about this function with: @@ -378,19 +382,19 @@ Learn more about this function with:
```
### Data sets included in package
Datasets to work with antibiotics and bacteria properties.
Data sets to work with antibiotics and bacteria properties.
```r
# Dataset with 2000 random blood culture isolates from anonymised
# Data set with complete taxonomic trees from ITIS, containing of
# the three kingdoms Bacteria, Fungi and Protozoa
microorganisms # A tibble: 18,831 x 15
# Data set with 2000 random blood culture isolates from anonymised
# septic patients between 2001 and 2017 in 5 Dutch hospitals
septic_patients # A tibble: 2,000 x 49
# Dataset with ATC antibiotics codes, official names, trade names
# Data set with ATC antibiotics codes, official names, trade names
# and DDDs (oral and parenteral)
antibiotics # A tibble: 423 x 18
# Dataset with bacteria codes and properties like gram stain and
# aerobic/anaerobic
microorganisms # A tibble: 2,642 x 14
```
## Copyright

BIN
data/microorganisms.old.rda

Binary file not shown.

BIN
data/microorganisms.rda

Binary file not shown.

BIN
data/septic_patients.rda

Binary file not shown.

73
man/as.mo.Rd

@ -6,17 +6,13 @@ @@ -6,17 +6,13 @@
\alias{is.mo}
\alias{guess_mo}
\title{Transform to microorganism ID}
\source{
[1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13}
[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571}
}
\usage{
as.mo(x, Becker = FALSE, Lancefield = FALSE)
as.mo(x, Becker = FALSE, Lancefield = FALSE, allow_uncertain = FALSE)
is.mo(x)
guess_mo(x, Becker = FALSE, Lancefield = FALSE)
guess_mo(x, Becker = FALSE, Lancefield = FALSE,
allow_uncertain = FALSE)
}
\arguments{
\item{x}{a character vector or a \code{data.frame} with one or two columns}
@ -25,31 +21,62 @@ guess_mo(x, Becker = FALSE, Lancefield = FALSE) @@ -25,31 +21,62 @@ guess_mo(x, Becker = FALSE, Lancefield = FALSE)
This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".}
\item{Lancefield}{a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
\item{Lancefield}{a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D.}
\item{allow_uncertain}{a logical to indicate whether empty results should be checked for only a part of the input string. When results are found, a warning will be given about the uncertainty and the result.}
}
\value{
Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}.
}
\description{
Use this function to determine a valid ID based on a genus (and species). Determination is done using Artificial Intelligence (AI), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms \emph{Bacteria}, \emph{Fungi} and \emph{Protozoa} (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
}
\details{
\code{guess_mo} is an alias of \code{as.mo}.
A microbial ID (class: \code{mo}) typically looks like these examples:\cr
\preformatted{
Code Full name
--------------- --------------------------------------
B_KLBSL Klebsiella
B_KLBSL_PNE Klebsiella pneumoniae
B_KLBSL_PNE_RHI Klebsiella pneumoniae rhinoscleromatis
| | | |
| | | |