Browse Source

(v0.7.1.9055) algorithm improvements

pull/67/head
parent
commit
7108454ba5
  1. 8
      DESCRIPTION
  2. 13
      NEWS.md
  3. 6
      R/catalogue_of_life.R
  4. 8
      R/data.R
  5. 108
      R/mo.R
  6. 25
      data-raw/reproduction_of_microorganisms.R
  7. BIN
      data/microorganisms.rda
  8. 4
      docs/LICENSE-text.html
  9. 492
      docs/articles/AMR.html
  10. BIN
      docs/articles/AMR_files/figure-html/plot 1-1.png
  11. BIN
      docs/articles/AMR_files/figure-html/plot 3-1.png
  12. BIN
      docs/articles/AMR_files/figure-html/plot 4-1.png
  13. BIN
      docs/articles/AMR_files/figure-html/plot 5-1.png
  14. 6
      docs/articles/EUCAST.html
  15. 58
      docs/articles/MDR.html
  16. 6
      docs/articles/SPSS.html
  17. 6
      docs/articles/WHONET.html
  18. 74
      docs/articles/benchmarks.html
  19. BIN
      docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png
  20. 4
      docs/articles/index.html
  21. 6
      docs/articles/resistance_predict.html
  22. 10
      docs/authors.html
  23. 34
      docs/index.html
  24. 21
      docs/news/index.html
  25. 4
      docs/reference/AMR-deprecated.html
  26. 4
      docs/reference/AMR.html
  27. 4
      docs/reference/WHOCC.html
  28. 4
      docs/reference/WHONET.html
  29. 4
      docs/reference/ab_property.html
  30. 4
      docs/reference/age.html
  31. 4
      docs/reference/age_groups.html
  32. 4
      docs/reference/antibiotics.html
  33. 4
      docs/reference/as.ab.html
  34. 4
      docs/reference/as.disk.html
  35. 4
      docs/reference/as.mic.html
  36. 16
      docs/reference/as.mo.html
  37. 4
      docs/reference/as.rsi.html
  38. 4
      docs/reference/atc_online.html
  39. 4
      docs/reference/availability.html
  40. 10
      docs/reference/catalogue_of_life.html
  41. 8
      docs/reference/catalogue_of_life_version.html
  42. 4
      docs/reference/count.html
  43. 4
      docs/reference/eucast_rules.html
  44. 4
      docs/reference/extended-functions.html
  45. 4
      docs/reference/filter_ab_class.html
  46. 4
      docs/reference/first_isolate.html
  47. 4
      docs/reference/g.test.html
  48. 4
      docs/reference/ggplot_rsi.html
  49. 4
      docs/reference/guess_ab_col.html
  50. 4
      docs/reference/index.html
  51. 4
      docs/reference/join.html
  52. 4
      docs/reference/key_antibiotics.html
  53. 4
      docs/reference/kurtosis.html
  54. 4
      docs/reference/like.html
  55. 4
      docs/reference/mdro.html
  56. 8
      docs/reference/microorganisms.codes.html
  57. 16
      docs/reference/microorganisms.html
  58. 8
      docs/reference/microorganisms.old.html
  59. 8
      docs/reference/mo_property.html
  60. 4
      docs/reference/mo_source.html
  61. 4
      docs/reference/p.symbol.html
  62. 4
      docs/reference/portion.html
  63. 4
      docs/reference/read.4D.html
  64. 4
      docs/reference/resistance_predict.html
  65. 4
      docs/reference/rsi_translation.html
  66. 4
      docs/reference/septic_patients.html
  67. 4
      docs/reference/skewness.html
  68. 4
      docs/reference/translate.html
  69. 36
      index.md
  70. 12
      man/as.mo.Rd
  71. 6
      man/catalogue_of_life.Rd
  72. 4
      man/catalogue_of_life_version.Rd
  73. 12
      man/microorganisms.Rd
  74. 4
      man/microorganisms.codes.Rd
  75. 4
      man/microorganisms.old.Rd
  76. 4
      man/mo_property.Rd
  77. 3
      tests/testthat/test-mo.R

8
DESCRIPTION

@ -1,18 +1,18 @@ @@ -1,18 +1,18 @@
Package: AMR
Version: 0.7.1.9038
Date: 2019-08-12
Version: 0.7.1.9055
Date: 2019-08-13
Title: Antimicrobial Resistance Analysis
Authors@R: c(
person(role = c("aut", "cre"),
family = "Berends", given = c("Matthijs", "S."), email = "m.s.berends@umcg.nl", comment = c(ORCID = "0000-0001-7620-1800")),
person(role = "aut",
family = "Luz", given = c("Christian", "F."), email = "c.f.luz@umcg.nl", comment = c(ORCID = "0000-0001-5809-5995")),
person(role = c("aut", "ths"),
family = "Glasner", given = "Corinna", email = "c.glasner@umcg.nl", comment = c(ORCID = "0000-0003-1241-1328")),
person(role = c("aut", "ths"),
family = "Friedrich", given = c("Alex", "W."), email = "alex.friedrich@umcg.nl", comment = c(ORCID = "0000-0003-4881-038X")),
person(role = c("aut", "ths"),
family = "Sinha", given = c("Bhanu", "N.", "M."), email = "b.sinha@umcg.nl", comment = c(ORCID = "0000-0003-1634-0010")),
person(role = c("aut", "ths"),
family = "Glasner", given = "Corinna", email = "c.glasner@umcg.nl", comment = c(ORCID = "0000-0003-1241-1328")),
person(role = "ctb",
family = "Hassing", given = c("Erwin", "E.", "A."), email = "e.hassing@certe.nl"),
person(role = "ctb",

13
NEWS.md

@ -1,4 +1,4 @@ @@ -1,4 +1,4 @@
# AMR 0.7.1.9038
# AMR 0.7.1.9055
### Breaking
* Function `freq()` has moved to a new package, [`clean`](https://github.com/msberends/clean) ([CRAN link](https://cran.r-project.org/package=clean)). Creating frequency tables is actually not the scope of this package (never was) and this function has matured a lot over the last two years. Therefore, a new package was created for data cleaning and checking and it perfectly fits the `freq()` function. The [`clean`](https://github.com/msberends/clean) package is available on CRAN and will be installed automatically when updating the `AMR` package, that now imports it. In a later stage, the `skewness()` and `kurtosis()` functions will be moved to the `clean` package too.
@ -57,12 +57,13 @@ @@ -57,12 +57,13 @@
* Removed class `atc` - using `as.atc()` is now deprecated in favour of `ab_atc()` and this will return a character, not the `atc` class anymore
* Removed deprecated functions `abname()`, `ab_official()`, `atc_name()`, `atc_official()`, `atc_property()`, `atc_tradenames()`, `atc_trivial_nl()`
* Fix and speed improvement for `mo_shortname()`
* Algorithm improvements for `as.mo()`:
* Some misspelled input were not understood
* Algorithm improvements for `as.mo()` (by which some additions were made to the `microorganisms` data set:
* Big improvement for misspelled input
* These new trivial names known to the field are now understood: meningococcus, gonococcus, pneumococcus
* Updated to the latest taxonomic data (updated to August 2019, from the International Journal of Systematic and Evolutionary Microbiology
* Added support for Viridans Group Streptococci (VGS) and Milleri Group Streptococci (MGS)
* Added support for 5,000 new fungi
* Added support for unknown yeasts and fungi
* Updated the `microorganisms` data set to contain the latest taxonomic data from the IJSEM journal (now up to date until August 2019)
* Added almost 5,000 new fungi to the `microorganisms` data set
* Fix for using `mo_*` functions where the coercion uncertainties and failures would not be available through `mo_uncertainties()` and `mo_failures()` anymore
* Deprecated the `country` parameter of `mdro()` in favour of the already existing `guideline` parameter to support multiple guidelines within one country
* The `name` of `RIF` is now Rifampicin instead of Rifampin
@ -144,7 +145,7 @@ @@ -144,7 +145,7 @@
#### Changed
* Fixed a critical bug in `first_isolate()` where missing species would lead to incorrect FALSEs. This bug was not present in AMR v0.5.0, but was in v0.6.0 and v0.6.1.
* Fixedd a bug in `eucast_rules()` where antibiotics from WHONET software would not be recognised
* Fixed a bug in `eucast_rules()` where antibiotics from WHONET software would not be recognised
* Completely reworked the `antibiotics` data set:
* All entries now have 3 different identifiers:
* Column `ab` contains a human readable EARS-Net code, used by ECDC and WHO/WHONET - this is the primary identifier used in this package

6
R/catalogue_of_life.R

@ -24,9 +24,9 @@ @@ -24,9 +24,9 @@
#' This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life.
#' @section Catalogue of Life:
#' \if{html}{\figure{logo_col.png}{options: height=40px style=margin-bottom:5px} \cr}
#' This package contains the complete taxonomic tree of almost all microorganisms (~65,000 species) from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). The Catalogue of Life is the most comprehensive and authoritative global index of species currently available.
#' This package contains the complete taxonomic tree of almost all microorganisms (~70,000 species) from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). The Catalogue of Life is the most comprehensive and authoritative global index of species currently available.
#'
#' \link[=catalogue_of_life]{Click here} for more information about the included taxa. The Catalogue of Life releases updates annually; check which version was included in this package with \code{\link{catalogue_of_life_version}()}.
#' \link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which version of the Catalogue of Life was included in this package with \code{\link{catalogue_of_life_version}()}.
#' @section Included taxa:
#' Included are:
#' \itemize{
@ -38,7 +38,7 @@ @@ -38,7 +38,7 @@
#' \item{The responsible author(s) and year of scientific publication}
#' }
#'
#' The Catalogue of Life (\url{http://www.catalogueoflife.org}) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.6 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.
#' The Catalogue of Life (\url{http://www.catalogueoflife.org}) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.9 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.
#'
#' The syntax used to transform the original data to a cleansed R format, can be found here: \url{https://gitlab.com/msberends/AMR/blob/master/data-raw/reproduction_of_microorganisms.R}.
#' @inheritSection AMR Read more on our website!

8
R/data.R

@ -55,7 +55,7 @@ @@ -55,7 +55,7 @@
#'
#' A data set containing the microbial taxonomy of six kingdoms from the Catalogue of Life. MO codes can be looked up using \code{\link{as.mo}}.
#' @inheritSection catalogue_of_life Catalogue of Life
#' @format A \code{\link{data.frame}} with 69,854 observations and 16 variables:
#' @format A \code{\link{data.frame}} with 69,855 observations and 16 variables:
#' \describe{
#' \item{\code{mo}}{ID of microorganism as used by this package}
#' \item{\code{col_id}}{Catalogue of Life ID}
@ -64,14 +64,14 @@ @@ -64,14 +64,14 @@
#' \item{\code{rank}}{Text of the taxonomic rank of the microorganism, like \code{"species"} or \code{"genus"}}
#' \item{\code{ref}}{Author(s) and year of concerning scientific publication}
#' \item{\code{species_id}}{ID of the species as used by the Catalogue of Life}
#' \item{\code{source}}{Either \code{"CoL"}, \code{"DSMZ"} (see source) or "manually added"}
#' \item{\code{source}}{Either "CoL", "DSMZ" (see Source) or "manually added"}
#' \item{\code{prevalence}}{Prevalence of the microorganism, see \code{?as.mo}}
#' }
#' @details Manually added were:
#' \itemize{
#' \item{9 entries of \emph{Streptococcus} (beta haemolytic groups A, B, C, D, F, G, H, K and unspecified)}
#' \item{11 entries of \emph{Streptococcus} (beta-haemolytic: groups A, B, C, D, F, G, H, K and unspecified; other: viridans, milleri)}
#' \item{2 entries of \emph{Staphylococcus} (coagulase-negative [CoNS] and coagulase-positive [CoPS])}
#' \item{3 entries of Trichomonas (Trichomonas vaginalis, and its family and genus)}
#' \item{3 entries of \emph{Trichomonas} (\emph{Trichomonas vaginalis}, and its family and genus)}
#' \item{5 other 'undefined' entries (unknown, unknown Gram negatives, unknown Gram positives, unknown yeast and unknown fungus)}
#' \item{8,970 species from the DSMZ (Deutsche Sammlung von Mikroorganismen und Zellkulturen) that are not in the Catalogue of Life}
#' }

108
R/mo.R

@ -29,7 +29,7 @@ @@ -29,7 +29,7 @@
#' @param Lancefield a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [3]. These \emph{Streptococci} will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
#'
#' This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D.
#' @param allow_uncertain a logical (\code{TRUE} or \code{FALSE}) or a value between 0 and 3 to indicate whether the input should be checked for less possible results, see Details
#' @param allow_uncertain a number between 0 (or "none") and 3 (or "all"), or TRUE (= 2) or FALSE (= 0) to indicate whether the input should be checked for less possible results, see Details
#' @param reference_df a \code{data.frame} to use for extra reference when translating \code{x} to a valid \code{mo}. See \code{\link{set_mo_source}} and \code{\link{get_mo_source}} to automate the usage of your own codes (e.g. used in your analysis or organisation).
#' @param ... other parameters passed on to functions
#' @rdname as.mo
@ -50,8 +50,7 @@ @@ -50,8 +50,7 @@
#' | | ----> species, a 3-4 letter acronym
#' | ----> genus, a 5-7 letter acronym, mostly without vowels
#' ----> taxonomic kingdom: A (Archaea), AN (Animalia), B (Bacteria),
#' C (Chromista), F (Fungi), P (Protozoa) or
#' PL (Plantae)
#' C (Chromista), F (Fungi), P (Protozoa)
#' }
#'
#' Values that cannot be coered will be considered 'unknown' and will get the MO code \code{UNKNOWN}.
@ -60,13 +59,14 @@ @@ -60,13 +59,14 @@
#'
#' The algorithm uses data from the Catalogue of Life (see below) and from one other source (see \code{?microorganisms}).
#'
# /// THIS PART WAS DELETED FROM THE MAN PAGE
# \strong{Self-learning algoritm} \cr
# The \code{as.mo()} function gains experience from previously determined microbial IDs and learns from it. This drastically improves both speed and reliability. Use \code{clean_mo_history()} to reset the algorithms. Only experience from your current \code{AMR} package version is used. This is done because in the future the taxonomic tree (which is included in this package) may change for any organism and it consequently has to rebuild its knowledge.
#
# Usually, any guess after the first try runs 80-95\% faster than the first try.
#
# For now, learning only works per session. If R is closed or terminated, the algorithms reset. This will probably be resolved in a next version.
#
# ////
#' \strong{Intelligent rules} \cr
#' This function uses intelligent rules to help getting fast and logical results. It tries to find matches in this order:
#' \itemize{
@ -169,7 +169,8 @@ @@ -169,7 +169,8 @@
#'
#' # All mo_* functions use as.mo() internally too (see ?mo_property):
#' mo_genus("E. coli") # returns "Escherichia"
#' mo_gramstain("E. coli") # returns "Gram negative"#'
#' mo_gramstain("E. coli") # returns "Gram negative"
#'
#' }
#' \dontrun{
#' df$mo <- as.mo(df$microorganism_name)
@ -478,6 +479,7 @@ exec_as.mo <- function(x, @@ -478,6 +479,7 @@ exec_as.mo <- function(x,
x_species <- paste(x, "species")
# translate to English for supported languages of mo_property
x <- gsub("(gruppe|groep|grupo|gruppo|groupe)", "group", x, ignore.case = TRUE)
x <- gsub("(vergroen)[a-z]*", "viridans", x, ignore.case = TRUE)
x <- gsub("(hefe|gist|gisten|levadura|lievito|fermento|levure)[a-z]*", "yeast", x, ignore.case = TRUE)
x <- gsub("(schimmels?|mofo|molde|stampo|moisissure|fungi)[a-z]*", "fungus", x, ignore.case = TRUE)
x <- gsub("Fungus[ph|f]rya", "Fungiphrya", x, ignore.case = TRUE)
@ -491,13 +493,13 @@ exec_as.mo <- function(x, @@ -491,13 +493,13 @@ exec_as.mo <- function(x,
x <- gsub("(alpha|beta|gamma).?ha?emoly", "\\1-haemoly", x)
# remove genus as first word
x <- gsub("^Genus ", "", x)
# allow characters that resemble others ----
# allow characters that resemble others = dyslexia_mode ----
if (dyslexia_mode == TRUE) {
x <- tolower(x)
x <- gsub("[iy]+", "[iy]+", x)
x <- gsub("(c|k|q|qu|s|z|x|ks)+", "(c|k|q|qu|s|z|x|ks)+", x)
x <- gsub("(ph|f|v)+", "(ph|f|v)+", x)
x <- gsub("(th|t)+", "(th|t)+", x)
x <- gsub("(ph|hp|f|v)+", "(ph|hp|f|v)+", x)
x <- gsub("(th|ht|t)+", "(th|ht|t)+", x)
x <- gsub("a+", "a+", x)
x <- gsub("u+", "u+", x)
# allow any ending of -um, -us, -ium, -icum, -ius, -icus, -ica and -a (needs perl for the negative backward lookup):
@ -512,6 +514,10 @@ exec_as.mo <- function(x, @@ -512,6 +514,10 @@ exec_as.mo <- function(x,
x <- gsub("(.)\\1+", "\\1+", x)
# allow ending in -en or -us
x <- gsub("e\\+n(?![a-z[])", "(e+n|u+(c|k|q|qu|s|z|x|ks)+)", x, ignore.case = TRUE, perl = TRUE)
# if the input is longer than 10 characters, add a [.] between all characters, as some might have forgotten a character
# this will allow "Pasteurella damatis" to be correctly read as "Pasteurella dagmatis".
x[nchar(x_backup_without_spp) > 10] <- gsub("([a-z])([a-z])", "\\1.*\\2", x[nchar(x_backup_without_spp) > 10], ignore.case = TRUE)
x[nchar(x_backup_without_spp) > 10] <- gsub("[+]", "+.*", x[nchar(x_backup_without_spp) > 10])
}
x <- strip_whitespace(x)
@ -764,6 +770,27 @@ exec_as.mo <- function(x, @@ -764,6 +770,27 @@ exec_as.mo <- function(x,
}
next
}
# streptococcal groups: milleri and viridans
if (x_trimmed[i] %like% 'strepto.* milleri'
| x_backup_without_spp[i] %like% 'strepto.* milleri'
| x_backup_without_spp[i] %like% 'mgs[^a-z]?$') {
# Milleri Group Streptococcus (MGS)
x[i] <- microorganismsDT[mo == 'B_STRPT_MIL', ..property][[1]][1L]
if (initial_search == TRUE) {
set_mo_history(x_backup[i], get_mo_code(x[i], property), 0, force = force_mo_history)
}
next
}
if (x_trimmed[i] %like% 'strepto.* viridans'
| x_backup_without_spp[i] %like% 'strepto.* viridans'
| x_backup_without_spp[i] %like% 'vgs[^a-z]?$') {
# Viridans Group Streptococcus (VGS)
x[i] <- microorganismsDT[mo == 'B_STRPT_VIR', ..property][[1]][1L]
if (initial_search == TRUE) {
set_mo_history(x_backup[i], get_mo_code(x[i], property), 0, force = force_mo_history)
}
next
}
if (x_backup_without_spp[i] %like% 'gram[ -]?neg.*'
| x_backup_without_spp[i] %like% 'negatie?[vf]'
| x_trimmed[i] %like% 'gram[ -]?neg.*') {
@ -1048,6 +1075,7 @@ exec_as.mo <- function(x, @@ -1048,6 +1075,7 @@ exec_as.mo <- function(x,
return(NA_character_)
}
# UNCERTAINTY LEVEL 1 ----
if (uncertainty_level >= 1) {
now_checks_for_uncertainty_level <- 1
@ -1114,6 +1142,7 @@ exec_as.mo <- function(x, @@ -1114,6 +1142,7 @@ exec_as.mo <- function(x,
}
}
# UNCERTAINTY LEVEL 2 ----
if (uncertainty_level >= 2) {
now_checks_for_uncertainty_level <- 2
@ -1172,9 +1201,37 @@ exec_as.mo <- function(x, @@ -1172,9 +1201,37 @@ exec_as.mo <- function(x,
return(found[1L])
}
# (5a) try to strip off half an element from end and check the remains ----
# (5) inverse input ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 2] (5) inverse input\n")
}
a.x_backup_inversed <- paste(rev(unlist(strsplit(a.x_backup, split = " "))), collapse = " ")
if (isTRUE(debug)) {
message("Running '", a.x_backup_inversed, "'")
}
# first try without dyslexia mode
found <- suppressMessages(suppressWarnings(exec_as.mo(a.x_backup_inversed, initial_search = FALSE, dyslexia_mode = FALSE, allow_uncertain = FALSE, debug = debug)))
if (empty_result(found)) {
# then with dyslexia mode
found <- suppressMessages(suppressWarnings(exec_as.mo(a.x_backup_inversed, initial_search = FALSE, dyslexia_mode = TRUE, allow_uncertain = FALSE, debug = debug)))
}
if (!empty_result(found) & nchar(g.x_backup_without_spp) >= 6) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- rbind(uncertainties,
data.frame(uncertainty = now_checks_for_uncertainty_level,
input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
if (initial_search == TRUE) {
set_mo_history(a.x_backup, get_mo_code(found[1L], property), 2, force = force_mo_history)
}
return(found[1L])
}
# (6) try to strip off half an element from end and check the remains ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 2] (5a) try to strip off half an element from end and check the remains\n")
cat("\n[UNCERTAINLY LEVEL 2] (6) try to strip off half an element from end and check the remains\n")
}
x_strip <- a.x_backup %>% strsplit(" ") %>% unlist()
if (length(x_strip) > 1) {
@ -1209,9 +1266,9 @@ exec_as.mo <- function(x, @@ -1209,9 +1266,9 @@ exec_as.mo <- function(x,
}
}
}
# (5b) try to strip off one element from end and check the remains ----
# (7) try to strip off one element from end and check the remains ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 2] (5b) try to strip off one element from end and check the remains\n")
cat("\n[UNCERTAINLY LEVEL 2] (7) try to strip off one element from end and check the remains\n")
}
if (length(x_strip) > 1) {
for (i in 1:(length(x_strip) - 1)) {
@ -1242,9 +1299,9 @@ exec_as.mo <- function(x, @@ -1242,9 +1299,9 @@ exec_as.mo <- function(x,
}
}
}
# (5c) check for unknown yeasts/fungi ----
# (8) check for unknown yeasts/fungi ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 2] (5b) check for unknown yeasts/fungi\n")
cat("\n[UNCERTAINLY LEVEL 2] (8) check for unknown yeasts/fungi\n")
}
if (b.x_trimmed %like% "yeast") {
found <- "F_YEAST"
@ -1274,9 +1331,9 @@ exec_as.mo <- function(x, @@ -1274,9 +1331,9 @@ exec_as.mo <- function(x,
}
return(found[1L])
}
# (6) try to strip off one element from start and check the remains (only allow >= 2-part name outcome) ----
# (9) try to strip off one element from start and check the remains (only allow >= 2-part name outcome) ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 2] (6) try to strip off one element from start and check the remains (only allow >= 2-part name outcome)\n")
cat("\n[UNCERTAINLY LEVEL 2] (9) try to strip off one element from start and check the remains (only allow >= 2-part name outcome)\n")
}
x_strip <- a.x_backup %>% strsplit(" ") %>% unlist()
if (length(x_strip) > 1 & nchar(g.x_backup_without_spp) >= 6) {
@ -1311,12 +1368,13 @@ exec_as.mo <- function(x, @@ -1311,12 +1368,13 @@ exec_as.mo <- function(x,
}
}
# UNCERTAINTY LEVEL 3 ----
if (uncertainty_level >= 3) {
now_checks_for_uncertainty_level <- 3
# (7a) try to strip off one element from start and check the remains (any text size) ----
# (10) try to strip off one element from start and check the remains (any text size) ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 3] (7a) try to strip off one element from start and check the remains (any text size)\n")
cat("\n[UNCERTAINLY LEVEL 3] (10) try to strip off one element from start and check the remains (any text size)\n")
}
x_strip <- a.x_backup %>% strsplit(" ") %>% unlist()
if (length(x_strip) > 1 & nchar(g.x_backup_without_spp) >= 6) {
@ -1346,10 +1404,10 @@ exec_as.mo <- function(x, @@ -1346,10 +1404,10 @@ exec_as.mo <- function(x,
}
}
}
# (7b) try to strip off one element from end and check the remains (any text size) ----
# (this is in fact 5b but without nchar limit of >=6)
# (11) try to strip off one element from end and check the remains (any text size) ----
# (this is in fact 7 but without nchar limit of >=6)
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 3] (7b) try to strip off one element from end and check the remains (any text size)\n")
cat("\n[UNCERTAINLY LEVEL 3] (11) try to strip off one element from end and check the remains (any text size)\n")
}
if (length(x_strip) > 1) {
for (i in 1:(length(x_strip) - 1)) {
@ -1379,9 +1437,9 @@ exec_as.mo <- function(x, @@ -1379,9 +1437,9 @@ exec_as.mo <- function(x,
}
}
# (8) part of a name (very unlikely match) ----
# (12) part of a name (very unlikely match) ----
if (isTRUE(debug)) {
cat("\n[UNCERTAINLY LEVEL 3] (8) part of a name (very unlikely match)\n")
cat("\n[UNCERTAINLY LEVEL 3] (12) part of a name (very unlikely match)\n")
}
if (isTRUE(debug)) {
message("Running '", f.x_withspaces_end_only, "'")
@ -1775,9 +1833,11 @@ translate_allow_uncertain <- function(allow_uncertain) { @@ -1775,9 +1833,11 @@ translate_allow_uncertain <- function(allow_uncertain) {
# default to uncertainty level 2
allow_uncertain <- 2
} else {
allow_uncertain[tolower(allow_uncertain) == "none"] <- 0
allow_uncertain[tolower(allow_uncertain) == "all"] <- 3
allow_uncertain <- as.integer(allow_uncertain)
if (!allow_uncertain %in% c(0:3)) {
stop("`allow_uncertain` must be a number between 0 (none) and 3 (all), or TRUE (= 2) or FALSE (= 0).", call. = FALSE)
stop('`allow_uncertain` must be a number between 0 (or "none") and 3 (or "all"), or TRUE (= 2) or FALSE (= 0).', call. = FALSE)
}
}
allow_uncertain

25
data-raw/reproduction_of_microorganisms.R

@ -302,6 +302,9 @@ MOs <- MOs %>% @@ -302,6 +302,9 @@ MOs <- MOs %>%
# put `mo` in front, followed by the rest
select(mo, everything(), -abbr_other, -abbr_genus, -abbr_species, -abbr_subspecies)
# remove empty fullnames
MOs <- MOs %>% filter(fullname != "")
# add non-taxonomic entries
MOs <- MOs %>%
bind_rows(
@ -483,6 +486,26 @@ MOs <- MOs %>% @@ -483,6 +486,26 @@ MOs <- MOs %>%
ref = NA_character_,
species_id = "",
source = "manually added"),
# Viridans Streptococci
MOs %>%
filter(genus == "Streptococcus", species == "agalactiae") %>% .[1,] %>%
mutate(mo = gsub("AGA", "VIR", mo),
col_id = NA_integer_,
species = "viridans" ,
fullname = "Viridans Group Streptococcus (VGS)",
ref = NA_character_,
species_id = "",
source = "manually added"),
# Milleri Streptococci
MOs %>%
filter(genus == "Streptococcus", species == "agalactiae") %>% .[1,] %>%
mutate(mo = gsub("AGA", "MIL", mo),
col_id = NA_integer_,
species = "milleri" ,
fullname = "Milleri Group Streptococcus (MGS)",
ref = NA_character_,
species_id = "",
source = "manually added"),
# Trichomonas vaginalis is missing, same order as Dientamoeba
MOs %>%
filter(fullname == "Dientamoeba") %>%
@ -575,7 +598,7 @@ MOs <- MOs %>% @@ -575,7 +598,7 @@ MOs <- MOs %>%
))
# arrange
MOs <- MOs %>% arrange(fullname)
MOs <- MOs %>% arrange(genus, species, subspecies)
MOs.old <- MOs.old %>% arrange(fullname)
# transform

BIN
data/microorganisms.rda

Binary file not shown.

4
docs/LICENSE-text.html

@ -78,7 +78,7 @@ @@ -78,7 +78,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9038</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9055</span>
</span>
</div>
@ -513,7 +513,7 @@ END OF TERMS AND CONDITIONS @@ -513,7 +513,7 @@ END OF TERMS AND CONDITIONS
<footer>
<div class="copyright">
<p>Developed by <a href='https://www.rug.nl/staff/m.s.berends/'>Matthijs S. Berends</a>, <a href='https://www.rug.nl/staff/c.f.luz/'>Christian F. Luz</a>, <a href='https://www.rug.nl/staff/c.glasner/'>Corinna Glasner</a>, <a href='https://www.rug.nl/staff/a.w.friedrich/'>Alex W. Friedrich</a>, <a href='https://www.rug.nl/staff/b.sinha/'>Bhanu N. M. Sinha</a>.</p>
<p>Developed by <a href='https://www.rug.nl/staff/m.s.berends/'>Matthijs S. Berends</a>, <a href='https://www.rug.nl/staff/c.f.luz/'>Christian F. Luz</a>, <a href='https://www.rug.nl/staff/a.w.friedrich/'>Alex W. Friedrich</a>, <a href='https://www.rug.nl/staff/b.sinha/'>Bhanu N. M. Sinha</a>, <a href='https://www.rug.nl/staff/c.glasner/'>Corinna Glasner</a>.</p>
</div>
<div class="pkgdown">

492
docs/articles/AMR.html

@ -40,7 +40,7 @@ @@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9029</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.1.9055</span>
</span>
</div>
@ -185,7 +185,7 @@ @@ -185,7 +185,7 @@
<h1>How to conduct AMR analysis</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">08 August 2019</h4>
<h4 class="date">13 August 2019</h4>
<div class="hidden name"><code>AMR.Rmd</code></div>
@ -194,7 +194,7 @@ @@ -194,7 +194,7 @@
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">R Markdown</a>. However, the methodology remains unchanged. This page was generated on 08 August 2019.</p>
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">R Markdown</a>. However, the methodology remains unchanged. This page was generated on 13 August 2019.</p>
<div id="introduction" class="section level1">
<h1 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h1>
@ -210,21 +210,21 @@ @@ -210,21 +210,21 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2019-08-08</td>
<td align="center">2019-08-13</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
</tr>
<tr class="even">
<td align="center">2019-08-08</td>
<td align="center">2019-08-13</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
</tr>
<tr class="odd">
<td align="center">2019-08-08</td>
<td align="center">2019-08-13</td>
<td align="center">efgh</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
@ -320,9 +320,9 @@ @@ -320,9 +320,9 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2017-01-03</td>
<td align="center">T3</td>
<td align="center">Hospital D</td>
<td align="center">2012-12-02</td>
<td align="center">V1</td>
<td align="center">Hospital B</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">S</td>
<td align="center">I</td>
@ -331,59 +331,59 @@ @@ -331,59 +331,59 @@
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2012-11-05</td>
<td align="center">F9</td>
<td align="center">2014-02-14</td>
<td align="center">E4</td>
<td align="center">Hospital D</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2011-12-06</td>
<td align="center">I5</td>
<td align="center">Hospital C</td>
<td align="center">Klebsiella pneumoniae</td>
<td align="center">R</td>
<td align="center">2011-11-09</td>
<td align="center">E3</td>
<td align="center">Hospital A</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="even">
<td align="center">2011-03-10</td>
<td align="center">B9</td>
<td align="center">Hospital D</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">2011-11-19</td>
<td align="center">S8</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
<tr class="odd">
<td align="center">2010-06-27</td>
<td align="center">A2</td>
<td align="center">2016-07-28</td>
<td align="center">X9</td>
<td align="center">Hospital D</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2017-12-11</td>
<td align="center">F8</td>
<td align="center">2010-03-04</td>
<td align="center">T4</td>
<td align="center">Hospital A</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">M</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
</tbody>
</table>
@ -406,8 +406,8 @@ @@ -406,8 +406,8 @@
#
# Item Count Percent Cum. Count Cum. Percent
# --- ----- ------- -------- ----------- -------------
# 1 M 10,468 52.3% 10,468 52.3%
# 2 F 9,532 47.7% 20,000 100.0%</code></pre>
# 1 M 10,486 52.4% 10,486 52.4%
# 2 F 9,514 47.6% 20,000 100.0%</code></pre>
<p>So, we can draw at least two conclusions immediately. From a data scientists perspective, the data looks clean: only values <code>M</code> and <code>F</code>. From a researchers perspective: there are slightly more men. Nothing we didn’t already know.</p>
<p>The data is already quite clean, but we still need to transform some variables. The <code>bacteria</code> column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> function of the <code>dplyr</code> package makes this really easy:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" data-line-number="1">data &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span></a>
@ -423,53 +423,53 @@ @@ -423,53 +423,53 @@
<a class="sourceLine" id="cb15-4" data-line-number="4"><span class="co"># http://eucast.org/</span></a>
<a class="sourceLine" id="cb15-5" data-line-number="5"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-6" data-line-number="6"><span class="co"># EUCAST Clinical Breakpoints (v9.0, 2019)</span></a>
<a class="sourceLine" id="cb15-7" data-line-number="7"><span class="co"># Aerococcus sanguinicola (no values changed)</span></a>
<a class="sourceLine" id="cb15-8" data-line-number="8"><span class="co"># Aerococcus urinae (no values changed)</span></a>
<a class="sourceLine" id="cb15-9" data-line-number="9"><span class="co"># Anaerobic Gram-negatives (no values changed)</span></a>
<a class="sourceLine" id="cb15-10" data-line-number="10"><span class="co"># Anaerobic Gram-positives (no values changed)</span></a>
<a class="sourceLine" id="cb15-11" data-line-number="11"><span class="co"># Campylobacter coli (no values changed)</span></a>
<a class="sourceLine" id="cb15-12" data-line-number="12"><span class="co"># Campylobacter jejuni (no values changed)</span></a>
<a class="sourceLine" id="cb15-13" data-line-number="13"><span class="co"># Enterobacteriales (Order) (no values changed)</span></a>
<a class="sourceLine" id="cb15-14" data-line-number="14"><span class="co"># Enterococcus (no values changed)</span></a>
<a class="sourceLine" id="cb15-15" data-line-number="15"><span class="co"># Haemophilus influenzae (no values changed)</span></a>
<a class="sourceLine" id="cb15-16" data-line-number="16"><span class="co"># Kingella kingae (no values changed)</span></a>
<a class="sourceLine" id="cb15-17" data-line-number="17"><span class="co"># Moraxella catarrhalis (no values changed)</span></a>
<a class="sourceLine" id="cb15-18" data-line-number="18"><span class="co"># Pasteurella multocida (no values changed)</span></a>
<a class="sourceLine" id="cb15-19" data-line-number="19"><span class="co"># Staphylococcus (no values changed)</span></a>
<a class="sourceLine" id="cb15-20" data-line-number="20"><span class="co"># Streptococcus groups A, B, C, G (no values changed)</span></a>
<a class="sourceLine" id="cb15-21" data-line-number="21"><span class="co"># Streptococcus pneumoniae (1,480 values changed)</span></a>
<a class="sourceLine" id="cb15-22" data-line-number="22"><span class="co"># Viridans group streptococci (no values changed)</span></a>
<a class="sourceLine" id="cb15-7" data-line-number="7"><span class="co"># Aerococcus sanguinicola (no changes)</span></a>
<a class="sourceLine" id="cb15-8" data-line-number="8"><span class="co"># Aerococcus urinae (no changes)</span></a>
<a class="sourceLine" id="cb15-9" data-line-number="9"><span class="co"># Anaerobic Gram-negatives (no changes)</span></a>
<a class="sourceLine" id="cb15-10" data-line-number="10"><span class="co"># Anaerobic Gram-positives (no changes)</span></a>
<a class="sourceLine" id="cb15-11" data-line-number="11"><span class="co"># Campylobacter coli (no changes)</span></a>
<a class="sourceLine" id="cb15-12" data-line-number="12"><span class="co"># Campylobacter jejuni (no changes)</span></a>
<a class="sourceLine" id="cb15-13" data-line-number="13"><span class="co"># Enterobacteriales (Order) (no changes)</span></a>
<a class="sourceLine" id="cb15-14" data-line-number="14"><span class="co"># Enterococcus (no changes)</span></a>
<a class="sourceLine" id="cb15-15" data-line-number="15"><span class="co"># Haemophilus influenzae (no changes)</span></a>
<a class="sourceLine" id="cb15-16" data-line-number="16"><span class="co"># Kingella kingae (no changes)</span></a>
<a class="sourceLine" id="cb15-17" data-line-number="17"><span class="co"># Moraxella catarrhalis (no changes)</span></a>
<a class="sourceLine" id="cb15-18" data-line-number="18"><span class="co"># Pasteurella multocida (no changes)</span></a>
<a class="sourceLine" id="cb15-19" data-line-number="19"><span class="co"># Staphylococcus (no changes)</span></a>
<a class="sourceLine" id="cb15-20" data-line-number="20"><span class="co"># Streptococcus groups A, B, C, G (no changes)</span></a>
<a class="sourceLine" id="cb15-21" data-line-number="21"><span class="co"># Streptococcus pneumoniae (1,452 values changed)</span></a>
<a class="sourceLine" id="cb15-22" data-line-number="22"><span class="co"># Viridans group streptococci (no changes)</span></a>
<a class="sourceLine" id="cb15-23" data-line-number="23"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-24" data-line-number="24"><span class="co"># EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)</span></a>
<a class="sourceLine" id="cb15-25" data-line-number="25"><span class="co"># Table 01: Intrinsic resistance in Enterobacteriaceae (1,293 values changed)</span></a>
<a class="sourceLine" id="cb15-26" data-line-number="26"><span class="co"># Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no values changed)</span></a>
<a class="sourceLine" id="cb15-27" data-line-number="27"><span class="co"># Table 03: Intrinsic resistance in other Gram-negative bacteria (no values changed)</span></a>
<a class="sourceLine" id="cb15-28" data-line-number="28"><span class="co"># Table 04: Intrinsic resistance in Gram-positive bacteria (2,772 values changed)</span></a>
<a class="sourceLine" id="cb15-29" data-line-number="29"><span class="co"># Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no values changed)</span></a>
<a class="sourceLine" id="cb15-30" data-line-number="30"><span class="co"># Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no values changed)</span></a>
<a class="sourceLine" id="cb15-31" data-line-number="31"><span class="co"># Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no values changed)</span></a>
<a class="sourceLine" id="cb15-32" data-line-number="32"><span class="co"># Table 12: Interpretive rules for aminoglycosides (no values changed)</span></a>
<a class="sourceLine" id="cb15-33" data-line-number="33"><span class="co"># Table 13: Interpretive rules for quinolones (no values changed)</span></a>
<a class="sourceLine" id="cb15-25" data-line-number="25"><span class="co"># Table 01: Intrinsic resistance in Enterobacteriaceae (1,313 values changed)</span></a>
<a class="sourceLine" id="cb15-26" data-line-number="26"><span class="co"># Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb15-27" data-line-number="27"><span class="co"># Table 03: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb15-28" data-line-number="28"><span class="co"># Table 04: Intrinsic resistance in Gram-positive bacteria (2,715 values changed)</span></a>
<a class="sourceLine" id="cb15-29" data-line-number="29"><span class="co"># Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<a class="sourceLine" id="cb15-30" data-line-number="30"><span class="co"># Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<a class="sourceLine" id="cb15-31" data-line-number="31"><span class="co"># Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)</span></a>
<a class="sourceLine" id="cb15-32" data-line-number="32"><span class="co"># Table 12: Interpretive rules for aminoglycosides (no changes)</span></a>
<a class="sourceLine" id="cb15-33" data-line-number="33"><span class="co"># Table 13: Interpretive rules for quinolones (no changes)</span></a>
<a class="sourceLine" id="cb15-34" data-line-number="34"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-35" data-line-number="35"><span class="co"># Other rules</span></a>
<a class="sourceLine" id="cb15-36" data-line-number="36"><span class="co"># Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,251 values changed)</span></a>
<a class="sourceLine" id="cb15-37" data-line-number="37"><span class="co"># Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (117 values changed)</span></a>
<a class="sourceLine" id="cb15-38" data-line-number="38"><span class="co"># Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no values changed)</span></a>
<a class="sourceLine" id="cb15-39" data-line-number="39"><span class="co"># Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no values changed)</span></a>
<a class="sourceLine" id="cb15-40" data-line-number="40"><span class="co"># Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no values changed)</span></a>
<a class="sourceLine" id="cb15-41" data-line-number="41"><span class="co"># Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no values changed)</span></a>
<a class="sourceLine" id="cb15-36" data-line-number="36"><span class="co"># Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,240 values changed)</span></a>
<a class="sourceLine" id="cb15-37" data-line-number="37"><span class="co"># Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (123 values changed)</span></a>
<a class="sourceLine" id="cb15-38" data-line-number="38"><span class="co"># Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)</span></a>
<a class="sourceLine" id="cb15-39" data-line-number="39"><span class="co"># Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<a class="sourceLine" id="cb15-40" data-line-number="40"><span class="co"># Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)</span></a>
<a class="sourceLine" id="cb15-41" data-line-number="41"><span class="co"># Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<a class="sourceLine" id="cb15-42" data-line-number="42"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-43" data-line-number="43"><span class="co"># --------------------------------------------------------------------------</span></a>
<a class="sourceLine" id="cb15-44" data-line-number="44"><span class="co"># EUCAST rules affected 6,550 out of 20,000 rows, making a total of 7,913 edits</span></a>
<a class="sourceLine" id="cb15-44" data-line-number="44"><span class="co"># EUCAST rules affected 6,513 out of 20,000 rows, making a total of 7,843 edits</span></a>
<a class="sourceLine" id="cb15-45" data-line-number="45"><span class="co"># =&gt; added 0 test results</span></a>
<a class="sourceLine" id="cb15-46" data-line-number="46"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-47" data-line-number="47"><span class="co"># =&gt; changed 7,913 test results</span></a>
<a class="sourceLine" id="cb15-48" data-line-number="48"><span class="co"># * 111 test results changed from S to I</span></a>
<a class="sourceLine" id="cb15-49" data-line-number="49"><span class="co"># * 4,763 test results changed from S to R</span></a>
<a class="sourceLine" id="cb15-50" data-line-number="50"><span class="co"># * 1,086 test results changed from I to S</span></a>
<a class="sourceLine" id="cb15-51" data-line-number="51"><span class="co"># * 349 test results changed from I to R</span></a>
<a class="sourceLine" id="cb15-52" data-line-number="52"><span class="co"># * 1,585 test results changed from R to S</span></a>
<a class="sourceLine" id="cb15-53" data-line-number="53"><span class="co"># * 19 test results changed from R to I</span></a>
<a class="sourceLine" id="cb15-47" data-line-number="47"><span class="co"># =&gt; changed 7,843 test results</span></a>
<a class="sourceLine" id="cb15-48" data-line-number="48"><span class="co"># - 111 test results changed from S to I</span></a>
<a class="sourceLine" id="cb15-49" data-line-number="49"><span class="co"># - 4,735 test results changed from S to R</span></a>
<a class="sourceLine" id="cb15-50" data-line-number="50"><span class="co"># - 1,034 test results changed from I to S</span></a>
<a class="sourceLine" id="cb15-51" data-line-number="51"><span class="co"># - 316 test results changed from I to R</span></a>
<a class="sourceLine" id="cb15-52" data-line-number="52"><span class="co"># - 1,612 test results changed from R to S</span></a>
<a class="sourceLine" id="cb15-53" data-line-number="53"><span class="co"># - 35 test results changed from R to I</span></a>
<a class="sourceLine" id="cb15-54" data-line-number="54"><span class="co"># --------------------------------------------------------------------------</span></a>
<a class="sourceLine" id="cb15-55" data-line-number="55"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-56" data-line-number="56"><span class="co"># Use eucast_rules(..., verbose = TRUE) (on your original data) to get a data.frame with all specified edits instead.</span></a></code></pre></div>
@ -497,7 +497,7 @@ @@ -497,7 +497,7 @@
<a class="sourceLine" id="cb17-3" data-line-number="3"><span class="co"># </span><span class="al">NOTE</span><span class="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<a class="sourceLine" id="cb17-4" data-line-number="4"><span class="co"># </span><span class="al">NOTE</span><span class="co">: Using column `date` as input for `col_date`.</span></a>
<a class="sourceLine" id="cb17-5" data-line-number="5"><span class="co"># </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb17-6" data-line-number="6"><span class="co"># =&gt; Found 5,610 first isolates (28.0% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb17-6" data-line-number="6"><span class="co"># =&gt; Found 5,643 first isolates (28.2% of total)</span></a></code></pre></div>
<p>So only is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb18-1" data-line-number="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb18-2" data-line-number="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(first <span class="op">==</span><span class="st"> </span><span class="ot">TRUE</span>)</a></code></pre></div>
@ -508,7 +508,7 @@ @@ -508,7 +508,7 @@
<div id="first-weighted-isolates" class="section level2">
<h2 class="hasAnchor">
<a href="#first-weighted-isolates" class="anchor"></a>First <em>weighted</em> isolates</h2>
<p>We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient W2, sorted on date:</p>
<p>We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient Q10, sorted on date:</p>
<table class="table">
<thead><tr class="header">
<th align="center">isolate</th>
@ -524,74 +524,74 @@ @@ -524,74 +524,74 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-02-15</td>
<td align="center">W2</td>
<td align="center">2010-01-12</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-05-10</td>
<td align="center">W2</td>
<td align="center">2010-01-16</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-08-30</td>
<td align="center">W2</td>
<td align="center">2010-03-09</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-09-27</td>
<td align="center">W2</td>
<td align="center">2010-03-31</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-10-28</td>
<td align="center">W2</td>
<td align="center">2010-04-06</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-12-01</td>
<td align="center">W2</td>
<td align="center">2010-05-24</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2011-01-20</td>
<td align="center">W2</td>
<td align="center">2010-05-25</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -601,8 +601,8 @@ @@ -601,8 +601,8 @@
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2011-02-09</td>
<td align="center">W2</td>
<td align="center">2010-07-08</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -612,21 +612,21 @@ @@ -612,21 +612,21 @@
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2011-03-08</td>
<td align="center">W2</td>
<td align="center">2010-10-22</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">TRUE</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-04-14</td>
<td align="center">W2</td>
<td align="center">2010-11-30</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -634,7 +634,7 @@ @@ -634,7 +634,7 @@
</tr>
</tbody>
</table>
<p>Only 2 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The <code><a href="../reference/key_antibiotics.html">key_antibiotics()</a></code> function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.</p>
<p>Only 1 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The <code><a href="../reference/key_antibiotics.html">key_antibiotics()</a></code> function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.</p>
<p>If a column exists with a name like ‘key(…)ab’ the <code><a href="../reference/first_isolate.html">first_isolate()</a></code> function will automatically use it and determine the first weighted isolates. Mind the NOTEs in below output:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" data-line-number="1">data &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb20-2" data-line-number="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">keyab =</span> <span class="kw"><a href="../reference/key_antibiotics.html">key_antibiotics</a></span>(.)) <span class="op">%&gt;%</span><span class="st"> </span></a>
@ -645,7 +645,7 @@ @@ -645,7 +645,7 @@
<a class="sourceLine" id="cb20-7" data-line-number="7"><span class="co"># </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb20-8" data-line-number="8"><span class="co"># </span><span class="al">NOTE</span><span class="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<a class="sourceLine" id="cb20-9" data-line-number="9"><span class="co"># [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<a class="sourceLine" id="cb20-10" data-line-number="10"><span class="co"># =&gt; Found 15,032 first weighted isolates (75.2% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb20-10" data-line-number="10"><span class="co"># =&gt; Found 15,004 first weighted isolates (75.0% of total)</span></a></code></pre></div>
<table class="table">
<thead><tr class="header">
<th align="center">isolate</th>
@ -662,59 +662,59 @@ @@ -662,59 +662,59 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-02-15</td>
<td align="center">W2</td>
<td align="center">2010-01-12</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-05-10</td>
<td align="center">W2</td>
<td align="center">2010-01-16</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-08-30</td>
<td align="center">W2</td>
<td align="center">2010-03-09</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-09-27</td>
<td align="center">W2</td>
<td align="center">2010-03-31</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-10-28</td>
<td align="center">W2</td>
<td align="center">2010-04-06</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
@ -722,32 +722,32 @@ @@ -722,32 +722,32 @@
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-12-01</td>
<td align="center">W2</td>
<td align="center">2010-05-24</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2011-01-20</td>
<td align="center">W2</td>
<td align="center">2010-05-25</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2011-02-09</td>
<td align="center">W2</td>
<td align="center">2010-07-08</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -758,35 +758,35 @@ @@ -758,35 +758,35 @@
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2011-03-08</td>
<td align="center">W2</td>
<td align="center">2010-10-22</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">TRUE</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-04-14</td>
<td align="center">W2</td>
<td align="center">2010-11-30</td>
<td align="center">Q10</td>
<td align="center">B_ESCHR_COL</td>
<td align="center