Browse Source

unknown codes, rsi fix

main
parent
commit
e835525cf6
  1. 2
      DESCRIPTION
  2. 15
      NEWS.md
  3. 5
      R/data.R
  4. 70
      R/mo.R
  5. 68
      R/mo_property.R
  6. 3
      R/rsi.R
  7. 1
      R/zzz.R
  8. BIN
      data/microorganisms.rda
  9. 392
      docs/articles/AMR.html
  10. BIN
      docs/articles/AMR_files/figure-html/plot 1-1.png
  11. BIN
      docs/articles/AMR_files/figure-html/plot 3-1.png
  12. BIN
      docs/articles/AMR_files/figure-html/plot 4-1.png
  13. BIN
      docs/articles/AMR_files/figure-html/plot 5-1.png
  14. 2
      docs/articles/EUCAST.html
  15. 2
      docs/articles/G_test.html
  16. 2
      docs/articles/WHONET.html
  17. 2
      docs/articles/atc_property.html
  18. 70
      docs/articles/benchmarks.html
  19. BIN
      docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png
  20. 2
      docs/articles/freq.html
  21. 2
      docs/articles/mo_property.html
  22. 2
      docs/articles/resistance_predict.html
  23. 145
      docs/news/index.html
  24. 3
      docs/reference/as.mo.html
  25. 3
      docs/reference/microorganisms.html
  26. 2
      docs/reference/microorganisms.old.html
  27. 2
      man/as.mo.Rd
  28. 3
      man/microorganisms.Rd
  29. 2
      man/microorganisms.old.Rd
  30. 81
      reproduction_of_microorganisms.R
  31. 4
      tests/testthat/test-mo.R

2
DESCRIPTION

@ -1,6 +1,6 @@ @@ -1,6 +1,6 @@
Package: AMR
Version: 0.5.0.9020
Date: 2019-03-01
Date: 2019-03-02
Title: Antimicrobial Resistance Analysis
Authors@R: c(
person(

15
NEWS.md

@ -78,11 +78,18 @@ We've got a new website: [https://msberends.gitlab.io/AMR](https://msberends.git @@ -78,11 +78,18 @@ We've got a new website: [https://msberends.gitlab.io/AMR](https://msberends.git
* Functions `atc_ddd()` and `atc_groups()` have been renamed `atc_online_ddd()` and `atc_online_groups()`. The old functions are deprecated and will be removed in a future version.
* Function `guess_mo()` is now deprecated in favour of `as.mo()` and will be removed in future versions
* Function `guess_atc()` is now deprecated in favour of `as.atc()` and will be removed in future versions
* Improvements for `as.mo()`:
* Improvements for `as.mo()`:\
* Incoercible results will now be considered 'unknown', MO code `UNKNOWN`. Properties of these will be translated on foreign systems in all language already previously supported: German, Dutch, French, Italian, Spanish and Portuguese:
```r
mo_genus("qwerty", language = "es")
# Warning:
# one unique value (^= 100.0%) could not be coerced and is considered 'unknown': "qwerty". Use mo_failures() to review it.
#> [1] "(género desconocido)"
```
* Fix for vector containing only empty values
* Finds better results when input is in other languages
* Better handling for subspecies
* Better handling for *Salmonellae*
* Better handling for *Salmonellae*, especially the 'city like' serovars like *Salmonella London*
* Understanding of highly virulent *E. coli* strains like EIEC, EPEC and STEC
* There will be looked for uncertain results at default - these results will be returned with an informative warning
* Manual (help page) now contains more info about the algorithms
@ -102,7 +109,9 @@ We've got a new website: [https://msberends.gitlab.io/AMR](https://msberends.git @@ -102,7 +109,9 @@ We've got a new website: [https://msberends.gitlab.io/AMR](https://msberends.git
* New colours for `scale_rsi_colours()`
* Summaries of class `mo` will now return the top 3 and the unique count, e.g. using `summary(mo)`
* Small text updates to summaries of class `rsi` and `mic`
* Function `as.rsi()` now gives a warning when inputting MIC values
* Function `as.rsi()`:
* Now gives a warning when inputting MIC values
* Now accepts high and low resistance: `"HIGH S"` will return `S`
* Frequency tables (`freq()` function):
* Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
```r

5
R/data.R

@ -134,7 +134,7 @@ @@ -134,7 +134,7 @@
#'
#' A data set containing the microbial taxonomy of six kingdoms from the Catalogue of Life. MO codes can be looked up using \code{\link{as.mo}}.
#' @inheritSection catalogue_of_life Catalogue of Life
#' @format A \code{\link{data.frame}} with 57,158 observations and 14 variables:
#' @format A \code{\link{data.frame}} with 59,985 observations and 15 variables:
#' \describe{
#' \item{\code{mo}}{ID of microorganism as used by this package}
#' \item{\code{col_id}}{Catalogue of Life ID}
@ -150,6 +150,7 @@ @@ -150,6 +150,7 @@
#' \item{\code{rank}}{Taxonomic rank of the microorganism, like \code{"species"} or \code{"genus"}}
#' \item{\code{ref}}{Author(s) and year of concerning scientific publication}
#' \item{\code{species_id}}{ID of the species as used by the Catalogue of Life}
#' \item{\code{prevalence}}{Prevalence of the microorganism, see \code{?as.mo}}
#' }
#' @source Catalogue of Life: Annual Checklist (public online database), \url{www.catalogueoflife.org}.
#' @details Manually added were:
@ -172,7 +173,7 @@ catalogue_of_life <- list( @@ -172,7 +173,7 @@ catalogue_of_life <- list(
#'
#' A data set containing old (previously valid or accepted) taxonomic names according to the Catalogue of Life. This data set is used internally by \code{\link{as.mo}}.
#' @inheritSection catalogue_of_life Catalogue of Life
#' @format A \code{\link{data.frame}} with 14,487 observations and 4 variables:
#' @format A \code{\link{data.frame}} with 17,069 observations and 4 variables:
#' \describe{
#' \item{\code{col_id}}{Catalogue of Life ID}
#' \item{\code{tsn_new}}{New Catalogue of Life ID}

70
R/mo.R

@ -51,6 +51,8 @@ @@ -51,6 +51,8 @@
#' F (Fungi), P (Protozoa), PL (Plantae) or V (Viruses)
#' }
#'
#' Values that cannot be coered will be considered 'unknown' and have an MO code \code{UNKNOWN}.
#'
#' Use the \code{\link{mo_property}} functions to get properties based on the returned code, see Examples.
#'
#' \strong{Artificial Intelligence} \cr
@ -275,7 +277,8 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -275,7 +277,8 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
# only check the uniques, which is way faster
x <- unique(x)
# remove empty values (to later fill them in again with NAs)
x <- x[!is.na(x) & !is.null(x) & !identical(x, "")]
# ("xxx" is WHONET code for 'no growth')
x <- x[!is.na(x) & !is.null(x) & !identical(x, "") & !identical(x, "xxx")]
# conversion of old MO codes from v0.5.0 (ITIS) to later versions (Catalogue of Life)
if (any(x %like% "^[BFP]_[A-Z]{3,7}") & !all(x %in% microorganisms$mo)) {
@ -367,8 +370,6 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -367,8 +370,6 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
x_species <- paste(x, "species")
# translate to English for supported languages of mo_property
x <- gsub("(Gruppe|gruppe|groep|grupo|gruppo|groupe)", "group", x, ignore.case = TRUE)
# remove 'empty' genus and species values
x <- gsub("(no MO)", "", x, fixed = TRUE)
# remove non-text in case of "E. coli" except dots and spaces
x <- gsub("[^.a-zA-Z0-9/ \\-]+", "", x)
# replace minus by a space
@ -419,12 +420,17 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -419,12 +420,17 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
next
}
if (tolower(x_trimmed[i]) %in% c("", "xxx", "other", "none", "unknown")) {
# empty and nonsense values, ignore without warning ("xxx" is WHONET code for 'no growth')
if (any(x_trimmed[i] %in% c(NA, ""))) {
x[i] <- NA_character_
next
}
if (tolower(x_trimmed[i]) %in% c("xxx", "other", "none", "unknown")) {
# empty and nonsense values, ignore without warning
x[i] <- microorganismsDT[mo == "UNKNOWN", ..property][[1]]
next
}
if (nchar(gsub("[^a-zA-Z]", "", x_trimmed[i])) < 3) {
# check if search term was like "A. species", then return first genus found with ^A
if (x_backup[i] %like% "[a-z]+ species" | x_backup[i] %like% "[a-z] spp[.]?") {
@ -441,14 +447,14 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -441,14 +447,14 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
}
}
# fewer than 3 chars and not looked for species, add as failure
x[i] <- NA_character_
x[i] <- microorganismsDT[mo == "UNKNOWN", ..property][[1]]
failures <- c(failures, x_backup[i])
next
}
if (x_trimmed[i] %like% "virus") {
# there is no fullname like virus, so don't try to coerce it
x[i] <- NA_character_
x[i] <- microorganismsDT[mo == "UNKNOWN", ..property][[1]]
failures <- c(failures, x_backup[i])
next
}
@ -667,7 +673,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -667,7 +673,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
d.x_withspaces_start_end = x_withspaces_start_end[i],
e.x_withspaces_start_only = x_withspaces_start_only[i],
f.x_withspaces_end_only = x_withspaces_end_only[i])
if (!is.na(x[i])) {
if (!empty_result(x[i])) {
next
}
# THEN TRY PREVALENT IN HUMAN INFECTIONS ----
@ -678,7 +684,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -678,7 +684,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
d.x_withspaces_start_end = x_withspaces_start_end[i],
e.x_withspaces_start_only = x_withspaces_start_only[i],
f.x_withspaces_end_only = x_withspaces_end_only[i])
if (!is.na(x[i])) {
if (!empty_result(x[i])) {
next
}
# THEN UNPREVALENT IN HUMAN INFECTIONS ----
@ -689,7 +695,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -689,7 +695,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
d.x_withspaces_start_end = x_withspaces_start_end[i],
e.x_withspaces_start_only = x_withspaces_start_only[i],
f.x_withspaces_end_only = x_withspaces_end_only[i])
if (!is.na(x[i])) {
if (!empty_result(x[i])) {
next
}
@ -765,7 +771,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -765,7 +771,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
# (3) not yet implemented taxonomic changes in Catalogue of Life ----
found <- suppressMessages(suppressWarnings(exec_as.mo(TEMPORARY_TAXONOMY(b.x_trimmed), clear_options = FALSE, allow_uncertain = FALSE)))
if (!is.na(found)) {
if (!empty_result(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- rbind(uncertainties,
@ -780,7 +786,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -780,7 +786,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
a.x_backup_stripped <- gsub("( *[(].*[)] *)", " ", a.x_backup)
a.x_backup_stripped <- trimws(gsub(" +", " ", a.x_backup_stripped))
found <- suppressMessages(suppressWarnings(exec_as.mo(a.x_backup_stripped, clear_options = FALSE, allow_uncertain = FALSE)))
if (!is.na(found) & nchar(b.x_trimmed) >= 6) {
if (!empty_result(found) & nchar(b.x_trimmed) >= 6) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- rbind(uncertainties,
@ -797,7 +803,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -797,7 +803,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
for (i in 1:(length(x_strip) - 1)) {
x_strip_collapsed <- paste(x_strip[1:(length(x_strip) - i)], collapse = " ")
found <- suppressMessages(suppressWarnings(exec_as.mo(x_strip_collapsed, clear_options = FALSE, allow_uncertain = FALSE)))
if (!is.na(found)) {
if (!empty_result(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- rbind(uncertainties,
@ -816,7 +822,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -816,7 +822,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
for (i in 2:(length(x_strip))) {
x_strip_collapsed <- paste(x_strip[i:length(x_strip)], collapse = " ")
found <- suppressMessages(suppressWarnings(exec_as.mo(x_strip_collapsed, clear_options = FALSE, allow_uncertain = FALSE)))
if (!is.na(found)) {
if (!empty_result(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- rbind(uncertainties,
@ -833,13 +839,15 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -833,13 +839,15 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
found <- microorganismsDT[fullname %like% f.x_withspaces_end_only]
if (nrow(found) > 0) {
found_result <- found[["mo"]]
found <- microorganismsDT[mo == found_result[1L], ..property][[1]]
uncertainties <<- rbind(uncertainties,
data.frame(uncertainty = 3,
input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
if (!empty_result(found_result)) {
found <- microorganismsDT[mo == found_result[1L], ..property][[1]]
uncertainties <<- rbind(uncertainties,
data.frame(uncertainty = 3,
input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
}
}
# didn't found in uncertain results too
@ -847,13 +855,13 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -847,13 +855,13 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
}
x[i] <- uncertain_fn(x_backup[i], x_trimmed[i], x_withspaces_start_end[i], x_withspaces_start_only[i], x_withspaces_end_only[i])
if (!is.na(x[i])) {
if (!empty_result(x[i])) {
next
}
}
# not found ----
x[i] <- NA_character_
x[i] <- microorganismsDT[mo == "UNKNOWN", ..property][[1]]
failures <- c(failures, x_backup[i])
}
}
@ -862,15 +870,15 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -862,15 +870,15 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
failures <- failures[!failures %in% c(NA, NULL, NaN)]
if (length(failures) > 0 & clear_options == TRUE) {
options(mo_failures = sort(unique(failures)))
plural <- c("value", "it")
plural <- c("value", "it", "is")
if (n_distinct(failures) > 1) {
plural <- c("values", "them")
plural <- c("values", "them", "are")
}
total_failures <- length(x_input[x_input %in% failures & !x_input %in% c(NA, NULL, NaN)])
total_n <- length(x_input[!x_input %in% c(NA, NULL, NaN)])
msg <- paste0("\n", nr2char(n_distinct(failures)), " unique input ", plural[1],
msg <- paste0("\n", nr2char(n_distinct(failures)), " unique ", plural[1],
" (^= ", percent(total_failures / total_n, round = 1, force_zero = TRUE),
") could not be coerced to a valid MO code")
") could not be coerced and ", plural[3], " considered 'unknown'")
if (n_distinct(failures) <= 10) {
msg <- paste0(msg, ": ", paste('"', unique(failures), '"', sep = "", collapse = ', '))
}
@ -887,7 +895,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -887,7 +895,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
if (NROW(uncertainties) > 1) {
plural <- c("values", "them")
}
msg <- paste0("\nResults of ", nr2char(NROW(uncertainties)), " input ", plural[1],
msg <- paste0("\nResults of ", nr2char(NROW(uncertainties)), " ", plural[1],
" was guessed with uncertainty. Use mo_uncertainties() to review ", plural[2], ".")
warning(red(msg),
call. = FALSE,
@ -951,7 +959,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -951,7 +959,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
# Wrap up ----------------------------------------------------------------
# comply to x, which is also unique and without empty values
x_input_unique_nonempty <- unique(x_input[!is.na(x_input) & !is.null(x_input) & !identical(x_input, "")])
x_input_unique_nonempty <- unique(x_input[!is.na(x_input) & !is.null(x_input) & !identical(x_input, "") & !identical(x_input, "xxx")])
# left join the found results to the original input values (x_input)
df_found <- data.frame(input = as.character(x_input_unique_nonempty),
@ -984,6 +992,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, @@ -984,6 +992,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
x
}
empty_result <- function(x) {
x %in% c(NA, "UNKNOWN")
}
TEMPORARY_TAXONOMY <- function(x) {
x[x %like% 'Cutibacterium'] <- gsub('Cutibacterium', 'Propionibacterium', x[x %like% 'Cutibacterium'])
x

68
R/mo_property.R

@ -364,7 +364,7 @@ mo_translate <- function(x, language) { @@ -364,7 +364,7 @@ mo_translate <- function(x, language) {
}
x_tobetranslated <- grepl(x = x,
pattern = "(Coagulase Negative Staphylococcus|Coagulase Positive Staphylococcus|Beta-haemolytic Streptococcus|unknown Gram negatives|unknown Gram positives|CoNS|CoPS|no MO|Gram negative|Gram positive|Bacteria|Fungi|Protozoa|biogroup|biotype|vegetative|group|Group)")
pattern = "(Coagulase Negative Staphylococcus|Coagulase Positive Staphylococcus|Beta-haemolytic Streptococcus|unknown Gram negatives|unknown Gram positives|unknown name|unknown kingdom|unknown phylum|unknown class|unknown order|unknown family|unknown genus|unknown species|unknown subspecies|unknown rank|CoNS|CoPS|Gram negative|Gram positive|Bacteria|Fungi|Protozoa|biogroup|biotype|vegetative|group|Group)")
if (sum(x_tobetranslated, na.rm = TRUE) == 0) {
return(x)
@ -379,9 +379,18 @@ mo_translate <- function(x, language) { @@ -379,9 +379,18 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Beta-h\u00e4molytischer Streptococcus", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "unbekannte Gramnegativen", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "unbekannte Grampositiven", ., fixed = TRUE) %>%
gsub("unknown name", "unbekannte Name", ., fixed = TRUE) %>%
gsub("unknown kingdom", "unbekanntes Reich", ., fixed = TRUE) %>%
gsub("unknown phylum", "unbekannter Stamm", ., fixed = TRUE) %>%
gsub("unknown class", "unbekannte Klasse", ., fixed = TRUE) %>%
gsub("unknown order", "unbekannte Ordnung", ., fixed = TRUE) %>%
gsub("unknown family", "unbekannte Familie", ., fixed = TRUE) %>%
gsub("unknown genus", "unbekannte Gattung", ., fixed = TRUE) %>%
gsub("unknown species", "unbekannte Art", ., fixed = TRUE) %>%
gsub("unknown subspecies", "unbekannte Unterart", ., fixed = TRUE) %>%
gsub("unknown rank", "unbekannter Rang", ., fixed = TRUE) %>%
gsub("(CoNS)", "(KNS)", ., fixed = TRUE) %>%
gsub("(CoPS)", "(KPS)", ., fixed = TRUE) %>%
gsub("(no MO)", "(kein MO)", ., fixed = TRUE) %>%
gsub("Gram negative", "Gramnegativ", ., fixed = TRUE) %>%
gsub("Gram positive", "Grampositiv", ., fixed = TRUE) %>%
gsub("Bacteria", "Bakterien", ., fixed = TRUE) %>%
@ -401,7 +410,16 @@ mo_translate <- function(x, language) { @@ -401,7 +410,16 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Beta-hemolytische Streptococcus", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "onbekende Gram-negatieven", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "onbekende Gram-positieven", ., fixed = TRUE) %>%
gsub("(no MO)", "(geen MO)", ., fixed = TRUE) %>%
gsub("unknown name", "onbekende naam", ., fixed = TRUE) %>%
gsub("unknown kingdom", "onbekend koninkrijk", ., fixed = TRUE) %>%
gsub("unknown phylum", "onbekende fylum", ., fixed = TRUE) %>%
gsub("unknown class", "onbekende klasse", ., fixed = TRUE) %>%
gsub("unknown order", "onbekende orde", ., fixed = TRUE) %>%
gsub("unknown family", "onbekende familie", ., fixed = TRUE) %>%
gsub("unknown genus", "onbekend geslacht", ., fixed = TRUE) %>%
gsub("unknown species", "onbekende soort", ., fixed = TRUE) %>%
gsub("unknown subspecies", "onbekende ondersoort", ., fixed = TRUE) %>%
gsub("unknown rank", "onbekende rang", ., fixed = TRUE) %>%
gsub("(CoNS)", "(CNS)", ., fixed = TRUE) %>%
gsub("(CoPS)", "(CPS)", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram-negatief", ., fixed = TRUE) %>%
@ -423,7 +441,16 @@ mo_translate <- function(x, language) { @@ -423,7 +441,16 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-hemol\u00edtico", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "Gram negativos desconocidos", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "Gram positivos desconocidos", ., fixed = TRUE) %>%
gsub("(no MO)", "(sin MO)", ., fixed = TRUE) %>%
gsub("unknown name", "nombre desconocido", ., fixed = TRUE) %>%
gsub("unknown kingdom", "reino desconocido", ., fixed = TRUE) %>%
gsub("unknown phylum", "filo desconocido", ., fixed = TRUE) %>%
gsub("unknown class", "clase desconocida", ., fixed = TRUE) %>%
gsub("unknown order", "orden desconocido", ., fixed = TRUE) %>%
gsub("unknown family", "familia desconocida", ., fixed = TRUE) %>%
gsub("unknown genus", "g\u00e9nero desconocido", ., fixed = TRUE) %>%
gsub("unknown species", "especie desconocida", ., fixed = TRUE) %>%
gsub("unknown subspecies", "subespecie desconocida", ., fixed = TRUE) %>%
gsub("unknown rank", "rango desconocido", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Bacterias", ., fixed = TRUE) %>%
@ -443,7 +470,16 @@ mo_translate <- function(x, language) { @@ -443,7 +470,16 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-emolitico", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "Gram negativi sconosciuti", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "Gram positivi sconosciuti", ., fixed = TRUE) %>%
gsub("(no MO)", "(non MO)", ., fixed = TRUE) %>%
gsub("unknown name", "nome sconosciuto", ., fixed = TRUE) %>%
gsub("unknown kingdom", "regno sconosciuto", ., fixed = TRUE) %>%
gsub("unknown phylum", "phylum sconosciuto", ., fixed = TRUE) %>%
gsub("unknown class", "classe sconosciuta", ., fixed = TRUE) %>%
gsub("unknown order", "ordine sconosciuto", ., fixed = TRUE) %>%
gsub("unknown family", "famiglia sconosciuta", ., fixed = TRUE) %>%
gsub("unknown genus", "genere sconosciuto", ., fixed = TRUE) %>%
gsub("unknown species", "specie sconosciute", ., fixed = TRUE) %>%
gsub("unknown subspecies", "sottospecie sconosciute", ., fixed = TRUE) %>%
gsub("unknown rank", "grado sconosciuto", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Batteri", ., fixed = TRUE) %>%
@ -462,7 +498,16 @@ mo_translate <- function(x, language) { @@ -462,7 +498,16 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Streptococcus B\u00eata-h\u00e9molytique", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "Gram n\u00e9gatifs inconnus", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "Gram positifs inconnus", ., fixed = TRUE) %>%
gsub("(no MO)", "(pas MO)", ., fixed = TRUE) %>%
gsub("unknown name", "nom inconnu", ., fixed = TRUE) %>%
gsub("unknown kingdom", "r\u00e8gme inconnu", ., fixed = TRUE) %>%
gsub("unknown phylum", "embranchement inconnu", ., fixed = TRUE) %>%
gsub("unknown class", "classe inconnue", ., fixed = TRUE) %>%
gsub("unknown order", "ordre inconnu", ., fixed = TRUE) %>%
gsub("unknown family", "famille inconnue", ., fixed = TRUE) %>%
gsub("unknown genus", "genre inconnu", ., fixed = TRUE) %>%
gsub("unknown species", "esp\u00e8ce inconnue", ., fixed = TRUE) %>%
gsub("unknown subspecies", "sous-esp\u00e8ce inconnue", ., fixed = TRUE) %>%
gsub("unknown rank", "rang inconnu", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram n\u00e9gatif", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positif", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9ries", ., fixed = TRUE) %>%
@ -482,7 +527,16 @@ mo_translate <- function(x, language) { @@ -482,7 +527,16 @@ mo_translate <- function(x, language) {
gsub("Beta-haemolytic Streptococcus", "Streptococcus Beta-hemol\u00edtico", ., fixed = TRUE) %>%
gsub("unknown Gram negatives", "Gram negativos desconhecidos", ., fixed = TRUE) %>%
gsub("unknown Gram positives", "Gram positivos desconhecidos", ., fixed = TRUE) %>%
gsub("(no MO)", "(sem MO)", ., fixed = TRUE) %>%
gsub("unknown name", "nome desconhecido", ., fixed = TRUE) %>%
gsub("unknown kingdom", "reino desconhecido", ., fixed = TRUE) %>%
gsub("unknown phylum", "filo desconhecido", ., fixed = TRUE) %>%
gsub("unknown class", "classe desconhecida", ., fixed = TRUE) %>%
gsub("unknown order", "ordem desconhecido", ., fixed = TRUE) %>%
gsub("unknown family", "fam\u00edlia desconhecida", ., fixed = TRUE) %>%
gsub("unknown genus", "g\u00eanero desconhecido", ., fixed = TRUE) %>%
gsub("unknown species", "esp\u00e9cies desconhecida", ., fixed = TRUE) %>%
gsub("unknown subspecies", "subesp\u00e9cies desconhecida", ., fixed = TRUE) %>%
gsub("unknown rank", "classifica\u00e7\u00e3o desconhecido", ., fixed = TRUE) %>%
gsub("Gram negative", "Gram negativo", ., fixed = TRUE) %>%
gsub("Gram positive", "Gram positivo", ., fixed = TRUE) %>%
gsub("Bacteria", "Bact\u00e9rias", ., fixed = TRUE) %>%

3
R/rsi.R

@ -76,6 +76,9 @@ as.rsi <- function(x) { @@ -76,6 +76,9 @@ as.rsi <- function(x) {
x <- gsub(' +', '', x)
# remove all MIC-like values: numbers, operators and periods
x <- gsub('[0-9.,;:<=>]+', '', x)
# remove everything between brackets, and 'high' and 'low'
x <- gsub("([(].*[)])", "", x)
x <- gsub("(high|low)", "", x, ignore.case = TRUE)
# disallow more than 3 characters
x[nchar(x) > 3] <- NA
# set to capitals

1
R/zzz.R

@ -55,6 +55,7 @@ make <- function() { @@ -55,6 +55,7 @@ make <- function() {
mutate(prevalence = case_when(
class == "Gammaproteobacteria"
| genus %in% c("Enterococcus", "Staphylococcus", "Streptococcus")
| mo == "UNKNOWN"
~ 1,
phylum %in% c("Proteobacteria",
"Firmicutes",

BIN
data/microorganisms.rda

Binary file not shown.

392
docs/articles/AMR.html

@ -192,7 +192,7 @@ @@ -192,7 +192,7 @@
<h1>How to conduct AMR analysis</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">01 March 2019</h4>
<h4 class="date">02 March 2019</h4>
<div class="hidden name"><code>AMR.Rmd</code></div>
@ -201,7 +201,7 @@ @@ -201,7 +201,7 @@
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 01 March 2019.</p>
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 02 March 2019.</p>
<div id="introduction" class="section level1">
<h1 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h1>
@ -217,21 +217,21 @@ @@ -217,21 +217,21 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2019-03-01</td>
<td align="center">2019-03-02</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
</tr>
<tr class="even">
<td align="center">2019-03-01</td>
<td align="center">2019-03-02</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
</tr>
<tr class="odd">
<td align="center">2019-03-01</td>
<td align="center">2019-03-02</td>
<td align="center">efgh</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
@ -327,69 +327,69 @@ @@ -327,69 +327,69 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2017-06-13</td>
<td align="center">Z3</td>
<td align="center">Hospital D</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">2012-06-30</td>
<td align="center">W5</td>
<td align="center">Hospital A</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2013-09-13</td>
<td align="center">A1</td>
<td align="center">Hospital D</td>
<td align="center">2012-07-07</td>
<td align="center">T4</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">F</td>
</tr>
<tr class="odd">
<td align="center">2015-06-10</td>
<td align="center">L8</td>
<td align="center">2011-02-19</td>
<td align="center">H3</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="even">
<td align="center">2010-08-05</td>
<td align="center">A7</td>
<td align="center">Hospital D</td>
<td align="center">2012-12-15</td>
<td align="center">G10</td>
<td align="center">Hospital C</td>
<td align="center">Klebsiella pneumoniae</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2011-08-25</td>
<td align="center">X4</td>
<td align="center">Hospital C</td>
<td align="center">2010-09-11</td>
<td align="center">L4</td>
<td align="center">Hospital D</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">M</td>
</tr>
<tr class="even">
<td align="center">2011-03-09</td>
<td align="center">B7</td>
<td align="center">Hospital D</td>
<td align="center">2011-03-27</td>
<td align="center">H5</td>
<td align="center">Hospital A</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">M</td>
</tr>
</tbody>
@ -411,8 +411,8 @@ @@ -411,8 +411,8 @@
#&gt;
#&gt; Item Count Percent Cum. Count Cum. Percent
#&gt; --- ----- ------- -------- ----------- -------------
#&gt; 1 M 10,311 51.6% 10,311 51.6%
#&gt; 2 F 9,689 48.4% 20,000 100.0%</code></pre>
#&gt; 1 M 10,433 52.2% 10,433 52.2%
#&gt; 2 F 9,567 47.8% 20,000 100.0%</code></pre>
<p>So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values <code>M</code> and <code>F</code>. From a researcher perspective: there are slightly more men. Nothing we didn’t already know.</p>
<p>The data is already quite clean, but we still need to transform some variables. The <code>bacteria</code> column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> function of the <code>dplyr</code> package makes this really easy:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1">data &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span></a>
@ -443,10 +443,10 @@ @@ -443,10 +443,10 @@
<a class="sourceLine" id="cb14-19" title="19"><span class="co">#&gt; Kingella kingae (no changes)</span></a>
<a class="sourceLine" id="cb14-20" title="20"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-21" title="21"><span class="co">#&gt; EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1364 changes)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1323 changes)</span></a>
<a class="sourceLine" id="cb14-23" title="23"><span class="co">#&gt; Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-24" title="24"><span class="co">#&gt; Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2659 changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2834 changes)</span></a>
<a class="sourceLine" id="cb14-26" title="26"><span class="co">#&gt; Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<a class="sourceLine" id="cb14-27" title="27"><span class="co">#&gt; Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<a class="sourceLine" id="cb14-28" title="28"><span class="co">#&gt; Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)</span></a>
@ -462,9 +462,9 @@ @@ -462,9 +462,9 @@
<a class="sourceLine" id="cb14-38" title="38"><span class="co">#&gt; Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<a class="sourceLine" id="cb14-39" title="39"><span class="co">#&gt; Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<a class="sourceLine" id="cb14-40" title="40"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,366 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,524 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-42" title="42"><span class="co">#&gt; -&gt; added 0 test results</span></a>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 4,023 test results (0 to S; 0 to I; 4,023 to R)</span></a></code></pre></div>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 4,157 test results (0 to S; 0 to I; 4,157 to R)</span></a></code></pre></div>
</div>
<div id="adding-new-variables" class="section level1">
<h1 class="hasAnchor">
@ -489,8 +489,8 @@ @@ -489,8 +489,8 @@
<a class="sourceLine" id="cb16-3" title="3"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<a class="sourceLine" id="cb16-4" title="4"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `date` as input for `col_date`.</span></a>
<a class="sourceLine" id="cb16-5" title="5"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,641 first isolates (28.2% of total)</span></a></code></pre></div>
<p>So only 28.2% is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,698 first isolates (28.5% of total)</span></a></code></pre></div>
<p>So only 28.5% is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb17-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb17-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(first <span class="op">==</span><span class="st"> </span><span class="ot">TRUE</span>)</a></code></pre></div>
<p>For future use, the above two syntaxes can be shortened with the <code><a href="../reference/first_isolate.html">filter_first_isolate()</a></code> function:</p>
@ -516,10 +516,10 @@ @@ -516,10 +516,10 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-14</td>
<td align="center">K8</td>
<td align="center">2010-01-18</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -527,30 +527,30 @@ @@ -527,30 +527,30 @@
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-02-17</td>
<td align="center">K8</td>
<td align="center">2010-02-27</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-03-01</td>
<td align="center">K8</td>
<td align="center">2010-04-22</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-03-11</td>
<td align="center">K8</td>
<td align="center">2010-06-09</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -560,54 +560,54 @@ @@ -560,54 +560,54 @@
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-04-13</td>
<td align="center">K8</td>
<td align="center">2011-04-13</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">FALSE</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-08-30</td>
<td align="center">K8</td>
<td align="center">2011-04-25</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-11-05</td>
<td align="center">K8</td>
<td align="center">2011-08-02</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-12-21</td>
<td align="center">K8</td>
<td align="center">2011-10-19</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-12-21</td>
<td align="center">K8</td>
<td align="center">2011-10-23</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -615,14 +615,14 @@ @@ -615,14 +615,14 @@
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-03-20</td>
<td align="center">K8</td>
<td align="center">2011-11-10</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
</tbody>
</table>
@ -637,7 +637,7 @@ @@ -637,7 +637,7 @@
<a class="sourceLine" id="cb19-7" title="7"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb19-8" title="8"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<a class="sourceLine" id="cb19-9" title="9"><span class="co">#&gt; [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,738 first weighted isolates (78.7% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,826 first weighted isolates (79.1% of total)</span></a></code></pre></div>
<table class="table">
<thead><tr class="header">
<th align="center">isolate</th>
@ -654,10 +654,10 @@ @@ -654,10 +654,10 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-14</td>
<td align="center">K8</td>
<td align="center">2010-01-18</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -666,94 +666,94 @@ @@ -666,94 +666,94 @@
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-02-17</td>
<td align="center">K8</td>
<td align="center">2010-02-27</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-03-01</td>
<td align="center">K8</td>
<td align="center">2010-04-22</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-03-11</td>
<td align="center">K8</td>
<td align="center">2010-06-09</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-04-13</td>
<td align="center">K8</td>
<td align="center">2011-04-13</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">FALSE</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-08-30</td>
<td align="center">K8</td>
<td align="center">2011-04-25</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-11-05</td>
<td align="center">K8</td>
<td align="center">2011-08-02</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-12-21</td>
<td align="center">K8</td>
<td align="center">2011-10-19</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-12-21</td>
<td align="center">K8</td>
<td align="center">2011-10-23</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -762,23 +762,23 @@ @@ -762,23 +762,23 @@
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-03-20</td>
<td align="center">K8</td>
<td align="center">2011-11-10</td>
<td align="center">C7</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
</tbody>
</table>
<p>Instead of 2, now 9 isolates are flagged. In total, 78.7% of all isolates are marked ‘first weighted’ - 50.5% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>Instead of 2, now 7 isolates are flagged. In total, 79.1% of all isolates are marked ‘first weighted’ - 50.6% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>As with <code><a href="../reference/first_isolate.html">filter_first_isolate()</a></code>, there’s a shortcut for this new algorithm too:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb20-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/first_isolate.html">filter_first_weighted_isolate</a></span>()</a></code></pre></div>
<p>So we end up with 15,738 isolates for analysis.</p>
<p>So we end up with 15,826 isolates for analysis.</p>
<p>We can remove unneeded columns:</p>
<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb21-1" title="1">data_1st &lt;-<span class="st"> </span>data_1st <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb21-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="op">-</span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(first, keyab))</a></code></pre></div>
@ -803,14 +803,14 @@ @@ -803,14 +803,14 @@
</tr></thead>
<tbody>
<tr class="odd">
<td>1</td>
<td align="center">2017-06-13</td>
<td align="center">Z3</td>
<td align="center">Hospital D</td>
<td>2</td>
<td align="center">2012-07-07</td>
<td align="center">T4</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">Gram negative</td>
@ -820,12 +820,12 @@ @@ -820,12 +820,12 @@
</tr>
<tr class="even">
<td>3</td>
<td align="center">2015-06-10</td>
<td align="center">L8</td>
<td align="center">2011-02-19</td>
<td align="center">H3</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
@ -835,69 +835,69 @@ @@ -835,69 +835,69 @@
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>4</td>
<td align="center">2012-12-15</td>
<td align="center">G10</td>
<td align="center">Hospital C</td>
<td align="center">B_KLBSL_PNE</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Klebsiella</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>6</td>
<td align="center">2011-03-09</td>
<td align="center">B7</td>
<td align="center">Hospital D</td>
<td align="center">2011-03-27</td>
<td align="center">H5</td>
<td align="center">Hospital A</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>8</td>
<td align="center">2015-03-15</td>
<td align="center">O1</td>
<tr class="odd">
<td>7</td>
<td align="center">2012-06-22</td>
<td align="center">Q8</td>
<td align="center">Hospital A</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">F</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>9</td>
<td align="center">2012-09-07</td>
<td align="center">K5</td>
<td align="center">Hospital C</td>
<tr class="even">
<td>8</td>
<td align="center">2015-06-27</td>
<td align="center">Q2</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R<<