Browse Source

(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

new-mo-algorithm
parent
commit
4e40e42011
  1. 4
      DESCRIPTION
  2. 1
      NAMESPACE
  3. 12
      NEWS.md
  4. 68
      R/aa_helper_functions.R
  5. 775
      R/aa_helper_functions_dplyr.R
  6. 1589
      R/aa_helper_pm_functions.R
  7. 80
      R/ab.R
  8. 4
      R/ab_from_text.R
  9. 14
      R/ab_property.R
  10. 2
      R/age.R
  11. 36
      R/atc_online.R
  12. 6
      R/availability.R
  13. 56
      R/bug_drug_combinations.R
  14. 6
      R/catalogue_of_life.R
  15. 2
      R/count.R
  16. 22
      R/data.R
  17. 8
      R/disk.R
  18. 66
      R/eucast_rules.R
  19. 12
      R/filter_ab_class.R
  20. 30
      R/first_isolate.R
  21. 26
      R/ggplot_rsi.R
  22. 6
      R/guess_ab_col.R
  23. 14
      R/join_microorganisms.R
  24. 18
      R/key_antibiotics.R
  25. 2
      R/kurtosis.R
  26. 2
      R/like.R
  27. 8
      R/mdro.R
  28. 38
      R/mic.R
  29. 122
      R/mo.R
  30. 63
      R/mo_matching_score.R
  31. 14
      R/mo_property.R
  32. 10
      R/mo_source.R
  33. 4
      R/pca.R
  34. 4
      R/proportion.R
  35. 8
      R/resistance_predict.R
  36. 83
      R/rsi.R
  37. 27
      R/rsi_calc.R
  38. 2
      R/skewness.R
  39. 19
      R/zzz.R
  40. 9
      _pkgdown.yml
  41. 2
      data-raw/ab.md5
  42. BIN
      data-raw/antibiotics.dta
  43. BIN
      data-raw/antibiotics.rds
  44. BIN
      data-raw/antibiotics.sas
  45. BIN
      data-raw/antibiotics.sav
  46. 26
      data-raw/antibiotics.txt
  47. BIN
      data-raw/antibiotics.xlsx
  48. 12
      data-raw/poorman_prepend.R
  49. 4
      data-raw/reproduction_of_antibiotics.R
  50. 67
      data-raw/reproduction_of_poorman.R
  51. BIN
      data/antibiotics.rda
  52. 2
      docs/404.html
  53. 2
      docs/LICENSE-text.html
  54. 2
      docs/articles/index.html
  55. 2
      docs/authors.html
  56. 2
      docs/index.html
  57. 20
      docs/news/index.html
  58. 2
      docs/pkgdown.yml
  59. 6
      docs/reference/WHONET.html
  60. 9
      docs/reference/ab_from_text.html
  61. 12
      docs/reference/ab_property.html
  62. 9
      docs/reference/age_groups.html
  63. 8
      docs/reference/antibiotics.html
  64. 8
      docs/reference/as.ab.html
  65. 6
      docs/reference/as.disk.html
  66. 14
      docs/reference/as.mic.html
  67. 33
      docs/reference/as.mo.html
  68. 17
      docs/reference/atc_online.html
  69. 13
      docs/reference/availability.html
  70. 6
      docs/reference/bug_drug_combinations.html
  71. 6
      docs/reference/catalogue_of_life_version.html
  72. 17
      docs/reference/count.html
  73. 9
      docs/reference/eucast_rules.html
  74. 6
      docs/reference/example_isolates.html
  75. 6
      docs/reference/example_isolates_unclean.html
  76. 15
      docs/reference/first_isolate.html
  77. 17
      docs/reference/ggplot_rsi.html
  78. 9
      docs/reference/guess_ab_col.html
  79. 36
      docs/reference/index.html
  80. 9
      docs/reference/intrinsic_resistant.html
  81. 9
      docs/reference/join.html
  82. 6
      docs/reference/kurtosis.html
  83. 6
      docs/reference/like.html
  84. 15
      docs/reference/mdro.html
  85. 6
      docs/reference/microorganisms.codes.html
  86. 6
      docs/reference/microorganisms.html
  87. 6
      docs/reference/microorganisms.old.html
  88. 308
      docs/reference/mo_matching_score.html
  89. 14
      docs/reference/mo_property.html
  90. 23
      docs/reference/mo_source.html
  91. 4
      docs/reference/plot.html
  92. 17
      docs/reference/proportion.html
  93. 17
      docs/reference/resistance_predict.html
  94. 6
      docs/reference/rsi_translation.html
  95. 6
      docs/reference/skewness.html
  96. 2
      docs/reference/translate.html
  97. 3
      docs/sitemap.xml
  98. 2
      docs/survey.html
  99. 2
      man/WHONET.Rd
  100. 2
      man/ab_from_text.Rd
  101. Some files were not shown because too many files have changed in this diff Show More

4
DESCRIPTION

@ -1,6 +1,6 @@ @@ -1,6 +1,6 @@
Package: AMR
Version: 1.3.0.9021
Date: 2020-09-14
Version: 1.3.0.9022
Date: 2020-09-18
Title: Antimicrobial Resistance Analysis
Authors@R: c(
person(role = c("aut", "cre"),

1
NAMESPACE

@ -165,6 +165,7 @@ export(mo_genus) @@ -165,6 +165,7 @@ export(mo_genus)
export(mo_gramstain)
export(mo_info)
export(mo_kingdom)
export(mo_matching_score)
export(mo_name)
export(mo_order)
export(mo_phylum)

12
NEWS.md

@ -1,5 +1,5 @@ @@ -1,5 +1,5 @@
# AMR 1.3.0.9021
## <small>Last updated: 14 September 2020</small>
# AMR 1.3.0.9022
## <small>Last updated: 18 September 2020</small>
Note: some changes in this version were suggested by anonymous reviewers from the journal we submitted our manuscipt to. We are those reviewers very grateful for going through our code so thoroughly!
@ -16,6 +16,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th @@ -16,6 +16,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th
pull(microorganism)
#> [1] "Enterococcus casseliflavus" "Enterococcus gallinarum"
```
* Support for veterinary ATC codes
### Changed
* Although advertised that this package should work under R 3.0.0, we still had a dependency on R 3.6.0. This is fixed, meaning that our package should now work under R 3.0.0.
@ -32,6 +33,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th @@ -32,6 +33,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th
```
* Big speed improvement for interpreting MIC values and disk zone diameters. When interpreting 5,000 MIC values of two antibiotics (10,000 values in total), our benchmarks showed a total run time going from 80.7-85.1 seconds to 1.8-2.0 seconds.
* Added parameter 'add_intrinsic_resistance' (defaults to `FALSE`), that considers intrinsic resistance according to EUCAST
* Fixed a bug where in EUCAST rules the breakpoint for R would be interpreted as ">=" while this should have been "<"
* Added intelligent data cleaning to `as.disk()`, so numbers can also be extracted from text and decimal numbers will always be rounded up:
```r
as.disk(c("disk zone: 23.4 mm", 23.4))
@ -39,7 +41,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th @@ -39,7 +41,7 @@ Note: some changes in this version were suggested by anonymous reviewers from th
#> [1] 24 24
```
* Improvements for `as.mo()`:
* Any user input value that could mean more than one taxonomic entry is now considered 'uncertain'. Instead of a warning, a message will be thrown and the accompanying `mo_uncertainties()` has been changed completely; it now prints all possible candidates with their matching score.
* A completely new matching score for ambiguous user input, using `mo_matching_score()`. Any user input value that could mean more than one taxonomic entry is now considered 'uncertain'. Instead of a warning, a message will be thrown and the accompanying `mo_uncertainties()` has been changed completely; it now prints all possible candidates with their matching score.
* Big speed improvement for already valid microorganism ID. This also means an significant speed improvement for using `mo_*` functions like `mo_name()` on microoganism IDs.
* Added parameter `ignore_pattern` to `as.mo()` which can also be given to `mo_*` functions like `mo_name()`, to exclude known non-relevant input from analysing. This can also be set with the option `AMR_ignore_pattern`.
* `get_locale()` now uses at default `Sys.getenv("LANG")` or, if `LANG` is not set, `Sys.getlocale()`. This can be overwritten by setting the option `AMR_locale`.
@ -50,6 +52,10 @@ Note: some changes in this version were suggested by anonymous reviewers from th @@ -50,6 +52,10 @@ Note: some changes in this version were suggested by anonymous reviewers from th
* Added a feature from AMR 1.1.0 and earlier again, but now without other package dependencies: `tibble` printing support for classes `<rsi>`, `<mic>`, `<disk>`, `<ab>` and `<mo>`. When using `tibble`s containing antimicrobial columns (class `<rsi>`), "S" will print in green, "I" will print in yellow and "R" will print in red. Microbial IDs (class `<mo>`) will emphasise on the genus and species, not on the kingdom.
* Names of antiviral agents in data set `antivirals` now have a starting capital letter, like it is the case in the `antibiotics` data set
* Updated the documentation of the `WHONET` data set to clarify that all patient names are fictitious
* Small `as.ab()` algorithm improvements
* Fix for combining MIC values with raw numbers, i.e. `c(as.mic(2), 2)` previously failed but now returns a valid MIC class
* `ggplot_rsi()` and `geom_rsi()` gained parameters `minimum` and `language`, to influence the internal use of `rsi_df()`
* Added abbreviation "piptazo" to piperacillin/tazobactam (TZP)
### Other
* Removed unnecessary references to the `base` package

68
R/aa_helper_functions.R

@ -19,41 +19,12 @@ @@ -19,41 +19,12 @@
# Visit our website for more info: https://msberends.github.io/AMR. #
# ==================================================================== #
# functions from dplyr, will perhaps become poorman
distinct <- function(.data, ..., .keep_all = FALSE) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
distinct.grouped_data(.data, ..., .keep_all = .keep_all)
} else {
distinct.default(.data, ..., .keep_all = .keep_all)
}
}
distinct.default <- function(.data, ..., .keep_all = FALSE) {
names <- rownames(.data)
rownames(.data) <- NULL
if (length(deparse_dots(...)) == 0) {
selected <- .data
} else {
selected <- select(.data, ...)
}
rows <- as.integer(rownames(unique(selected)))
if (isTRUE(.keep_all)) {
res <- .data[rows, , drop = FALSE]
} else {
res <- selected[rows, , drop = FALSE]
}
rownames(res) <- names[rows]
res
}
distinct.grouped_data <- function(.data, ..., .keep_all = FALSE) {
apply_grouped_function(.data, "distinct", ..., .keep_all = .keep_all)
}
# faster implementation of left_join than using merge() by poorman - we use match():
left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
pm_left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
if (is.null(by)) {
by <- intersect(names(x), names(y))[1L]
if (is.na(by)) {
stop_("no common column found for left_join()")
stop_("no common column found for pm_left_join()")
}
join_message(by)
} else if (!is.null(names(by))) {
@ -77,17 +48,28 @@ left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { @@ -77,17 +48,28 @@ left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
rownames(merged) <- NULL
merged
}
filter_join_worker <- function(x, y, by = NULL, type = c("anti", "semi")) {
type <- match.arg(type, choices = c("anti", "semi"), several.ok = FALSE)
if (is.null(by)) {
by <- intersect(names(x), names(y))
join_message(by)
# pm_filter_join_worker <- function(x, y, by = NULL, type = c("anti", "semi")) {
# type <- match.arg(type, choices = c("anti", "semi"), several.ok = FALSE)
# if (is.null(by)) {
# by <- intersect(names(x), names(y))
# join_message(by)
# }
# rows <- interaction(x[, by]) %in% interaction(y[, by])
# if (type == "anti") rows <- !rows
# res <- x[rows, , drop = FALSE]
# rownames(res) <- NULL
# res
# }
quick_case_when <- function(...) {
vectors <- list(...)
split <- lapply(vectors, function(x) unlist(strsplit(paste(deparse(x), collapse = ""), "~", fixed = TRUE)))
for (i in seq_len(length(vectors))) {
if (eval(parse(text = split[[i]][1]), envir = parent.frame())) {
return(eval(parse(text = split[[i]][2]), envir = parent.frame()))
}
}
rows <- interaction(x[, by]) %in% interaction(y[, by])
if (type == "anti") rows <- !rows
res <- x[rows, , drop = FALSE]
rownames(res) <- NULL
res
return(NA)
}
# No export, no Rd
@ -165,7 +147,7 @@ search_type_in_df <- function(x, type) { @@ -165,7 +147,7 @@ search_type_in_df <- function(x, type) {
if (any(colnames(x) %like% "^(specimen date|specimen_date|spec_date)")) {
# WHONET support
found <- sort(colnames(x)[colnames(x) %like% "^(specimen date|specimen_date|spec_date)"])[1]
if (!any(class(pull(x, found)) %in% c("Date", "POSIXct"))) {
if (!any(class(pm_pull(x, found)) %in% c("Date", "POSIXct"))) {
stop(font_red(paste0("ERROR: Found column `", font_bold(found), "` to be used as input for `col_", type,
"`, but this column contains no valid dates. Transform its values to valid dates first.")),
call. = FALSE)
@ -461,7 +443,7 @@ font_stripstyle <- function(x) { @@ -461,7 +443,7 @@ font_stripstyle <- function(x) {
gsub("(?:(?:\\x{001b}\\[)|\\x{009b})(?:(?:[0-9]{1,3})?(?:(?:;[0-9]{0,3})*)?[A-M|f-m])|\\x{001b}[A-M]", "", x, perl = TRUE)
}
progress_estimated <- function(n = 1, n_min = 0, ...) {
progress_ticker <- function(n = 1, n_min = 0, ...) {
if (!interactive() || n < n_min) {
pb <- list()
pb$tick <- function() {

775
R/aa_helper_functions_dplyr.R

@ -1,775 +0,0 @@ @@ -1,775 +0,0 @@
# ==================================================================== #
# TITLE #
# Antimicrobial Resistance (AMR) Analysis #
# #
# SOURCE #
# https://github.com/msberends/AMR #
# #
# LICENCE #
# (c) 2018-2020 Berends MS, Luz CF et al. #
# #
# This R package is free software; you can freely use and distribute #
# it for both personal and commercial purposes under the terms of the #
# GNU General Public License version 2.0 (GNU GPL-2), as published by #
# the Free Software Foundation. #
# #
# We created this package for both routine data analysis and academic #
# research and it was publicly released in the hope that it will be #
# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY. #
# Visit our website for more info: https://msberends.github.io/AMR. #
# ==================================================================== #
# ------------------------------------------------
# THIS FILE WAS CREATED AUTOMATICALLY!
# Source file: data-raw/reproduction_of_poorman.R
# ------------------------------------------------
# Poorman: a package to replace all dplyr functions with base R so we can lose dependency on dplyr.
# These functions were downloaded from https://github.com/nathaneastwood/poorman,
# from this commit: https://github.com/nathaneastwood/poorman/tree/7d76d77f8f7bc663bf30fb5a161abb49801afa17
#
# All code below was released under MIT license, that permits 'free of charge, to any person obtaining a
# copy of the software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software
# is furnished to do so', given that a copyright notice is given in the software.
#
# Copyright notice as found on https://github.com/nathaneastwood/poorman/blob/master/LICENSE on 2 May 2020:
# YEAR: 2020
# COPYRIGHT HOLDER: Nathan Eastwood
arrange <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
arrange.grouped_data(.data, ...)
} else {
arrange.default(.data, ...)
}
}
arrange.default <- function(.data, ...) {
rows <- eval.parent(substitute(with(.data, order(...))))
.data[rows, , drop = FALSE]
}
arrange.grouped_data <- function(.data, ...) {
apply_grouped_function(.data, "arrange", ...)
}
between <- function(x, left, right) {
if (!is.null(attr(x, "class")) && !inherits(x, c("Date", "POSIXct"))) {
warning("`between()` called on numeric vector with S3 class")
}
if (!is.double(x)) x <- as.numeric(x)
x >= as.numeric(left) & x <= as.numeric(right)
}
count <- function(x, ..., wt = NULL, sort = FALSE, name = NULL) {
groups <- get_groups(x)
if (!missing(...)) x <- group_by(x, ..., .add = TRUE)
wt <- deparse_var(wt)
res <- do.call(tally, list(x, wt, sort, name))
if (length(groups) > 0L) res <- do.call(group_by, list(res, as.name(groups)))
res
}
tally <- function(x, wt = NULL, sort = FALSE, name = NULL) {
name <- check_name(x, name)
wt <- deparse_var(wt)
res <- do.call(summarise, set_names(list(x, as.name(tally_n(x, wt))), c(".data", name)))
res <- ungroup(res)
if (isTRUE(sort)) res <- do.call(arrange, list(res, call("desc", as.name(name))))
rownames(res) <- NULL
res
}
add_count <- function(x, ..., wt = NULL, sort = FALSE, name = NULL) {
name <- check_name(x, name)
row_names <- rownames(x)
wt <- deparse_var(wt)
if (!missing(...)) x <- group_by(x, ..., .add = TRUE)
res <- do.call(add_tally, list(x, wt, sort, name))
res[row_names, ]
}
add_tally <- function(x, wt = NULL, sort = FALSE, name = NULL) {
wt <- deparse_var(wt)
n <- tally_n(x, wt)
name <- check_name(x, name)
res <- do.call(mutate, set_names(list(x, as.name(n)), c(".data", name)))
if (isTRUE(sort)) {
do.call(arrange, list(res, call("desc", as.name(name))))
} else {
res
}
}
tally_n <- function(x, wt) {
if (is.null(wt) && "n" %in% colnames(x)) {
message("Using `n` as weighting variable")
wt <- "n"
}
context$.data <- x
on.exit(rm(list = ".data", envir = context))
if (is.null(wt)) {
"n()"
} else {
paste0("sum(", wt, ", na.rm = TRUE)")
}
}
check_name <- function(df, name) {
if (is.null(name)) {
if ("n" %in% colnames(df)) {
stop(
"Column 'n' is already present in output\n",
"* Use `name = \"new_name\"` to pick a new name"
)
}
return("n")
}
if (!is.character(name) || length(name) != 1) {
stop("`name` must be a single string")
}
name
}
desc <- function(x) -xtfrm(x)
select_env <- new.env()
peek_vars <- function() {
get(".col_names", envir = select_env)
}
context <- new.env()
n <- function() {
do.call(nrow, list(quote(.data)), envir = context)
}
filter <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
filter.grouped_data(.data, ...)
} else {
filter.default(.data, ...)
}
}
filter.default <- function(.data, ...) {
conditions <- paste(deparse_dots(...), collapse = " & ")
context$.data <- .data
on.exit(rm(.data, envir = context))
.data[do.call(with, list(.data, str2lang(unname(conditions)))), ]
}
filter.grouped_data <- function(.data, ...) {
rows <- rownames(.data)
res <- apply_grouped_function(.data, "filter", ...)
res[rows[rows %in% rownames(res)], ]
}
group_by <- function(.data, ..., .add = FALSE) {
check_is_dataframe(.data)
pre_groups <- get_groups(.data)
groups <- deparse_dots(...)
if (isTRUE(.add)) groups <- unique(c(pre_groups, groups))
unknown <- !(groups %in% colnames(.data))
if (any(unknown)) stop("Invalid groups: ", groups[unknown])
structure(.data, class = c("grouped_data", class(.data)), groups = groups)
}
ungroup <- function(x, ...) {
check_is_dataframe(x)
rm_groups <- deparse_dots(...)
groups <- attr(x, "groups")
if (length(rm_groups) == 0L) rm_groups <- groups
attr(x, "groups") <- groups[!(groups %in% rm_groups)]
if (length(attr(x, "groups")) == 0L) {
attr(x, "groups") <- NULL
class(x) <- class(x)[!(class(x) %in% "grouped_data")]
}
x
}
get_groups <- function(x) {
attr(x, "groups", exact = TRUE)
}
has_groups <- function(x) {
groups <- get_groups(x)
if (is.null(groups)) FALSE else TRUE
}
set_groups <- function(x, groups) {
attr(x, "groups") <- groups
x
}
apply_grouped_function <- function(.data, fn, ...) {
groups <- get_groups(.data)
grouped <- split_into_groups(.data, groups)
res <- do.call(rbind, unname(lapply(grouped, fn, ...)))
if (any(groups %in% colnames(res))) {
class(res) <- c("grouped_data", class(res))
attr(res, "groups") <- groups[groups %in% colnames(res)]
}
res
}
split_into_groups <- function(.data, groups) {
class(.data) <- "data.frame"
group_factors <- lapply(groups, function(x, .data) as.factor(.data[, x]), .data)
res <- split(x = .data, f = group_factors)
res
}
print.grouped_data <- function(x, ..., digits = NULL, quote = FALSE, right = TRUE, row.names = TRUE, max = NULL) {
class(x) <- "data.frame"
print(x, ..., digits = digits, quote = quote, right = right, row.names = row.names, max = max)
cat("\nGroups: ", paste(attr(x, "groups", exact = TRUE), collapse = ", "), "\n\n")
}
if_else <- function(condition, true, false, missing = NULL) {
if (!is.logical(condition)) stop("`condition` must be a logical vector.")
cls_true <- class(true)
cls_false <- class(false)
cls_missing <- class(missing)
if (!identical(cls_true, cls_false)) {
stop("The class of `true` <", class(true), "> is not the same as the class of `false` <", class(false), ">")
}
if (!is.null(missing) && !identical(cls_true, cls_missing)) {
stop("`missing` must be a ", cls_true, " vector, not a ", cls_missing, " vector.")
}
res <- ifelse(condition, true, false)
if (!is.null(missing)) res[is.na(res)] <- missing
attributes(res) <- attributes(true)
res
}
inner_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
join_worker(x = x, y = y, by = by, suffix = suffix, sort = FALSE)
}
# left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
# join_worker(x = x, y = y, by = by, suffix = suffix, all.x = TRUE)
# }
right_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
join_worker(x = x, y = y, by = by, suffix = suffix, all.y = TRUE)
}
full_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) {
join_worker(x = x, y = y, by = by, suffix = suffix, all = TRUE)
}
join_worker <- function(x, y, by = NULL, suffix = c(".x", ".y"), ...) {
x[, ".join_id"] <- seq_len(nrow(x))
if (is.null(by)) {
by <- intersect(names(x), names(y))
join_message(by)
merged <- merge(x = x, y = y, by = by, suffixes = suffix, ...)[, union(names(x), names(y))]
} else if (is.null(names(by))) {
merged <- merge(x = x, y = y, by = by, suffixes = suffix, ...)
} else {
merged <- merge(x = x, y = y, by.x = names(by), by.y = by, suffixes = suffix, ...)
}
merged <- merged[order(merged[, ".join_id"]), colnames(merged) != ".join_id"]
rownames(merged) <- NULL
merged
}
join_message <- function(by) {
if (length(by) > 1L) {
message("Joining, by = c(\"", paste0(by, collapse = "\", \""), "\")\n", sep = "")
} else {
message("Joining, by = \"", by, "\"\n", sep = "")
}
}
anti_join <- function(x, y, by = NULL) {
filter_join_worker(x, y, by, type = "anti")
}
semi_join <- function(x, y, by = NULL) {
filter_join_worker(x, y, by, type = "semi")
}
# filter_join_worker <- function(x, y, by = NULL, type = c("anti", "semi")) {
# type <- match.arg(type, choices = c("anti", "semi"), several.ok = FALSE)
# if (is.null(by)) {
# by <- intersect(names(x), names(y))
# join_message(by)
# }
# rows <- interaction(x[, by]) %in% interaction(y[, by])
# if (type == "anti") rows <- !rows
# res <- x[rows, ]
# rownames(res) <- NULL
# res
# }
lag <- function (x, n = 1L, default = NA) {
if (inherits(x, "ts")) stop("`x` must be a vector, not a `ts` object, do you want `stats::lag()`?")
if (length(n) != 1L || !is.numeric(n) || n < 0L) stop("`n` must be a nonnegative integer scalar")
if (n == 0L) return(x)
tryCatch(
storage.mode(default) <- typeof(x),
warning = function(w) {
stop("Cannot convert `default` <", typeof(default), "> to `x` <", typeof(x), ">")
}
)
xlen <- length(x)
n <- pmin(n, xlen)
res <- c(rep(default, n), x[seq_len(xlen - n)])
attributes(res) <- attributes(x)
res
}
lead <- function (x, n = 1L, default = NA) {
if (length(n) != 1L || !is.numeric(n) || n < 0L) stop("n must be a nonnegative integer scalar")
if (n == 0L) return(x)
tryCatch(
storage.mode(default) <- typeof(x),
warning = function(w) {
stop("Cannot convert `default` <", typeof(default), "> to `x` <", typeof(x), ">")
}
)
xlen <- length(x)
n <- pmin(n, xlen)
res <- c(x[-seq_len(n)], rep(default, n))
attributes(res) <- attributes(x)
res
}
mutate <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
mutate.grouped_data(.data, ...)
} else {
mutate.default(.data, ...)
}
}
mutate.default <- function(.data, ...) {
conditions <- deparse_dots(...)
cond_names <- names(conditions)
unnamed <- which(nchar(cond_names) == 0L)
if (is.null(cond_names)) {
names(conditions) <- conditions
} else if (length(unnamed) > 0L) {
names(conditions)[unnamed] <- conditions[unnamed]
}
not_matched <- names(conditions)[!names(conditions) %in% names(.data)]
.data[, not_matched] <- NA
context$.data <- .data
on.exit(rm(.data, envir = context))
for (i in seq_along(conditions)) {
.data[, names(conditions)[i]] <- do.call(with, list(.data, str2lang(unname(conditions)[i])))
}
.data
}
mutate.grouped_data <- function(.data, ...) {
rows <- rownames(.data)
res <- apply_grouped_function(.data, "mutate", ...)
res[rows, ]
}
n_distinct <- function(..., na.rm = FALSE) {
res <- c(...)
if (is.list(res)) return(nrow(unique(as.data.frame(res, stringsAsFactors = FALSE))))
if (isTRUE(na.rm)) res <- res[!is.na(res)]
length(unique(res))
}
`%>%` <- function(lhs, rhs) {
lhs <- substitute(lhs)
rhs <- substitute(rhs)
eval(as.call(c(rhs[[1L]], lhs, as.list(rhs[-1L]))), envir = parent.frame())
}
pull <- function(.data, var = -1) {
var_deparse <- deparse_var(var)
col_names <- colnames(.data)
if (!(var_deparse %in% col_names) & grepl("^[[:digit:]]+L|[[:digit:]]", var_deparse)) {
var <- as.integer(gsub("L", "", var_deparse))
var <- if_else(var < 1L, rev(col_names)[abs(var)], col_names[var])
} else if (var_deparse %in% col_names) {
var <- var_deparse
}
.data[, var]
}
relocate <- function(.data, ..., .before = NULL, .after = NULL) {
check_is_dataframe(.data)
data_names <- colnames(.data)
col_pos <- select_positions(.data, ...)
.before <- deparse_var(.before)
.after <- deparse_var(.after)
has_before <- !is.null(.before)
has_after <- !is.null(.after)
if (has_before && has_after) {
stop("You must supply only one of `.before` and `.after`")
} else if (has_before) {
where <- min(match(.before, data_names))
col_pos <- c(setdiff(col_pos, where), where)
} else if (has_after) {
where <- max(match(.after, data_names))
col_pos <- c(where, setdiff(col_pos, where))
} else {
where <- 1L
col_pos <- union(col_pos, where)
}
lhs <- setdiff(seq(1L, where - 1L), col_pos)
rhs <- setdiff(seq(where + 1L, ncol(.data)), col_pos)
col_pos <- unique(c(lhs, col_pos, rhs))
col_pos <- col_pos[col_pos <= length(data_names)]
res <- .data[col_pos]
if (has_groups(.data)) res <- set_groups(res, get_groups(.data))
res
}
rename <- function(.data, ...) {
check_is_dataframe(.data)
new_names <- names(deparse_dots(...))
if (length(new_names) == 0L) {
warning("You didn't give any new names")
return(.data)
}
col_pos <- select_positions(.data, ...)
old_names <- colnames(.data)[col_pos]
new_names_zero <- nchar(new_names) == 0L
if (any(new_names_zero)) {
warning("You didn't provide new names for: ", paste0("`", old_names[new_names_zero], collapse = ", "), "`")
new_names[new_names_zero] <- old_names[new_names_zero]
}
colnames(.data)[col_pos] <- new_names
.data
}
rownames_to_column <- function(.data, var = "rowname") {
check_is_dataframe(.data)
col_names <- colnames(.data)
if (var %in% col_names) stop("Column `", var, "` already exists in `.data`")
.data[, var] <- rownames(.data)
rownames(.data) <- NULL
.data[, c(var, setdiff(col_names, var))]
}
select <- function(.data, ...) {
map <- names(deparse_dots(...))
col_pos <- select_positions(.data, ..., group_pos = TRUE)
res <- .data[, col_pos, drop = FALSE]
to_map <- nchar(map) > 0L
colnames(res)[to_map] <- map[to_map]
if (has_groups(.data)) res <- set_groups(res, get_groups(.data))
res
}
starts_with <- function(match, ignore.case = TRUE, vars = peek_vars()) {
grep(pattern = paste0("^", paste0(match, collapse = "|^")), x = vars, ignore.case = ignore.case)
}
ends_with <- function(match, ignore.case = TRUE, vars = peek_vars()) {
grep(pattern = paste0(paste0(match, collapse = "$|"), "$"), x = vars, ignore.case = ignore.case)
}
contains <- function(match, ignore.case = TRUE, vars = peek_vars()) {
matches <- lapply(
match,
function(x) {
if (isTRUE(ignore.case)) {
match_u <- toupper(x)
match_l <- tolower(x)
pos_u <- grep(pattern = match_u, x = toupper(vars), fixed = TRUE)
pos_l <- grep(pattern = match_l, x = tolower(vars), fixed = TRUE)
unique(c(pos_l, pos_u))
} else {
grep(pattern = x, x = vars, fixed = TRUE)
}
}
)
unique(matches)
}
matches <- function(match, ignore.case = TRUE, perl = FALSE, vars = peek_vars()) {
grep(pattern = match, x = vars, ignore.case = ignore.case, perl = perl)
}
num_range <- function(prefix, range, width = NULL, vars = peek_vars()) {
if (!is.null(width)) {
range <- sprintf(paste0("%0", width, "d"), range)
}
find <- paste0(prefix, range)
if (any(duplicated(vars))) {
stop("Column names must be unique")
} else {
x <- match(find, vars)
x[!is.na(x)]
}
}
all_of <- function(x, vars = peek_vars()) {
x_ <- !x %in% vars
if (any(x_)) {
which_x_ <- which(x_)
if (length(which_x_) == 1L) {
stop("The column ", x[which_x_], " does not exist.")
} else {
stop("The columns ", paste(x[which_x_], collapse = ", "), " do not exist.")
}
} else {
which(vars %in% x)
}
}
any_of <- function(x, vars = peek_vars()) {
which(vars %in% x)
}
everything <- function(vars = peek_vars()) {
seq_along(vars)
}
last_col <- function(offset = 0L, vars = peek_vars()) {
if (!is_wholenumber(offset)) stop("`offset` must be an integer")
n <- length(vars)
if (offset && n <= offset) {
stop("`offset` must be smaller than the number of `vars`")
} else if (n == 0) {
stop("Can't select last column when `vars` is empty")
} else {
n - offset
}
}
select_positions <- function(.data, ..., group_pos = FALSE) {
cols <- eval(substitute(alist(...)))
data_names <- colnames(.data)
select_env$.col_names <- data_names
on.exit(rm(list = ".col_names", envir = select_env))
exec_env <- parent.frame(2L)
pos <- unlist(lapply(cols, eval_expr, exec_env = exec_env))
if (isTRUE(group_pos)) {
groups <- get_groups(.data)
missing_groups <- !(groups %in% cols)
if (any(missing_groups)) {
message("Adding missing grouping variables: `", paste(groups[missing_groups], collapse = "`, `"), "`")
pos <- c(match(groups[missing_groups], data_names), pos)
}
}
unique(pos)
}
eval_expr <- function(x, exec_env) {
type <- typeof(x)
switch(
type,
"integer" = x,
"double" = as.integer(x),
"character" = select_char(x),
"symbol" = select_symbol(x, exec_env = exec_env),
"language" = eval_call(x),
stop("Expressions of type <", typeof(x), "> cannot be evaluated for use when subsetting.")
)
}
select_char <- function(expr) {
pos <- match(expr, select_env$.col_names)
if (is.na(pos)) stop("Column `", expr, "` does not exist")
pos
}
select_symbol <- function(expr, exec_env) {
res <- try(select_char(as.character(expr)), silent = TRUE)
if (inherits(res, "try-error")) {
res <- tryCatch(
select_char(eval(expr, envir = exec_env)),
error = function(e) stop("Column ", expr, " does not exist.")
)
}
res
}
eval_call <- function(x) {
type <- as.character(x[[1]])
switch(
type,
`:` = select_seq(x),
`!` = select_negate(x),
`-` = select_minus(x),
`c` = select_c(x),
`(` = select_bracket(x),
select_context(x)
)
}
select_seq <- function(expr) {
x <- eval_expr(expr[[2]])
y <- eval_expr(expr[[3]])
x:y
}
select_negate <- function(expr) {
x <- if (is_negated_colon(expr)) {
expr <- call(":", expr[[2]][[2]], expr[[2]][[3]][[2]])
eval_expr(expr)
} else {
eval_expr(expr[[2]])
}
x * -1L
}
is_negated_colon <- function(expr) {
expr[[1]] == "!" && length(expr[[2]]) > 1L && expr[[2]][[1]] == ":" && expr[[2]][[3]][[1]] == "!"
}
select_minus <- function(expr) {
x <- eval_expr(expr[[2]])
x * -1L
}
select_c <- function(expr) {
lst_expr <- as.list(expr)
lst_expr[[1]] <- NULL
unlist(lapply(lst_expr, eval_expr))
}
select_bracket <- function(expr) {
eval_expr(expr[[2]])
}
select_context <- function(expr) {
eval(expr, envir = context$.data)
}
slice <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
slice.grouped_data(.data, ...)
} else {
slice.default(.data, ...)
}
}
slice.default <- function(.data, ...) {
rows <- c(...)
stopifnot(is.numeric(rows) | is.integer(rows))
if (all(rows > 0L)) rows <- rows[rows <= nrow(.data)]
.data[rows, ]
}
slice.grouped_data <- function(.data, ...) {
apply_grouped_function(.data, "slice", ...)
}
summarise <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
summarise.grouped_data(.data, ...)
} else {
summarise.default(.data, ...)
}
}
summarise.default <- function(.data, ...) {
fns <- vapply(substitute(...()), deparse, NA_character_)
context$.data <- .data
on.exit(rm(.data, envir = context))
if (has_groups(.data)) {
group <- unique(.data[, get_groups(.data), drop = FALSE])
if (nrow(group) == 0L) return(NULL)
}
res <- lapply(fns, function(x) do.call(with, list(.data, str2lang(x))))
res <- as.data.frame(res)
fn_names <- names(fns)
colnames(res) <- if (is.null(fn_names)) fns else fn_names
if (has_groups(.data)) res <- cbind(group, res)
res
}
summarise.grouped_data <- function(.data, ...) {
groups <- get_groups(.data)
res <- apply_grouped_function(.data, "summarise", ...)
res <- res[do.call(order, lapply(groups, function(x) res[, x])), ]
rownames(res) <- NULL
res
}
summarize <- summarise
summarize.default <- summarise.default
summarize.grouped_data <- summarise.grouped_data
transmute <- function(.data, ...) {
check_is_dataframe(.data)
if ("grouped_data" %in% class(.data)) {
transmute.grouped_data(.data, ...)
} else {
transmute.default(.data, ...)
}
}
transmute.default <- function(.data, ...) {
conditions <- deparse_dots(...)
mutated <- mutate(.data, ...)
mutated[, names(conditions), drop = FALSE]
}
transmute.grouped_data <- function(.data, ...) {
rows <- rownames(.data)
res <- apply_grouped_function(.data, "transmute", ...)
res[rows, ]
}
deparse_dots <- function(...) {
vapply(substitute(...()), deparse, NA_character_)
}
deparse_var <- function(var) {
sub_var <- eval(substitute(substitute(var)), parent.frame())
if (is.symbol(sub_var)) var <- as.character(sub_var)
var
}
check_is_dataframe <- function(.data) {
parent_fn <- all.names(sys.call(-1L), max.names = 1L)
if (!is.data.frame(.data)) stop(parent_fn, " must be given a data.frame")
invisible()
}
is_wholenumber <- function(x) {
x %% 1L == 0L
}
set_names <- function(object = nm, nm) {
names(object) <- nm
object
}
cume_dist <- function(x) {
rank(x, ties.method = "max", na.last = "keep") / sum(!is.na(x))
}
dense_rank <- function(x) {
match(x, sort(unique(x)))
}
min_rank <- function(x) {
rank(x, ties.method = "min", na.last = "keep")
}
ntile <- function (x = row_number(), n) {
if (!missing(x)) x <- row_number(x)
len <- length(x) - sum(is.na(x))
n <- as.integer(floor(n))
if (len == 0L) {
rep(NA_integer_, length(x))
} else {
n_larger <- as.integer(len %% n)
n_smaller <- as.integer(n - n_larger)
size <- len / n
larger_size <- as.integer(ceiling(size))
smaller_size <- as.integer(floor(size))
larger_threshold <- larger_size * n_larger
bins <- if_else(
x <= larger_threshold,
(x + (larger_size - 1L)) / larger_size,
(x + (-larger_threshold + smaller_size - 1L)) / smaller_size + n_larger
)
as.integer(floor(bins))
}
}
percent_rank <- function(x) {
(min_rank(x) - 1) / (sum(!is.na(x)) - 1)
}
row_number <- function(x) {
if (missing(x)) seq_len(n()) else rank(x, ties.method = "first", na.last = "keep")
}

1589
R/aa_helper_pm_functions.R

File diff suppressed because it is too large Load Diff

80
R/ab.R

@ -47,9 +47,9 @@ @@ -47,9 +47,9 @@
#'
#' European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{http://ec.europa.eu/health/documents/community-register/html/atc.htm}
#' @aliases ab
#' @return Character (vector) with class [`ab`]. Unknown values will return `NA`.
#' @return A [character] [vector] with additional class [`ab`]
#' @seealso
#' * [antibiotics] for the dataframe that is being used to determine ATCs
#' * [antibiotics] for the [data.frame] that is being used to determine ATCs
#' * [ab_from_text()] for a function to retrieve antimicrobial drugs from clinical text (from health care records)
#' @inheritSection AMR Reference data publicly available
#' @inheritSection AMR Read more on our website!
@ -101,23 +101,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -101,23 +101,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
x <- gsub('"', "", x, fixed = TRUE)
x_bak_clean <- x
if (already_regex == FALSE) {
# remove suffices
x_bak_clean <- gsub("_(MIC|RSI|DIS[CK])$", "", x_bak_clean)
# remove disk concentrations, like LVX_NM -> LVX
x_bak_clean <- gsub("_[A-Z]{2}[0-9_.]{0,3}$", "", x_bak_clean)
# remove part between brackets if that's followed by another string
x_bak_clean <- gsub("(.*)+ [(].*[)]", "\\1", x_bak_clean)
# keep only max 1 space
x_bak_clean <- trimws(gsub(" +", " ", x_bak_clean))
# non-character, space or number should be a slash
x_bak_clean <- gsub("[^A-Z0-9 -]", "/", x_bak_clean)
# spaces around non-characters must be removed: amox + clav -> amox/clav
x_bak_clean <- gsub("(.*[A-Z0-9]) ([^A-Z0-9].*)", "\\1\\2", x_bak_clean)
x_bak_clean <- gsub("(.*[^A-Z0-9]) ([A-Z0-9].*)", "\\1\\2", x_bak_clean)
# remove hyphen after a starting "co"
x_bak_clean <- gsub("^CO-", "CO", x_bak_clean)
# replace text 'and' with a slash
x_bak_clean <- gsub(" AND ", "/", x_bak_clean)
x_bak_clean <- generalise_antibiotic_name(x_bak_clean)
}
x <- unique(x_bak_clean)
@ -133,7 +117,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -133,7 +117,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
if (initial_search == TRUE) {
progress <- progress_estimated(n = length(x), n_min = ifelse(isTRUE(info), 25, length(x) + 1)) # start if n >= 25
progress <- progress_ticker(n = length(x), n_min = ifelse(isTRUE(info), 25, length(x) + 1)) # start if n >= 25
on.exit(close(progress))
}
@ -161,7 +145,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -161,7 +145,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# exact name
found <- antibiotics[which(toupper(antibiotics$name) == x[i]), ]$ab
found <- antibiotics[which(AB_lookup$generalised_name == x[i]), ]$ab
if (length(found) > 0) {
x_new[i] <- found[1L]
next
@ -189,8 +173,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -189,8 +173,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# exact LOINC code
loinc_found <- unlist(lapply(antibiotics$loinc,
function(s) x[i] %in% s))
loinc_found <- unlist(lapply(AB_lookup$generalised_loinc,
function(s) generalise_antibiotic_name(x[i]) %in% s))
found <- antibiotics$ab[loinc_found == TRUE]
if (length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
@ -198,8 +182,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -198,8 +182,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# exact synonym
synonym_found <- unlist(lapply(antibiotics$synonyms,
function(s) x[i] %in% toupper(s)))
synonym_found <- unlist(lapply(AB_lookup$generalised_synonyms,
function(s) generalise_antibiotic_name(x[i]) %in% s))
found <- antibiotics$ab[synonym_found == TRUE]
if (length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
@ -207,8 +191,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -207,8 +191,8 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# exact abbreviation
abbr_found <- unlist(lapply(antibiotics$abbreviations,
function(a) x[i] %in% toupper(a)))
abbr_found <- unlist(lapply(AB_lookup$generalised_abbreviations,
function(s) generalise_antibiotic_name(x[i]) %in% s))
found <- antibiotics$ab[abbr_found == TRUE]
if (length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
@ -246,21 +230,21 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -246,21 +230,21 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# try if name starts with it
found <- antibiotics[which(antibiotics$name %like% paste0("^", x_spelling)), ]$ab
found <- antibiotics[which(AB_lookup$generalised_name %like% paste0("^", x_spelling)), ]$ab
if (length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
next
}
# try if name ends with it
found <- antibiotics[which(antibiotics$name %like% paste0(x_spelling, "$")), ]$ab
found <- antibiotics[which(AB_lookup$generalised_name %like% paste0(x_spelling, "$")), ]$ab
if (nchar(x[i]) >= 4 & length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
next
}
# and try if any synonym starts with it
synonym_found <- unlist(lapply(antibiotics$synonyms,
function(s) any(s %like% paste0("^", x_spelling))))
synonym_found <- unlist(lapply(AB_lookup$generalised_synonyms,
function(s) any(generalise_antibiotic_name(s) %like% paste0("^", x_spelling))))
found <- antibiotics$ab[synonym_found == TRUE]
if (length(found) > 0) {
x_new[i] <- note_if_more_than_one_found(found, i, from_text)
@ -291,7 +275,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -291,7 +275,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
}
# transform back from other languages and try again
x_translated <- paste(lapply(strsplit(x[i], "[^A-Z0-9 ]"),
x_translated <- paste(lapply(strsplit(x[i], "[^A-Z0-9]"),
function(y) {
for (i in seq_len(length(y))) {
y[i] <- ifelse(tolower(y[i]) %in% tolower(translations_file$replacement),
@ -299,7 +283,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -299,7 +283,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
!isFALSE(translations_file$fixed)), "pattern"],
y[i])
}
y
generalise_antibiotic_name(y)
})[[1]],
collapse = "/")
x_translated_guess <- suppressWarnings(as.ab(x_translated, initial_search = FALSE))
@ -317,7 +301,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -317,7 +301,7 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
y_name,
y[i])
}
y
generalise_antibiotic_name(y)
})[[1]],
collapse = "/")
x_translated_guess <- suppressWarnings(as.ab(x_translated, initial_search = FALSE))
@ -449,9 +433,9 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) { @@ -449,9 +433,9 @@ as.ab <- function(x, flag_multiple_results = TRUE, info = TRUE, ...) {
call. = FALSE)
}
x_result <- data.frame(x = x_bak_clean, stringsAsFactors = FALSE) %>%
left_join(data.frame(x = x, x_new = x_new, stringsAsFactors = FALSE), by = "x") %>%
pull(x_new)
x_result <- data.frame(x = x_bak_clean, stringsAsFactors = FALSE) %pm>%
pm_left_join(data.frame(x = x, x_new = x_new, stringsAsFactors = FALSE), by = "x") %pm>%
pm_pull(x_new)
if (length(x_result) == 0) {
x_result <- NA_character_
@ -538,3 +522,25 @@ c.ab <- function(x, ...) { @@ -538,3 +522,25 @@ c.ab <- function(x, ...) {
attributes(y) <- attributes(x)
class_integrity_check(y, "antimicrobial code", antibiotics$ab)
}
generalise_antibiotic_name <- function(x) {
x <- toupper(x)
# remove suffices
x <- gsub("_(MIC|RSI|DIS[CK])$", "", x)
# remove disk concentrations, like LVX_NM -> LVX
x <- gsub("_[A-Z]{2}[0-9_.]{0,3}$", "", x)
# remove part between brackets if that's followed by another string
x <- gsub("(.*)+ [(].*[)]", "\\1", x)
# keep only max 1 space
x <- trimws(gsub(" +", " ", x))
# non-character, space or number should be a slash
x <- gsub("[^A-Z0-9 -]", "/", x)
# spaces around non-characters must be removed: amox + clav -> amox/clav
x <- gsub("(.*[A-Z0-9]) ([^A-Z0-9].*)", "\\1\\2", x)
x <- gsub("(.*[^A-Z0-9]) ([A-Z0-9].*)", "\\1\\2", x)
# remove hyphen after a starting "co"
x <- gsub("^CO-", "CO", x)
# replace operators with a space
x <- gsub("(/| AND | WITH | W/|[+]|[-])+", " ", x)
x
}

4
R/ab_from_text.R

@ -47,7 +47,7 @@ @@ -47,7 +47,7 @@
#' With using `collapse`, this function will return a [character]:\cr
#' `df %>% mutate(abx = ab_from_text(clinical_text, collapse = "|"))`
#' @export
#' @return A [list], or a [character] if `collapse` is not `NULL`
#' @return A [list], or a [character] if `collapse` is not `NULL`
#' @inheritSection AMR Read more on our website!
#' @examples
#' # mind the bad spelling of amoxicillin in this line,
@ -97,7 +97,7 @@ ab_from_text <- function(text, @@ -97,7 +97,7 @@ ab_from_text <- function(text,
text <- tolower(as.character(text))
text_split_all <- strsplit(text, "[ ;.,:\\|]")
progress <- progress_estimated(n = length(text_split_all), n_min = 5)
progress <- progress_ticker(n = length(text_split_all), n_min = 5)
on.exit(close(progress))
if (type %like% "(drug|ab|anti)") {

14
R/ab_property.R

@ -38,10 +38,10 @@ @@ -38,10 +38,10 @@
#' @rdname ab_property
#' @name ab_property
#' @return
#' - An [`integer`] in case of [ab_cid()]
#' - A named [`list`] in case of [ab_info()] and multiple [ab_synonyms()]/[ab_tradenames()]
#' - A [`double`] in case of [ab_ddd()]
#' - A [`character`] in all other cases
#' - An [integer] in case of [ab_cid()]
#' - A named [list] in case of [ab_info()] and multiple [ab_synonyms()]/[ab_tradenames()]
#' - A [double] in case of [ab_ddd()]
#' - A [character] in all other cases
#' @export
#' @seealso [antibiotics]
#' @inheritSection AMR Reference data publicly available
@ -231,9 +231,9 @@ ab_validate <- function(x, property, ...) { @@ -231,9 +231,9 @@ ab_validate <- function(x, property, ...) {
error = function(e) stop(e$message, call. = FALSE))
x_bak <- x
if (!all(x %in% antibiotics[, property])) {
x <- data.frame(ab = as.ab(x, ...), stringsAsFactors = FALSE) %>%
left_join(antibiotics, by = "ab") %>%
pull(property)
x <- data.frame(ab = as.ab(x, ...), stringsAsFactors = FALSE) %pm>%
pm_left_join(antibiotics, by = "ab") %pm>%
pm_pull(property)
}
if (property == "ab") {
return(structure(x, class = property))

2
R/age.R

@ -135,7 +135,7 @@ age <- function(x, reference = Sys.Date(), exact = FALSE, na.rm = FALSE) { @@ -135,7 +135,7 @@ age <- function(x, reference = Sys.Date(), exact = FALSE, na.rm = FALSE) {
#' filter(mo == as.mo("E. coli")) %>%
#' group_by(age_group = age_groups(age)) %>%
#' select(age_group, CIP) %>%
#' ggplot_rsi(x = "age_group")
#' ggplot_rsi(x = "age_group", minimum = 0)
#' }
age_groups <- function(x, split_at = c(12, 25, 55, 75), na.rm = FALSE) {
stop_ifnot(is.numeric(x), "`x` must be numeric, not ", paste0(class(x), collapse = "/"))

36
R/atc_online.R

@ -21,12 +21,13 @@ @@ -21,12 +21,13 @@
#' Get ATC properties from WHOCC website
#'
#' Gets data from the WHO to determine properties of an ATC (e.g. an antibiotic), such as the name, defined daily dose (DDD) or standard unit.
#' @inheritSection lifecycle Stable lifecycle
#' @description Gets data from the WHO to determine properties of an ATC (e.g. an antibiotic) like name, defined daily dose (DDD) or standard unit.
#' @param atc_code a character or character vector with ATC code(s) of antibiotic(s)
#' @param property property of an ATC code. Valid values are `"ATC"`, `"Name"`, `"DDD"`, `"U"` (`"unit"`), `"Adm.R"`, `"Note"` and `groups`. For this last option, all hierarchical groups of an ATC code will be returned, see Examples.
#' @param administration type of administration when using `property = "Adm.R"`, see Details
#' @param url url of website of the WHO. The sign `%s` can be used as a placeholder for ATC codes.
#' @param url url of website of the WHOCC. The sign `%s` can be used as a placeholder for ATC codes.
#' @param url_vet url of website of the WHOCC for veterinary medicine. The sign `%s` can be used as a placeholder for ATC_vet codes (that all start with "Q").
#' @param ... parameters to pass on to `atc_property`
#' @details
#' Options for parameter `administration`:
@ -74,7 +75,8 @@ @@ -74,7 +75,8 @@
atc_online_property <- function(atc_code,