Note: some changes in this version were suggested by anonymous reviewers from the journal we submitted our manuscipt to. We are those reviewers very grateful for going through our code so thoroughly!
### New
* A new vignette and website page with info about all our public and freely available data sets, that can be downloaded as flat files or in formats for use in R, SPSS, SAS, Stata and Excel: https://msberends.github.io/AMR/articles/datasets.html
@ -16,6 +18,7 @@
```
### Changed
* Although advertised that this package should work under R 3.0.0, we still had a dependency on R 3.6.0. This is fixed, meaning that our package should now work under R 3.0.0.
* Improvements for `as.rsi()`:
* Support for using `dplyr`'s `across()` to interpret MIC values or disk zone diameters, which also automatically determines the column with microorganism names or codes.
```r
@ -35,6 +38,10 @@
#> Class <disk>
#> [1] 24 24
```
* Improvements for `as.mo()`:
* Big speed improvement for already valid microorganism ID. This also means an significant speed improvement for using `mo_*` functions like `mo_name()` on microoganism IDs.
* Added parameter `ignore_pattern` to `as.mo()` which can also be given to `mo_*` functions like `mo_name()`, to exclude known non-relevant input from analysing. This can also be set with the option `AMR_ignore_pattern`.
* `get_locale()` now uses `Sys.getlocale()` instead of `Sys.getlocale("LC_COLLATE")`
* Speed improvement for `eucast_rules()`
* Overall speed improvement by tweaking joining functions
* Function `mo_shortname()` now returns the genus for input where the species is unknown
@ -42,6 +49,10 @@
* Added a feature from AMR 1.1.0 and earlier again, but now without other package dependencies: `tibble` printing support for classes `<rsi>`, `<mic>`, `<disk>`, `<ab>` and `<mo>`. When using `tibble`s containing antimicrobial columns (class `<rsi>`), "S" will print in green, "I" will print in yellow and "R" will print in red. Microbial IDs (class `<mo>`) will emphasise on the genus and species, not on the kingdom.
* Names of antiviral agents in data set `antivirals` now have a starting capital letter, like it is the case in the `antibiotics` data set
### Other
* Removed unnecessary references to the `base` package
* Added packages that could be useful for some functions to the `Suggests` field of the `DESCRIPTION` file
#' Use this function to determine the antibiotic code of one or more antibiotics. The data set [antibiotics] will be searched for abbreviations, official names and synonyms (brand names).
#' Use these functions to return a specific property of an antibiotic from the [antibiotics] data set. All input values will be evaluated internally with [as.ab()].
#' @description Gets data from the WHO to determine properties of an ATC (e.g. an antibiotic) like name, defined daily dose (DDD) or standard unit.
#'
#' **This function requires an internet connection.**
#' @param atc_code a character or character vector with ATC code(s) of antibiotic(s)
#' @param property property of an ATC code. Valid values are `"ATC"`, `"Name"`, `"DDD"`, `"U"` (`"unit"`), `"Adm.R"`, `"Note"` and `groups`. For this last option, all hierarchical groups of an ATC code will be returned, see Examples.
#' @param administration type of administration when using `property = "Adm.R"`, see Details
@ -54,6 +52,8 @@
#' - `"MU"` = million units
#' - `"mmol"` = millimole
#' - `"ml"` = milliliter (e.g. eyedrops)
#'
#' **N.B. This function requires an internet connection and only works if the following packages are installed: `curl`, `rvest`, `xml2`.**
#' @details The function [format()] calculates the resistance per bug-drug combination. Use `combine_IR = FALSE` (default) to test R vs. S+I and `combine_IR = TRUE` to test R+I vs. S.
#'
#' The language of the output can be overwritten with `options(AMR_locale)`, please see [translate].
#' @export
#' @rdname bug_drug_combinations
#' @return The function [bug_drug_combinations()] returns a [`data.frame`] with columns "mo", "ab", "S", "I", "R" and "total".
#' Translation table with `r format(nrow(microorganisms.codes), big.mark = ",")` common microorganism codes
#' Data set with `r format(nrow(microorganisms.codes), big.mark = ",")` common microorganism codes
#'
#' A data set containing commonly used codes for microorganisms, from laboratory systems and WHONET. Define your own with [set_mo_source()]. They will all be searched when using [as.mo()] and consequently all the [`mo_*`][mo_property()] functions.
#' @format A [`data.frame`] with `r format(nrow(microorganisms.codes), big.mark = ",")` observations and `r ncol(microorganisms.codes)` variables:
@ -194,17 +194,17 @@ catalogue_of_life <- list(
#' Data set with `r format(nrow(WHONET), big.mark = ",")` isolates - WHONET example
#'
#' This example data set has the exact same structure as an export file from WHONET. Such files can be used with this package, as this example data set shows. The data itself was based on our [example_isolates] data set.
#' This example data set has the exact same structure as an export file from WHONET. Such files can be used with this package, as this example data set shows. The antibiotic results are based on our [example_isolates] data set. All patient names are created using online surname generators and are only in place for practice purposes.
#' @format A [`data.frame`] with `r format(nrow(WHONET), big.mark = ",")` observations and `r ncol(WHONET)` variables:
#' - `Identification number`\cr ID of the sample
#' - `Specimen number`\cr ID of the specimen
#' - `Organism`\cr Name of the microorganism. Before analysis, you should transform this to a valid microbial class, using [as.mo()].
#' - `Country`\cr Country of origin
#' - `Laboratory`\cr Name of laboratory
#' - `Last name`\cr Last name of patient
#' - `First name`\cr Initial of patient
#' - `Sex`\cr Gender of patient
#' - `Age`\cr Age of patient
#' - `Last name`\cr Fictitious last name of patient
#' - `First name`\cr Fictitious initial of patient
#' - `Sex`\cr Fictitious gender of patient
#' - `Age`\cr Fictitious age of patient
#' - `Age category`\cr Age group, can also be looked up using [age_groups()]
#' - `Date of admission`\cr Date of hospital admission
#' - `Specimen date`\cr Date when specimen was received at laboratory
#' This transforms a vector to a new class [`disk`], which is a growth zone size (around an antibiotic disk) in millimetres between 6 and 50.
#' This transforms a vector to a new class [`disk`], which is a disk diffusion growth zone size (around an antibiotic disk) in millimetres between 6 and 50.
#' * Is caseinsensitive (use `%like_case%` for case-sensitive matching)
#' * Is case-insensitive (use `%like_case%` for case-sensitive matching)
#' * Supports multiple patterns
#' * Checks if `pattern` is a regular expression and sets `fixed = TRUE` if not, to greatly improve speed
#' * Tries again with `perl = TRUE` if regex fails
#'
#' Using RStudio? This function can also be inserted from the Addins menu and can have its own Keyboard Shortcut like `Ctrl+Shift+L` or `Cmd+Shift+L` (see `Tools` > `Modify Keyboard Shortcuts...`).
#' @source Idea from the [`like` function from the `data.table` package](https://github.com/Rdatatable/data.table/blob/master/R/like.R)
#' Transform input to minimum inhibitory concentrations
#'
#' This transforms a vector to a new class [`mic`], which is an ordered [`factor`] with valid MIC values as levels. Invalid MIC values will be translated as `NA` with a warning.
#' This transforms a vector to a new class [`mic`], which is an ordered [`factor`] with valid minimum inhibitory concentrations (MIC) as levels. Invalid MIC values will be translated as `NA` with a warning.
#' Use this function to determine a valid microorganism ID ([`mo`]). Determination is done using intelligent rules and the complete taxonomic kingdoms Bacteria, Chromista, Protozoa, Archaea and most microbial species from the kingdom Fungi (see Source). The input can be almost anything: a full name (like `"Staphylococcus aureus"`), an abbreviated name (like `"S. aureus"`), an abbreviation known in the field (like `"MRSA"`), or just a genus. Please see *Examples*.
#' @inheritSection lifecycle Stable lifecycle
@ -32,6 +32,7 @@
#' This excludes *Enterococci* at default (who are in group D), use `Lancefield = "all"` to also categorise all *Enterococci* as group D.
#' @param allow_uncertain a number between `0` (or `"none"`) and `3` (or `"all"`), or `TRUE` (= `2`) or `FALSE` (= `0`) to indicate whether the input should be checked for less probable results, please see *Details*
#' @param reference_df a [`data.frame`] to be used for extra reference when translating `x` to a valid [`mo`]. See [set_mo_source()] and [get_mo_source()] to automate the usage of your own codes (e.g. used in your analysis or organisation).
#' @param ignore_pattern a regular expression (case-insensitive) of which all matches in `x` must return `NA`. This can be convenient to exclude known non-relevant input and can also be set with the option `AMR_ignore_pattern`, e.g. `options(AMR_ignore_pattern = "(not reported|contaminated flora)")`.
#' @param ... other parameters passed on to functions
#' @rdname as.mo
#' @aliases mo
@ -39,7 +40,7 @@
#' @details
#' ## General info
#'
#' A microorganism ID from this package (class: [`mo`]) typically looks like these examples:
#' A microorganism ID from this package (class: [`mo`]) is human readable and typically looks like these examples:
#' Use these functions to return a specific property of a microorganism. All input values will be evaluated internally with [as.mo()], which makes it possible to use microbial abbreviations, codes and names as input. Please see *Examples*.
#' Use these functions to return a specific property of a microorganism based on the latest accepted taxonomy. All input values will be evaluated internally with [as.mo()], which makes it possible to use microbial abbreviations, codes and names as input. Please see *Examples*.
#' @inheritSection lifecycle Stable lifecycle
#' @param x any (vector of) text that can be coerced to a valid microorganism code with [as.mo()]
#' @param property one of the column names of the [microorganisms] data set or `"shortname"`
#' @param language language of the returned text, defaults to system language (see [get_locale()]) and can also be set with `getOption("AMR_locale")`. Use `language = NULL` or `language = ""` to prevent translation.
#' @param ... other parameters passed on to [as.mo()]
#' @param language language of the returned text, defaults to system language (see [get_locale()]) and can be overwritten by setting the option `AMR_locale`, e.g. `options(AMR_locale = "de")`, see [translate]. Use `language = NULL` or `language = ""` to prevent translation.
#' @param ... other parameters passed on to [as.mo()], such as 'allow_uncertain' and 'ignore_pattern'
#' @param open browse the URL using [utils::browseURL()]
#' @details All functions will return the most recently known taxonomic property according to the Catalogue of Life, except for [mo_ref()], [mo_authors()] and [mo_year()]. Please refer to this example, knowing that *Escherichia blattae* was renamed to *Shimwellia blattae* in 2010:
#' - `mo_name("Escherichia blattae")` will return `"Shimwellia blattae"` (with a message about the renaming)
stop_ifnot(interactive(),"This function can only be used in interactive mode, since it must ask for the user's permission to write a file to their home folder.")
stop_ifnot(length(path)==1,"`path` must be of length 1")
#' Interpret minimum inhibitory concentration (MIC) values and disk diffusion diameters according to EUCAST or CLSI, or clean up existing R/SI values. This transforms the input to a new class [`rsi`], which is an ordered factor with levels `S < I < R`. Values that cannot be interpreted will be returned as `NA` with a warning.
#' For language-dependent output of AMR functions, like [mo_name()], [mo_gramstain()], [mo_type()] and [ab_name()].
#' @inheritSection lifecycle Stable lifecycle
#' @details Strings will be translated to foreign languages if they are defined in a local translation file. Additions to this file can be suggested at our repository. The file can be found here: <https://github.com/msberends/AMR/blob/master/data-raw/translations.tsv>.
#' @details Strings will be translated to foreign languages if they are defined in a local translation file. Additions to this file can be suggested at our repository. The file can be found here: <https://github.com/msberends/AMR/blob/master/data-raw/translations.tsv>. This file will be read by all functions where a translated output can be desired, like all [mo_property()] functions ([mo_name()], [mo_gramstain()], [mo_type()], etc.).
#'
#' Currently supported languages are (besides English): `r paste(sort(gsub(";.*", "", ISOcodes::ISO_639_2[which(ISOcodes::ISO_639_2$Alpha_2 %in% unique(AMR:::translations_file$lang)), "Name"])), collapse = ", ")`. Please note that currently not all these languages have translations available for all antimicrobial agents and colloquial microorganism names.
#' Currently supported languages are: `r paste(sort(gsub(";.*", "", ISOcodes::ISO_639_2[which(ISOcodes::ISO_639_2$Alpha_2 %in% LANGUAGES_SUPPORTED), "Name"])), collapse = ", ")`. Please note that currently not all these languages have translations available for all antimicrobial agents and colloquial microorganism names.
#'
#' Please suggest your own translations [by creating a new issue on our repository](https://github.com/msberends/AMR/issues/new?title=Translations).
#'
#' This file will be read by all functions where a translated output can be desired, like all [mo_property()] functions ([mo_name()], [mo_gramstain()], [mo_type()], etc.).
#'
#' The system language will be used at default, if that language is supported. The system language can be overwritten with `Sys.setenv(AMR_locale = yourlanguage)`.
#' The system language will be used at default (as returned by [Sys.getlocale()]), if that language is supported. The language to be used can be overwritten by setting the option `AMR_locale`, e.g. `options(AMR_locale = "de")`.
#' @inheritSection AMR Read more on our website!
#' @rdname translate
#' @name translate
@ -66,10 +64,16 @@
#' #> "Staphylococcus coagulase negativo (CoNS)"
get_locale<-function(){
if (!is.null(getOption("AMR_locale",default=NULL))){
return(getOption("AMR_locale"))
if (!language%in%LANGUAGES_SUPPORTED){
stop_("unsupported language: '",language,"' - use one of: ",
Functions for conducting AMR analysis, like counting isolates, calculating
resistance or susceptibility, or make plots.
With `as.mic()` and `as.disk()` you can transform your raw input to valid MIC or disk diffusion values.
Use `as.rsi()` for cleaning raw data to let it only contain "R", "I" and "S", or to interpret MIC or disk diffusion values as R/SI based on the lastest EUCAST and CLSI guidelines.
Afterwards, you can extend antibiotic interpretations by applying [EUCAST rules](http://www.eucast.org/expert_rules_and_intrinsic_resistance/) with `eucast_rules()`.
Use these function for the analysis part. You can use `susceptibility()` or `resistance()` on any antibiotic column.
Be sure to first select the isolates that are appropiate for analysis, by using `first_isolate()`.
You can also filter your data on certain resistance in certain antibiotic classes (`filter_ab_class()`), or determine multi-drug resistant microorganisms (MDRO, `mdro()`).
contents:
- "`proportion`"
- "`count`"
- "`availability`"
- "`first_isolate`"
- "`key_antibiotics`"
- "`mdro`"
- "`count`"
- "`ggplot_rsi`"
- "`bug_drug_combinations`"
- "`resistance_predict`"
- "`pca`"
- "`antibiotic_class_selectors`"
- "`filter_ab_class`"
- "`g.test`"
- "`ggplot_rsi`"
- "`ggplot_pca`"
- "`kurtosis`"
- "`skewness`"
- title:"Included data sets"
desc:>
Scientifically reliable references for microorganisms and
antibiotics, and example data sets to use for practise.
contents:
- "`microorganisms`"
- "`antibiotics`"
- "`intrinsic_resistant`"
- "`example_isolates`"
- "`example_isolates_unclean`"
- "`rsi_translation`"
- "`microorganisms.codes`"
- "`microorganisms.old`"
- "`WHONET`"
- title:"Background information"
desc:>
Some pages about our package and its external sources. Be sure to read our [How To's](./../articles/index.html)
for more information about how to work with functions in this package.
contents:
- "`AMR`"
- "`catalogue_of_life`"
- "`catalogue_of_life_version`"
- "`WHOCC`"
- "`lifecycle`"
- title:"Other functions"
- "`resistance_predict`"
- "`guess_ab_col`"
- title:"Other: miscellaneous functions"
desc:>
These functions are mostly for internal use, but some of
them may also be suitable for your analysis. Especially the
@ -167,7 +166,23 @@ reference:
contents:
- "`get_locale`"
- "`like`"
- title:"Deprecated functions"
- "`age_groups`"
- "`age`"
- "`join`"
- "`availability`"
- "`pca`"
- "`ggplot_pca`"
- title:"Other: statistical tests"
desc:>
Some statistical tests or methods are not part of base R and are added to this package for convenience.
contents:
- "`g.test`"
- "`kurtosis`"
- "`skewness`"
- "`p_symbol`"
- title:"Other: deprecated functions"
desc:>
These functions are deprecated, meaning that they will still
work but show a warning with every use and will be removed
<p>A data set with 67,151 rows and 16 columns, containing the following column names:<br><em>‘mo’, ‘fullname’, ‘kingdom’, ‘phylum’, ‘class’, ‘order’, ‘family’, ‘genus’, ‘species’, ‘subspecies’, ‘rank’, ‘ref’, ‘species_id’, ‘source’, ‘prevalence’, ‘snomed’</em>.</p>
<p>This data set is in R available as <code>microorganisms</code>, after you load the <code>AMR</code> package.</p>
<p>It was last updated on 28 July 2020 20:52:40 CEST. Find more info about the structure of this data set <ahref="https://msberends.github.io/AMR/reference/microorganisms.html">here</a>.</p>
<p>It was last updated on 1 September 2020 11:07:11 CEST. Find more info about the structure of this data set <ahref="https://msberends.github.io/AMR/reference/microorganisms.html">here</a>.</p>