* Functions `is_gram_negative()` and `is_gram_positive()` as wrappers around `mo_gramstain()`. They always return `TRUE` or `FALSE`, thus always return `FALSE` for species outside the taxonomic kingdom of Bacteria.
* Functions `%not_like%` and `%like_perl%` as wrappers around `%like%`.
* Functions `%not_like%` and `%not_like_case%` as wrappers around `%like%` and `%like_case%`. The RStudio addin to insert the text " %like% " as provided in this package now iterates over all like variants. So if you have defined the keyboard shortcut Ctrl/Cmd + L to this addin, it will first insert ` %like% ` and by pressing it again it will be replaced with ` %not_like% `, etc.
### Changed
* For all function parameters in the code, it is now defined what the exact type of user input should be (inspired by the [`typed`](https://github.com/moodymudskipper/typed) package). If the user input for a certain function does not meet the requirements for a specific parameter (such as the class or length), an informative error will be thrown. This makes the package more robust and the use of it more reproducible and reliable. In total, more than 400 arguments were defined.
* Deprecated function `p_symbol()` that not really fits the scope of this package. It will be removed in a future version. See [here](https://github.com/msberends/AMR/blob/v1.4.0/R/p_symbol.R) for the source code to preserve it.
* Better determination of disk zones and MIC values when running `as.rsi()` on a data.frame
* Updated coagulase-negative staphylococci with Becker *et al.* 2020 (PMID 32056452), meaning that the species *S. argensis*, *S. caeli*, *S. debuckii*, *S. edaphicus* and *S. pseudoxylosus* are now all considered CoNS
* Fix for using parameter `reference_df` in `as.mo()` and `mo_*()` functions that contain old microbial codes (from previous package versions)
### Other
* All messages thrown by this package now have correct line breaks
#' Convenient wrapper around [grep()] to match a pattern: `x %like% pattern`. It always returns a [`logical`] vector and is always case-insensitive (use `x %like_case% pattern` for case-sensitive matching). Also, `pattern` can be as long as `x` to compare items of each index in both vectors, or they both can have the same length to iterate over all cases.
#' @inheritSection lifecycle Stable lifecycle
@ -41,9 +41,9 @@
#' * Checks if `pattern` is a regular expression and sets `fixed = TRUE` if not, to greatly improve speed
#' * Tries again with `perl = TRUE` if regex fails
#'
#' Using RStudio? This function can also be inserted from the Addins menu and can have its own Keyboard Shortcut like `Ctrl+Shift+L` or `Cmd+Shift+L` (see `Tools` > `Modify Keyboard Shortcuts...`).
#' Using RStudio? This function can also be inserted in your code from the Addins menu and can have its own Keyboard Shortcut like `Ctrl+Shift+L` or `Cmd+Shift+L` (see `Tools` > `Modify Keyboard Shortcuts...`). This addin iterates over all 'like' variants. So if you have defined the keyboard shortcut Ctrl/Cmd + L to this addin, it will first insert ` %like% ` and by pressing it again it will be replaced with ` %not_like% `, then ` %like_case% `, then ` %not_like_case% ` and then back to ` %like% `.
#'
#' The `"%not_like%"` and `"%like_perl%"` functions are wrappers around `"%like%"`.
#' The `"%not_like%"` and `"%not_like_case%"` functions are wrappers around `"%like%"` and `"%like_case%"`.
#' @source Idea from the [`like` function from the `data.table` package](https://github.com/Rdatatable/data.table/blob/master/R/like.R)
# Developed at the University of Groningen, the Netherlands, in #
# collaboration with non-profit organisations Certe Medical #
# Diagnostics & Advice, and University Medical Center Groningen. #
# Diagnostics & Advice, and University Medical Center Groningen. #
# #
# This R package is free software; you can freely use and distribute #
# it for both personal and commercial purposes under the terms of the #
@ -29,7 +29,7 @@
#' @inheritSection lifecycle Stable lifecycle
#' @param x any character (vector) that can be coerced to a valid microorganism code with [as.mo()]
#' @param property one of the column names of the [microorganisms] data set or `"shortname"`
#' @param language language of the returned text, defaults to system language (see [get_locale()]) and can be overwritten by setting the option `AMR_locale`, e.g. `options(AMR_locale = "de")`, see [translate]. Use `language = NULL` or `language = ""` to prevent translation.
#' @param language language of the returned text, defaults to system language (see [get_locale()]) and can be overwritten by setting the option `AMR_locale`, e.g. `options(AMR_locale = "de")`, see [translate]. Also used to translate text like "no growth". Use `language = NULL` or `language = ""` to prevent translation.
#' @param ... other parameters passed on to [as.mo()], such as 'allow_uncertain' and 'ignore_pattern'
#' @param open browse the URL using [utils::browseURL()]
#' @details All functions will return the most recently known taxonomic property according to the Catalogue of Life, except for [mo_ref()], [mo_authors()] and [mo_year()]. Please refer to this example, knowing that *Escherichia blattae* was renamed to *Shimwellia blattae* in 2010:
@ -38,7 +38,7 @@
#' - `mo_ref("Shimwellia blattae")` will return `"Priest et al., 2010"` (without a message)
#'
#' The short name - [mo_shortname()] - almost always returns the first character of the genus and the full species, like `"E. coli"`. Exceptions are abbreviations of staphylococci (like *"CoNS"*, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (like *"GBS"*, Group B Streptococci). Please bear in mind that e.g. *E. coli* could mean *Escherichia coli* (kingdom of Bacteria) as well as *Entamoeba coli* (kingdom of Protozoa). Returning to the full name will be done using [as.mo()] internally, giving priority to bacteria and human pathogens, i.e. `"E. coli"` will be considered *Escherichia coli*. In other words, `mo_fullname(mo_shortname("Entamoeba coli"))` returns `"Escherichia coli"`.
#'
#'
#' Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions [mo_kingdom()] and [mo_domain()] return the exact same results.
#'
#' The Gram stain - [mo_gramstain()] - will be determined based on the taxonomic kingdom and phylum. According to Cavalier-Smith (2002, [PMID 11837318](https://pubmed.ncbi.nlm.nih.gov/11837318)), who defined subkingdoms Negibacteria and Posibacteria, only these phyla are Posibacteria: Actinobacteria, Chloroflexi, Firmicutes and Tenericutes. These bacteria are considered Gram-positive - all other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value `NA`. Functions [is_gram_negative()] and [is_gram_positive()] always return `TRUE` or `FALSE`, even for species outside the kingdom of Bacteria.
shortnames[shortnames%like%"S. group [ABCDFGHK]"]<-paste0("G",gsub("S. group ([ABCDFGHK])","\\1",shortnames[shortnames%like%"S. group [ABCDFGHK]"]),"S")
<spanclass="fu"><ahref="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(order =<spanclass="fu"><ahref="../reference/mo_property.html">mo_order</a></span>(<spanclass="kw">mo</span>), <spanclass="co"># group on anything, like order</span>
genus =<spanclass="fu"><ahref="../reference/mo_property.html">mo_genus</a></span>(<spanclass="kw">mo</span>)) <spanclass="op">%>%</span><spanclass="co"># and genus as we do here</span>
<spanclass="fu"><ahref="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span>(<spanclass="kw">is.rsi</span>, <spanclass="kw">resistance</span>) <spanclass="op">%>%</span><spanclass="co"># then get resistance of all drugs</span>
<spanclass="fu"><ahref="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span><spanclass="op">(</span>order <spanclass="op">=</span><spanclass="fu"><ahref="../reference/mo_property.html">mo_order</a></span><spanclass="op">(</span><spanclass="va">mo</span><spanclass="op">)</span>, <spanclass="co"># group on anything, like order</span>
genus <spanclass="op">=</span><spanclass="fu"><ahref="../reference/mo_property.html">mo_genus</a></span><spanclass="op">(</span><spanclass="va">mo</span><spanclass="op">)</span><spanclass="op">)</span><spanclass="op">%>%</span><spanclass="co"># and genus as we do here</span>
<spanclass="fu"><ahref="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span><spanclass="op">(</span><spanclass="va">is.rsi</span>, <spanclass="va">resistance</span><spanclass="op">)</span><spanclass="op">%>%</span><spanclass="co"># then get resistance of all drugs</span>
<spanclass="va">CAZ</span>, <spanclass="va">GEN</span>, <spanclass="va">TOB</span>, <spanclass="va">TMP</span>, <spanclass="va">SXT</span><spanclass="op">)</span><spanclass="co"># and select only relevant columns</span>
<ahref="#perform-principal-component-analysis"class="anchor"></a>Perform principal component analysis</h1>
<p>The new <code><ahref="../reference/pca.html">pca()</a></code> function will automatically filter on rows that contain numeric values in all selected variables, so we now only need to do:</p>
<p>Good news. The first two components explain a total of 93.3% of the variance (see the PC1 and PC2 values of the <em>Proportion of Variance</em>. We can create a so-called biplot with the base R <code><ahref="https://rdrr.io/r/stats/biplot.html">biplot()</a></code> function, to see which antimicrobial resistance per drug explain the difference per microorganism.</p>
<p>But we canโt see the explanation of the points. Perhaps this works better with our new <code><ahref="../reference/ggplot_pca.html">ggplot_pca()</a></code> function, that automatically adds the right labels and even groups:</p>