* Function `is_new_episode()` to determine patient episodes which are not necessarily based on microorganisms. It also supports grouped variables with e.g. `mutate()` and `summarise()` of the `dplyr` package:
* Functions `mo_is_gram_negative()` and `mo_is_gram_positive()` as wrappers around `mo_gramstain()`. They always return `TRUE` or `FALSE` (except when the input is `NA` or the MO code is `UNKNOWN`), thus always return `FALSE` for species outside the taxonomic kingdom of Bacteria. If you have the `dplyr` package installed, they can even determine the column with microorganisms themselves when used inside `dplyr` verbs:
#' Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type.
#' Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use [is_new_episode()] that also supports grouping with the `dplyr` package, see *Examples*.
#' @inheritSection lifecycle Stable lifecycle
#' @param x a [data.frame] containing isolates.
#' @param col_date column name of the result date (or date that is was received on the lab), defaults to the first column of with a date class
#' @param x,.data a [data.frame] containing isolates.
#' @param col_date column name of the result date (or date that is was received on the lab), defaults to the first column with a date class
#' @param col_patient_id column name of the unique IDs of the patients, defaults to the first column that starts with 'patient' or 'patid' (case insensitive)
#' @param col_mo column name of the IDs of the microorganisms (see [as.mo()]), defaults to the first column of class [`mo`]. Values will be coerced using [as.mo()].
#' @param col_testcode column name of the test codes. Use `col_testcode = NULL` to **not** exclude certain test codes (like test codes for screening). In that case `testcodes_exclude` will be ignored.
@ -45,17 +45,26 @@
#' @param info print progress
#' @param include_unknown logical to determine whether 'unknown' microorganisms should be included too, i.e. microbial code `"UNKNOWN"`, which defaults to `FALSE`. For WHONET users, this means that all records with organism code `"con"` (*contamination*) will be excluded at default. Isolates with a microbial ID of `NA` will always be excluded as first isolate.
#' @param ... parameters passed on to the [first_isolate()] function
#' @details **WHY THIS IS SO IMPORTANT** \cr
#' @details The [is_new_episode()] function is a wrapper around the [first_isolate()] function and can be used for data sets without isolates to just determine patient episodes based on any combination of grouping variables (using `dplyr`), please see *Examples*. Since it runs [first_isolate()] for every group, it is quite slow.
#'
#' All isolates with a microbial ID of `NA` will be excluded as first isolate.
#'
#' ### Why this is so important
#' To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode [(ref)](https:/pubmed.ncbi.nlm.nih.gov/17304462/). If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all *S. aureus* isolates would be overestimated, because you included this MRSA more than once. It would be [selection bias](https://en.wikipedia.org/wiki/Selection_bias).
#'
#' All isolates with a microbial ID of `NA` will be excluded as first isolate.
#' ### `filter_*()` shortcuts
#'
#' The functions [filter_first_isolate()] and [filter_first_weighted_isolate()] are helper functions to quickly filter on first isolates. The function [filter_first_isolate()] is essentially equal to either:
#' The functions [filter_first_isolate()] and [filter_first_weighted_isolate()] are helper functions to quickly filter on first isolates.
#'
#' The function [filter_first_isolate()] is essentially equal to either:
#'
#' ```
#' x[first_isolate(x, ...), ]
#' x %>% filter(first_isolate(x, ...))
#' ```
#'
#' The function [filter_first_weighted_isolate()] is essentially equal to:
message_("Using combined columns `",font_bold("First name"),"`, `",font_bold("Last name"),"` and `",font_bold("Sex"),"` as input for `col_patient_id`")
}
}else{
col_patient_id<-search_type_in_df(x=.data,
type="patient_id",
info=first_group)
}
stop_if(is.null(col_patient_id),"`col_patient_id` must be set")
}
# create any random mo, so first isolates can be calculated
<li><p>Function <code><ahref="../reference/first_isolate.html">is_new_episode()</a></code> to determine patient episodes which are not necessarily based on microorganisms. It also supports grouped variables with e.g.<code><ahref="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> and <code><ahref="https://dplyr.tidyverse.org/reference/summarise.html">summarise()</a></code> of the <code>dplyr</code> package: <code>r example_isolates %>% group_by(hospital_id) %>% summarise(patients = n_distinct(patient_id), n_episodes_365 = sum(is_new_episode(episode_days = 365)), n_episodes_60 = sum(is_new_episode(episode_days = 60)))</code></p></li>
<li>
<p>Functions <code><ahref="../reference/mo_property.html">mo_is_gram_negative()</a></code> and <code><ahref="../reference/mo_property.html">mo_is_gram_positive()</a></code> as wrappers around <code><ahref="../reference/mo_property.html">mo_gramstain()</a></code>. They always return <code>TRUE</code> or <code>FALSE</code> (except when the input is <code>NA</code> or the MO code is <code>UNKNOWN</code>), thus always return <code>FALSE</code> for species outside the taxonomic kingdom of Bacteria. If you have the <code>dplyr</code> package installed, they can even determine the column with microorganisms themselves when used inside <code>dplyr</code> verbs:</p>
<metaproperty="og:title"content="Determine first (weighted) isolates — first_isolate"/>
<metaproperty="og:description"content="Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type."/>
<metaproperty="og:description"content="Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use is_new_episode() that also supports grouping with the dplyr package, see Examples."/>
<spanclass="version label label-default"data-toggle="tooltip"data-placement="bottom"title="Latest development version">1.4.0.9000</span>
<spanclass="version label label-default"data-toggle="tooltip"data-placement="bottom"title="Latest development version">1.4.0.9024</span>
</span>
</div>
@ -239,7 +239,7 @@
</div>
<divclass="ref-description">
<p>Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type.</p>
<p>Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use <code>is_new_episode()</code> that also supports grouping with the <code>dplyr</code> package, see <em>Examples</em>.</p>
To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode <ahref='https:/pubmed.ncbi.nlm.nih.gov/17304462/'>(ref)</a>. If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all <em>S. aureus</em> isolates would be overestimated, because you included this MRSA more than once. It would be <ahref='https://en.wikipedia.org/wiki/Selection_bias'>selection bias</a>.</p>
<p>All isolates with a microbial ID of <code>NA</code> will be excluded as first isolate.</p>
<p>The functions <code>filter_first_isolate()</code> and <code>filter_first_weighted_isolate()</code> are helper functions to quickly filter on first isolates. The function <code>filter_first_isolate()</code> is essentially equal to either:</p><pre><spanclass='va'>x</span><spanclass='op'>[</span><spanclass='fu'>first_isolate</span><spanclass='op'>(</span><spanclass='va'>x</span>, <spanclass='va'>...</span><spanclass='op'>)</span>, <spanclass='op'>]</span>
<p>The <code>is_new_episode()</code> function is a wrapper around the <code>first_isolate()</code> function and can be used for data sets without isolates to just determine patient episodes based on any combination of grouping variables (using <code>dplyr</code>), please see <em>Examples</em>. Since it runs <code>first_isolate()</code> for every group, it is quite slow.</p>
<p>All isolates with a microbial ID of <code>NA</code> will be excluded as first isolate.</p><h3class='hasAnchor'id='arguments'><aclass='anchor'href='#arguments'></a>Why this is so important</h3>
<p>To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode <ahref='https:/pubmed.ncbi.nlm.nih.gov/17304462/'>(ref)</a>. If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all <em>S. aureus</em> isolates would be overestimated, because you included this MRSA more than once. It would be <ahref='https://en.wikipedia.org/wiki/Selection_bias'>selection bias</a>.</p>
<p>The functions <code>filter_first_isolate()</code> and <code>filter_first_weighted_isolate()</code> are helper functions to quickly filter on first isolates.</p>
<p>The function <code>filter_first_isolate()</code> is essentially equal to either:</p><pre><spanclass='va'>x</span><spanclass='op'>[</span><spanclass='fu'>first_isolate</span><spanclass='op'>(</span><spanclass='va'>x</span>, <spanclass='va'>...</span><spanclass='op'>)</span>, <spanclass='op'>]</span>
\item{col_date}{column name of the result date (or date that is was received on the lab), defaults to the first column of with a date class}
\item{col_date}{column name of the result date (or date that is was received on the lab), defaults to the first column with a date class}
\item{col_patient_id}{column name of the unique IDs of the patients, defaults to the first column that starts with 'patient' or 'patid' (case insensitive)}
@ -90,15 +98,22 @@ filter_first_weighted_isolate(
A \code{\link{logical}} vector
}
\description{
Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type.
Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use \code{\link[=is_new_episode]{is_new_episode()}} that also supports grouping with the \code{dplyr} package, see \emph{Examples}.
}
\details{
\strong{WHY THIS IS SO IMPORTANT} \cr
To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode \href{https:/pubmed.ncbi.nlm.nih.gov/17304462/}{(ref)}. If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all \emph{S. aureus} isolates would be overestimated, because you included this MRSA more than once. It would be \href{https://en.wikipedia.org/wiki/Selection_bias}{selection bias}.
The \code{\link[=is_new_episode]{is_new_episode()}} function is a wrapper around the \code{\link[=first_isolate]{first_isolate()}} function and can be used for data sets without isolates to just determine patient episodes based on any combination of grouping variables (using \code{dplyr}), please see \emph{Examples}. Since it runs \code{\link[=first_isolate]{first_isolate()}} for every group, it is quite slow.
All isolates with a microbial ID of \code{NA} will be excluded as first isolate.
\subsection{Why this is so important}{
To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode \href{https:/pubmed.ncbi.nlm.nih.gov/17304462/}{(ref)}. If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all \emph{S. aureus} isolates would be overestimated, because you included this MRSA more than once. It would be \href{https://en.wikipedia.org/wiki/Selection_bias}{selection bias}.
}
\subsection{\verb{filter_*()} shortcuts}{
The functions \code{\link[=filter_first_isolate]{filter_first_isolate()}} and \code{\link[=filter_first_weighted_isolate]{filter_first_weighted_isolate()}} are helper functions to quickly filter on first isolates. The function \code{\link[=filter_first_isolate]{filter_first_isolate()}} is essentially equal to either:\preformatted{ x[first_isolate(x, ...), ]
The functions \code{\link[=filter_first_isolate]{filter_first_isolate()}} and \code{\link[=filter_first_weighted_isolate]{filter_first_weighted_isolate()}} are helper functions to quickly filter on first isolates.
The function \code{\link[=filter_first_isolate]{filter_first_isolate()}} is essentially equal to either:\preformatted{ x[first_isolate(x, ...), ]
x \%>\% filter(first_isolate(x, ...))
}
@ -110,6 +125,7 @@ The function \code{\link[=filter_first_weighted_isolate]{filter_first_weighted_i
select(-only_weighted_firsts, -keyab)
}
}
}
\section{Key antibiotics}{
There are two ways to determine whether isolates can be included as first \emph{weighted} isolates which will give generally the same results:
@ -143,21 +159,22 @@ On our website \url{https://msberends.github.io/AMR/} you can find \href{https:/