<p>So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values <code>M</code> and <code>F</code>. From a researcher perspective: there are slightly more men. Nothing we didn’t already know.</p>
<p>The data is already quite clean, but we still need to transform some variables. The <code>bacteria</code> column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The <code><ahref="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> function of the <code>dplyr</code> package makes this really easy:</p>
<p>Finally, we will apply <ahref="http://www.eucast.org/expert_rules_and_intrinsic_resistance/">EUCAST rules</a> on our antimicrobial results. In Europe, most medical microbiological laboratories already apply these rules. Our package features their latest insights on intrinsic resistance and exceptional phenotypes. Moreover, the <code><ahref="../reference/eucast_rules.html">eucast_rules()</a></code> function can also apply additional rules, like forcing <helptitle="ATC: J01CA01">ampicillin</help> = R when <helptitle="ATC: J01CR02">amoxicillin/clavulanic acid</help> = R.</p>
<p>Because the amoxicillin (column <code>amox</code>) and amoxicillin/clavulanic acid (column <code>amcl</code>) in our data were generated randomly, some rows will undoubtedly contain amox = S and amcl = R, which is technically impossible. The <code><ahref="../reference/eucast_rules.html">eucast_rules()</a></code> fixes this:</p>
<aclass="sourceLine"id="cb14-3"title="3"><spanclass="co">#>Rules by the European Committee on Antimicrobial Susceptibility Testing (EUCAST)</span></a>
<aclass="sourceLine"id="cb14-5"title="5"><spanclass="co">#> Rules by the European Committee on Antimicrobial Susceptibility Testing (EUCAST)</span></a>
<aclass="sourceLine"id="cb14-24"title="24"><spanclass="co">#> Table 1: Intrinsic resistance in Enterobacteriaceae (1230 changes)</span></a>
<aclass="sourceLine"id="cb14-25"title="25"><spanclass="co">#> Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-26"title="26"><spanclass="co">#> Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-27"title="27"><spanclass="co">#> Table 4: Intrinsic resistance in Gram-positive bacteria (2700 changes)</span></a>
<aclass="sourceLine"id="cb14-28"title="28"><spanclass="co">#> Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<aclass="sourceLine"id="cb14-29"title="29"><spanclass="co">#> Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<aclass="sourceLine"id="cb14-30"title="30"><spanclass="co">#> Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-31"title="31"><spanclass="co">#> Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)</span></a>
<aclass="sourceLine"id="cb14-32"title="32"><spanclass="co">#> Table 12: Interpretive rules for aminoglycosides (no changes)</span></a>
<aclass="sourceLine"id="cb14-33"title="33"><spanclass="co">#> Table 13: Interpretive rules for quinolones (no changes)</span></a>
<aclass="sourceLine"id="cb14-35"title="35"><spanclass="co">#> Other rules</span></a>
<aclass="sourceLine"id="cb14-36"title="36"><spanclass="co">#> Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-37"title="37"><spanclass="co">#> Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-38"title="38"><spanclass="co">#> Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-39"title="39"><spanclass="co">#> Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-40"title="40"><spanclass="co">#> Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-41"title="41"><spanclass="co">#> Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-43"title="43"><spanclass="co">#> => EUCAST rules affected 7,267 out of 20,000 rows</span></a>
<aclass="sourceLine"id="cb14-44"title="44"><spanclass="co">#> -> added 0 test results</span></a>
<aclass="sourceLine"id="cb14-45"title="45"><spanclass="co">#> -> changed 3,930 test results (0 to S; 0 to I; 3,930 to R)</span></a></code></pre></div>
<aclass="sourceLine"id="cb14-22"title="22"><spanclass="co">#> Table 1: Intrinsic resistance in Enterobacteriaceae (1291 changes)</span></a>
<aclass="sourceLine"id="cb14-23"title="23"><spanclass="co">#> Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-24"title="24"><spanclass="co">#> Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-25"title="25"><spanclass="co">#> Table 4: Intrinsic resistance in Gram-positive bacteria (2705 changes)</span></a>
<aclass="sourceLine"id="cb14-26"title="26"><spanclass="co">#> Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<aclass="sourceLine"id="cb14-27"title="27"><spanclass="co">#> Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<aclass="sourceLine"id="cb14-28"title="28"><spanclass="co">#> Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)</span></a>
<aclass="sourceLine"id="cb14-29"title="29"><spanclass="co">#> Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)</span></a>
<aclass="sourceLine"id="cb14-30"title="30"><spanclass="co">#> Table 12: Interpretive rules for aminoglycosides (no changes)</span></a>
<aclass="sourceLine"id="cb14-31"title="31"><spanclass="co">#> Table 13: Interpretive rules for quinolones (no changes)</span></a>
<aclass="sourceLine"id="cb14-33"title="33"><spanclass="co">#> Other rules</span></a>
<aclass="sourceLine"id="cb14-34"title="34"><spanclass="co">#> Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-35"title="35"><spanclass="co">#> Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-36"title="36"><spanclass="co">#> Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)</span></a>
<aclass="sourceLine"id="cb14-37"title="37"><spanclass="co">#> Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-38"title="38"><spanclass="co">#> Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-39"title="39"><spanclass="co">#> Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<aclass="sourceLine"id="cb14-41"title="41"><spanclass="co">#> => EUCAST rules affected 7,376 out of 20,000 rows</span></a>
<aclass="sourceLine"id="cb14-42"title="42"><spanclass="co">#> -> added 0 test results</span></a>
<aclass="sourceLine"id="cb14-43"title="43"><spanclass="co">#> -> changed 3,996 test results (0 to S; 0 to I; 3,996 to R)</span></a></code></pre></div>
<aclass="sourceLine"id="cb16-3"title="3"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<aclass="sourceLine"id="cb16-4"title="4"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `date` as input for `col_date`.</span></a>
<aclass="sourceLine"id="cb16-5"title="5"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<aclass="sourceLine"id="cb16-6"title="6"><spanclass="co">#> => Found 5,663 first isolates (28.3% of total)</span></a></code></pre></div>
<p>So only 28.3% is suitable for resistance analysis! We can now filter on it with the <code><ahref="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<aclass="sourceLine"id="cb16-6"title="6"><spanclass="co">#> => Found 5,641 first isolates (28.2% of total)</span></a></code></pre></div>
<p>So only 28.2% is suitable for resistance analysis! We can now filter on it with the <code><ahref="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<p>For future use, the above two syntaxes can be shortened with the <code><ahref="../reference/first_isolate.html">filter_first_isolate()</a></code> function:</p>
<aclass="sourceLine"id="cb19-4"title="4"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<aclass="sourceLine"id="cb19-7"title="7"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<aclass="sourceLine"id="cb19-8"title="8"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `date` as input for `col_date`.</span></a>
<aclass="sourceLine"id="cb19-9"title="9"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<aclass="sourceLine"id="cb19-10"title="10"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<aclass="sourceLine"id="cb19-11"title="11"><spanclass="co">#> [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<aclass="sourceLine"id="cb19-12"title="12"><spanclass="co">#> => Found 15,865 first weighted isolates (79.3% of total)</span></a></code></pre></div>
<aclass="sourceLine"id="cb19-5"title="5"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<aclass="sourceLine"id="cb19-6"title="6"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `date` as input for `col_date`.</span></a>
<aclass="sourceLine"id="cb19-7"title="7"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<aclass="sourceLine"id="cb19-8"title="8"><spanclass="co">#></span><spanclass="al">NOTE</span><spanclass="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<aclass="sourceLine"id="cb19-9"title="9"><spanclass="co">#> [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<aclass="sourceLine"id="cb19-10"title="10"><spanclass="co">#> => Found 15,939 first weighted isolates (79.7% of total)</span></a></code></pre></div>
<tableclass="table">
<thead><trclass="header">
<thalign="center">isolate</th>
@ -662,11 +654,11 @@
@@ -662,11 +654,11 @@
<tbody>
<trclass="odd">
<tdalign="center">1</td>
<tdalign="center">2010-04-19</td>
<tdalign="center">S8</td>
<tdalign="center">2010-01-24</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">TRUE</td>
@ -674,46 +666,46 @@
@@ -674,46 +666,46 @@
</tr>
<trclass="even">
<tdalign="center">2</td>
<tdalign="center">2010-08-08</td>
<tdalign="center">S8</td>
<tdalign="center">2010-03-30</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">I</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">FALSE</td>
<tdalign="center">TRUE</td>
</tr>
<trclass="odd">
<tdalign="center">3</td>
<tdalign="center">2010-10-31</td>
<tdalign="center">S8</td>
<tdalign="center">2010-07-21</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">R</td>
<tdalign="center">I</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">TRUE</td>
</tr>
<trclass="even">
<tdalign="center">4</td>
<tdalign="center">2010-11-11</td>
<tdalign="center">S8</td>
<tdalign="center">2010-09-23</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">I</td>
<tdalign="center">R</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">FALSE</td>
<tdalign="center">TRUE</td>
</tr>
<trclass="odd">
<tdalign="center">5</td>
<tdalign="center">2011-04-04</td>
<tdalign="center">S8</td>
<tdalign="center">2010-10-05</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
@ -722,47 +714,47 @@
@@ -722,47 +714,47 @@
</tr>
<trclass="even">
<tdalign="center">6</td>
<tdalign="center">2011-05-22</td>
<tdalign="center">S8</td>
<tdalign="center">2010-10-26</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">TRUE</td>
<tdalign="center">TRUE</td>
<tdalign="center">FALSE</td>
<tdalign="center">FALSE</td>
</tr>
<trclass="odd">
<tdalign="center">7</td>
<tdalign="center">2011-08-15</td>
<tdalign="center">S8</td>
<tdalign="center">2011-02-03</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">I</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">TRUE</td>
<tdalign="center">TRUE</td>
</tr>
<trclass="even">
<tdalign="center">8</td>
<tdalign="center">2011-08-20</td>
<tdalign="center">S8</td>
<tdalign="center">2011-02-16</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">TRUE</td>
</tr>
<trclass="odd">
<tdalign="center">9</td>
<tdalign="center">2011-08-25</td>
<tdalign="center">S8</td>
<tdalign="center">2011-04-19</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
@ -770,23 +762,23 @@
@@ -770,23 +762,23 @@
</tr>
<trclass="even">
<tdalign="center">10</td>
<tdalign="center">2011-12-16</td>
<tdalign="center">S8</td>
<tdalign="center">2011-05-17</td>
<tdalign="center">A4</td>
<tdalign="center">B_ESCHR_COL</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">R</td>
<tdalign="center">R</td>
<tdalign="center">S</td>
<tdalign="center">S</td>
<tdalign="center">FALSE</td>
<tdalign="center">TRUE</td>
</tr>
</tbody>
</table>
<p>Instead of 2, now 9 isolates are flagged. In total, 79.3% of all isolates are marked ‘first weighted’ - 51% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>Instead of 2, now 8 isolates are flagged. In total, 79.7% of all isolates are marked ‘first weighted’ - 51.5% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>As with <code><ahref="../reference/first_isolate.html">filter_first_isolate()</a></code>, there’s a shortcut for this new algorithm too:</p>
<p>The functions <code>portion_R</code>, <code>portion_RI</code>, <code>portion_I</code>, <code>portion_IS</code> and <code>portion_S</code> can be used to determine the portion of a specific antimicrobial outcome. They can be used on their own:</p>
<p>Or can be used in conjuction with <code><ahref="https://dplyr.tidyverse.org/reference/group_by.html">group_by()</a></code> and <code><ahref="https://dplyr.tidyverse.org/reference/summarise.html">summarise()</a></code>, both from the <code>dplyr</code> package:</p>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 10 milliseconds means it can determine 100 input values per second. It case of 50 milliseconds, this is only 20 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first is a WHONET code) or common laboratory codes, or common full organism names like the last one.</p>
<p>To achieve this speed, the <code>as.mo</code> function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of <em>Mycoplasma leonicaptivi</em> (<code>B_MYCPL_LEO</code>), a bug probably never found before in humans:</p>