You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

398 lines
27 KiB

4 years ago
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>How to import data from SPSS / SAS / Stata • AMR (for R)</title>
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png">
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png">
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png">
<!-- jquery --><script src="" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="" rel="stylesheet" crossorigin="anonymous">
<script src="" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script><!-- Font Awesome icons --><link rel="stylesheet" href="" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous">
<!-- clipboard.js --><script src="" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script><!-- sticky kit --><script src="" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script><!-- docsearch --><script src="../docsearch.js"></script><link rel="stylesheet" href="" integrity="sha256-QOSRU/ra9ActyXkIBbiIB144aDBdtvXBcNc3OTNuX/Q=" crossorigin="anonymous">
<link href="../docsearch.css" rel="stylesheet">
<script src="" integrity="sha256-4HLtjeVgH0eIB3aZ9mLYF6E8oU5chNdjU6p6rrXpl9U=" crossorigin="anonymous"></script><link href="../extra.css" rel="stylesheet">
<script src="../extra.js"></script><meta property="og:title" content="How to import data from SPSS / SAS / Stata">
<meta property="og:description" content="">
<meta property="og:image" content="">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src=""></script>
<script src=""></script>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version"></span>
4 years ago
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<a href="../index.html">
<span class="fa fa-home"></span>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
<span class="fa fa-question-circle"></span>
How to
<span class="caret"></span>
<ul class="dropdown-menu" role="menu">
<a href="../articles/AMR.html">
<span class="fa fa-directions"></span>
Conduct AMR analysis
<a href="../articles/resistance_predict.html">
<span class="fa fa-dice"></span>
Predict antimicrobial resistance
<a href="../articles/WHONET.html">
<span class="fa fa-globe-americas"></span>
Work with WHONET data
<a href="../articles/SPSS.html">
<span class="fa fa-file-upload"></span>
Import data from SPSS/SAS/Stata
<a href="../articles/EUCAST.html">
<span class="fa fa-exchange-alt"></span>
Apply EUCAST rules
<a href="../reference/mo_property.html">
<span class="fa fa-bug"></span>
Get properties of a microorganism
<a href="../reference/atc_property.html">
<span class="fa fa-capsules"></span>
Get properties of an antibiotic
<a href="../articles/freq.html">
<span class="fa fa-sort-amount-down"></span>
Create frequency tables
<a href="../articles/G_test.html">
<span class="fa fa-clipboard-check"></span>
Use the G-test
<a href="../articles/benchmarks.html">
<span class="fa fa-shipping-fast"></span>
Other: benchmarks
<a href="../reference/">
<span class="fa fa-book-open"></span>
<a href="../authors.html">
<span class="fa fa-users"></span>
<a href="../news/">
<span class="far fa far fa-newspaper"></span>
<ul class="nav navbar-nav navbar-right">
<a href="">
<span class="fab fa fab fa-gitlab"></span>
Source Code
<a href="../LICENSE-text.html">
<span class="fa fa-book"></span>
<form class="navbar-form navbar-right" role="search">
<div class="form-group">
<input type="search" class="form-control" name="search-input" id="search-input" placeholder="Search..." aria-label="Search for..." autocomplete="off">
<!--/.nav-collapse -->
<!--/.container -->
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>How to import data from SPSS / SAS / Stata</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">14 February 2019</h4>
<div class="hidden name"><code>SPSS.Rmd</code></div>
<div id="spss-sas-stata" class="section level2">
<h2 class="hasAnchor">
<a href="#spss-sas-stata" class="anchor"></a>SPSS / SAS / Stata</h2>
<p>SPSS (Statistical Package for the Social Sciences) is probably the most well-known software package for statistical analysis. SPSS is easier to learn than R, because in SPSS you only have to click a menu to run parts of your analysis. Because of its user-friendlyness, it is taught at universities and particularly useful for students who are new to statistics. From my experience, I would guess that pretty much all (bio)medical students know it at the time they graduate. SAS and Stata are statistical packages popular in big industries.</p>
<div id="compared-to-r" class="section level2">
<h2 class="hasAnchor">
<a href="#compared-to-r" class="anchor"></a>Compared to R</h2>
<p>As said, SPSS is easier to learn than R. But SPSS, SAS and Stata come with major downsides when comparing it with R:</p>
<p><strong>R is highly modular.</strong></p>
<p>The <a href="">official R network (CRAN)</a> features almost 14,000 packages at the time of writing, our <code>AMR</code> package being one of them. All these packages were peer-reviewed before publication. Aside from this official channel, there are also developers who choose not to submit to CRAN, but rather keep it on their own public repository, like GitLab or GitHub. So there may even be a lot more than 14,000 packages out there.</p>
<p>Bottomline is, you can really extend it yourself or ask somebody to do this for you. Take for example our <code>AMR</code> package. SPSS, SAS and Stata will never know what a valid MIC value is (so data might not be clean) or what the Gram stain of <em>E. coli</em> is. Or the fact that all species of <em>Klebiella</em> are resistant to amoxicillin.</p>
<p><strong>R is extremely flexible.</strong></p>
<p>Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, gathering, grouping, summarising and drawing plots is endless - with SPSS, SAS or Stata you are bound to their algorithms and styles. It may be a bit flexible, but you can never create that very specific publication-ready plot without using other (paid) software.</p>
<p><strong>R can be easily automated.</strong></p>
<p>Over the last years, <a href="">R Markdown</a> has really made an interesting development. With R Markdown, you can very easily reproduce your reports, whether it’s to Word, Powerpoint, a website, a PDF document or just the raw data to Excel. I use this a lot to generate monthly reports automatically. Just write the code once and enjoy the automatically updated reports at any interval you like.</p>
<p>For an even more professional environment, you could create <a href="">Shiny apps</a>: live manipulation of data using a custom made website. The webdesign knowledge needed (Javascript, CSS, HTML) is almost <em>zero</em>.</p>
<p><strong>R has a huge community.</strong></p>
<p>Many R users just ask questions on website like <a href=""></a>, the largest online community for programmers. At the time of writing, around <a href="">275,000 R questions</a> have been asked on this platform (which covers questions and answer for any programming language). In my own experience, most questions are answered within a couple of minutes.</p>
<p><strong>R understands any data type, including SPSS/SAS/Stata.</strong></p>
<p>And that’s not vice versa I’m afraid. You can import data from any source into R. As said, from SPSS/SAS/Stata (<a href="">link</a>), but also from Excel (<a href="">link</a>), from flat files like CSV, TXT or TSV (<a href="">link</a>), or directly from databases or datawarehouses from anywhere on the world (<a href="">link</a>). You can even scrape websites to download tables that are live on the internet (<a href="">link</a>).</p>
<p>And the best part - you can export from R to all data formats as well. So you can import an SPSS file, do your analysis neatly in R and export back to SPSS. Although you might omit that very last step.</p>
<p><strong>R is completely free and open-source.</strong></p>
<p>No strings attached. It was created and is being maintained by volunteers who believe that (data) science should be open and publicly available to everybody. SPSS, SAS and Stata are quite expensive. IBM SPSS Staticstics only comes with subscriptions nowadays, varying <a href="">between USD 1,300 and USD 8,500</a> per computer <em>per year</em>. SAS Analytics Pro costs <a href="">around USD 10,000</a> per computer. Stata also has a business model with subscription fees, varying <a href="">between USD 600 and USD 1,200</a> per computer per year, but lower prices come with a limitation of the number of variables you can work with.</p>
<p>If you are working at a midsized or small company, you can save it tens of thousands of dollars by using R instead of SPSS - gaining even more functions and flexibility. And all R enthousiasts can do as much PR as they want (like I do here), because nobody is officially associated with or affiliated by R. It is really free.</p>
<p>If you sometimes write syntaxes in SPSS to run a complete analysis or to ‘automate’ some of your work, you should perhaps do this in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS.</p>
<div id="import-data-from-spsssasstata" class="section level2">
<h2 class="hasAnchor">
<a href="#import-data-from-spsssasstata" class="anchor"></a>Import data from SPSS/SAS/Stata</h2>
<div id="rstudio" class="section level3">
<h3 class="hasAnchor">
<a href="#rstudio" class="anchor"></a>RStudio</h3>
<p>To work with R, probably the best option is to use <a href="">RStudio</a>. It is an open-source and free desktop environment which not only allows you to run R code, but also supports project management, version management, package management and convenient import menu to work with other data sources. You can also run <a href="">RStudio Server</a>, which is nothing less than the complete RStudio software available as a website (e.g. in your corporate network or at home).</p>
<p>To import a data file, just click <em>Import Dataset</em> in the Environment tab:</p>
<p><img src="../import1.png"></p>
<p>If additional packages are needed, RStudio will ask you if they should be installed on beforehand.</p>
<p>In the the window that opens, you can define all options (parameters) that should be used for import and you’re ready to go:</p>
<p><img src="../import2.png"></p>
<p>If you want named variables to be imported as factors so it resembles SPSS more, use <code><a href="">as_factor()</a></code>.</p>
<p>The difference is this:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1">SPSS_data</a>
<a class="sourceLine" id="cb1-2" title="2"><span class="co"># # A tibble: 4,203 x 4</span></a>
<a class="sourceLine" id="cb1-3" title="3"><span class="co"># v001 sex status statusage</span></a>
<a class="sourceLine" id="cb1-4" title="4"><span class="co"># &lt;dbl&gt; &lt;dbl+lbl&gt; &lt;dbl+lbl&gt; &lt;dbl&gt;</span></a>
<a class="sourceLine" id="cb1-5" title="5"><span class="co"># 1 10002 1 1 76.6</span></a>
<a class="sourceLine" id="cb1-6" title="6"><span class="co"># 2 10004 0 1 59.1</span></a>
<a class="sourceLine" id="cb1-7" title="7"><span class="co"># 3 10005 1 1 54.5</span></a>
<a class="sourceLine" id="cb1-8" title="8"><span class="co"># 4 10006 1 1 54.1</span></a>
<a class="sourceLine" id="cb1-9" title="9"><span class="co"># 5 10007 1 1 57.7</span></a>
<a class="sourceLine" id="cb1-10" title="10"><span class="co"># 6 10008 1 1 62.8</span></a>
<a class="sourceLine" id="cb1-11" title="11"><span class="co"># 7 10010 0 1 63.7</span></a>
<a class="sourceLine" id="cb1-12" title="12"><span class="co"># 8 10011 1 1 73.1</span></a>
<a class="sourceLine" id="cb1-13" title="13"><span class="co"># 9 10017 1 1 56.7</span></a>
<a class="sourceLine" id="cb1-14" title="14"><span class="co"># 10 10018 0 1 66.6</span></a>
<a class="sourceLine" id="cb1-15" title="15"><span class="co"># # … with 4,193 more rows</span></a>
<a class="sourceLine" id="cb1-16" title="16"></a>
<a class="sourceLine" id="cb1-17" title="17"><span class="kw">as_factor</span>(SPSS_data)</a>
<a class="sourceLine" id="cb1-18" title="18"><span class="co"># # A tibble: 4,203 x 4</span></a>
<a class="sourceLine" id="cb1-19" title="19"><span class="co"># v001 sex status statusage</span></a>
<a class="sourceLine" id="cb1-20" title="20"><span class="co"># &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt;</span></a>
<a class="sourceLine" id="cb1-21" title="21"><span class="co"># 1 10002 Male alive 76.6</span></a>
<a class="sourceLine" id="cb1-22" title="22"><span class="co"># 2 10004 Female alive 59.1</span></a>
<a class="sourceLine" id="cb1-23" title="23"><span class="co"># 3 10005 Male alive 54.5</span></a>
<a class="sourceLine" id="cb1-24" title="24"><span class="co"># 4 10006 Male alive 54.1</span></a>
<a class="sourceLine" id="cb1-25" title="25"><span class="co"># 5 10007 Male alive 57.7</span></a>
<a class="sourceLine" id="cb1-26" title="26"><span class="co"># 6 10008 Male alive 62.8</span></a>
<a class="sourceLine" id="cb1-27" title="27"><span class="co"># 7 10010 Female alive 63.7</span></a>
<a class="sourceLine" id="cb1-28" title="28"><span class="co"># 8 10011 Male alive 73.1</span></a>
<a class="sourceLine" id="cb1-29" title="29"><span class="co"># 9 10017 Male alive 56.7</span></a>
<a class="sourceLine" id="cb1-30" title="30"><span class="co"># 10 10018 Female alive 66.6</span></a>
<a class="sourceLine" id="cb1-31" title="31"><span class="co"># # … with 4,193 more rows</span></a></code></pre></div>
<div id="base-r" class="section level3">
<h3 class="hasAnchor">
<a href="#base-r" class="anchor"></a>Base R</h3>
<p>To import data from SPSS, SAS or Stata, you can use the <a href="">great <code>haven</code> package</a> yourself:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" title="1"><span class="co"># download and install the latest version:</span></a>
<a class="sourceLine" id="cb2-2" title="2"><span class="kw"><a href="">install.packages</a></span>(<span class="st">"haven"</span>)</a>
<a class="sourceLine" id="cb2-3" title="3"><span class="co"># load the package you just installed:</span></a>
<a class="sourceLine" id="cb2-4" title="4"><span class="kw"><a href="">library</a></span>(haven) </a></code></pre></div>
<p>You can now import files as follows:</p>
<div id="spss" class="section level4">
<h4 class="hasAnchor">
<a href="#spss" class="anchor"></a>SPSS</h4>
<p>To read files from SPSS into R:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1"><span class="co"># read any SPSS file based on file extension (best way):</span></a>
<a class="sourceLine" id="cb3-2" title="2"><span class="kw"><a href="">read_spss</a></span>(<span class="dt">file =</span> <span class="st">"path/to/file"</span>)</a>
<a class="sourceLine" id="cb3-3" title="3"></a>
<a class="sourceLine" id="cb3-4" title="4"><span class="co"># read .sav or .zsav file:</span></a>
<a class="sourceLine" id="cb3-5" title="5"><span class="kw"><a href="">read_sav</a></span>(<span class="dt">file =</span> <span class="st">"path/to/file"</span>)</a>
<a class="sourceLine" id="cb3-6" title="6"></a>
<a class="sourceLine" id="cb3-7" title="7"><span class="co"># read .por file:</span></a>
<a class="sourceLine" id="cb3-8" title="8"><span class="kw"><a href="">read_por</a></span>(<span class="dt">file =</span> <span class="st">"path/to/file"</span>)</a></code></pre></div>
<p>Do not forget about <code><a href="">as_factor()</a></code>, as mentioned above.</p>
<p>To export your R objects to the SPSS file format:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="co"># save as .sav file:</span></a>
<a class="sourceLine" id="cb4-2" title="2"><span class="kw"><a href="">write_sav</a></span>(<span class="dt">data =</span> yourdata, <span class="dt">path =</span> <span class="st">"path/to/file"</span>)</a>
<a class="sourceLine" id="cb4-3" title="3"></a>
<a class="sourceLine" id="cb4-4" title="4"><span class="co"># save as compressed .zsav file:</span></a>
<a class="sourceLine" id="cb4-5" title="5"><span class="kw"><a href="">write_sav</a></span>(<span class="dt">data =</span> yourdata, <span class="dt">path =</span> <span class="st">"path/to/file"</span>, <span class="dt">compress =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<div id="sas" class="section level4">
<h4 class="hasAnchor">
<a href="#sas" class="anchor"></a>SAS</h4>
<p>To read files from SAS into R:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" title="1"><span class="co"># read .sas7bdat + .sas7bcat files:</span></a>
<a class="sourceLine" id="cb5-2" title="2"><span class="kw"><a href="">read_sas</a></span>(<span class="dt">data_file =</span> <span class="st">"path/to/file"</span>, <span class="dt">catalog_file =</span> <span class="ot">NULL</span>)</a>
<a class="sourceLine" id="cb5-3" title="3"></a>
<a class="sourceLine" id="cb5-4" title="4"><span class="co"># read SAS transport files (version 5 and version 8):</span></a>
<a class="sourceLine" id="cb5-5" title="5"><span class="kw"><a href="">read_xpt</a></span>(<span class="dt">file =</span> <span class="st">"path/to/file"</span>)</a></code></pre></div>
<p>To export your R objects to the SAS file format:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" title="1"><span class="co"># save as regular SAS file:</span></a>
<a class="sourceLine" id="cb6-2" title="2"><span class="kw"><a href="">write_sas</a></span>(<span class="dt">data =</span> yourdata, <span class="dt">path =</span> <span class="st">"path/to/file"</span>)</a>
<a class="sourceLine" id="cb6-3" title="3"></a>
<a class="sourceLine" id="cb6-4" title="4"><span class="co"># the SAS transport format is an open format </span></a>
<a class="sourceLine" id="cb6-5" title="5"><span class="co"># (required for submission of the data to the FDA)</span></a>
<a class="sourceLine" id="cb6-6" title="6"><span class="kw"><a href="">write_xpt</a></span>(<span class="dt">data =</span> yourdata, <span class="dt">path =</span> <span class="st">"path/to/file"</span>, <span class="dt">version =</span> <span class="dv">8</span>)</a></code></pre></div>
<div id="stata" class="section level4">
<h4 class="hasAnchor">
<a href="#stata" class="anchor"></a>Stata</h4>
<p>To read files from Stata into R:</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" title="1"><span class="co"># read .dta file:</span></a>
<a class="sourceLine" id="cb7-2" title="2"><span class="kw"><a href="">read_stata</a></span>(<span class="dt">file =</span> <span class="st">"/path/to/file"</span>)</a>
<a class="sourceLine" id="cb7-3" title="3"></a>
<a class="sourceLine" id="cb7-4" title="4"><span class="co"># works exactly the same:</span></a>
<a class="sourceLine" id="cb7-5" title="5"><span class="kw"><a href="">read_dta</a></span>(<span class="dt">file =</span> <span class="st">"/path/to/file"</span>)</a></code></pre></div>
<p>To export your R objects to the Stata file format:</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1"><span class="co"># save as .dta file, Stata version 14:</span></a>
<a class="sourceLine" id="cb8-2" title="2"><span class="co"># (supports Stata v8 until v15 at the time of writing)</span></a>
<a class="sourceLine" id="cb8-3" title="3"><span class="kw"><a href="">write_dta</a></span>(<span class="dt">data =</span> yourdata, <span class="dt">path =</span> <span class="st">"/path/to/file"</span>, <span class="dt">version =</span> <span class="dv">14</span>)</a></code></pre></div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#spss-sas-stata">SPSS / SAS / Stata</a></li>
<li><a href="#compared-to-r">Compared to R</a></li>
<li><a href="#import-data-from-spsssasstata">Import data from SPSS/SAS/Stata</a></li>
<footer><div class="copyright">
<p>Developed by <a href="">Matthijs S. Berends</a>, <a href="">Christian F. Luz</a>, <a href="">Corinna Glasner</a>, <a href="">Alex W. Friedrich</a>, <a href="">Bhanu N. M. Sinha</a>.</p>
<div class="pkgdown">
<p>Site built with <a href="">pkgdown</a> 1.3.0.</p>
<script src="" integrity="sha256-GKvGqXDznoRYHCwKXGnuchvKSwmx9SRMrZOTh2g4Sb0=" crossorigin="anonymous"></script><script>
apiKey: 'f737050abfd4d726c63938e18f8c496e',
indexName: 'amr',
inputSelector: 'input#search-input.form-control',
transformData: function(hits) {
return (hit) {
hit.url = updateHitURL(hit);
return hit;