This function can be used to import files exported by LimeSurvey, a powerful Open Source online survey application that can be used for, for example, psychological experiments and other research.
importLimeSurveyData(datafile = NULL, dataPath = NULL, datafileRegEx = NULL, scriptfile = NULL, limeSurveyRegEx.varNames = "names\\(data\\)\\[\\d*\\] <- ", limeSurveyRegEx.toChar = "data\\[, \\d*\\] <- as.character\\(data\\[, \\d*\\]\\)", limeSurveyRegEx.varLabels = "attributes\\(data\\)\\$variable.labels\\[\\d*\\] <- \".*\"", limeSurveyRegEx.toFactor = paste0("data\\[, \\d*\\] <- factor\\(data\\[, \\d*\\], ", "levels=c\\(.*\\),.*labels=c\\(.*\\)\\)"), limeSurveyRegEx.varNameSanitizing = list(list(pattern = "#", replacement = "."), list(pattern = "\\$", replacement = ".")), setVarNames = TRUE, setLabels = TRUE, convertToCharacter = FALSE, convertToFactor = FALSE, categoricalQuestions = NULL, massConvertToNumeric = TRUE, dataHasVarNames = TRUE, encoding = "NULL", dataEncoding = "unknown", scriptEncoding = "ASCII")
datafile | The path and filename of the file containing the data (comma separated values). |
---|---|
dataPath, datafileRegEx | Path containing datafiles: this can be used to read multiple datafiles, if the data is split between those. This is useful when downloading the entire datafile isn't possible because of server restrictions, for example when the processing time for the script in LimeSurvey that generates the datafiles is limited. In that case, the data can be downloaded in portions, and specifying a path here enables reading all datafiles in one go. Use the regular expression to indicate which files in the path should be read. |
scriptfile | The path and filename of the file containing the R script to import the data. |
limeSurveyRegEx.varNames | The regular expression used to extract the variable names from the script file. The first regex expression (i.e. the first expression between parentheses) will be extracted as variable name. |
limeSurveyRegEx.toChar | The regular expression to detect the lines in the import script where variables are converted to the character type. |
limeSurveyRegEx.varLabels | The regular expression used to detect the lines in the import script where variable labels are set. |
limeSurveyRegEx.toFactor | The regular expression used to detect the lines in the import script where vectors are converted to factors. |
limeSurveyRegEx.varNameSanitizing | A list of regular expression patterns and their replacements to sanitize the variable names (e.g. replace hashes/pound signs ('#') by something that is not considered the comment symbol by R). |
setVarNames, setLabels, convertToCharacter, convertToFactor | Whether to set variable names or labels, or convert to character or factor, using the code isolated using the specified regular expression. |
categoricalQuestions | Which variables (specified using LimeSurvey variable names) are considered categorical questions; for these, the script to convert the variables to factors, as extracted from the LimeSurvey import file, is applied. |
massConvertToNumeric | Whether to convert all variables to numeric using
|
dataHasVarNames | Whether the variable names are included as header (first line) in the comma separated values file (data file). |
encoding, dataEncoding, scriptEncoding | The encoding of the files; |
This function was intended to make importing data from LimeSurvey a bit easier. The default settings used by LimeSurvey are not always convenient, and this function provides a bit more control.
The dataframe.
# NOT RUN { ### Of course, you need valid LimeSurvey files. This is an example of ### what you'd do if you have them, assuming you specified that path ### containing the data in 'dataPath', the name of the datafile in ### 'dataFileName', the name of the script file in 'dataLoadScriptName', ### and that you only want variables 'informedConsent', 'gender', 'hasJob', ### 'currentEducation', 'prevEducation', and 'country' to be converted to ### factors. dat <- importLimeSurveyData(datafile = file.path(dataPath, dataFileName), scriptfile = file.path(dataPath, dataLoadScriptName), categoricalQuestions = c('informedConsent', 'gender', 'hasJob', 'currentEducation', 'prevEducation', 'country')); # }