pcpattL {pcpattL} | R Documentation |
The function pcpattL
converts a set of likert type responses measured on a commmon scale into
paired comparison patterns, returning a new data frame containing the design matrix for a loglinear
paired comparison model. Additionally, the
frequencies of these patterns are computed and are stored in the first
column of the data frame. Optionally,
the function provides all necessary structures (commands, data/design
files) to fit the loglinear paired comparisons pattern model in GLIM,
which is often more efficient at fitting large loglinear models of this type.
pcpattL(ctrl, dfr = NULL)
ctrl |
a list of control parameters If
ctrl is not correctly specified an error message is printed. |
dfr |
a dataframe conforming to the same specifications as the input
data file (see below). The default is NULL , i.e., no dataframe
is supplied in the call of pcpattL with the data instead supplied
through the datafile element in the ctrl list. |
Prior to the call of pcpattL
the user has to provide a control list
ctrl
(see The Control List) and data (either in the form of a
dataframe or an external file) which has to conform to a certain structure
(see Input Data).
The typical usage is
desmat <- pcpattL(ctrl)or just
pcpattL(ctrl)
if only the GLIM output is wanted.
The function pcpattL
allows for different scenarios mainly concerning
blnIntcovs = TRUE
.
blnGLIMcmds = TRUE
two files are generated one of which contains all GLIM commands to fit
a basic loglinear paired comparisons pattern model. The other contains
the design matrix optionally including subject covariates. If
dependency covariates are requested they are written to a third file.
(Please note that the corresponding part of design matrix is transposed
in the interactions output file to allow GLIM for using the $array
facility in case of a large number of parameters to be estimated)
The output is a dataframe. Each row represents a unique response pattern.
If subject covariates are specified, each row instead represents a particular
combination of a unique covariate combination with a response pattern. All
possible combinations are generated.
The first column contains the counts for the paired
comparison response patterns and is labelled with Y
. The next columns
are the covariates for the items and the undecided category effects (one for
each comparison). These are labelled Uab
, where ab
denotes the
comparison between items a
and b
. Optionally,
covariates for dependencies between comparisons follow. The columns are labelled
Ia.bc
denoting the interaction of the comparisons between items (a,b)
and (a,c)
where the common item is a
. If subject covariates are
present they are in the rightmost columns and defined to be factors.
Alternatively, the function pcpattL
does not produce output in R if GLIM output
is requested via blnGLIMcmds = TRUE
. The output is then written to
the corresponding files (see The Control List below).
The argument ctrl
is a list with elements described below. It must be defined prior
to the call of pcpattL
.
datafile
nitems
blnRevert
blnRevert
should be specified to be FALSE
. Otherwise
set blnRevert = TRUE
.
blnIntcovs
blnIntcovs = TRUE
.
cov.sel
cov.sel = c("SEX", "AGE")
). If all covariates are to be
included the specification can be abbreviated to cov.sel = "ALL"
.
For no covariates specify cov.sel = ""
.
blnGLIMcmds
TRUE
, if GLIM output is wanted. If blnGLIMcmds = FALSE
the following items can be set to any value (such as a null text string) and are ignored.
Please note that if blnGLIMcmds
TRUE
there is no output in R
but instead goes to the the following files.
glimCmdFile
outFile
intFile
blnIntcovs = TRUE
.
Input data is specified either through an external file (as specified through
datafile
in ctrl
) or through a dataframe via the argument dfr
.
The input data file if specified must be a plain
text file with variable names in the first row as readable via the command
read.table(datafile, header = TRUE)
. The leftmost columns must be the
responses to the Likert items optionally followed by columns for
categorical
subject covariates. These have to be specified such that the categories are represented
by consecutive integers starting with 1. Missing values are treated such that
rows with one ore more NAs are removed from the data and a message is printed.
For an example see xmpl
or the file xmpl.dat
in the package's
data/
directory.
Care has to be taken in case of a larger number of Likert items (>= 7) and/or of covariate level combinations. The tractable size of the design matrix depends on the working memory available to R.
Reinhold Hatzinger
Dittrich, R,, Francis, B.J., Hatzinger R., Katzenbeisser, W. (2007), A Paired Comparison Approach for the Analysis of Sets of Likert Scale Responses. Statistical Modelling, 7(1):3-28.
## Not run: ## EXAMPLE 1: TYPICAL USAGE ## not run because input data file does not exist # defining the ctrl list # would be typically read from file using source() testex1<-list( datafile = "test/test.dat", nitems = 5, blnRevert = FALSE, blnIntcovs = FALSE, cov.sel = c("SEX","URB"), blnGLIMcmds = TRUE, glimCmdFile = "test/test.gli", outFile = "test/test.design", intFile = "" # since blnIntcovs = FALSE ) # call pcpattL(testex1) ## End(Not run) ## EXAMPLE 2: WITH LOADED DATAFRAME data(xmpl) # example data in package testex2<-list( datafile = "", # dataframe used nitems = 3, blnRevert = FALSE, blnIntcovs = TRUE, cov.sel = "ALL", blnGLIMcmds = FALSE, # no GLIM output glimCmdFile = "", outFile = "", intFile = "" ) dsgnmat <- pcpattL(testex2, xmpl) print(head(dsgnmat)) ## EXAMPLE 3: ILLUSTRATING THE ISSP2000 EXAMPLE ## simplified version of the analysis as ## given in Dittrich et.al.(2007) data(issp2000) testex3<-list( datafile = "", nitems = 6, blnRevert = FALSE, blnIntcovs = FALSE, cov.sel = c("SEX","EDU"), blnGLIMcmds = FALSE, glimCmdFile = "", outFile = "", intFile = "" ) design <- pcpattL(testex3, issp2000) # - fit null multinomial model (basic model for items without # subject covariates) through Poisson distribution. # - SEX:EDU parameters are nuisance parameters # - the last item (GENE) becomes a reference item # in the model and is aliased; all other items # are compared to this last item # item parameters with undecided effects and no covariate effects. summary(glm(y~SEX:EDU + CAR+IND+FARM+WATER+TEMP+GENE + U12+U13+U23+U14+U24+U34+U15+U25+U35+U45+U16+U26+U36+U46+U56, family=poisson, data=design)) # now add main effect of SEX on items summary(glm(y~SEX:EDU + CAR+IND+FARM+WATER+TEMP+GENE + (CAR+IND+FARM+WATER+TEMP+GENE):SEX + U12+U13+U23+U14+U24+U34+U15+U25+U35+U45+U16+U26+U36+U46+U56, family=poisson, data=design))