Shape perception:
Hypothesis testing example

Introduction

Data from nine participants in a shape perception study conducted by Vandekerckhove, Panis and Wagemans (2007; see experiment 3). In this task, participants were instructed to compare pairs of irregular shapes shown sequentially to indicate whether they were “Same” or “Different”.

The experimental design included three dichotomous factors:

  1. Change presence: Was there a change between the shapes?
  2. Change type: If there was a change, was it a change in convexity or concavity?
  3. Change quality: If there was a change, was it a qualitative or quantitative change?

The incompletely crossed design leads to five experimental conditions, distinguished by the configuration of three dummy variables \(A\), \(B\), and \(C\):

Condition Change (A) Change quality (B) Change type (C)
k = 1 Yes (A = 1) Qualitative (B = 0) Convexity (C = 0)
k = 2 Yes (A = 1) Quantitative (B = 1) Convexity (C = 0)
k = 3 Yes (A = 1) Qualitative (B = 0) Concavity (C = 1)
k = 4 Yes (A = 1) Quantitative (B = 1) Concavity (C = 1)
k = 5 No (A = 0) n/a n/a
# Load necessary libraries/packages
library(R2jags)
library(here)
seed <- 15

Loading and cleaning the data

Load the data

# Load the data from nine participants
data_raw <- read.csv(here("demos", "applications", "shape_perception", "data", "vpw08.csv"))
colnames(data_raw) <- c("sub", "change_quality", "change_type", "noChange", "response", "rt")
head(data_raw)
##   sub change_quality change_type noChange response      rt
## 1   1              0           0        1        0 1.39940
## 2   1              0           0        1        1 1.43930
## 3   1              0           0        1        1 0.85313
## 4   1              1           0        0        0 0.50655
## 5   1              0           0        1        0 1.09290
## 6   1              1           0        0        0 0.50652
# No. of observations
nrow(data_raw)
## [1] 5760

Clean the data

# Create a copy of the data preserving only RTs <= 3 seconds
tmp <- data_raw[which(data_raw$rt<=3),]

# Add a column with the condition index
change <- 1-tmp$noChange
cond <- rep(0,nrow(tmp))
cond[which(tmp$change_quality==0&tmp$change_type==0)] <- 1
cond[which(tmp$change_quality==1&tmp$change_type==0)] <- 2
cond[which(tmp$change_quality==0&tmp$change_type==1)] <- 3
cond[which(tmp$change_quality==1&tmp$change_type==1)] <- 4
cond[which(change==0)] <- 5

# Prepare final data set to use for our analysis
data <- data.frame("sub" = tmp$sub, "cond" = cond, "change" = change, 
                   "change_quality" = tmp$change_quality, "change_type" = tmp$change_type,
                   "response" = tmp$response, "rt" = tmp$rt)
# No. of observations
nrow(data)
## [1] 5722

Get summary statistics

Note: Since the three-parameters considered by the EZDDM formulation don’t predict any difference in shape between the correct and incorrect RT distributions, we use all trials to compute the mean and variance of the RT.

Write custom function ez_summaries

# Define a function to compute the summary statistics used by EZ-DDM
ez_summaries <- function(data){
  # Identify condition and subject ID
  cond <- unique(data$cond)
  sub <- unique(data$sub)
  # Prepare EZ statistics
  output<- c()
  for(i in sort(sub)){
    # For each subject...
    for(k in sort(cond)){
      # On each condition...
      # Isolate the data
      subset <- data[which(data$sub==i&data$cond==k),]
      # And prepare summary statistics
      output <- rbind(output, c("sub" = unique(subset$sub), "cond" = unique(subset$cond),
                                "change" = unique(subset$change), 
                                "change_quality" = unique(subset$change_quality),
                                "change_type" = unique(subset$change_type),
                                "nTrials" = nrow(subset), "score" = sum(subset$response),
                                "meanRT"  = mean(subset$rt), "varRT"= var(subset$rt)))
    }
  }
  # Return data frame with summary statistics per condition and subject
  return(as.data.frame(output))
}

Compute summary statistics from data

ezdata <- ez_summaries(data)
ezdata$acc_rate <- ezdata$score/ezdata$nTrials

The model

EZHBDM

The first component of the model is what we called ‘the proxy’ model: We use the sampling distributions for the three summary statistics computed from the data, (i.e., the number of correct responses \(\dot{T}\) and the mean \(\dot{M}\) and variance \(\dot{V}\) of the response times), which are built from the forward-system of equations in the EZDDM that characterize the proportion of correct responses PC, and the mean and precision of the RTs MRT, PRT.

\[\begin{align*} \dot{T} &\sim \mbox{Binomial}(PC,N)\\ \dot{V} &\sim \mbox{Normal}(\frac{1}{PRT},\frac{N-1}{2(PRT)^2})\\ \dot{M} &\sim \mbox{Normal}(MRT, PRT \times N)\\ PC &= \frac{1}{\exp(-\alpha\nu)+1}\\ MRT &= \tau + \frac{\alpha}{2\nu}\frac{\exp(-\alpha\nu)-1}{\exp(-\alpha\nu)+1}\\ PRT &= \frac{\alpha}{2\nu^3}\{\frac{1-2\alpha\nu\exp(-\alpha\nu)-\exp(-\alpha\nu)^2}{(\exp(-\alpha\nu)+1)^2}\} \end{align*}\]

Metaregression structure with dummy variables

We model the variability in the drift rate \(\nu\) parameter across conditions \(k\) by fitting a multiple linear regression

\[\begin{align*} \nu^{pred}_k &= \mu_\nu+X_k(\gamma_1Z_k+\gamma_2Y_k+\gamma_3Y_kZ_k)+\gamma_4(1-X_k)\\ \nu_k &\sim \mbox{Normal}(\nu^{pred}_k,\sigma_\nu) \end{align*}\]

so that the drift rate predicted for every condition \(\nu^{pred}_k\) depends on the configuration of the three dummy variables and their interaction with five different regression coefficients.

# Is there a change?
# Yes (1) / No (0)
A <- ezdata$change

# Change quality
# Quantitative (1) / Qualitative (0)
B <- ezdata$change_quality

# Change type
# Concavity (1) / Convexity (0)
C <- ezdata$change_type

Write the model in JAGS

modelFile <- here("output", "BUGS-models", "demo_shape_model.bug")
model <- write("
    model {
              ####### Priors
              drift_mu ~ dnorm(0,1)   # Baseline
              drift_lambda ~ dgamma(2,1)
              drift_sigma = pow(drift_lambda, -0.5)
              nondt ~ dexp(1)
              for(i in 1:4){
                  gamma[i] ~ dnorm(0,1)   
              }      
              
              for(j in 1:5){
                  drift_pred[j] = drift_mu + A[j]*(gamma[1]*B[j]+gamma[2]*C[j]+gamma[3]*B[j]*C[j]) + (1-A[j])*gamma[4]
              }
              
              ####### Sampling model
              for (k in 1:length(nTrials)) {
                  # Person-by-condition parameters for DM parameters
                  bound[k] ~ dgamma(2,1)
                  drift[k] ~ dnorm(drift_pred[cond[k]],drift_lambda)
          
                  # Forward equations from EZ Diffusion
                  ey[k]  = exp(-bound[k] * drift[k])
                  Pc[k]  = 1 / (1 + ey[k])
                  PRT[k] = 2 * pow(drift[k], 3) / bound[k] * pow(ey[k] + 1, 2) / (2 * -bound[k] * 
                           drift[k] * ey[k] - ey[k] * ey[k] + 1)
                  MDT[k] = (bound[k] / (2 * drift[k])) * (1 - ey[k]) / (1 + ey[k])
                  MRT[k] = MDT[k] + nondt
          
                  # Sampling distributions for summary statistics
                  correct[k] ~ dbin(Pc[k], nTrials[k])
                  varRT[k]   ~ dnorm(1/PRT[k], 0.5*(nTrials[k]-1) * PRT[k] * PRT[k])
                  meanRT[k]  ~ dnorm(MRT[k], PRT[k] * nTrials[k])
                }
    }", modelFile)

JAGS

Specify JAGS setup

set.seed(seed)

# General setup
n.chains  <- 4;      n.iter    <- 2500
n.burnin  <- 250;    n.thin    <- 1

# Pass data to JAGS
data_toJAGS <- list("nTrials"  =  ezdata$nTrials,
                    "meanRT"   =  ezdata$meanRT,
                    "varRT"    =  ezdata$varRT,
                    "correct"  =  ezdata$score,
                    "cond"     =  ezdata$cond,
                    "A" = A, "B" = B, "C" = C)

# Specify parameters to keep track of
parameters <- c('gamma', 'drift_mu', 'drift_lambda', 'drift_sigma', 'drift_pred',
                'drift','bound', 'nondt', "Pc", "PRT", "MRT")

# Prepare initial values
myinits <- rep(list(list()), n.chains)
for(i in 1:n.chains){
    myinits[[i]] <- list(drift = rnorm(nrow(ezdata),0,1))
}

Run JAGS

set.seed(seed)

start <- Sys.time()
samples <- jags(data=data_toJAGS,
                parameters.to.save=parameters,
                model=modelFile,
                n.chains=n.chains,  n.iter=1000,
                n.burnin=100,  n.thin=n.thin,
                DIC=T, inits=myinits)
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 135
##    Unobserved stochastic nodes: 97
##    Total graph size: 1624
## 
## Initializing model
end <- Sys.time()
## [1] "No Rhat greater than 1.05"

Extract posterior samples

##### Drift rate parameters
# Recovered drift rates
drift <- samples$BUGSoutput$sims.list$drift
# Regression coefficients
mu <- samples$BUGSoutput$sims.list$drift_mu
gamma <- samples$BUGSoutput$sims.list$gamma
# Fitted values / Predicted drift rates
drift_pred <- samples$BUGSoutput$sims.list$drift_pred

##### Summary statistics computed from the recovered drift and boundary
Pc   <- samples$BUGSoutput$sims.list$Pc
PRT  <- samples$BUGSoutput$sims.list$PRT
MRT  <- samples$BUGSoutput$sims.list$MRT

Results

Bayes Factors!

epsilon <- 0.1
prior_constant <- pnorm(epsilon) - pnorm(-epsilon)
for(i in 1:3){
    this.gamma <- gamma[,i]
    post_mass <- mean(this.gamma > -epsilon & this.gamma < epsilon)
    this.BF <- prior_constant/post_mass
    this.BF[post_mass==0] <- 0
    cat("The B.F. for gamma", i, " in favor of an effect is", this.BF, ".", sep="")
    if(this.BF < 1){
      cat(" Evidence in favor of the null effect!\n")
    } else if (this.BF < 1.2){
      cat(" Weak evidence in favor of the effect!\n")
    } else if (this.BF < 5){
      cat(" Moderate evidence in favor of the effect!\n")
    } else {
      cat(" Strong evidence in favor of the effect!\n")
    }
}
## The B.F. for gamma1 in favor of an effect is0.7899736. Evidence in favor of the null effect!
## The B.F. for gamma2 in favor of an effect is28.67604. Strong evidence in favor of the effect!
## The B.F. for gamma3 in favor of an effect is0.6779206. Evidence in favor of the null effect!

Drift predicted per condition

Posterior regression coefficients

Posterior predictive checks