Which Game is the Scariest? Alien: Isolation, Dead Space, Dead Space 2, or Silent Hill 2? An R Halloween Analysis!

I wanted to get into the halloween spirit by doing some kind of horror themed analytics post. The idea of combining R, data analytics, and the macabre isn't as straightforward as some may think (yes, that was a joke).

While I don't care for horror movies, for some reason, I enjoy survival horror video games. Not playing them, of course. I'm far too squeamish for that. I usually watch youtube videos of other people playing them to spare myself a panic attack. I'm the kind of guy who would start playing the game and once the atmosphere became intense, I would just go, "NOPE", turn off the game and walk away.


Of the survival horror games I've seen, the Dead Space franchise is up there. I also love the Alien franchise, though that franchise has suffered from a number of awful releases (including movies). Alien: Isolation is a gem, whose intense atmosphere makes every footstep nerve-racking. Lastly, I wanted to include another game that I haven't seen but should. I have heard good things about Silent Hill and checked Metacritic for the best one of the franchise which was Silent Hill 2. Other editorial sites place this particular game as the best survival horror game out there.

If you aren't interested in web scraping for data, statistical programming, bayesian models, or R, I would recommend skipping ahead to the results and conclusion.

Data

To fuel this analysis I wanted data from both critics, as well, as the fan base. I took the first 100 critic reviews and the first 100 user reviews sorted by helpfulness. Note that there are typically fewer than 100 critic reviews so all critic reviews are usually captured. In order to acquire this I will be scraping review information from the aforementioned Metacritic website. I'm not interested in positive reviews so much as I am interested in some metric of "scariness" or atmospheric intensity. I do plan on using scores from critics, because I think they are a correlate to that fear factor. With all that information, I'll assemble a metric that I feel could be useful in answering the question of which is the scariest. Code for extracting the information is provided below.


I do the same for each of the other three games, changing only the initial url.

Fear Factor Metric

My plan is to review the comments for key fear words or descriptors, which I present to you at the end of this post where you can find all of my code. I'm intentionally not stemming the words because that has lead to issues every time I've done it. Certainly, this setup is not infallible and other people could come up with other or better words, but this is just a "for fun" analysis so I'm not too worried.

To better capture context, I used enhancer words that will add a bonus. Conversely, I have detractor words that penalize negative aspects of the review. It's not perfect, of course, but it's a way to consider the more and the less favorable ideas that people express in these reviews. 1 point for each fear factor word and each enhancer. Conversely, -1 point for each detractor word. All words are shown in the R code at the end.



One may be inclined to sum up all the points and declare a victor for whoever has the most. Problem with that is that each game doesn't have the same number of reviews. Consequently, I need to use an average score rather than a total score.

Model Fitting

Non-statistical folks are generally satisfied with a point estimate and move on. However, I am also interested in how consistently well-rated these games are which makes me also interested in the variability associated with the fear factor score. To consider this aspect of performance, I will use posterior predictive distributions to see what to expect if we resampled the population. You guessed it, we are using a Bayesian model! The website is lazybayesian.com, after all.

Another advantage of a bayesian model is using the prior distribution to incorporate additional information. You may have been wondering how I was going to use the critic scores. I am going to incorporate them into the prior distributions. This requires some thought because I want the variances to be narrower for the more consistently well-rated games and a higher average mean for the priors of better rated games. Look at the summary statistics for the critics score across the four games and it gives a good idea of what is going on.


Priors

The idea is to use the summary information to propose priors that are consistent with above trends. That with higher average should be given higher priors for the mean, and those with greater variability should have more disperse priors for the variance or precision parameters. Problem is that the scores are on a totally different scale than the fear factor scores.

I ended up dividing the critic scores by 20 to scale them to range I thought was reasonable. I then used the median value for the mean prior, and set the precision of 0.25 which amounts to a standard deviation of 2. FYI, I'm using precisions for priors because that is what JAGS uses, as does WinBUGS. I also want the estimated precisions of each likelihood to be influenced by the critic scores, as well. I used the Gamma distribution for precision priors and set the shape parameter to 1.5 so there wouldn't be much sampling at zero, but still allow for skew. I based the scale parameters off of the ranges of the critic scores, and did 1/range of the critics scores. If you have other ideas about setting up the priors to accomplish this goal, I would love to hear about it in the comments!

You'll find all my model details in my JAGS code.

Results

The posterior distributions of the mean fear factor scores are shown below. If we use the mean of the posterior distribution for our point estimate, we would declare Alien: Isolation the winner. Though, I don't think the story is that simple.


If we look at the overall posterior distribution, you'll see there is plenty of overlap between the mean estimates for Alien, Dead Space, and Dead Space 2. I'm not going to talk in more depth about that until we have a chance to look at the posterior predictive distributions.

Moving on the the estimates of standard deviation, it is worth noting that the more highly rated games tend to also have the wider estimates for standard deviation. Estimated Standard deviation for Alien: Isolation is wider than all the rest, leading us to understand that there is more variability in how people felt and reacted to the game.


Given the closeness of some of the estimates along with their variability, it is difficult to understand who really is coming out on top are far as being the scariest game is concerned. Consequently, looking at the posterior predictive distribution may give us a better understanding of what is going in terms of raw fear factor scores.


The above graphic represents what we would expect if you sampled fear factor scores from the populations, given what we learned from our model. You can tell the Alien: Isolation is further right (higher fear factor scores) than the rest but at the same time it also goes further left (lower scores) than the rest. While it is a generally higher rated game, it also generates a broader range of sentiment from the public. That is, it has higher highs and lower lows than the other 3 games.

Conclusion

As with most opinions, at all depends on who you ask. There is plenty of overlap among all 4 distributions. However, if you want to crown a winner of the scariest, then look no further than Alien: Isolation.

If you were to ask me, the results make perfect sense. I along with many people who reviewed the game think the first half is intense, atmospheric, spine-tingling, fear-inspiring action. Afterward it does start to feel repetitive and unnecessarily long as additional gameplay starts generating fewer scares and adds no additional depth.

As far as Dead Space is concerned, I would recommend watching the first one, well, first. It best introduces you to the ideas and storyline while providing one of my all time favorite introductory scenes to the game. I can see people generally appreciating Dead Space 2 more because there is more diversity in tasks and superior game play, while maintaining the same atmospheric intensity and sense of dread and anticipation as you move about.

One way or another, you can't go wrong with any of these games, as I plan to rewatch youtube videos of the game play for all of them in order to get into the Halloween spirit.

Thank you for reading this post and hope you have a wonderfully spooky Halloween!

R Code

####### Halloween - Horror gaming comparison Post ########
library(rvest);library(qdap)

# Alien Critic Reviews #
alCritInit<-read_html("https://www.metacritic.com/game/pc/alien-isolation/critic-reviews")
alCritBlockT<-alCritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".review_body")
alCritText<-html_text(alCritBlockT) %>% unlist()
alCritText<-scrubber(alCritText) %>% tolower()
# scores
alCritBlockS<-alCritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".indiv")
alCritScores<-html_text(alCritBlockS) %>% unlist() %>% as.numeric()

# Alien User Reviews #
alUserInit<-read_html("https://www.metacritic.com/game/pc/alien-isolation/user-reviews")
alUserBlock<-alUserInit %>% html_nodes(".user_reviews") %>% html_nodes(".review_body")
alUserText<-html_text(alUserBlock) %>% unlist()
alUserText<-scrubber(alUserText) %>% tolower()

rm(alCritInit,alCritBlockT,alCritBlockS,alUserBlock,alUserInit)

# Dead Space Critic Reviews #
dsCritInit<-read_html("https://www.metacritic.com/game/pc/dead-space/critic-reviews")
dsCritBlockT<-dsCritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".review_body")
dsCritText<-html_text(dsCritBlockT)
dsCritText<-scrubber(dsCritText) %>% tolower()
# scores
dsCritBlockS<-dsCritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".indiv")
dsCritScores<-html_text(dsCritBlockS) %>% unlist() %>% as.numeric()

# Dead Space User Reviews #
dsUserBlock<-read_html("https://www.metacritic.com/game/pc/dead-space/user-reviews")
dsUserBlock<-dsUserBlock %>% html_nodes(".user_reviews") %>% html_nodes(".review_body")
dsUserText<-html_text(dsUserBlock)
dsUserText<-scrubber(dsUserText) %>% tolower()

rm(dsCritInit,dsCritBlockT,dsCritBlockS,dsUserBlock)

# Dead Space 2 Critic Reviews #
ds2CritInit<-read_html("https://www.metacritic.com/game/pc/dead-space-2/critic-reviews")
ds2CritBlockT<-ds2CritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".review_body")
ds2CritText<-html_text(ds2CritBlockT) %>% unlist()
ds2CritText<-scrubber(ds2CritText) %>% tolower()
# scores
ds2CritBlockS<-ds2CritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".indiv")
ds2CritScores<-html_text(ds2CritBlockS) %>% unlist() %>% as.numeric()

# Dead Space 2 User Reviews #
ds2UserBlock<-read_html("https://www.metacritic.com/game/pc/dead-space-2/user-reviews")
ds2UserBlock<-ds2UserBlock %>% html_nodes(".user_reviews") %>% html_nodes(".review_body")
ds2UserText<-html_text(ds2UserBlock) %>% unlist()
ds2UserText<-scrubber(ds2UserText) %>% tolower()

rm(ds2CritInit,ds2CritBlockT,ds2CritBlockS,ds2UserBlock,ds2UserInit)

# Silent Hill 2 Critic Reviews #
sh2CritInit<-read_html("https://www.metacritic.com/game/pc/silent-hill-2/critic-reviews")
sh2CritBlockT<-sh2CritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".review_body")
sh2CritText<-html_text(sh2CritBlockT) %>% unlist()
sh2CritText<-scrubber(sh2CritText) %>% tolower()
# scores
sh2CritBlockS<-sh2CritInit %>% html_nodes(".critic_reviews") %>% html_nodes(".indiv")
sh2CritScores<-html_text(sh2CritBlockS) %>% unlist() %>% as.numeric()

# Silent Hill 2 User Reviews#
sh2UserInit<-read_html("https://www.metacritic.com/game/pc/silent-hill-2/user-reviews")
sh2UserBlock<-sh2UserInit %>% html_nodes(".user_reviews") %>% html_nodes(".review_body")
sh2UserText<-html_text(sh2UserBlock) %>% unlist()
sh2UserText<-scrubber(sh2UserText) %>% tolower()

rm(sh2CritInit,sh2UserInit,sh2CritBlockS,sh2CritBlockT)
## Key Terms ##
enhancers<-c("superb","excellent","perfect","perfectly","best","satisfying","unique","outstanding","masterpiece","strong","incredible","amazing","amazed","great","inspired","legendary","good")


fearTerms<-c("atmosphere","environment","fear","intense","scared","scariest","scary","terrify","emotion","emotional","mood","nervewrecking","nerve","nerves","oppressive","edge","terror","terrifying","immersive","immersed","adrenaline","ambience","thrill","thrilling","tense","tension","despair","haunting","anxiety","anxious","afraid","creative","creepy","creepiest","fright","frightening","pants","eerie")

detractors<-c("repetitive","repetition","repititious","repeat","bored","boring","meandering","bugs","poor","poorly","distracted","disctracting","predictable","issues","long","problems","disappointing","disappoint","dissapointed","pointless","arbitrary","weak","awful","confusing","bizarre","disjointed","disorganized","rambling","dull","clunky","aggrevating","tiresome","frustrating","unnecesary")

wordScore<-data.frame(keyTerm=c(enhancers,fearTerms,detractors),Point=c(rep(1,length(enhancers)),rep(1,length(fearTerms)),rep(-1,length(detractors))),termType=c(rep("Enhancer",length(enhancers)),rep("Fear",length(fearTerms)),rep("Detractor",length(detractors))))
# ,"hair-raising" ,"too long","over and over"
rm(enhancers,fearTerms,detractors)
## extract data from review text ##

# Alien: Isolation #
alienWfxRev<-wfdf(text.var=c(alCritText,alUserText),grouping.var=1:length(c(alCritText,alUserText)))
alienWfxRev<-merge(alienWfxRev,wordScore,by.x="Words",by.y="keyTerm")
pointsXrevAI<-apply(alienWfxRev[,-1*c(1,143:144)],2,function(x) x%*%alienWfxRev$Point )

rm(alCritText,alUserText)

# Dead Space
dsWfxRev<-wfdf(text.var=c(dsCritText,dsUserText),grouping.var=1:length(c(dsCritText,dsUserText)))
dsWfxRev<-merge(dsWfxRev,wordScore,by.x="Words",by.y="keyTerm")
pointsXrevDS<-apply(dsWfxRev[,-1*c(1,130:131)],2,function(x) x%*%dsWfxRev$Point )

rm(dsUserText,dsCritText)

# Dead Space 2
ds2WfxRev<-wfdf(text.var=c(ds2CritText,ds2UserText),grouping.var=1:length(c(ds2CritText,ds2UserText)))
ds2WfxRev<-merge(ds2WfxRev,wordScore,by.x="Words",by.y="keyTerm")
pointsXrevDS2<-apply(ds2WfxRev[,-1*c(1,130:131)],2,function(x) x%*%ds2WfxRev$Point )

# Silent Hill 2
sh2WfxRev<-wfdf(text.var=c(sh2CritText,sh2UserText),grouping.var=1:length(c(sh2CritText,sh2UserText)))
sh2WfxRev<-merge(sh2WfxRev,wordScore,by.x="Words",by.y="keyTerm")
pointsXrevSH2<-apply(sh2WfxRev[,-1*c(1,49:50)],2,function(x) x%*%sh2WfxRev$Point )

summary(alCritScores/20) # Alien: Isolation
summary(dsCritScores/20) # Dead Space
summary(ds2CritScores/20) # Dead Space 2
summary(sh2CritScores/20) # Silent Hill 2

range(alCritScores)
range(dsCritScores)
range(dsCritScores)
range(sh2CritScores)

#zCritScores<-cbind(alCritScores,dsCritScores,ds2CritScores,sh2CritScores) %>% sweep(2,c(mean(alCritScores),mean(dsCritScores),mean(ds2CritScores),mean(sh2CritScores),FUN="-")) %>% sweep(2,c(sd(alCritScores),sd(dsCritScores),sd(ds2CritScores),sd(sh2CritScores),FUN="/"))

zalScore<-(alCritScores-mean(alCritScores))/sd(alCritScores)
zdsScore<-(dsCritScores-mean(dsCritScores))/sd(dsCritScores)
zds2Score<-(ds2CritScores-mean(ds2CritScores))/sd(ds2CritScores)
zsh2Score<-(sh2CritScores-mean(sh2CritScores))/sd(sh2CritScores)

### Bayesian analysis of fear factor scores ###
library(R2jags)

modat<-data.frame(FearScore=c(pointsXrevAI,pointsXrevDS,pointsXrevDS2,pointsXrevSH2),Game=c(rep(1,length(pointsXrevAI)),rep(2,length(pointsXrevDS)),rep(3,length(pointsXrevDS2)),rep(4,length(pointsXrevSH2))))

# Propose Bayesian Model

FearFactorMod<-"
model{
  for(i in 1:n){
FearScore[i]~dnorm(mu[Game[i]],prec[Game[i]])
  }
 #set priors
mu[1]~dnorm(4.15,0.25) # Alien: Isolation
mu[2]~dnorm(4.35,0.25) # Dead Space
mu[3]~dnorm(4.475,0.25) # Dead Space 2
mu[4]~dnorm(4,0.25) # Silent Hill 2
# set precisions
prec[1]~dgamma(1.5,1/41) # Alien: Isolation
prec[2]~dgamma(1.5,1/30) # Dead Space
prec[3]~dgamma(1.5,1/30) # Dead Space 2
prec[4]~dgamma(1.5,1/50) # Silent Hill 2

sd[1]<-sqrt(1/prec[1])
sd[2]<-sqrt(1/prec[2])
sd[3]<-sqrt(1/prec[3])
sd[4]<-sqrt(1/prec[4])
}
"
writeLines(FearFactorMod,'FearFactorMod.txt')

FearScore<-modat$FearScore
Game<-modat$Game
n<-nrow(modat)

jagdat<-c('FearScore','Game','n')
parms<-c('mu','sd')

Fear.sim<-jags(data=jagdat,inits=NULL,parameters.to.save=parms,model.file='FearFactorMod.txt',jags.seed=123,n.iter=25000,n.burnin=5000,n.chains=5,n.thin=1)

library(MCMCvis)
options(scipen = 999)

MCMCsummary(Fear.sim)
MCMCplot(Fear.sim,params = "mu",main="Posterior Distribution for Mean Fear Factor Score Estimates",labels=c("Alien: Isolation","Dead Space","Dead Space 2","Silent Hill 2"),xlab="Fear Factor Score")
MCMCplot(Fear.sim,params = "sd",main="Posterior Distributions for Std. Dev. Fear Factor Estimates",labels=c("Alien: Isolation","Dead Space","Dead Space 2","Silent Hill 2"),xlab="Fear Factor Score")

MCMCtrace(Fear.sim)

head(Fear.sim$BUGSoutput$sims.matrix)
fearsamp<-as.mcmc(Fear.sim)
# Derive Posterior Predictive Distributions
ppAI<-rnorm(nrow(fearsamp[[1]]),mean=fearsamp[[1]][,2],sd=fearsamp[[1]][,6])
ppDS<-rnorm(nrow(fearsamp[[1]]),mean=fearsamp[[1]][,3],sd=fearsamp[[1]][,7])
ppDS2<-rnorm(nrow(fearsamp[[1]]),mean=fearsamp[[1]][,4],sd=fearsamp[[1]][,8])
ppSH2<-rnorm(nrow(fearsamp[[1]]),mean=fearsamp[[1]][,5],sd=fearsamp[[1]][,9])

postPred<-data.frame(postPredDraws=c(ppAI,ppDS,ppDS2,ppSH2),Game=c(rep("Alien: Isolation",20000),rep("Dead Space",20000),rep("Dead Space 2",20000),rep("Silent Hill 2",20000)))

library(ggplot2)

ppPlot<-ggplot(aes(x=postPredDraws,group=Game,color=Game,fill=Game),data=postPred) + geom_density(alpha=0.33) + labs(title="Posterior Predictive Distributions for Alien: Isolation, Dead Space, Dead Space 2, and Silent Hill 2",x="Fear Factor Score") + theme(axis.ticks.y = element_blank(),axis.text.y = element_blank())
ppPlot

Comments

Popular posts from this blog

How to Get Started Playing Super Metroid / Link to the Past Crossover Randomizer.

Two-Step fix for rJava library installation on Mac OS

Structural Machine Learning in R: Predicting Probabilistic Offender Profiles using FBI's NIBRS Data