Sentiment Analysis: What is it and how does goShadow use it?

Sentiment analysis is the contextual mining of text that interprets the emotions (positive, negative, and neutral) found within textual data. The majority of sentiment analysis has been performed on large texts rather than survey responses due to ease. However, open-text questions gather unconstrained, thoughtful responses compared to multiple choice or likert scaled questions. Performing a sentiment analysis on open-text responses enables the most accurate understanding of the respondent’s intentions, opinions, and motivations. 

Manual sentiment analysis occurs when analysts subjectively categorize the responses in terms of sentiment. If resources are available, multiple analysts should be used to reduce bias. Nevertheless, even with multiple coders, a large amount of text can make the project infeasible because of the time, resources, and money it takes. 

Within the past year, goShadow has begun conducting manual sentiment analysis on open-text survey questions. An analyst reads through each response and classifies the responses as positive or negative based on their educated interpretation. An important factor for acquiring an accurate sentiment is determining whether the question is biased towards one sentiment or the other. For example, a question asking “What would make you happier?” prompts a negative sentiment. The respondent is trying to say that she/he lacks something making it a negative response. Although responses are fairly accurate, the process is tedious and time consuming. 

Computer-aided text analysis (CATA) is an emerging methodology created to make sentiment analysis more efficient. There are two main methods for CATA: lexicon-based and learning-based. Lexicon-based methods determine the sentiment based on a predetermined dictionary of words categorized in a sentiment. Learning-based methods train the algorithm using textual data that is already labeled to predict the unlabeled text. Wijngaards and researchers proved the reliability of lexicon-based CATA on text-based data about job satisfaction (Wijngaards et al., 2019). Nonetheless, no matter which methodology is chosen the manual analysis is oftentimes used as the gold standard or benchmark.  

goShadow has begun utilizing Python algorithms to automate sentiment analysis. Python utilizes the Natural Language Toolkit (NLTK) platform and different libraries to perform the analysis. NLTK uses over fifty corpora and lexical resources to compare the text to. The various libraries then enable the text to be cleaned of any spelling or grammatical errors. 

Want to learn more or design your own custom survey with the goShadow team? Email us to get started.

Wijngaards, I., Burger, M., & van Exel, J. (2019). The promise of open survey questions-The validation of text-based job satisfaction measures. PloS one, 14(12), e0226408. https://doi.org/10.1371/journal.pone.0226408

Back to Blog

Posted on

July 14, 2021