Transforming the Voice of the Customer into High-Performance Advertising
1. Context & Business Problem: How to Turn Listening into Conversion?
The objective of this project was to address a fundamental business challenge: how to design a hyper-effective advertising campaign for a back pain solution. Rather than relying on marketing intuition, my premise was that a rigorously data-driven approach would allow me to build a message with surgical precision.
The real question wasn't "what should we tell our customers?" but rather "what are our customers already telling each other?" To answer this, I decided to analyze the most authentic source available: the spontaneous conversations of hundreds of people on public forums, specifically Reddit.
2. My Data-Driven Methodology: An End-to-End Mastered Process
To transform raw discussions into a quantifiable marketing strategy, I deployed a complete processing pipeline within the R programming environment, a choice driven by its statistical power and high-quality visualization capabilities.
Data Source: Public discussions about back pain, scraped from Reddit using the Apify tool. This proactive approach allowed me to build a unique and unbiased dataset.
Analysis Tools:
R: For the entire process, from data cleaning to semantic analysis and visualization creation.
Google Slides: For the final communication of findings, in order to build a clear and direct narrative.
Data Collection & Cleaning: Raw text from the internet is inherently "dirty." I therefore applied a rigorous cleaning script in R, validating each step iteratively with the print() function to ensure data integrity. This process included:
tolower(): Standardizing case so that "Pain," "pain," and "PAIN" were treated as a single concept.
gsub("[[:punct:]]", "") & gsub("[[:digit:]]", ""): Removing punctuation and numbers, which are noise for semantic analysis.
removeWords(stopwords("english")): Removing common words (stopwords like "the," "a," "is") to retain only meaningful terms.
Analysis Technique: I used Term Frequency analysis to identify the most recurring themes. This is the most direct method for quantifying the main concerns of users.
3. Analysis & Insights: Moving from "What" to "Why"
Once the data was ready, I could begin to make it talk. My analysis unfolded in two phases.
A. Descriptive Analysis: Identifying the "What"
The first step was to quantify the topics of discussion. What was the number one concern for people suffering from back pain?
My Method: An In-Depth Analysis to Reveal the Unseen
To transform this raw text into a clear strategy, I followed a rigorous process using the R programming language, known for its statistical power.
Text Cleaning: The Foundation of a Sound Analysis. Text from the internet is "dirty." I therefore prepared the data using essential functions:
tolower(): To standardize the text to lowercase, ensuring that "Pain," "pain," and "PAIN" are treated as the same word.
gsub("[[:punct:]]", "") and gsub("[[:digit:]]", ""): To remove punctuation and numbers, which are "noise" for sentiment analysis.
removeWords(stopwords("english")): To remove common stop words (e.g., "the," "is," "a") that lack semantic weight, allowing me to focus on meaningful words.
Sentiment and Emotion Analysis: Quantifying Opinion. I used the R library sentimentr, a specialized tool for this type of analysis.
sentiment_by(): This function allowed me to assign a sentiment score (positive, negative, neutral) to each comment, giving me an overview of the general tone of the discussions.
emotion_by(): Even more powerfully, this function detected the presence of 8 fundamental emotions (joy, sadness, anger, fear, etc.). It was this function that enabled me to understand the emotional charge behind the words.
The 'Aha Moment': Beyond the Pain, What's the Real Story?
The overall analysis immediately revealed a critical insight. While general sentiment is mixed, the emotional palette is dominated by negativity, fear, and sadness.
Key Observation: The problem of back pain is not just a physical issue. It is, first and foremost, an emotional suffering that profoundly impacts daily life.
Distribution of Emotions in Comments Sadness and fear are the most frequently expressed emotions by the audience.