Strava App Review Sentiment Analysis
What do 2,000 Google Play reviews actually say about Strava? Using TF-IDF and Random Forest models in R, I identified the language patterns that separate satisfied users from frustrated ones — and what product teams should do about it.
Model Performance
Two-Model Comparison
| Model | Features | Accuracy | AUC |
|---|---|---|---|
| Random Forest | Behavioral only(length, date, version…) | 75.5% | 0.822 |
| TF-IDF + LASSO | Text tokens(logistic regression) | 82.0% | 0.893 |
| Combination ModelBest | TF-IDF score + behavioral(Random Forest) | 83.25% | 0.9124 |
Combining TF-IDF sentiment scores with behavioral features yielded the strongest results — each model addition improved both accuracy and AUC, with the combination model reaching 83.25% accuracy and AUC 0.9124.
Combination Model
Feature Importance
What the Random Forest weighted most when combining text and behavioral signals
The TF-IDF sentiment score dominates at 350.9 — nearly 5× the next feature — confirming that text language is the primary driver of review sentiment. Behavioral features like hour of day, review length, and topic clusters provide meaningful secondary signal.
LASSO Coefficients
Top 20 Sentiment Tokens
Larger bars = stronger predictive weight toward that sentiment class
LDA Topic Modeling
6 Latent Topics
Latent Dirichlet Allocation surfaced six distinct conversation clusters across all 2,000 reviews. Topic scores were used as features in the Random Forest and combination models.
Topic 1
Bug Reports & Account Issues
Topic 2
Performance Tracking
Topic 3
Social & Wellness
Topic 4
GPS & Activity Recording
Topic 5
Monetization & Paywall
Topic 6
Positive Experience & Sharing
So What
What This Means for Strava
The data points to a clear strategic gap: the fitness experience is loved, but infrastructure failures are destroying it.
Product
Fix infrastructure first
Auth flow, upload reliability, and crash reduction dominate negative reviews. These aren't feature gaps — they're broken foundations that undermine the entire experience for every user.
Marketing
Double down on social fitness
"Friends", "motivates", and "community" are among the strongest positive predictors. Strava's unique angle isn't tracking — it's the social accountability layer. Lead with that.
Insight
Complaints aren't about fitness
Not a single fitness-related word appears in the top negative tokens. Users aren't unhappy with the workout experience — they're frustrated by sign-in errors, server failures, and crashes.
Insight
The core value prop works
"Tracking", "fitness", "exercise", and "trail" all drive positive sentiment. When Strava works, users love exactly what it's supposed to do. The product vision is validated — execution is the issue.
Methodology
Data Collection
2,000 Google Play reviews scraped and labeled (Bad: 1–3 stars, Good: 4–5 stars)
Feature Engineering
Review length, word count, time/date features, app version, and season extracted
Random Forest
Behavioral features only — accuracy 75.5%, AUC 0.822
TF-IDF + LASSO
Text vectorization + logistic regression — accuracy 82.0%, AUC 0.893
Topic Modeling (LDA)
Latent Dirichlet Allocation applied to review text — 6 latent topics extracted and used as features
Combination Model
TF-IDF sentiment score fed into Random Forest alongside behavioral features — accuracy 83.25%, AUC 0.9124
Findings
Feature importance and LASSO coefficients reveal what language and behavioral signals drive each sentiment class
Analysis conducted in R using the tidytext, randomForest, and glmnet packages. 2,000 Google Play reviews collected and labeled by star rating.