NLPSentiment AnalysisR · TF-IDF · Random Forest

Strava App Review Sentiment Analysis

What do 2,000 Google Play reviews actually say about Strava? Using TF-IDF and Random Forest models in R, I identified the language patterns that separate satisfied users from frustrated ones — and what product teams should do about it.

Model Performance

Two-Model Comparison

Model	Features	Accuracy	AUC
Random Forest	Behavioral only(length, date, version…)	75.5%	0.822
TF-IDF + LASSO	Text tokens(logistic regression)	82.0%	0.893
Combination ModelBest	TF-IDF score + behavioral(Random Forest)	83.25%	0.9124

Combining TF-IDF sentiment scores with behavioral features yielded the strongest results — each model addition improved both accuracy and AUC, with the combination model reaching 83.25% accuracy and AUC 0.9124.

Combination Model

Feature Importance

What the Random Forest weighted most when combining text and behavioral signals

TF-IDF Score

350.9

Hour of Day

70.5

Review Length

58.7

Topic 1

48.7

Word Count

45.9

Time of Day

25.8

Topic 2

21.6

Topic 5

19.6

App Version

18.4

Topic 6

10.9

Topic 4

9.6

Topic 3

9.0

Weekend

3.3

Season

2.2

The TF-IDF sentiment score dominates at 350.9 — nearly 5× the next feature — confirming that text language is the primary driver of review sentiment. Behavioral features like hour of day, review length, and topic clusters provide meaningful secondary signal.

LASSO Coefficients

Top 20 Sentiment Tokens

Larger bars = stronger predictive weight toward that sentiment class

LDA Topic Modeling

6 Latent Topics

Latent Dirichlet Allocation surfaced six distinct conversation clusters across all 2,000 reviews. Topic scores were used as features in the Random Forest and combination models.

Topic 1

Bug Reports & Account Issues

phoneaccountrecordactivityissuedatafixupdate

Topic 2

Performance Tracking

trackrunningaccuratetrackingprogressactivitieseasypace

Topic 3

Social & Wellness

nicesocialfitnesshealthapplicationtrackingexerciseconnect

Topic 4

GPS & Activity Recording

timedistancerunningtrackingactivityrecordinggpsrides

Topic 5

Monetization & Paywall

subscriptionpayfeaturesfreepaidpaywallpremiumdata

Topic 6

Positive Experience & Sharing

lovefreefeaturesgarmintrialmapmoneyshare

So What

What This Means for Strava

The data points to a clear strategic gap: the fitness experience is loved, but infrastructure failures are destroying it.

Product

Fix infrastructure first

Auth flow, upload reliability, and crash reduction dominate negative reviews. These aren't feature gaps — they're broken foundations that undermine the entire experience for every user.

Marketing

Double down on social fitness

"Friends", "motivates", and "community" are among the strongest positive predictors. Strava's unique angle isn't tracking — it's the social accountability layer. Lead with that.

Insight

Complaints aren't about fitness

Not a single fitness-related word appears in the top negative tokens. Users aren't unhappy with the workout experience — they're frustrated by sign-in errors, server failures, and crashes.

Insight

The core value prop works

"Tracking", "fitness", "exercise", and "trail" all drive positive sentiment. When Strava works, users love exactly what it's supposed to do. The product vision is validated — execution is the issue.

Methodology

Data Collection

2,000 Google Play reviews scraped and labeled (Bad: 1–3 stars, Good: 4–5 stars)

Feature Engineering

Review length, word count, time/date features, app version, and season extracted

Random Forest

Behavioral features only — accuracy 75.5%, AUC 0.822

TF-IDF + LASSO

Text vectorization + logistic regression — accuracy 82.0%, AUC 0.893

Topic Modeling (LDA)

Latent Dirichlet Allocation applied to review text — 6 latent topics extracted and used as features

Combination Model

TF-IDF sentiment score fed into Random Forest alongside behavioral features — accuracy 83.25%, AUC 0.9124

Findings

Feature importance and LASSO coefficients reveal what language and behavioral signals drive each sentiment class

Analysis conducted in R using the tidytext, randomForest, and glmnet packages. 2,000 Google Play reviews collected and labeled by star rating.