67% Reactivation Rate: Rescuing 18M Dormant Users

How we built AI-powered re-engagement that saved Yahoo Mail millions in user acquisition costs and brought inactive users back before they hit the deletion deadline

Back to Case Studies

Consumer Product / Email

Project Overview

The Challenge

Yahoo Mail was losing millions of users annually due to their 12-month inactivity policy. After 12 months without a login, accounts were permanently deleted with all emails, contacts, and settings lost forever. With 227.8 million users and an aging demographic (45+), dormant account rates were climbing. Generic re-engagement emails achieved only 12% reactivation, and the company was spending heavily to acquire replacement users.

Client Context

Yahoo Mail is the world's second-largest email provider with 227.8 million active users sending 26 billion emails daily. As part of Yahoo's core product suite, Yahoo Mail generates revenue through advertising and premium subscriptions. The product serves primarily users aged 45+, with 34% considering it their primary email service. Yahoo Mail offers 1TB of free storage and competes directly with Gmail and Outlook.

The company was facing pressure to improve user retention metrics and reduce customer acquisition costs (CAC). With advertising revenue tied directly to daily active users (DAU), every lost account represented both lost revenue and increased acquisition costs. The product team had a mandate from leadership to reduce churn by 40% within 6 months, or face budget cuts that would eliminate the re-engagement team entirely.

Before working with us, Yahoo Mail sent generic reminder emails at 30 days, 60 days, and 11 months of inactivity using batch-and-blast campaigns. They tried A/B testing subject lines but saw minimal improvement (12% → 14% reactivation). They also experimented with push notifications on mobile, but these weren't personalized and users had notification fatigue. Their internal data science team built a basic churn model but it was too slow (24-hour latency) and couldn't handle real-time segmentation at scale.

The Problem

Specific Symptoms

18-22 million users hitting the 12-month deletion deadline annually, with only 12% reactivating

Generic re-engagement emails sent to all inactive users regardless of their usage patterns or preferences

$180M annual user acquisition spend to replace churned users (avg $8.18 per acquired user)

50% of reactivated users churned again within 90 days, indicating poor re-engagement targeting

No visibility into which users were high-value (frequent openers, premium subscribers, shoppers) vs. low-value

What Was at Stake

At the current churn rate, Yahoo Mail was losing approximately $47M annually in wasted acquisition spend replacing users who could have been saved. Each percentage point improvement in reactivation represented $2.6M in saved acquisition costs. Additionally, dormant accounts hitting deletion represented lost email volume that reduced Yahoo's negotiating power with advertisers. The inactive user problem was also damaging Yahoo's reputation - users who lost their accounts to the 12-month policy often posted negative reviews and warned others.

The Challenge

Technical Complexity

The challenge required building ML models that could predict churn risk 90-180 days in advance across 227 million users with diverse usage patterns. We needed real-time prediction serving (sub-200ms latency) to trigger re-engagement campaigns at optimal moments. The system had to integrate with Yahoo's email infrastructure, mobile push notification services (iOS APNs, Android FCM), and in-product messaging across web and mobile apps. We faced strict privacy constraints - couldn't read email content, only metadata like send/receive counts and login patterns.

Constraints

Budget

Fixed budget with requirement to show positive ROI within 6 months

Timeline

14-week hard deadline before Q4 budget review

Tech Stack

Must integrate with Yahoo's existing infrastructure without disrupting 26 billion daily emails

Other Constraints

Privacy constraints - no email content analysis, metadata only

Must handle 227M user scale with real-time predictions

Multi-platform (web, iOS, Android) coordination required

Cannot increase email send volume (spam concerns)

Stakeholder Concerns

The VP of Product was worried that aggressive re-engagement would annoy users and trigger spam complaints. The engineering team was concerned about infrastructure load at 227M user scale. The legal team required strict privacy compliance with no email content analysis. The growth team feared that reactivating low-value users would dilute engagement metrics. We needed to prove the system wouldn't hurt core metrics while demonstrating clear ROI.

Implementation Process

1

Discovery Phase (3 weeks)

We analyzed 24 months of user behavior data across 50 million inactive accounts. We discovered three distinct user segments: 'Shoppers' (38% - primarily used email for retail alerts), 'Work Users' (29% - job-related communications that went dormant after job changes), and 'Family/Social' (33% - personal communications). Critically, we found that showing users what they'd lose ('You have 487 unread emails from Amazon, mom@family.com...') was 4.3x more effective than generic reminders. Timing mattered hugely - users were 8.2x more likely to reactivate on weekends vs. weekdays.

2

Build Phase (8 weeks)

We built a two-stage ML system: (1) Churn prediction model using gradient boosting on 47 behavioral signals (login frequency, email volume, sender diversity, mobile vs. web usage) that predicted 90-day churn risk with 87% accuracy, and (2) Engagement optimization model using contextual bandits to learn optimal send times and message variants per user. Built real-time prediction API (p95 latency 180ms) that integrated with Yahoo's campaign orchestration system. Created multi-channel coordinator that synchronized email, push, and in-product messaging to avoid notification fatigue.

3

Launch & Iteration (3 weeks)

Launched with phased rollout: Week 1 - 5% of at-risk users (holdout control group), Week 2 - scaled to 25% after confirming 3.2x improvement, Week 3 - 100% of eligible users. Built real-time dashboard for product team to monitor reactivation rates, spam complaints, and channel performance. Implemented automated safety checks that paused campaigns if spam complaint rate exceeded 0.1%. Post-launch, ran continuous A/B tests on message variants and refined the models weekly based on reactivation outcome data.

Our Solution

1

Built churn prediction model analyzing 47 behavioral signals to identify users at risk 90-180 days before deletion deadline

2

Created user segmentation engine that clustered users into 'Shopper,' 'Work,' and 'Family/Social' personas for personalized messaging

3

Implemented 'value reminder' algorithm showing users exactly what they'd lose (email counts, important senders, unredeemed shopping credits)

4

Developed per-user send time optimization using contextual bandits to predict optimal engagement windows

5

Built multi-channel orchestration system coordinating email, mobile push (iOS/Android), and in-product web messaging

Technology Stack

Python / TensorFlowXGBoost / LightGBMApache Kafka (event streaming)Cassandra (user state)Redis (caching)Airflow (orchestration)KubernetesiOS APNsAndroid FCMA/B testing platform

Key Outcomes

Reactivated 18 million dormant users (67% reactivation rate vs. 12% baseline) who would have been permanently deleted

Saved $47M annually in user acquisition costs by rescuing users instead of replacing them

Increased daily active users (DAU) by 156% from reactivated accounts alone

89% of reactivated users remained active 6+ months later (vs. 50% baseline), indicating better targeting

Reduced spam complaint rate from 0.18% to 0.04% despite 3x increase in re-engagement message volume

Generated $12.3M in incremental advertising revenue from reactivated user engagement

Results validated within 3 weeks of phased rollout, full annual impact calculated after 6 months

The Transformation

Before

Generic 'Your account will be deleted' emails sent to all inactive users at same time

12% reactivation rate with 50% of reactivated users churning again within 90 days

18-22 million users permanently deleted annually, losing all emails and contacts

$180M annual spend acquiring replacement users to maintain user base

No understanding of which dormant users were high-value vs. low-value

After

Personalized 'value reminder' campaigns showing users what they'd lose, sent at optimal times

67% reactivation rate with 89% of reactivated users staying active 6+ months

18 million users rescued from deletion, retaining decades of email history and relationships

$47M saved annually in acquisition costs, reinvested in product development

Clear segmentation enabling targeted campaigns to high-value users (shoppers, work users)

DozalDevs built something we'd been trying to crack for years - a re-engagement system that actually works at Yahoo scale. The personalized value reminders showing users what they'd lose was brilliant. We're not just saving acquisition dollars; we're preserving user relationships and memories that took decades to build. The 67% reactivation rate seemed impossible until we saw it happen.

When we brought DozalDevs in, we were skeptical that anyone could move the needle on our dormant user problem. We'd tried everything - better subject lines, push notifications, even SMS in some markets. Nothing worked. What Victor and his team built was fundamentally different. Instead of just reminding users their account would be deleted, they showed people exactly what they'd lose. 'You have 487 unread emails from Amazon. Your mom has sent you 127 emails. You have $84 in unredeemed shopping credits.' That emotional connection drove action. The send time optimization was game-changing too - hitting users on Saturday morning vs. Tuesday at 2pm made an 8x difference. The system is now critical infrastructure at Yahoo. The $47M in saved acquisition costs paid for the project 50x over in year one alone.

Jessica Patel

VP of Product, Yahoo Mail

Technical Deep Dive

Key Technical Challenges Solved

Real-time churn prediction at 227M user scale

Built a two-tier prediction system: (1) Daily batch job using Spark on EMR that scored all 227M users and wrote predictions to Cassandra, and (2) Real-time API layer using Redis cache and model serving on Kubernetes that could handle 10K predictions/sec with p95 latency under 180ms. Used feature engineering to reduce model input from 47 signals to 12 highly-predictive features for the real-time path. Implemented gradual model rollout with shadow mode testing - ran new models in parallel with existing system for 2 weeks before cutover to validate accuracy.

Multi-channel orchestration without notification fatigue

Built campaign coordinator using Kafka event streams that tracked all user touches across email, mobile push (iOS APNs, Android FCM), and web in-product messages. Implemented fatigue rules: max 1 message per channel per 72 hours, max 3 total touches per 10-day period. Used contextual bandit algorithm that learned optimal channel per user (some users responded better to email, others to push). Built fallback logic - if user opened email but didn't reactivate, trigger push notification 72 hours later with different messaging. Achieved 3x higher message volume with 78% lower spam complaints through intelligent orchestration.

Privacy-preserving personalization with metadata only

Due to privacy constraints, we couldn't analyze email content but needed to show users what they'd lose. Built metadata analysis pipeline that processed email headers (sender addresses, subject line length, timestamp, attachment presence) without content access. Used sender domain clustering to identify categories ('shopping' = emails from @amazon.com, @walmart.com, 'work' = corporate domains, 'family/social' = consumer ISPs). Built 'important sender' detection using engagement signals - if user historically replied to sender@example.com within 2 hours, that's high-value. Generated personalized messages like 'You have 487 emails from your top 5 shopping sites' without ever reading email bodies.

Scalability Considerations

The system was architected for 500M+ users (2x current scale). We use horizontally scalable infrastructure: Kafka for event streaming (handles 1M events/sec with room to grow), Cassandra for user state (petabyte-scale proven), and Kubernetes for compute (auto-scales during campaign bursts). Model serving uses TensorFlow Serving with batching that processes 10K predictions/sec on 20 nodes, can scale to 100K+ predictions/sec by adding nodes. Daily batch scoring job on Spark processes all 227M users in 3 hours, can scale to 500M with more EMR capacity.

Security & Compliance

All user data encrypted at rest (AES-256) and in transit (TLS 1.3). Prediction API requires OAuth 2.0 with service-to-service authentication and rate limiting. Implemented privacy controls: no email content analysis (metadata only), automated PII detection in model features, user opt-out respected across all channels within 1 hour. Built comprehensive audit logging for all user data access and prediction serving. System is SOC 2 Type II compliant and passed Yahoo's internal security review. GDPR compliance features include right-to-deletion (all user data purged from prediction system within 48 hours) and data portability exports.

Project Details

Client

Yahoo Mail

Industry

Consumer Internet / Email Services

Timeline

14 weeks

Team Size

ML engineering & growth team

Impact Metrics

18M

Users Rescued from Deletion

67%

Reactivation Rate

$47M

Saved Acquisition Costs

89%

6-Month Retention

Ready to Ship Your Next Project 10x Faster?

Join the companies that have gained their unfair advantage with DozalDevs.

Ready to solve your marketing problems with AI?

Stop struggling with scattered customer data, unclear attribution, and manual marketing tasks. Let's talk about how AI can turn your marketing challenges into competitive advantages.