What Finally Closed My English Speaking Gap
A non-native founder's honest story: two years of apps, courses, Netflix, and dictionaries that barely moved the needle — and the specific 90-day routine that did. The full playbook with what to do, what to skip, and what to build.
Quick answer: Two years of grammar apps, vocab apps, Netflix immersion, and YouTube polyglot advice barely moved my speaking. Then I switched to one simple daily protocol — 20 minutes of unscripted AI conversation with honest phoneme feedback, plus voice memo journaling — and broke through in 90 days. The tool that made the daily protocol practical at zero friction is what I ended up building: SpeakShark. This is the honest before-and-after, with the exact routine.
I'm Duy, the founder of SpeakShark. This post is the truth about my own English speaking journey — what wasted my time, what worked, and how I'd compress it now if I started over.
Where I started: B2 reading, A2 speaking
In early 2024 I had a problem most non-native English speakers know intimately: my English looked fine on paper.
- I could read tech news, Reddit threads, GitHub PRs without a dictionary
- I could write coherent emails and Slack messages in English
- I scored B2 on online placement tests
- I'd done 18 months of Duolingo, 6 months of Babbel, 4 months of Pimsleur
- My streak across all apps combined exceeded 600 days
And yet: on US client calls, I froze inside the first 30 seconds. I'd rehearse opening lines in my head, get them out fine, then collapse when the conversation went off-script. I'd switch to writing in chat because speaking was so much harder. My team's perception was that I "didn't speak English well" — despite years of "study."
The gap between what I could read/write (B2) and what I could speak (A2) was embarrassing. Worse, it was career-limiting. US clients didn't want to write everything down for me. Senior remote roles required calls. I was capping my own earning power by avoiding speaking.
The wasted year: what I tried first
For most of 2024 I attacked the problem the wrong way. Here's what I spent time on and what each actually delivered.
1. More grammar drills (3 months, ~60 hours)
I bought a Babbel year subscription and ground through their "college-level grammar course." Tense rules, articles, prepositions, conditionals.
Result: Better grammar scores on written tests. Zero improvement in spoken fluency. I could explain when to use past perfect; I still couldn't produce it in conversation.
Why it failed: Receptive knowledge ≠ productive ability. The grammar was already in my head — I just couldn't access it at speaking speed.
2. Watching English Netflix with subtitles (4 months daily, ~120 hours)
The internet's favorite recommendation. I watched everything: dramas, sitcoms, documentaries, news.
Result: Listening comprehension improved noticeably (from B1 to B2+). Speaking unchanged. Vocabulary expanded somewhat.
Why it failed: Watching is receptive. My mouth never produced English while watching. You can watch English 8 hours a day and never speak a sentence — and your speaking won't improve.
3. Reading English books and articles (continuous, ~300 hours over the year)
I read 12 English books in 2024. Plus daily Hacker News, Twitter, newsletters.
Result: Vocabulary jumped massively. Written English felt natural. Speaking still locked.
Why it failed: Same as Netflix — reception only. Reading expands what you can understand. It doesn't train what you can produce.
4. Vocabulary flashcards (2 months, ~40 hours)
Anki decks. Spaced repetition. "20 new words a day."
Result: I "knew" 2000+ new words. In speaking, I couldn't retrieve any of them. They sat in passive memory.
Why it failed: Flashcards train recognition, not retrieval-under-pressure. The vocab needed to be in production loops to become accessible.
5. Polyglot YouTube advice (sporadic, ~10 hours)
"How I learned English in 6 months" videos. Comprehensible input theory. Krashen.
Result: I felt informed. My speaking didn't improve.
Why it failed: Information ≠ action. Watching someone explain how to learn isn't learning. I watched a video about the importance of speaking practice and then... didn't speak.
Total time wasted: ~500 hours over 12 months. Speaking gap basically unchanged. Streak counter: high. Actual progress: minimal.
The realization that changed everything
In early 2025 a US client interrupted me on a call to ask if I wanted to switch to text. I felt the shame in real time. That night I sat down and made an honest list of what I'd actually done that year.
The list was almost entirely receptive activities. Reading. Watching. Listening. Studying.
The activity I'd done least: actually producing English out of my mouth, under cognitive load, for sustained periods.
I'd been doing the wrong sport. Grammar and vocab study trained me to be great at English tests. They didn't train me to speak, because speaking is a different physical and neurological skill from understanding.
This sounds obvious in retrospect. It was not obvious during the year I was doing it wrong. Every app, every YouTube video, every textbook recommendation pointed me toward more reception. The path felt productive because completion was rewarded constantly. The actual gap — production — got no attention from any of those resources.
The 90-day protocol that worked
In February 2025 I designed and ran a simple protocol on myself. Three components.
Component 1: 20 minutes of unscripted AI conversation, daily
This is non-negotiable. Twenty minutes of open speaking practice with an AI partner, every day, no exceptions.
At the time I used ChatGPT Voice (which I had via Plus). It was good enough for raw conversation but had no pronunciation feedback. I'd talk about whatever — my work, what I read that morning, a problem I was thinking through, a topic the AI picked.
The first week was brutal. I'd run out of things to say at 5 minutes. By week 3 the 20-minute sessions felt natural. By week 6 I was wishing they were longer.
Key rule I imposed on myself: don't restart when you make a mistake. Push forward. Recovery is the skill.
Component 2: Daily voice memo journaling, 5 minutes
Every evening: open phone voice memo app, hit record, summarize the day in English for 5 minutes.
Listen back the next morning. Take notes on what I heard myself doing wrong.
This was the highest-leverage feedback loop I had at the time. Hearing my own filler words ("um... like... you know...") made them impossible to ignore. Within 30 days my filler density dropped 60%.
Component 3: Weekly shadowing, 30 minutes
Once a week, 30 minutes of shadowing the All-In Podcast. (Yes, the All-In Pod — judge if you want, but the cadence is gold and the topics interested me.)
Play 30 seconds. Pause. Repeat aloud matching pace and intonation. Continue.
Shadowing didn't move my fluency the way Components 1 and 2 did. But it moved my intonation — the rhythm and stress patterns that signal "fluent native-like speaker." After 90 days of weekly shadowing, native speakers started saying "your English sounds different lately." It was the intonation, not the grammar.
What I cut
Equally important — what I stopped doing:
- Babbel cancelled in week 2
- Anki vocab decks paused (still paused)
- Netflix in English continued but I stopped pretending it was "practice"
- Grammar drills cut completely
- Polyglot YouTube videos cut completely
Total time investment of the new protocol: 25 minutes daily + 30 minutes weekly = ~3 hours/week. About half the time I'd been spending on receptive activities in 2024.
The 90-day results
Week-by-week, what changed.
Week 1: Brutal. I could barely sustain 5-minute sessions. Filler word density was embarrassing on the voice memos. The first time I listened back I almost stopped doing it.
Week 3: Sessions extended to 15-20 minutes. Filler words dropped by maybe 30%. I started recognizing my own pronunciation patterns ("My /θ/ is consistently coming out as /t/. My final consonants drop when I'm tired.").
Week 6: First real-conversation breakthrough. A US client meeting where I spoke for 8 continuous minutes without freezing. Not perfect — but no freeze. First time in two years.
Week 9: Native speakers stopped asking me to repeat myself in casual conversations. Intonation patterns started feeling natural. I caught myself thinking in English about a technical problem at work — first time consciously.
Week 12 (day 90): Mock interview I recorded for comparison. 30 minutes of unscripted speaking on tech topics. Listened back. Couldn't recognize the voice as "me" from January. The fluency was actually there.
Concrete metrics I tracked:
- Filler word density: from 1 every 6 seconds → 1 every 25 seconds
- Sustained speaking: from 4 minutes max → 30 minutes comfortable
- Freeze frequency in real conversations: from "every meeting" → "rare, only in high-stakes situations"
- US client meetings: from "I switched to text" → "I led 30-min calls"
90 days. 25 minutes a day. Plus weekly shadowing. That was it.
What I'd do differently if I started over
Looking back at the 12 months of wasted effort plus the 90 days that worked, here's the protocol I'd run from day one.
Day 1: Diagnose the asymmetry
Honest test: can you read English news without a dictionary? Can you write a coherent paragraph in English? If yes, your reception is fine. The gap is production. Don't add more reception.
Days 1-30: Build the production habit
20 minutes of unscripted AI conversation, daily. SpeakShark Daily Talk mode (free tier covers this — 3 sessions/day, no card required). This is what I wished I had — combination of unscripted volume + phoneme-level feedback.
Voice memo journal, 5 minutes daily. Listen back next morning. Cringe. Continue.
Don't add anything else this month. Reception is already enough.
Days 31-60: Add variety and feedback layer
Continue daily SpeakShark sessions. Now rotate topics aggressively — never repeat a topic two days in a row. Force fresh sentence assembly.
Use the phoneme-level feedback SpeakShark gives you. Pick one specific pronunciation error and target it for the week (e.g., "this month I'm fixing my /θ/").
Add weekly shadowing (30 min, one podcast you actually like).
Days 61-90: Real-world transfer
Continue daily AI sessions. Add 1-2 real-human interactions in English per week — language exchange, Cambly trial, or just asking your English-speaking colleagues to speak English with you instead of switching to your L1.
The AI practice is your gym. Real conversations are the game. You need both, but the gym is what gets you to game-ready.
Day 90 audit
Record a 5-minute monologue on a topic you couldn't have spoken about on day 1. Listen back. Compare to your week-1 voice memos. The progress is visible and audible. If it's not, double the volume.
Total cost: SpeakShark free tier ($0) for the first month, Pro $12/mo afterward if you want unlimited. Plus a podcast subscription if you don't have one. Under $50 total for 90 days.
What this protocol replaces
If you're currently doing any of these, you can drop them with no progress loss:
- General language apps (Babbel, Duolingo, Pimsleur, Rosetta) for English specifically
- Vocabulary flashcards
- Grammar drill subscriptions
- "Watch English Netflix to learn" guilt-time
- Polyglot YouTube binges
Keep:
- Reading whatever you'd read anyway (it doesn't hurt)
- Watching English content for enjoyment (just stop calling it "practice")
- Anything that's working for you specifically, that this post hasn't accounted for
The product I built — and why it's not a coincidence
After 90 days of the protocol working on me, I had a problem: ChatGPT Voice was a generic chat partner, not a language coach. It would tell me "great job!" when I made errors, had no phoneme-level feedback, and changed voice randomly between sessions which broke any accent target practice.
I needed a tool that delivered:
- Unscripted open conversation (✅ ChatGPT Voice had this)
- Phoneme-level pronunciation scoring (❌ ChatGPT Voice didn't)
- Consistent accent target (❌ ChatGPT Voice didn't)
- Score trends over time (❌ ChatGPT Voice didn't)
- Cheap enough for daily use ($20/mo ChatGPT Plus was overpriced for the speaking subset)
So I built SpeakShark. Four native-accent AI teachers (American, British, Australian, Canadian — pick one and commit). Per-utterance pronunciation scoring with phoneme-level errors highlighted. Score trends across pronunciation, grammar, fluency, vocabulary so you can see week-over-week movement. Free tier: 3 conversations/day, no card. Pro $12/mo for unlimited.
This is biased, obviously. But the bias comes from a real problem the existing tools didn't solve for me. The tool I built is the one I wish I'd had on day one of the 90-day protocol.
What I'd tell my 2024 self
If I could go back to early 2024 and give myself one paragraph of advice:
You don't need more grammar. You don't need more vocabulary. You don't need more Netflix. Your reception is already enough. Stop adding inputs. Start producing outputs. Twenty minutes of unscripted English speaking daily, voice memo journal for self-awareness, and weekly shadowing for intonation. That's it. Do this for 90 days and the gap closes. Everything else is procrastination dressed up as study.
That paragraph would have saved me 12 months. If it saves you any time, the post served its purpose.
Bottom line
The English speaking gap most learners have isn't a knowledge gap — it's a production gap. The fix isn't more study. It's structured production: 20 min daily unscripted speaking + 5 min voice memo + weekly shadowing. Run for 90 days. Measure on day 90.
SpeakShark is the tool I built specifically to make this protocol practical at scale. Free tier (3 sessions/day, no card) covers the foundation. Pro $12/mo if you need unlimited.
Or use any other tool that delivers sustained unscripted speaking with honest feedback. The protocol matters more than the tool. But you need some tool, and most language apps aren't built for this.
I closed my own gap in 90 days after wasting 12 months. The math on speaking improvement is straightforward once you understand what to do. The path is just much narrower than most people think — and it doesn't run through grammar drills.
Good luck.