Technology

Real-Time On-Device Voice Reconstruction AI

Whispp’s Proprietary Voice AI

At Whispp, we’re pioneering on-device audio-to-audio AI, a novel technology that transforms how people communicate, whether they’re whispering, speaking in noisy environments, or living with a voice condition.

Traditional tools
Traditional voice processing tools like noise suppression, spectral masking, or voice isolation are good at removing unwanted sound but they can’t restore what’s missing. Whispp goes a step further. Using Whispp’s proprietary voice AI, we don’t just clean audio, we recreate lost or degraded speech information, delivering in real-time natural, expressive and intelligible communication where it wasn’t possible before.

Whispp’s advanced voice AI reconstructs and restores speech in real time, reviving lost pitch, tone, and natural intonation.

Voiced speech
Figure a shows the spectrogram of regular voiced speech. Key features of voiced speech are noted below:

  • Fundamental frequency (F₀): visible as evenly spaced horizontal striations representing vocal fold vibrations (pitch).
  • Harmonic structure: multiple frequency bands stacked above the F₀, giving richness and timbre to the voice.
  • Formants (F₁, F₂, F₃, …): darker bands that correspond to resonances in the vocal tract, shaping vowel sounds.
  • Energy concentration: clear patterns of energy that vary dynamically across time and frequency, reflecting natural rhythm and prosody

Whispered speech
Figure b shows the spectrogram of whispered speech. You can see the loss of:

  • Fundamental frequency (F₀): no visible pitch band because the vocal folds do not vibrate.
  • Harmonic structure: replaced by diffuse, noisy energy.
  • Natural intonation: speech sounds flat and monotone since there’s no pitch variation.

Whispered speech retains some formant information, allowing words to remain intelligible, but loses the voicing cues that make speech sound natural and expressive.

Whispp Reconstructed Speech
Figure c shows the spectrogram of reconstructed speech using Whispp’s Proprietary Voice AI. Whispp restores the key features of natural spoken speech from a whispered sample:

 

  • Reconstructed fundamental frequency and harmonics, bringing back natural pitch and tone.
  • Enhanced formant clarity, improving intelligibility and timbral accuracy.
  • Reintroduced prosody and emotional expressiveness, allowing speech to sound authentically human again.

(a) Spectogram of regular voiced speech

(b) Spectogram of whispered speech

(c) Spectogram of Whispp reconstructed speech

Figure 1.  (a) Spectogram of regular voiced speech, (b) spectorgram of whispered speech, and (c) spectogram of  Whispp reconstructed speech. © Whispp.

Figure 1.  Spectograms of  various speech forms showcasing Whispp reconstruction AI technology. © Whispp.

Personalizing voices

By providing recordings, your Whispp voice will sound like your own healthy voice!

In the Whispp app you can use your Personal Whispp voice for your video or audio calls and messages. Stay connected with family, friends, and others in a way that feels familiar and comfortable.

Whispp reconstructs the key features of natural spoken speech using its proprietary voice AI technology. It does this on-device, in real-time and in any language.

Whispp’s technology can also be applied to affected voices, or heavily noisy environments. The result is speech that not only sounds clear but feels authentic, expressive, and true to the speaker’s original voice.

Accessible calls

Make clear calls no matter your voice condition

Private calls

Make whispered calls in quiet spaces

Noisy calls

Make calls in high background noise

About Whispp

Real-time voice AI for clear, private calls

Media & News

Awards, News, and updates on Whispp’s voice AI

Team & Partners

Meet the team building Whispp’s voice AI