Fig. 1 Spectrogram of the first recording showing the audio feedback at about 1KHz and the anomalous voice (AV1) after 11 seconds. The number of audio samples for each FFT segment is 256. |
But, soft: behold! lo where it comes again!
I'll cross it, though it blast me. - Stay, illusion!
If thou hast any sound, or use a voice.
Speak to me!
W. Shakespeare, Hamlet, 1.1
My nephew XK lives in the US and is a young American who speaks both Portuguese and English. Driven by his love for singing, he was alone at home recording songs with a Karaoke system. Careless about one of his recordings, he left the recorder on a little longer. To his surprise, he noticed an unexpected voice at the end of one of the files. After inquiring the unseen to give him and his mother a reply, the unusual voice appeared a second time. Fearing the invisible, he abandoned his karaoke sessions since then.
Fig.2 Filtered waveforms of AV1 in the indicated interval showing the anomalous voice. |
My sister sent me two of the raw MP4 files sampled at 48KHz. The first recording was obtained on June 18 2019 at 4:03 pm and the second one about one hour later. The total recording time length of samples 1 and 2 is 13.76s and 17.62 s, respectively. In the first sample, the Karaoke speaker was on and the characteristic audio feedback is clearly heard. In the second file, registered with the speaker off, no feedback is present, but the output volume is much lower.
It is possible to visually analyze sounds by calculating and displaying their spectrograms. These are frequency-time diagrams showing the time distribution of intensity amplitudes of each tone that compose the original sound. Spectrograms are the result of applying the famous “Fast Fourier Transformation” method (FFT in short) to the sampled audio signal. The first recording spectrogram is seen in Fig. 1. The “anomalous voice” (AV1) is present after about 11 seconds with a spectrum distribution between 1-8 KHz above the characteristic audio feedback (in the range 0.75-1.3KHz and 2.8-3.1KHz). By applying a 76dB flat filter in these ranges, the audio noise is reduced significantly. The filtered anomalous signal waveform in the time interval [11.6,12.75] is seen in Fig. 2. AV1 sounds like a whispering male voice pronouncing “I see”.
Fig. 4 Audio waveforms of AV2 ("yes") in the indicated interval showing the anomalous voice modulation. |
Fig. 3 is the corresponding spectrogram of the second sample where other features are indicated. In particular, there is an initial "invocation'' in Portuguese that can be translated as "tell something to mom'' in the interval 1-2 s. Cracking sounds can be heard between 10-14 s. After 14.5 s, the silence is broken by an unmistakable but very low “yes” distributed mostly in the frequency range 1-2KHz, and whispered by the same male voice of the first recording. A zoomed view of the AV2 audio signal is seen in Fig. 4 in the time interval [14.75, 15.6] s.
The original single-channel versions were converted to WAV format, and can be listened here (1) and here (2). The complete filtered file can be accessed here.
Next post: EVP II (Electronic Voice Phenomena).