SCIENTISTS COUNDN'T DO IT IN TWO CENTURIES ---------- AN ENGINEER DID IT IN TWO MONTHS ---------- A TIME-DOMAIN COCHLEAR IMPLANT ---------- RULES ARE MEANT TO BE BROKEN

 

“If at first you don’t succeed, try, try again. Then quit. 

There’s no point in being a damn fool about it.”

– WC Fields

If you can’t change an industry – change its foundation.

-A. Doolittle

The Bates Cochlear Implant is changing directions – literally. We have decided to release the core of our cochlear implant, The Bates Function, to the public.

 

The Bates Function is a small, blazingly small piece of code that uses a tapped delay line and third-grade arithmetic to replace mathematically intensive spectrum-based waveform analysis techniques, such as Fourier analysis. Frequency domain tools know frequencies precisely; time domain tools know time precisely. 

 

The Bates Function and the ear are in the time domain, which is the domain of movement—the movement of sound pressure waves traveling through air and water, the movement of a speaker cone, and the movement of electricity through wires. The frequency domain is the domain of mathematical manipulation.

 

Fourier analysis attempts to find the combination of waveforms needed to aggregate to create a sound’s spectrum envelope. 

 

The frequency domain uses filterbanks to divide a spectrum into individual frequency bands; however, filters have sidebands that interfere with their neighbors and prevent filter frequencies from being too close. This limits the number of available filters that can be used and limits the number of electrodes to around twenty-four. 

 

 

The Bates Function deconstructs sounds and extracts and separates the independent waveforms found inside. It sorts the extracted waveforms by frequency and distributes them to the appropriate frequency channel. There is no channel overlap, no limit to channel closeness, and no upper limit on the number of electrodes. 

 

Spectrum analysis does not ” hear ” impulse sounds because they lack a spectrum. It needs to view the spectrum through a window and buffer sounds. Therefore, spectrum analysis is not a real-time processor.

 

The Bates Function records impulse sounds and functions as a stream processor. With its small size, it is a blazingly fast real-time processor. 

 

 

The time domain knows the precise moment when the energy in a waveform crosses zero, and the frequency domain doesn’t. J.C.R. Licklider* demonstrated clipped speech (essentially zero crossovers with no spectrum) could be understood from 60% to 90% of the time – depending on the subject’s familiarity with the test words. This suggests the brain can rebuild a spectrum using zero crossings alone. With no precise time markers, this ability is not available in the frequency domain, and this could explain why cochlear implant wearers cannot hear conversation when in a noisy environment. 

 

 

 

So, the BCIP will soon release a small, blazingly fast replacement for Fourier analysis – Stay Tuned!

 

*J.C. Licklider and I. Pollack, “Effects of differentiation, integration, and infinite clipping upon the intelligibility of speech,” Jour. of Acoustic Soc. of Am., vol. 10, No. 11, p1781, 1948

A video of a real-time  PSM (Periodicity Sorting Matrix) output. The PSM is the preprocessor of the Bates cochlear implant and replaces the filterbanks traditionally used by other implants. The actual pitch is being displayed. 

Previous slide
Next slide

A Partial List of Unpublished Papers by John Bates

A Computational Auditory Model Based on Evolutionary Principle
A Modern Atomist’s Theory of Hearing: It began with Epicurus 300 B.C
A Robust Signal Processor for Cochlear Implants
A Selectionist’s Approach to Auditory Perception
A Signal Processor for Cochlear Implants – An application for interstitial waveform sampling
A Systems Approach for Auditory Modeling
A Time-Domain Processing Experiment to Test Fundamental Auditory Principles
Acoustic Source Separation and Localization
An Auditory Model Based on Principles of Survival
An Auditory Theory, the Helmholtzian Mistake, the Cocktail Party Problem
An Experiment on Direction Finding -of-Arrival of Moving Vehicles
Appendix to “How to hear everything and listen to anything”
Can a Zeros-Based Waveform Encoding Explain Two-Tone Interference?
Decoding Hearing: From Cocktail Party to Fundamental Principles
Experiments in Direction Finding
Experiments on Interstitial Waveform Sampling
Hearing Sound as Particles of Meaning
higher-level auditory processing that leads to robust speech processing and other auditory applications
How to hear everything and listen to anything
Interpolater Between PRF Periodicity Recognition Gates
Modeling the HAAS Effect – A First Step for Solving the CASA Problem
Monaural Separation of Sounds by Their Meanings
My Engineering Mind: How I invented an auditory theory using engineering principles instead of science
Progress Report on AUTONOM, an Autonomic Acoustic Perception System
Solving the Cocktail Party Problem: Unthinkable ideas, luck, and pluck
Solving the Mystery of Hearing: Basic Principles, Ancient Algorithm
The Aural Retina: Hearing Sound as Particles of Meaning
The Microgranule System
The Story of the Aural Retina – Hearing Sound as Particles of Meaning
Time and Frequency: A Closer Look at Filtering and Time-Frequency Analysis
Tonal perception and periodicities
Using Attention and Awareness in a Computational Auditory Model
Zeros-Based Waveform Encoding Experiments in Two-Tone Interference

Future Projects (Unfunded)

Singer Vocal Fault Finder (Being Updated)

UPDATE: We have assigned a programmer to create an iOS version as a “thank you” for your contribution. Contributors will be notified when it becomes available. 

 

The processor in the first-generation cochlear implant was ingeniously used to create a smartphone app to assist singers in visually locating and correcting vocal faults. The app gave the singer a real-time display of their voice’s pitch superimposed on a musical staff. Within the display were fault markers.

How good was the vocal fault detector? Two reviews follow:

“The most significant characteristic of the application is the visible manifestation of singing sound properties in a convincing mathematical way. You may control almost everything, from the exact pitch of the voice and the shaky voice (wrong vibrato frequency) from the annoying “voice caprile” (“He-goat Voice with high frequency) to the unacceptable “ballare la voce (“dancing voice with low frequency and big pitch intervals) up to realize the differentiation of simple legato, tenuto, portando, portato and glissando. The students can easily understand how to control their music phrasing, avoiding exaggerations, merely because they can observe what they sing.

Zachos Terzakis Opera Tenor, Vocal Teacher, Athens, Greece

“I have used this application in my studio to visually show my students whether they are singing on pitch. Once they realize that the center of the space or line equals the center of the pitch, it’s easy for them to see their own accuracy and train their ear as well. The accuracy of the program is incredible. I highly recommend it.”

 Mark Kent – Vocal Teacher, High Point, North Carolina

Visual Speech Enunciation

A PROPOSED PROJECT:

This is an adaptation of the singer’s vocal fault finder. The application is intended to be an enunciation coach for those with limited hearing. The app scrolls the script of a predetermined lesson plan across the screen. As the user reads the script, the engine in the Bates cochlear implant deconstructs the speech in real time and displays the individual elements in a format suggested by the radar plot shown. Synchronized with the user’s voice is a plot display taken from a reference speaker using the same script. The reference speaker will be of similar gender, age, and register. The user corrects their speaking voice by having their plot match the shape of the reference voice.

Empowering the user with control over their learning process, the application allows them to scroll back and forth through the script, highlight areas for practice, and create loops to focus on difficult sections. This user-centric approach ensures a personalized and effective learning experience.  

Looking ahead, a ‘Pro’ version of the application could offer even more advanced features. This version might include recording capability, allowing users to track their progress over time. Additionally, it could provide a method for downloading recordings for review by speech therapists, enhancing the application’s potential for professional use. 

A “Therapist” version (different platform?) would be able to store recordings from multiple users and annotate each as needed.  

And yes, we know this website needs improvement. We welcome anyone willing to volunteer their website design service.