I do not know much about speech recognition. I do not know what is the state of the art. But years ago I was playing a bit with it and I would like to throw an idea our there, maybe somebody picks it up, maybe it turns out useful, or maybe it is already being used. Please tell me. It can be used for not just speech recognition, but any general audio pattern recognition, or any signal pattern recognition.
The basic idea is to observe that human hearing works by first cochlea doing physically a frequency transform. Hairs of different lengths resonate to frequencies in the audio input. Stronger a particular frequency is in the input, stronger will be a signal for that hair. A stronger signal in neurons does not mean a larger amplitude of the action potential, but more of them. So a stronger signal for a particular frequency means that more impulses will go over that neuron. More impulses mean a higher frequency of those impulses. So brain has to learn not directly from the input audio, but from changes of frequency of the signal for each frequency in the input audio. If brain is recognizing patterns from that, we should too.