23 July 2008

NuPIC Limitations

If you’ve spent time working on recognition / classification problems using Numenta’s NuPIC framework, the idea of multiple high-level causes might seem a bit strange. Aren’t we trying to distil the main invariant feature from the input, to discover a label that summarises a group of patterns or sequences? The problem with that approach is that in general there isn’t one main cause, there are several.

The NuPIC algorithms are biased towards grouping particular kinds of patterns together, due to the choice of distance metric, the measure of similarity. This makes NuPIC efficient on the tasks for which it is designed, at the expense of generality. NuPIC cannot learn language, because the algorithms’ simplifying assumptions do not apply.

Let me give an example. If you train a NuPIC network on movies of dogs and cats walking and jumping, NuPIC will learn to classify dogs from cats, and the label that emerges at the top of the network will reflect the kind of animal, not its movement. To alter your network so that it identifies movement (walking versus jumping) at the top level node, you would need to change the learning algorithm to bias it in a different way, to pass information about movement up the network while discarding differences in visual structure. And to identify both movement and animal, you need nodes with multiple parents in a network that is not a pyramid. Perhaps that can be added to NuPIC, but we haven’t seen it yet.

This is not a criticism of Numenta. The folks there are friends, I worked with some of them at Palm, I was privileged to visit their offices last year, and I have heaps of respect for their work. Jeff’s book and the original NuPIC release inspired me to work on HTMs in the first place. I’d be thrilled to see Numenta incorporate ideas from Blerpl. And NuPIC does a good job at what it does. Blerpl cannot yet handle a 32x32 pixel image, because the implementation uses exhaustive greedy search and runs out of memory building the network. But if we are to develop algorithms that are capable of general learning and intelligence, we have to recognise the current limitations.

Blerpl is an HTM algorithm, although it is very different from NuPIC. Blerpl focusses on language (in a very general sense) while NuPIC focusses on vision. While I don’t yet have all the answers, I think Blerpl is a way to learn something about some aspects of HTM theory which are missing from other implementations. I hope you’ll take the time to keep reading as I do my best to explain what I’ve learned, and ask questions along the way.

No comments: