Published on

Taking Advantage of Paradigm Shifts

What's the most efficient way of getting to AI?

Sometimes simply waiting for technology change is the quickest way of moving forwards.

For some computation-intensive tasks, the quickest way to finish a given computation (for a given dollar amount of spending) is simply to wait until a faster machine is available : Moore's law will take care of bringing the cost of a suitable machine down quicker than running the task on a machine available at today's prices.

Not just computers...

The pace of sequencing grew enormously during the lifespan of the genome project. The methods used didn't just get incrementally better, there was an entire paradigm shift in the technologies used for sequencing. A strategic outsider might have realised that the most efficient way to get the project done was to wait until the paradigm shift occurred, and then (only after that point) spend wildly to get to the finish line first.

The stuck-in-a-rut feeling

After an initial rush of excitement when it was first devised, the neural network training technique of backpropagation (BP) was soon found to present a large number of computational hurdles.

For one, as a gradient descent method, BP proved slow to converge. For another, networks tend to over-learn, becoming brittle when presented with unseen data. A large number of techniques have been devised to circumvent these problems - by adding momentum terms, scaling / clamping weights, or 'early stopping'. But at some point, these 'tricks of the trade' become more of a hinderance to understanding - despite the fact that they seem to help with the problem at hand.

To some extent, the proliferation of new techniques to solve a give problem, none of which directly address the core issues (e.g. that the chain rule dilutes information about how to change weights), should act as a signal that we're busily exploring a cul-de-sac.

Knowing that the brain has actually solved the problem of consciousness is a very encouraging piece of data. It tells us that there must almost certainly be a few paradigm shifts ahead, since we can make a decent guess at whether the sum of the current (and prospective) set of 'hacks' could possibly get us to our goal - and it seems that we're going to fall short (since we're likely entering into a new plateau phase).

We know there's a better way : We just don't know where to jump next (yet).

New paradigms

The current resurgence of neural network research has been spurred by the application of ever larger amounts of CPU/GPU power on problems that had previously been 'stuck'. Not only has computer power grown rapidly, but there has been a reassessment of the overall tactics the brain uses to solve problems.

Previously, people were interested in solving problems using minimal representations and efficient networks. Now, a new take on the brain's use of simple neurons is that nature has chosen a mechanism that is inherently wasteful of computing resources, but very efficient in terms of power and robustness.

So, making use of this new perspective, the commoditization of networked computers, and the availability of vast datasets, people can once again tackle old problems with renewed vigour.

New ruts

But let's not imagine that blindly throwing computer resources at these problems is going to be the last innovation that will start to be required. Soon (if it has not already started), these methods will evolve principally through the accumulation of 'fixes', each of which increases efficiency marginally, but at the expense of clarity of purpose.

What fundamental assumption are we currently making that will be discovered to be holding us back 5 years from now?