Google’s DeepMind AI division has taken care of everything from Star Craft for me folding squirrel🇧🇷 So it’s probably not surprising that the creators turned to what is undoubtedly a personal interest: computer programming. In the Thursday issue of Science magazine, the company describes a system it has developed that generates code in response to programming typical of those used in human programming competitions.
In the average task, the AI system manages to recruit almost half of the best participants. But he had some difficulty with scaling, since he was less likely to create a program that would work on problems that would normally require more code. However, the fact that it works without any structural information about algorithms or programming languages is surprising.
take the challenge
The tasks of computer programming are quite simple: people are given the task of executing and creating code that must perform a required task. In the example in the new article, programmers were given two strings and asked to determine if they could get the smaller of the two by replacing the spaces with some of the keystrokes needed to enter the larger string. Submitted programs are checked to see if they provide a general solution to the problem or fail when testing additional examples.
Given enough examples of software that can solve a single problem, an AI system should probably be able to infer the algorithmic structure needed to succeed. But it won’t be a one-size-fits-all solution to any problem; An AI trained in a task category will fail when asked to take on an unrelated task.
To make things more general, the DeepMind team treated this more like a language issue. To some extent, the task description is an expression of what the algorithm should do, and the code is an expression of the same thing, only in a different language. So the AI in question was designed to be in two parts: one part takes the description and turns it into an internal representation, and the second part uses the internal representation to generate working code.
Systems learning was also a two-step process. At the first stage, the system only needed to process a snapshot of the material on GitHub, in total more than 700 GB of code. (In the days when you can put this on a flash drive, this might not seem like much, but remember that the code is just raw text, so you get a lot of lines per gigabyte.) Note that this data is also will include comments that you should use natural language to explain what the adjacent code is doing, and therefore should help with input and output tasks.
After the system is trained, it goes through a period of adaptation. DeepMind sets up its own programming quizzes and then feeds the results back into the system: problem description, working code, buggy code, and test cases used to test it.
Similar approaches have been used before, but DeepMind reports that it has been able to devote more resources to training. The paper states that “one of the key factors in improving the performance of AlphaCode was the increase in the number of model samples by orders of magnitude greater than in previous work.”