Artificial intelligence (AI) has always been a new topic for companies to explore. More specifically, it has been the most relevant topic for a long time. But what about AI writing software? If this was a new idea, then it would be difficult to believe that IBM would invest in the idea. But this isn’t a new idea. In fact, it has been around for a long time, but it’s just now that companies are experimenting with AI as the author of the software.
Google Brain scientists last week presented a deep reinforcement learning technique for floor planning, the process of arranging different computer chip components. Their paper was published in the peer-reviewed journal Nature.
Google’s Tensor Processing Units, that are specialized artificial intelligence processors, are designed using the reinforcement learning technique.
Designing chips using software isn’t a new innovation. According to Google researchers, the new reinforcement learning model is capable of producing chip layouts similar to or better than those produced by humans for several key metrics, including power consumption, performance, and chip area. And it does so in a fraction of the time a human would.
In recent times, artificial intelligence has gained significant attention due to its superior performance over humans. Media outlets have called this software “artificial intelligence software that can design computer chips faster than humans can” and written that “a chip that would take humans months to design can be devised in less than six hours by [Google’s] new AI.”
A second outlet wrote, “AI is designing chips for AI, so things seem to be just getting warmed up.”
When reading the paper, however, something that surprised me wasn’t the intricate nature of artificial intelligence used to design computer chips, but its synergy with human intelligence.
The value of analogies, intuitions, and rewards
According to the paper, chip floorplanning involves placing netlists onto chip canvases (two-dimensional grids), ensuring that performance metrics (such as power consumption, timing, area, and wire length) are optimized within hard density and routing constraints.
Placing the components as optimally as possible is what you want to do. As chip components grow, it becomes more difficult to find an optimal design, just as it is with other problems.
In the event that the target chip becomes more complex, existing software can help speed up the discovery process, but it falls short when it comes to discovering chip arrangements. Using reinforcement learning, the researchers were able to solve complex space problems using the game Go.
The chip floorplanner is analogous [emphasis mine] to a game with varying pieces (for example, network topologies, macro counts, macro sizes and aspect ratios), canvas sizes (different sizes of the canvas) and win conditions (different evaluative metrics or different density and routing congestion constraints),”the researchers wrote.
One of the most important aspects of human intelligence is analogy, which is the manifestation of this. When we solve a problem, we can abstract certain ideas from that problem, which we can then apply to another one. Our ability to transfer knowledge is largely due to these skills. Researchers were able to do this by recasting the chip floorplan planning problem as a board game and be able to solve it like scientists solved Go.
The brain is physically incapable of searching very large spaces, a feat that deep reinforcement learning models excel at. Nevertheless, the scientists were confronted with a far more complex problem than Go. “[The] state space of putting 1,000 clusters of nodes on a grid of 1,000 cells would be 1,000! The state space of Go is 10360, whereas that of C++ is somewhat larger (102,500),” the researchers wrote. There would be millions of nodes in the chips they were planning.
By encoding chip designs into vector representations, they solved the complexity problem and facilitated exploring problem space much more easily. According to the paper, “[emphasis mine] We had the intuition that a policy that is capable of general chip placement should also be able to decode the state associated with a new unseen chip at inference time.” Our goal is to use this architecture as the layer that encodes our policy, capable of predicting rewards upon placement of new netlists.
There is no universally accepted definition of intuition. The process of self-awareness is extremely complex and poorly understood, involving experience, unconscious knowledge, and pattern recognition. Having worked in one field for years, we get our intuitions. However, we can also obtain them from additional experiences. A high-performance computing system and machine learning tools can help us to test these intuitions.
In addition, well-designed rewards are required for reinforcement learning systems. Some scientists believe that reinforcement learning can reach artificial general intelligence when the reward function is chosen appropriately. If the agent does not receive the right reward, she can end up in endless loops doing meaningless and stupid tasks. RL agents playing Coast Runners are trying to maximize their points by abandoning the main goal of winning the race.
During the development and training of the reinforcement learning model, Google’s scientists adjusted the weights of proxy wirelength, congestion, and density.
Taking advantage of its computational power, the reinforcement learning model was able to find a variety of ways to design floorplans that maximized reward.
Supervised learning was used to develop the deep neural network in the system. The parameters of the model must be adjusted during training using labeled data during supervised machine learning. Using a dataset of 10,000 chip placements, Google scientists created a dataset for each state of a placement along with the reward associated with that placement.
The researchers used a mix of human-designed and computer-generated floorplans to prevent having to create every single plan manually. In the paper, there’s no reference to the amount of human effort put into the evaluation of the algorithm-generated examples. A supervised learning model that does not have good training data may end up making poor inferences.
This makes the AI system distinct from other reinforcement-learning systems like AlphaZero, which won’t need human input in order to devise its gameplay policy. They may develop a machine learning agent that works without the need to be supervised in the future. Due to the complexity of the problem, I believe that it may still be necessary to combine human intuition, machine learning, and high-performance computing to solve such problems.
Human design vs. reinforcement learning
Layout of chips is one aspect of Google’s research that is particularly interesting. There are many ways in which we humans overcome the limitations of our brains. One problem cannot be tackled at a time. The complexity can be divided and conquered through modular and hierarchical systems. Developing systems that can perform very complex tasks has been made possible by our ability to think and design top-down architectures.
Chip design follows a similar pattern. It is common for humans to design chips with clean divisions between modules. Google’s reinforcement learning agent, however, has found that independent of the layout, the floorplan designs it has designed are the least prone to failure
Human intelligence and artificial intelligence
Investing in AI hardware and software innovation will continue to require abstract thinking, identifying problems to solve, developing intuition about solutions, and selecting data that validates solutions appropriately. Artificial Intelligence chips can enhance rather than replace these kinds of skills.
Ultimately, I do not believe that AI is outsmarting humans, creating smarter AI, or developing capabilities of recursive self-improvement. As a result, humans are using AI as a means to overcome their own cognitive limitations and to extend their capabilities. Whenever humans and AI cooperate better, a virtuous cycle develops.
A new way of writing software?
IBM is now creating a new way of writing software’s that is intended to be more human like. They believe that writing like humans could be the key to creating more efficient AI programs. In the future, this could help computers make decisions without having to be told what to do or how to do it. For now, IBM is taking baby steps into the future by experimenting with this new form of writing.