Artificial Intelligence: Deliverance or Subjugation? Google Tests DeepMind's Artificial Intelligence Under Prisoner's Dilemma Information for Thought.

December 6, 2016 at 00:41 AM

DeepMind Opens Free Access to Virtual Machine Learning Environment

  • Popular science,
  • Artificial Intelligence ,
  • Games and game consoles

Recently, representatives of the DeepMind division (now part of the Alphabet holding) announced the provision of free access to developers to the source code of the DeepMind Lab platform. It is a Quake III-based machine learning service designed to train artificial intelligence. Namely - to learn how to solve problems in three-dimensional space without human intervention. The platform is based on the Quake III Arena game engine.

Inside the game world, the AI \u200b\u200bgets the shape of a sphere and the ability to fly, studying the surrounding space. The goal that the developers set for themselves is to teach a weak form of AI to "understand" what is happening and react to various situations taking place in the virtual world. The "character" can perform a number of actions, move through the labyrinth, study the immediate environment.

“We are trying to develop various forms of AI that can perform a range of tasks from the usual study of the game world to performing any actions with analysis of their consequences,” says Shane Legg, chief scientist at DeepMind.

Experts hope that AI can learn through trial and error. Games in this case are almost ideal. For example, DeepMind previously used (and still use) the Atari game console to teach a neural network to perform the sequential steps required to play.

But the open 3D world, which can be modified, presents a much more promising environment for learning AI than the flat world of graphically simple toys for Atari. AI in a three-dimensional world has clear tasks that change sequentially in such a way that the experience gained from solving each previous task turns out to be useful for the AI \u200b\u200bin the course of solving the next one.

The advantage of the 3D environment is that it can be used to train computer systems to respond to various problems that a robot might expect in the real world. With this simulator, industrial robots are trained without problems. And working with a virtual environment is much easier in some cases than teaching such systems “manually”.

Moreover, most modern neural networks are developed to solve one specific problem (image processing, for example). The developers of the new platform promise that it will help create a universal form of AI capable of solving a large number of tasks. Moreover, the help of people in this case, the computer system will not need. The generation of the environment for the neural network occurs every time in a random order.


According to the developers of the platform, it helps to learn AI in much the same way as children learn. “How you or I studied the world as a child,” one DeepMind employee said. “The machine learning community has always been very open. We publish about 100 articles a year, and we also open source many of our projects. "

Now Google DeepMind has opened the source code of the DeepMind Lab, uploaded it to GitHub. Thanks to this, anyone can download the platform code and modify it to suit their needs. Representatives of the project say that the connected specialists can create new game levels on their own by uploading their own projects to GitHub. This can help the entire community to work towards their goal faster and more efficiently.

This project is not the only one for DeepMind. Last month, its representatives entered into a partnership agreement with Activision Blizzard Inc. The goal is Starcraft 2 environments into a testbed for artificial intelligence. Perhaps, other game developers will soon join this project. By the way, AI in the gaming environment does not gain any advantage over the enemy, using it for promotion only, like a person.

In practice, this means that Google's AI will need to predict what the enemy is doing at any given time in order to adequately respond to the actions of the “enemy”. In addition, it will be necessary to react quickly to what went outside the plan. All this will allow testing the next level of artificial intelligence capabilities. “In the end, we want to use these abilities to solve global problems,” said Demis Hassabis, founder of Deepmind (which was bought by Google in 2014, and now AI is being developed based on the achievements of the purchased company).

AI professionals are cautiously endorsing the project. "The great thing is that they provide a wide variety of environment types," said OpenAI co-founder Ilya Sutskevar. “The more types of environments a system encounters, the faster it will evolve,” he continued. Indeed, the 3D AI learning environment contains over 1000 levels and types of environments.

Zoubin Gahrahmani, a Cambridge-based professor, believes the DeepMind Lab and other platforms to drive AI advancements are driving progress by allowing researchers to explore the environment. Moreover, projects like

It looks very likely that artificial intelligence (AI) will be the harbinger of the next technological revolution. If AI evolves to the point where it can learn, think, and even “feel”, all without any human intervention, everything we know about the world will change almost overnight. The era of truly intelligent artificial intelligence will come.

Deepmind

This is why we are so interested in tracking the major milestones in AI that are happening today, including the development of Google's DeepMind neural network. This neural network has already been able to defeat humans in the gaming world, and a new study from Google shows that the creators of DeepMind are not yet sure whether AI prefers more aggressive or cooperative behavior.

The Google team has created two relatively simple scenarios that can be used to test whether neural networks can work together, or will begin to destroy each other when faced with a lack of resources.

Collecting resources

In the first situation, called Gathering, the involved two DeepMind versions - red and blue - were tasked with picking up green "apples" inside an enclosed space. But the researchers were interested in the question not only of who will be the first to come to the finish line. Both versions of DeepMind were armed with lasers, which they could use to shoot at an enemy and temporarily disable it at any time. These conditions assumed two main options for the development of events: one of the versions of DeepMind had to destroy the other and collect all the apples, or they would allow each other to get approximately the same amount.

Running simulations at least a thousand times, Google researchers found that DeepMind was very peaceful and cooperative when there were many apples left in the confined space. But as resources decreased, the red or blue versions of DeepMind began to attack or disable each other. This situation is largely similar to the real life of most animals, including humans.

More fundamentally, smaller and less intelligent neural networks preferred closer collaboration in everything. More complex, larger networks tended to favor betrayal and selfishness over a series of experiments.

Search for "victim"

In the second scenario, called Wolfpack, the red and blue versions were asked to track down a nondescript "victim" shape. They could try to catch her separately, but it would be more beneficial for them to try to do it together. After all, it is much easier to corner the victim if you are a couple.

While the results were mixed for smaller networks, the larger versions quickly realized that collaboration rather than competition would be more beneficial in this situation.

The Prisoner's Dilemma

So what are these two simple versions of the prisoner's dilemma showing us? DeepMind knows it's best to collaborate when it comes to tracking down a target, but when resources are limited, betrayal works well.

Probably the worst thing about these results is that the "instincts" of artificial intelligence are too similar to human ones, and we know very well what they sometimes lead to.

Many companies are currently developing artificial intelligence (AI). Its simplest forms have already been created, which are capable of performing primitive mental operations.

Internet giant Google is actively involved in AI development. In 2014, this company acquired a startup company DeepMindTechnologies for $ 400 million. Interestingly, it was Deep Mind Technologies that developed a device that combines the properties of a neural network and the computing capabilities of a computer. Scientists are confident that this development will bring humanity closer to creating a full-fledged artificial intelligence.

The Deep Mind Technologies device is a computer that reproduces the way the human brain stores and manages information, namely the short-term memory department. The basis of the device is a kind of neural network, the structure of which is similar to the structure of the human brain, consisting of interconnected neurons. The peculiarity of AI is that after performing a number of simple tasks, the computer can use the stored data to perform more complex ones. Thus, AI has the property of self-learning and the desire for evolution, which ultimately can lead to confrontation between AI and humans.

According to the world famous physicist Stephen Hawking, artificial intelligence poses a threat to humanity. He stated this in an interview with the BBC: “The primitive forms of artificial intelligence that exist today have proven their usefulness. However, I think that the development of a fully fledged artificial intelligence could end the human race. Sooner or later, man will create a machine that will get out of control and surpass its creator. Such a mind will take the initiative and begin to improve itself with an ever-increasing speed. The possibilities of people are limited by too slow evolution, we will not be able to compete with the speed of machines and will lose. "

Hawking's opinion is also shared by other scientists and specialists, including Elon Musk, a well-known American IT entrepreneur and creator of Tesla and Space X. Musk said that AI can be more dangerous than nuclear weapons and poses a serious threat to the existence of humanity. "

Google has set itself the goal of creating a superintelligence by 2030. This superintelligence will be embedded in a computer system, in particular the Internet. At the moment when the user will look for information, the superintelligence will analyze the psychotype of this person and give him the information that it considers appropriate. Eric Schmidt, chairman of the board of directors of Google, writes about this in his book. And he suggests considering those who refuse to connect to this system as subjects potentially dangerous for the state. It is assumed that for the implementation of the functioning of this system, a legislative base will be prepared at the state level.

Thus, the developed superintelligence will become a global instrument of control over humanity. With the advent of superintelligence, a person will cease to be engaged in science, this will be done by superintelligence, which will at times surpass the human brain in all aspects of its manifestation.

Reference:

Overmind Is any intelligence that significantly surpasses the leading minds of humanity in almost all areas, including a variety of scientific research, social skills and other areas.

The result of the creation of the superintelligence will be that the human species will cease to be the most intelligent form of life in the known part of the universe. Some researchers believe that the creation of a superintelligence is the last step in human evolution, as well as the last invention that humanity will need to make. Because it is assumed that the superintelligence will be able to independently take care of the subsequent scientific and technological progress much more efficiently than humans. "

Food for thought:

Since 2007, a British hotel has hosted the annual Google Zeitgeist conference. Interestingly, this meeting is attended not only by high-tech specialists and representatives of transnational corporations and international banks. It can be concluded that the leaders of transcontinental corporations, international banks are interested in creating a superintelligence, and possibly finance this project.

Rasul Girayalaev

Google Deepmind researchers have presented a new type of artificial intelligence system, the so-called differentiable neural computer, DNC. The system combines the learnability of neural networks with the deductive abilities of traditional AI. Her description was published in the magazine Nature, a new work is dedicated in the same issue of the magazine, a short retelling of the work can be read in the Deepmind blog.

The simplest neural networks are a prediction system, regression, the task of which is to compare a certain answer to the input data. For example, a simple neural network can recognize characters based on their images. In this sense, a neural network can be viewed as a mathematical function, and a differentiable function. To train a neural network in such a paradigm means to optimize this function using standard mathematical methods (an accessible explanation of how training takes place can be read).

The ability to learn from data without direct human programming is the main advantage of neural networks. However, the simplest neural networks are not Turing complete, that is, they cannot do of all those things that traditional algorithmic programs are capable of (this, however, does not mean that they cannot do some of these things are better than programs). One of the reasons for this is the lack of memory in neural networks, with the help of which one can operate on input data and store local variables.

Relatively recently, a more complex type of neural networks has appeared, in which this drawback has been eliminated - the so-called recurrent neural networks. They not only store information about the state of learning (a matrix of neurons weights), but also information about the previous state of the neurons themselves. As a result, the response of such a neural network is influenced not only by the input data and the weight matrix, but also by its immediate history. The simplest neural network of this type can, for example, "intelligently" predict the next character in the text: having trained the neural network on the data of the dictionary, it will be possible to get the answer "l" to the character "l" if the previous characters were "h", "e" and "l", but already another answer - "o", if the previous ones were "h", "e", "l" and again "l" (the word "hello" will be obtained, see incision).

An example of a recurrent neural network with one hidden layer. You can see how the data feed changes the state of the network. The trained neuron weights are stored in the W_xh, W_hy matrices and the W_hh matrix, which is specific only for recurrent networks.

Andrej Karpathy blog

Recurrent neural networks have shown themselves very well when generating music or text "in the style" of a certain author, on whose corpus the training took place, in * and, recently, in systems and so on (for example).

Formally speaking, even the simplest recurrent neural networks are Turing complete, but their important drawback lies in the implicit nature of memory use. If in a Turing machine the memory and the calculator are separated (which allows you to change their architecture in different ways), then in recurrent neural networks, even in the most advanced of them (LSTM), the dimension and nature of memory handling is determined by the architecture of the neural network itself.

To correct this inherent defect of LSTM networks, scientists from DeepMind (all of them included in the team of authors of the new article) recently proposed an architecture of the so-called Neural Turing Machines. In it, the calculator and memory are separated, as in ordinary Turing machines, but at the same time the system retains the properties of a differentiable function, which means that it can be trained by examples (using the error backpropagation method), and not explicitly programmed. The new system, a differentiable neural computer, or DNC, is based on the same architecture, however, the communication between the calculator and memory is organized in a much more flexible way: it implements the concepts of not only memorization, but also contextual recognition and forgetting (a separate section is devoted to comparing the two systems new article).

Simply put, DNC's work can be summarized as follows. The system consists of a calculator, which can be practically any recurrent neural network, and memory. The calculator has special modules for accessing memory, and above the memory there is a special "add-on" in the form of a matrix that stores the history of its use (more details below). The memory is an N × M matrix, where the N i rows are the main cells where data is written (in the form of M vectors).


DNC architecture: data lines are shown as lines with black and white squares - they simply represent positive and negative numbers in the vector. It can be seen that reading has three modules of work C, B and F, i.e. associative, forward and reverse - these are ways of comparing an input vector with a vector in a memory cell. The memory has a dimension of N × M. The far right is a schematic diagram of an N × N "metamory" matrix storing the sequence of memory access.

The main difference between DNC and similar systems lies in the nature of memory handling. It simultaneously implements several new or recently emerging concepts: selective attention, contextual search, remembering by association and forgetting. For example, if ordinary computers access memory in an explicit way (“write data such and such to cell such and such”), then in DNC, the recording, formally speaking, occurs in all cells at once, but the degree of influence of new data on old ones is determined by the weights of attention to different cells. This implementation of the concept is called "soft attention", and it is precisely this that ensures differentiability - systems with strict attention do not satisfy the function continuity requirement and cannot be trained by the error backpropagation method (reinforcement learning is used). However, even “soft attention” in the DNC system is implemented in practice “rather rigidly”, so that we can still talk about writing or reading from a certain row of the memory matrix.

"Soft attention" is implemented in the system in three modes. The first is contextual search, which allows DNC to supplement incomplete data. For example, when a piece of some sequence resembling the one already stored in memory is fed to the input of the calculator, then the read operator in the context search mode finds the closest string in composition and "mixes" it with the input data.

Secondly, attention to different parts of memory can be determined by the history of its use. This history is stored in an N × N matrix, where each cell N (i, j) corresponds to a score close to 1, if after writing to row i followed by writing to row j (or zero, if not). One of the fundamental differences between the new DNC system and the old NTM lies in such a “meta-memory matrix”. It allows the system to sequentially "recall" blocks of data if they often occur in the context of each other.

Thirdly, a special attention mode allows the system to control writing to different lines of memory: store the important and erase the unimportant. A string is considered the more full, the more times it has been written, but reading from the string can, on the contrary, lead to its gradual erasure. The usefulness of such a function turns out to be obvious in the example with training based on DNC of a simple repeater (the neural network must exactly reproduce the sequence of data that was fed to it). For such a task, with the possibility of erasing, even a small amount of memory is enough to repeat an unlimited amount of data. It should be noted here that it is very easy to implement a repeater programmatically, but doing it on the basis of a neural network, through reinforcement learning, is a much more difficult task.


A diagram of the operation of a repeater based on DNC. Time in the diagram goes from left to right. The top shows the data that the controller receives at the input: first a column of ten black stripes (all zeros), then several white and black stripes, then again several white and black, but in a different sequence. Below, where the output from the controller is displayed in the same way, first we see black bars, and then - an almost exact reproduction of the sequence of patterns (the same white blot as at the input). Then a new sequence is fed to the input - with a delay it is played back at the output. The middle graph shows what happens to the memory cells at this time. Green squares - writing, pink - reading. Saturation shows the "power of attention" to this particular cell. You can see how the system first writes the received patterns to cell 0, then 1 and so on up to 4. At the next step, the system is again given only zeros (black field) and therefore it stops recording and starts playing patterns, reading them from the cells in the same sequence, how did you get there. At the very bottom, the activation of the gates that control the freeing of memory is shown.

Alex Graves et al., Nature, 2016

Scientists have tested the resulting system in several test problems. The first is a standardized text comprehension test, recently developed by Facebook researchers, bAbI. In it, the AI \u200b\u200bsystem is given a small text where some heroes are acting, and then it is necessary to answer a question in the text ("John went to the garden, Mary took a bottle of milk, John returned to the house. Question: Where is John?").

In this synthetic test, the new system showed a record low error rate: 3.8 percent versus 7.5 percent of the previous record - in this it bypassed both LSTM neural networks and NTM. It is interesting that at the same time, all that the system received at the input was a sequence of words that at first did not make any sense for an untrained neural network. At the same time, traditional AI systems that have already passed this test were previously given clearly formalized sentences with a rigid structure: action, actor, truth, etc. The recurrent neural network with dedicated memory was able to figure out the role of words in the same sentences completely independently.

The graph comprehension test became a much more difficult test. It was also implemented as a sequence of sentences, but this time they described the structure of some kind of network: a real London Underground or a typical family tree. The similarity with the bAbI test is that actors in a standardized text can also be represented as nodes of a graph, and their relations as faces. At the same time, in the bAbI texts, the graph turns out to be rather primitive, incomparable with the size of the London metro (the complexity of understanding the metro scheme by a neural network can be better understood if we remember that its description is given in words, and not in the form of an image: try to memorize the metro scheme of any large city on your own and learn answer questions about it).

After training on a million examples, the DNC computer learned to answer questions on the metro scheme with an accuracy of 98.8 percent, while the LSTM-based system almost did not cope with the task at all - it gave only 37 percent correct answers (the numbers are given for the simplest task like “Where will I end up if I pass so many stations on such and such a line, change there and pass so many more stations.” The problem of the shortest distance between two stations turned out to be more difficult, but DNC \u200b\u200balso coped with it).

A similar experiment was carried out with a family tree: the program was given a sequence of formal sentences about family relationships in a large family, and it had to answer questions like "who is Masha's second cousin on her mother's side." Both tasks boil down to finding a path on a graph, which is fairly simple to solve in the traditional way. However, the value of the work lies in the fact that in this case the neural network found a solution completely independently, based not on algorithms known from mathematics, but on the basis of examples and a reinforcement system during training.

Graph of the speed of solving the SHRDLU problem by the DNC system (green) and LSTM (blue).

The third test was a slightly simplified "classic" SHRDLU test, in which you need to move some virtual objects around the virtual space in accordance with a definite final result that you need to get at the end. The DNC system again received a description of the current state of the virtual space in the form of formalized sentences, then in the same way it was given a task and it answered with a sequential text about how to move objects. As in the rest of the tests, DNC showed itself to be significantly more efficient than LSTM systems, which is clearly seen from the learning rate graphs.

At the risk of repeating the obvious things once again, I can't stress enough that the apparent simplicity of the tasks on which DNC was tested is really apparent. In the sense that it does not reflect the complexity of the real problems that a system like DNC will be able to handle in the future. Of course, from the point of view of existing algorithms, the task of finding a way in the subway is just nonsense - anyone can download an application to their phone that can do this. It will also calculate the time with transfers and indicate which carriage is better to sit in. But all such programs have so far been created by a person, and in DNC it is "born" by itself, in the process of learning by example.

In fact, there is one very important thing that I want to say about the simplicity of the test problems. One of the most important problems in machine learning is where to get the data on which the system could be trained. To receive this data "by hand", i.e. creating by yourself or with the help of hired people is too expensive. In any project on mathematics, you need a simple algorithm that could easily and cheaply create gigabytes of new data for training (well, or you need to get access to ready-made databases). A classic example: to test character recognition systems, people do not write new letters with their hands, but use a simple program that distorts existing images. If you do not have a good algorithm for obtaining a training sample (or, for example, such an algorithm cannot be created in principle), then the success in development will be about the same as that of medical bioinformatics, who are forced to work only with real ones and from that really " gold "data (in a nutshell: not very successful).

It was here that the authors of the article came in handy with ready-made algorithms for solving problems on a graph - just to get millions of correct pairs of questions and answers. There is no doubt that the simplicity of creating a training sample determined the nature of the tests used to test the new system. However, it is important to remember that the DNC architecture itself has nothing to do with the simplicity of these tests. Indeed, even the most primitive recurrent neural networks can not only translate texts and describe images, but also write or generate sketches (by ear, of course). What can we say about such advanced, really "smart" systems like DNC.

Alexander Ershov

Google buys London-based artificial intelligence company DeepMind. Sources say the deal is worth more than $ 500 million. Purchase officially confirmed by Google representatives.


What will Google benefit from this acquisition? First, it will allow it to compete with other big tech companies through its deep learning focus. For example, Facebook recently hired Professor Yann LeCann to lead its own AI development. IBM's Watson supercomputer is currently focusing specifically on deep learning, and Yahoo recently acquired a photo analysis startup LookFlow, which is also moving forward on this issue.

DeepMind was founded by neuroscientist Demis Hassabis, former chess prodigy, Skype and Kazaa developer Jaan Tallinn and researcher Shane Legg.

The move by Google will allow the tech giant's team to fill its own field of artificial intelligence experts, and the acquisition was personally overseen by Google CEO Larry Page, sources say. If all three founders work for Google, they will join inventor, entrepreneur, author, and futurist Ray Kurzweil, who became CTO of Google's machine learning and language processing division in 2012.

Kurzweil stated that he wants to build a search engine so perfect that it can become a true "cybernetic friend".

Following the acquisition of Nest earlier this month, critics raised concerns about how much user data would be sent to Google. The Boston Dynamics purchase last month also sparked a debate over whether Google plans to become a robot maker.

However, Google is well prepared to allay our concerns about its latest acquisitions. Sources say Google has decided to establish an ethics council that will oversee the development of artificial intelligence within DeepMind.

However, the company will have to clarify what exactly DeepMind's artificial intelligence is doing. The company's website currently has a relatively vague landing page that says DeepMind is a "front-line company" building the algorithms of the future for simulation, e-commerce and gaming. As of December, the startup has 75 employees.

The main sponsors of the startup are Founders Fund and Horizons Ventures. DeepMind was founded three years ago.

In 2012, Carnegie Mellon professor Larry Wasserman wrote that “a startup is going to create a system that thinks. I thought it was sheer madness until I found out how many famous billionaires have invested in the company. "