The term was coined by Eliezer Yudkowsky,[1] who is best known for popularizing the idea,[2][3] to discuss superintelligent artificial agents that reliably implement human values.
Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the golem, or the proto-robots of Gerbert of Aurillac and Roger Bacon.
In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict.
[12] In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI';[13] nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.
[18] In his book Human Compatible, AI researcher Stuart J. Russell lists three principles to guide the development of beneficial machines.
He urges AI researchers to convene a meeting similar to the Asilomar Conference on Recombinant DNA, which discussed risks of biotechnology.
McGinnis notes that his proposal stands in contrast to that of the Machine Intelligence Research Institute, which generally aims to avoid government involvement in friendly AI.
[21] Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and Nick Bostrom’s proposal to create friendly AIs appear to be bleak.
Adam Keiper and Ari N. Schulman, editors of the technology journal The New Atlantis, say that it will be impossible ever to guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power.
[23] The inner workings of advanced AI systems may be complex and difficult to interpret, leading to concerns about transparency and accountability.