Volume 2, Issue 2
2nd Quarter, 2007


Ethics for Machines

J. Storrs Hall, Ph.D.

This article was adapted from a lecture given by Josh Storrs-Hall, Ph.D., at the 2nd Annual Colloquium on the Law of Transbeman Persons, on December 10th, 2006 at the Space Coast Retreat of Terasem Movement, Inc., Melbourne Beach, FL.

In a detailed explanation, Dr. Josh Storrs-Hall, of the Institute for Molecular Manufacturing in Palo Alto, CA, describes what is currently referred to as ‘Artificial Intelligence’ as actually being human programmed, artificial skill, portraying the current responsible, moral agent as the programmer. A true Artificial Intelligence will possess the evolutionary edge of a (non-human programmed), self –interest and motive to improve its own ideas.

There is no such thing as artificial intelligence. Right now, there is no such thing as artificial intelligence in the sense that I use "intelligence". The programs that we actually have could more reasonably be called artificial skills.


Image 1: Stick-Built AI

If you have a human being who does the kind of things that some of these AI programs do, like playing chess or driving cars, the thing that the human did that was intelligent was learn to do the skill. In the case of an AI program, as they’re currently constituted, humans did that part too. They programmed the AI’s skill, and so the part of an AI program that does something that would be called intelligent, if people did it, was actually one by the programmers.

For all currently existing AI programs, the matter of ethical constraints is roughly the same as it is with any other piece of technology, because the moral agent involved is, in fact, the programmer.

"If the machine does something bad, the person who built it is the one with whom the moral blame rests."
However, probably, sometime in the next ten years the kinds of AIs that are truly intelligent in my sense of the word are going to appear, ones that can actually learn things and grow mentally and understand the world in new ways and do things that their creators not only couldn't specifically predict, but will understand the world in ways that their creators don't understand it. When that happens, we do have a problem that is beyond the range of the current assignment of credit or blame in standard technology.


Image 2: Autogenous AI

In order to talk about how an AI might work, I'm going to use a very, very simplified model and it's the same kind of model that is used in a chess playing program. It doesn't actually work for real minds or for real AIs because the world is too complicated to do it this way, but the sort of heuristics and other structures that are used in actual brains and algorithms are in some sense an approximation to this, and so we can use this simple model to illuminate what we can say about any future AI.


Image 3: Rational Architecture

The model is, that there is two parts to the intelligence. There is a world model which understands things, it knows what's going on, what exists, and it's able to make predictions about what's going to happen and what the results of its actions might be. And then we have a utility function which looks at some prediction of what might happen and says if this a good thing or a bad thing.

The way the simplified notion of the machine works is simply to say, okay, the world model predicts all the results of the various actions

that you might take, and the utility function just picks out which one's the best, and then the machine turns around and does that action.


Image 4: Learning Rational AI

So here's the problem of the learning AI in a nutshell. The learning AI can update its world model by doing essentially science. It goes out and it learns new facts and it creates theories and it tests the theories that it has, the new knowledge that it's trying to gain, based on how well they predict what actually happened. So it has a basis for updating the world model in its predictive ability.

On the other hand, it does not have a basis to update the utility function, because the only way it has to understand what's good or bad is its old utility function.


Image 5: Vernor Vinge

In his original paper about the singularity published in the early '90s, Vernor Vinge [1] wrote, and the key part of this quote is,

   "The mind itself must grow and when it becomes great
   enough and looks back ... what fellow-feeling can it have
   with the soul that it was originally?"


I you envision a process of continued growth in a mind that involves new understanding of the world, at some point it's going to become so much different from what it was originally that the very concepts that it uses to judge what's good and what's bad may no longer apply to its new understanding.

We've all gone through this. We were all children once. Our moral basis was something like mine, mine, mine, and now that we've grown, we understand the world differently and we understand it better, we think, and our moral judgments are much more subtle.

So we want to be able to build machines that can undergo this same kind of moral growth as well as intellectual growth, but again, we come back to the question of, how do we update the utility function?

My notion about how to address this problem is sort of (because I was trained as a mathematician), a mathematical one. We have to find properties of the mind that are invariant across the evolutionary process, and when I say evolutionary process, I mean not only what happens in selection and variation in the Darwinian process where you have different organisms that are competing, but also in the evolutionary process of one machine improving itself. That some people refer to as recursive self-improvement, but basically the notion is it doesn't matter if you're building a new AI that's your offspring or you're building a new AI that's going to be yourself next time, there are some properties that we have need to find in order to build machines that will stay be moral in some sense, that are invariant, that don't change across the process of a machine redesigning itself.

Next Page

Footnote

[1] Vernor Vinge – born October 2, 1944 in Waukesha, Wisconsin, is a retired San Diego State University Professor of Mathematics, computer scientist, and science fiction author. He is best known for his Hugo Award-winning novels, A Fire Upon The Deep(1992) and A Deepness In The Sky (1999), as well as for his 1993 essay “The Coming Technological Singularity”, in which he argues that exponential growth in technology will reach a point beyond which we cannot even speculate about the consequences.
http://en.wikipedia.org/wiki/Vernor_Vinge   March 20, 2007 11:20AM EST

1 2 3 next page>