SIAI's Guidelines for building 'Friendly' AI

The Singularity Institute for Artificial Intelligence (SIAI) recently published v1.0 of its guidelines for designing 'Friendliness' into intelligent systems.

AI, and especially self-improving systems - Seed AI – may well achieve a 'critical mass' of intelligence from which their ability could grow hyper-exponentially over a short period of time. Such sudden growth obviously poses some risks. In fact, even without a 'hard take-off', we should carefully consider the possibility of future, more autonomous AI systems acting in unanticipated goal-directed ways not amicable to our well-being.

The question of our relationship with truly intelligent machines is a crucial one, and I applaud and support SIAI for its work in this area.

Several issues come to mind in evaluating the Guidelines:

Most fundamentally: Are the Guidelines necessary? Will machines ever have 'a will of their own', or will they always remain subservient to our purpose and goals? (Neither the Guidelines, nor these comments address the substantial problem of AIs specifically programmed or instructed to be malevolent).

My present view on this is agnostic. It may well turn out that any super-intelligence will inherently be benevolent towards us. Or, that it remains neutral, with no goals other than those of its designer/ operator. On the other hand, I do acknowledge the possibility of independent rogue AI. Obviously, we should err on the side of caution.

The next question then: Are SIAI's Guidelines theoretically sound? I have a number of grave reservations on several assumptions and conclusions of the AI design underlying the guidelines, as well as the guidelines themselves. However, allowing for a convergence of ideas, I want to move to a more practical question:

Can the Guidelines be implemented? Currently only a few dozen AI researchers/ teams (worldwide!) are actually focusing on theoretical or practical aspects of achieving general, human-level machines intelligence. Even fewer claim to have a reasonably comprehensive theoretical framework for achieving it.

The practicality of implementing the Guidelines must be assessed in the context of specific design proposals for AI. It would be valuable to have feedback from all the various players: both on their overall view on the need for (and approach towards) 'Friendliness', and also whether implementing SIAI's guidelines would be compatible with their own designs.

The Guidelines' eight design recommendations in the light of my theory of mind/ intelligence:

1) Friendliness-topped goal system - Not possible: My design does not allow for such a high-level 'supergoal'.

2) Cleanly causal goal system - Not possible: requires 1)

3) Probabilistic supergoal content – Inherent in my design: All knowledge and goals are subject to revision.

4) Acquisition of Friendliness sources – While I certainly encourage the AI to acquire knowledge (including ethical theory) compatible with what I consider moral, this does not necessarily agree with what others regard as desirable ethics/ Friendliness.

5) Causal validity semantics – Inherent in my design: One of the key functions of the AI is to (help me) review and improve its premises, inferences, conclusions, etc. at all levels. Unfortunately, this ability only becomes really effective once a significant level of intelligence has already been reached.

6) Injunctions - This seems like a good recommendation, however it is not clear what specific injunctions should be implemented, how to implement them effectively, and to what extent they will oppose other recommendations/ features.

7) Self-modeling of fallibility - Inherent in my design. This seems to be an abstract expression of point 3)

8) Controlled ascent - Good idea, but may be difficult to implement: It may be hard to distinguish between rapid knowledge acquisition, improvements in learning, and overall self-improvement (ie. substantial increases in intelligence).

SIAI's Guidelines for building 'Friendly' AI

Peter Voss, June 2001