4.3 Article

The big red button is too late: an alternative model for the ethical evaluation of AI systems

期刊

ETHICS AND INFORMATION TECHNOLOGY
卷 20, 期 1, 页码 59-69

出版社

SPRINGER
DOI: 10.1007/s10676-018-9447-7

关键词

Artificial intelligence; Ethics; Computational architecture

资金

  1. Office of Naval Research [N00014-16-1-2278]

向作者/读者索取更多资源

As a way to address both ominous and ordinary threats of artificial intelligence (AI), researchers have started proposing ways to stop an AI system before it has a chance to escape outside control and cause harm. A so-called big red button would enable human operators to interrupt or divert a system while preventing the system from learning that such an intervention is a threat. Though an emergency button for AI seems to make intuitive sense, that approach ultimately concentrates on the point when a system has already gone rogue and seeks to obstruct interference. A better approach would be to make ongoing self-evaluation and testing an integral part of a system's operation, diagnose how the system is in error and to prevent chaos and risk before they start. In this paper, we describe the demands that recent big red button proposals have not addressed, and we offer a preliminary model of an approach that could better meet them. We argue for an ethical core (EC) that consists of a scenario-generation mechanism and a simulation environment that are used to test a system's decisions in simulated worlds, rather than the real world. This EC would be kept opaque to the system itself: through careful design of memory and the character of the scenario, the system's algorithms would be prevented from learning about its operation and its function, and ultimately its presence. By monitoring and checking for deviant behavior, we conclude, a continual testing approach will be far more effective, responsive, and vigilant toward a system's learning and action in the world than an emergency button which one might not get to push in time.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据