An organization dedicated to the safe development of artificial intelligence released a “breakthrough paper” it said will help humans better control the technology as it spreads.
“We can’t trust AIs if we don’t know what they are thinking or how they work on the inside,” Dan Hendrycks, director of the Center for AI Safety, told Fox News Digital.
Hendrycks made the comments after the Center for AI Safety (CAIS) released a paper this week diving into the inner workings of the mind of AI systems, looking for ways that humans could better understand and control and understand AI technologies and mitigate some of the risks they pose.
META MAY BE USING YOUR FACEBOOK, INSTAGRAM TO ‘FEED THE BEAST’ OF NEW TECH
According to the CAIS, the paper demonstrated ways humans can control and detect when AI systems are telling truths or lies, when they behave morally or immorally, whether they act with emotions such as anger, fear and joy, and how to make them less biased. The paper also looked at ways to develop systems that can resist jailbreaks, a practice where users can exploit vulnerabilities in AI systems and potentially use them outside desired protocols.
WHAT IS ARTIFICIAL INTELLIGENCE (AI)?
“Our research develops ways to read the inner thoughts of AIs, allowing us to detect when they are lying or malfunctioning in various ways,” Hendrycks said, noting current AI systems are “capable of deception and will lie or try to trick humans if given a reason to.”
“We show examples of this in our paper, and we develop tools for monitoring and controlling the internal activity of AIs to prevent this from happening,” Hendrycks said.
CAIS notes that modern AI systems have been notoriously difficult for humans to understand, which also makes it hard for users to understand AI decision-making. Those concerns have also been shared by Congress, with Senate Majority Leader Chuck Schumer, D-N.Y. calling AI explainability “one of the most important and most difficult technical issues in all of AI” in remarks at the Center for Strategic & International Studies earlier this year.
Hendrycks echoed those concerns, arguing an important aspect of the continued development of AI was to make sure humans have the tools to control the technology.
CLICK HERE TO GET THE FOX NEWS APP
“We’re forming a sort of ‘internal surveillance’ for AI systems, ensuring they aren’t trying to trick us,” Hendrycks said. “Deception in AI is a real concern, and our research is a key step towards providing tools to prevent these risks.”