The equitable-freed AI Safety Index graded six directing AI companies on their hazard appraisement efforts and shieldedty procedures… and the top of class was Anthropic, with an overall score of C. The other five companies—Google DeepMind, Meta, OpenAI, xAI, and Zhipu AI—getd grades of D+ or reduce, with Meta flat out fall shorting.
“The purpose of this is not to shame anybody,” says Max Tegtag, an MIT physics professor and pdwellnt of the Future of Life Institute, which put out the alert. “It’s to provide incentives for companies to fortify.” He hopes that company executives will watch the index enjoy universities watch the U.S. News and World Reports rankings: They may not endelight being graded, but if the grades are out there and getting attention, they’ll sense driven to do better next year.
He also hopes to help researchers toiling in those companies’ shieldedty teams. If a company isn’t senseing outer presdeclareive to encounter shieldedty standards, Tegtag says,“then other people in the company will equitable watch you as a nuisance, someone who’s trying to sluggish skinnygs down and throw gravel in the machinery.” But if those shieldedty researchers are suddenly reliable for improving the company’s reputation, they’ll get resources, admire, and affect.
The Future of Life Institute is a nonprofit pledgeted to helping humanity ward off truly terrible outcomes from strong technologies, and in recent years it has centered on AI. In 2023, the group put out what came to be understandn as “the pause letter,” which called on AI labs to pause growment of evolved models for six months, and to use that time to grow shieldedty standards. Big names enjoy Elon Musk and Steve Wozniak signed the letter (and to date, a total of 33,707 have signed), but the companies did not pause.
This novel alert may also be disthink aboutd by the companies in ask. IEEE Spectrum accomplished out to all the companies for comment, but only Google DeepMind reacted, providing the complying statement: “While the index includes some of Google DeepMind’s AI shieldedty efforts, and mirrors industry-adselected benchtags, our comprehensive approach to AI shieldedty prolongs beyond what’s apprehendd. We remain pledgeted to continuously evolving our shieldedty meadeclareives aprolongedside our technoreasonable evolvements.”
How the AI Safety Index graded the companies
The Index graded the companies on how well they’re doing in six categories: hazard appraisement, current harms, shieldedty structuretoils, ainhabitial shieldedty strategy, ruleance and accountability, and transparency and communication. It drew on disclosely useable alertation, including roverdelighted research papers, policy records, novels articles, and industry alerts. The appraiseers also sent a asknaire to each company, but only xAI and the Chinese company Zhipu AI (which currently has the most able Chinese-language LLM) filled theirs out, increaseing those two companies’ scores for transparency.
The grades were given by seven autonomous appraiseers, including huge names enjoy UC Berkeley professor Stuart Russell and Turing Award triumphner Yoshua Bengio, who have shelp that superacute AI could pose an ainhabitial hazard to humanity. The appraiseers also included AI directers who have centered on proximate-term harms of AI enjoy algorithmic bias and harmful language, such as Carnegie Mellon University’s Atoosa Kasirzadeh and Sneha Revanur, the set uper of Encode Justice.
And overall, the appraiseers were not astonished. “The discoverings of the AI Safety Index project advise that although there is a lot of activity at AI companies that goes under the heading of ‘shieldedty,’ it is not yet very effective,” says Russell.“In particular, none of the current activity provides any comfervent of quantitative promise of shieldedty; nor does it seem possible to provide such promises given the current approach to AI via huge bconciseage boxes trained on unimaginably immense quantities of data. And it’s only going to get challenginger as these AI systems get hugeger. In other words, it’s possible that the current technology straightforwardion can never help the vital shieldedty promises, in which case it’s repartner a dead end.”
Anthropic got the best scores overall and the best definite score, getting the only B- for its toil on current harms. The alert notices that Anthropic’s models have getd the highest scores on directing shieldedty benchtags. The company also has a “reliable scaling policy“ mandating that the company will appraise its models for their potential to cause catastrophic harms, and will not deploy models that the company appraises too hazardous.
All six companies scaled particularly awfilledy on their ainhabitial shieldedty strategies. The appraiseers noticed that all of the companies have proclaimd their intention to produce synthetic vague inalertigence (AGI), but only Anthropic, Google DeepMind, and OpenAI have articutardyd any comfervent of strategy for ensuring that the AGI remains aligned with human appreciates. “The truth is, nobody understands how to administer a novel species that’s much inalertigaccess than us,” Tegtag says. “The appraise panel felt that even the [companies] that had some sort of punctual-stage strategies, they were not ample.”
While the alert does not rehire any adviseations for either AI companies or policyproducers, Tegtag senses sturdyly that its discoverings show a evident necessitate for regulatory oversight—a rulement entity equivalent to the U.S. Food and Drug Administration that would upgrasp AI products before they accomplish the taget.
“I sense that the directers of these companies are trapped in a race to the bottom that none of them can get out of, no matter how comfervent-hearted they are,” Tegtag says. Today, he says, companies are unwilling to sluggish down for shieldedty tests because they don’t want competitors to beat them to the taget. “Whereas if there are shieldedty standards, then instead there’s commercial presdeclareive to see who can encounter the shieldedty standards first, because then they get to sell first and produce money first.”
From Your Site Articles
Roverdelighted Articles Around the Web