Yahoo Search Búsqueda en la Web

Resultado de búsqueda

  1. 21 de may. de 2024 · Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse - including in scenarios of catastrophic risk. It’s therefore particularly interesting that, in addition to the aforementioned scam emails feature, we found features corresponding to:

  2. 22 de may. de 2024 · Los investigadores de la empresa han identificado la combinación de neuronas artificiales que significan rasgos tan dispares como burritos, puntos y comas en código de programación y –en gran...

  3. 21 de may. de 2024 · Researchers at the A.I. company Anthropic claim to have found clues about the inner workings of large language models, possibly helping to prevent their misuse and to curb their potential threats.

  4. 23 de may. de 2024 · Anthropics Generative AI Research Reveals More About How LLMs Affect Security and Bias. Published May 23, 2024. Written By Megan Crouse. Anthropic opened a window into the ‘black box’...

  5. Hace 3 días · Now though, a team from Anthropic has made a significant advance in our ability to parse what’s going on inside these models. They’ve shown they can not only link particular patterns of activity in a large language model to both concrete and abstract concepts, but they can also control the behavior of the model by dialing this activity up or down.

  6. 21 de may. de 2024 · Since then, scaling sparse autoencoders has been a major priority of the Anthropic interpretability team, and we're pleased to report extracting high-quality features from Claude 3 Sonnet, 1 Anthropic's medium-sized production model. We find a diversity of highly abstract features. They both respond to and behaviorally cause abstract behaviors.

  7. Hace 1 día · Anthropic is the smallest, youngest, and least well-financed of all the “frontier” AI labs. It’s also nurturing a reputation as the safest.