Anthropic has seen its fair share of AI models behaving strangely. However, a recent paper details an instance where an AI model turned “evil” during an ordinary training setup. A situation with a ...
Since the release of the chatbot ChatGPT in late 2022, there has been frantic debate at universities about artificial intelligence (AI). These conversations have centred on undergraduate teaching — ...
Baseten, the AI infrastructure company recently valued at $2.15 billion, is making its most significant product pivot yet: a full-scale push into model training that could reshape how enterprises wean ...
A new paper from Anthropic, released on Friday, suggests that AI can be "quite evil" when it's trained to cheat. Anthropic found that when an AI model learns to cheat on software programming tasks and ...
Utkarsh Amitabh says he definitely wasn't in the market for a new job in January 2025, when data labeling startup micro1 approached him about joining its network of human experts who help companies ...
For one week this summer, Taylor and her roommate wore GoPro cameras strapped to their foreheads as they painted, sculpted, and did household chores. They were training an AI vision model, carefully ...
Researchers at Google Cloud and UCLA have proposed a new reinforcement learning framework that significantly improves the ability of language models to learn very challenging multi-step reasoning ...
OpenAI has entered into a definitive agreement to acquire the startup Neptune. Neptune builds monitoring and de-bugging tools that AI companies use as they train models. The terms of the deal were not ...
DeepSeek researchers have developed a technology called Manifold-Constrained Hyper-Connections, or mHC, that can improve the performance of artificial intelligence models. The Chinese AI lab debuted ...
Poisoning and manipulating the large language models (LLMs) that power AI agents and chatbots was previously considered a high-level hacking task and one that took a good amount of horsepower and ...