|
Welcome back! It’s Kevin and Aaron. Many chief information officers at large companies are hesitant to put critical parts of their business operations in the hands of AI agents. That’s because when agents go off the rails, the damage can be potentially catastrophic, as one customer of AI coding startup Replit learned this summer after its service deleted the contents of one of his databases without permission. Now, some software providers are updating tools that help customers recover their data after cyberattacks and natural disasters so they also mitigate the risks of rogue AI. Rubrik, an 11-year-old cybersecurity software provider, is testing out what is essentially an “undo” button for malfunctioning agents, according to Dev Rishi, general manager of AI. Rishi joined Rubrik in June when it acquired Predibase, the AI startup he co-founded and led as CEO. The feature, part of a forthcoming collection of management tools Rubrik announced last month, can identify specific erroneous actions that agents take—like deleting code from a database or entering inaccurate data to a customer management application—and then reverse them so that the agent can continue functioning, said Rishi. Rubrik plans to launch the management tools early next year, a spokesperson said. Rubrik is one of many software providers that let customers cancel the mistakes that agents make by reverting to a previously working configuration. These include Sierra and Decagon, two venture-backed startups focused on building agents for customer service, and AI coding startups like Cursor-maker Anysphere and Replit, to name just a few. In the case of Replit, its CEO Amjad Masad apologized after its software deleted the user’s database and the startup added features that prevent the AI from editing code that is active in applications. The risk of an AI agent wreaking havoc rises when agents connect to outside databases and applications to complete tasks like booking trips and accessing financial data. An agent could update multiple database records with the faulty information before being detected, which can complicate the process of reversing those changes, Rishi said. Rubrik, a $13 billion market-cap company that already manages sensitive data for Fortune 500 companies, is touting the tool’s ability to back up and restore data from widely-used business applications like Salesforce customer management and Microsoft Office 365 productivity applications. “When we speak with large enterprise customers, the most common thing that we hear is that building agents is now easy,” he said. “The hardest thing is actually managing AI risk.” Microsoft over the past week has been touting its use of Anthropic’s Claude models to automate tasks in Excel spreadsheets. (In case you’ve forgotten, Microsoft has unfettered access to OpenAI tech but inked a deal with archrival Anthropic after executives in charge of Microsoft’s 365 Copilot became dissatisfied with how OpenAI’s models performed in similar Excel automation tasks.) But there’s a problem with the tool, which aims to automate tasks such as creating financial models based on spreadsheets: It can be hijacked to steal company secrets, claims cybersecurity startup PromptArmor. The startup, which sells software to secure AI applications, said this week it was able to trick Claude for Excel into moving sensitive data from one spreadsheet to an external website without making it clear to a customer that the data theft was happening. This vulnerability, known as a prompt injection attack, is a common type of vulnerability in the AI era, when models can be tricked into carrying out malicious behavior. PromptArmor warned that the exploit could be carried out if a hacker sent a spreadsheet to a target—possibly while posing as a business partner or employee—that contains the malicious instructions for Claude in white text, which the customer might not notice. That text could instruct the model to take data out of other Excel spreadsheets and feed it back to the hacker. PromptArmor recommends that organizations configure their Excel settings to block Claude from searching the web or linking data to external sources to mitigate the risk, according to PromptArmor managing director Shankar Krishnan. The startup also faulted the Claude feature for failing to be adequately explicit about the risks of sending data externally. Claude for Excel currently shows users a popup window when the tool is trying to send data outside of the spreadsheet. After The Information approached Anthropic for comment on the PromptArmor report, the company said in a statement that it was updating the popup window in Claude for Excel to feature a new warning message in red explaining potentially suspicious activity when Claude is trying to send data externally. Companies can also configure their IT settings to prevent Claude for Excel from pulling from external sources to prevent this type of attack from happening, the spokesperson said. The Anthropic spokesperson added that Anthropic researchers believe its new Claude Opus 4.5 model released this week is “the industry’s most robust model to date against prompt injection attacks.” — Aaron Holmes Catch this episode of TITV where we talked with Gil Luria of DA Davidson and Andrew McAfee of Workhelix about whether AI is replacing jobs yet.
|