Follow ZDNET: Add us as a favorite source On Google.
ZDNET Highlights
- Establishing and evaluating governance is key to designing agents.
- Start small with agents rather than trying to replace the entire workflow.
- Clean, well-organized data makes all agency work smoother.
According to Microsoft CEO Mustafa Suleiman of AI, computing is on the threshold of “almost human-level agents”. A recent opinion column For MIT Technology Review.
But there are many obstacles in the way. Businesses are overwhelmed with trying to redesign their workflows and decide what information agents AI programs should have access to.
Also: 5 Myths of the Agentic Coding Apocalypse
A result of the challenges, database technology giant Databricks noted in its recent AI agents status reportIt’s that “only 19% of organizations have deployed AI agents, and mostly to a limited extent.”
“If you talk to many chief financial officers, they’ll tell you, ‘I have three concerns,'” Craig Wiley, head of AI for Databricks, told ZDNET.
“Can you control it, can you tell me if it’s good (meaning, does what comes out of the model actually provide value), and how much does it cost?”
To address those concerns, enterprises should think ahead, Wiley said. First Those agents implement three best practices:
- control it (govern)
- Evaluate for accuracy
- Start Small to Maximize Efficiency and Profit
Also: MIT study shows AI agents are fast, loose, and out of control
Can you control it?
“Can you control it?” This depends on the practice of governance, which starts with controlling what data an agent will access.
OneAn AI agent is an artificial intelligence program that can go beyond the simple turn-by-turn prompting offered by ChatGPT and similar bots. An agent can plug into corporate resources such as databases. It can execute computer code outside of code contained in a larger language model. It can implement external programs such as email systems. It can chain together multiple actions of different types to execute an entire workflow.
Also: How to build better AI agents for your business – without creating trust issues
The first rule in data access is to do no harm. Women’s health app Flow, a Databricks client, has 75 million users who use the app for personalized assessments and advice.
“The challenge they face is that they want to provide stronger and stronger feedback and advice and guidance and insight to their app users,” Wiley explains. “But they need to be incredibly careful because it’s very sensitive data, so the last thing they would want is for an app user to get a response that includes some other app user’s information.”
Also: I built an app for work in 5 minutes with Tasklets – and watched my no-code dreams come true
To protect against such data leakage, Wiley said, a governance system “must be able to very selectively say, ‘Hey, this device or this data, this is data that everyone can use; this is data that should only be used by the user.'”
Asset manager Franklin Templeton takes the same level of care when sending portfolio reports to clients. “The last thing I want (as a fund client) is to get an email from my financial advisor that’s about your (someone else’s) information,” he said.
“Often, we see customers get really excited about a use case, they start running it, and then they run into one of these walls where they say, oh, our questions or our responses should vary by user,” he said. “And it needs to be implemented, not just as suggested in the sign, but it needs to be implemented deterministic Compelled.”
connecting points in data
The next part of governance is to define the question and identify the resource that should be answered.
As Wiley framed the challenge, “How do I align my question with the right data to support my question with the right model to get that response?” The goal is to avoid making agentic AI programs a “transaction” like a chatbot, where a person is expected to keep asking another question.
Also: I Asked 5 Data Leaders How They Use AI to Automate – and End Integration Nightmares
Design the agent so that it can find multiple connected pieces of data so that the human user can automatically delve deeper into the topic.
Wiley cited edmondsOnline car-buying operation, which created an agentic information tool for internal use to efficiently manage car sales called Edmunds Mind. It was designed to combine many more aspects of a potential purchase.
Wiley explained, “Instead of just asking which car is the best convertible for sale and how much it costs, they can ask in a more comprehensive way which car dealerships are underserved, by looking at traffic data and demographic data on top of listing data and pricing data.”
Such an agent “potentially takes a number of steps to ensure that the responses are high-quality responses,” he said, so that “I (as the user) am not responsible for giving all the information to the model.”
To enforce governance, something called a data catalog does two things. First, it is a “single pane of glass” that lets an IT administrator see everything the agent has access to, including structured and unstructured data, model reference protocols for external tool calling, and the tools being deployed.
Also: Building an agentic AI strategy that pays off without the risk of business failure
Second, a catalog implements identity, which includes the identity of an agent and the information it accesses, as well as the identity of a user. The catalog tracks those identities throughout the agent’s activity to keep the data fragmented, so it can only be accessed by the agent and the user to the extent that their identity allows them.
Being mindful of administration from the beginning “as a first-class principle in your design” makes clients “more likely to involve agents in production than people who are just kind of freewheeling these things,” he said. “It really depends on that thoughtfulness of the design.”
How do you know it’s right?
The second element is to really think carefully about how to evaluate what comes out of the model.
When Flow’s app developers were “trying to bring accuracy, the people who were evaluating whether or not these agents were saying what they should be saying were actually physicians, not programmers. Software programmers write what’s called the orchestration system that manages the agents, but it’s the physicians who were saying, ‘This response here needs additional context or color or what have you,'” Wiley said.
Evaluation continues throughout the life of the program and at multiple levels, Wiley said. “Not only what was asked of the agent and what answer did he give, but at each intermediate stage of his thinking, what was he actually doing, and was he consistent with getting the right answer?”
Also: This AI expert says the job apocalypse isn’t going to happen, even if you’re a coder – here’s why
If something goes wrong, bring the agent back to the evaluation phase, redeploy, and “continue that loop so we can build the automated learning types of agents that I think people are really hungry for.”
Wiley said precision has enabled Flow to deliver an application to the market that is distinguished by the quality of the user experience. Broadly speaking, he said, like governance, companies that can evaluate agents’ output are six times more likely to get into production.
small is beautiful
The third concern, cost, is easier because it is the result of doing the first two things, governance and valuation, correctly. “Honestly, once you get those two things done, the rest of it becomes implementation details,” Wiley said.
But cost has to be considered from the beginning.
Also: The Rise and Risk of Agent Management Platforms
“It’s something we spend a lot of time talking to customers about,” Wiley said. “Is this something we can solve today inside a reasonable cost envelope? And assuming we can solve it inside a reasonable cost envelope, is this really going to move the needle at your company?”
“There is an important consideration with implementation,” Wiley continued, “and that is to start small and consider building at a pace at which agents can be controlled and verified. We are seeing companies of different levels of ambition, and ambition is great. (However), as with all software projects, the smaller and more atomic I can create individual pieces that I can test and verify work, then I can build them into a larger kind of federation of capabilities.” Which can do great work.
As an example of focus, Wiley cited the convenience store chain 7-Eleven, whose service technicians have to go on-site to repair equipment. When they don’t have the right manual, it’s either a wasted trip or a more complicated job than it should be.
“By having agents have access to a ton of documents, the company can provide a “super assistant” to techies,” Wiley said, “where they can look up every single issue that’s been filed against these machines, and every single manual and specification, and they’re no longer asking their friend, ‘Have you seen this problem before?’”
Also: True agentic AI is years away – here’s why and how we get there
Another example is Baylor University, which uses agents to review the recording of each call with a prospective student to analyze elements such as student decision factors for a school when the humans taking the calls do not have the time or energy to take extensive notes.
“They are now able to learn a lot about their organization by listening to their customers at a depth they were never able to before,” Wiley said.
Attempts to replace entire workflows with agents will likely be less successful, he said.
“If I were trying to replace my ERP or SaaS system that my organization uses, the last thing I would do would be to start with a prompt that said, Hey, I need a new general ledger system,” Wiley said. “I’ll go after it component by component.”
What is the payment?
It is still too early to have concrete figures for the industry’s financial return on investment from agents, Wiley said. “We’re probably sitting in the equivalent of 2001 on the Web, where companies are investing in their Web pages but haven’t really understood the purpose of it all yet.”
There are encouraging real examples. The automation of Franklin Templeton’s investment portfolio analysis enabled the firm to identify more than $15 million in new product opportunities such as gaps in a client’s portfolio.
Also: How This Travel Company’s AI Rollout Boosted 73% Satisfaction: A 5-Step Playbook for Your Business
Companies see their KPIs (key performance indicators) moving in the right direction, such as 7-Eleven seeing a 25% increase in first-time fix rates for appliances and a 40% drop in repair time, which can lead to cost savings.
The final element is the time required for conception, construction, and deployment. From Wiley’s perspective, this goes back to “making sure your data is clean and in the right place” at the beginning of agentic AI.
Organizing the data at the beginning will increase the “momentum” of the project, he said. “Then your software developers, data scientists, agent developers… if that’s the case, they’ll be able to move faster. ‘If your data is in good shape, we can do it this afternoon (meaning, building and deploying an agentic system). If your data is in bad shape, the real problem is how long will it take us to get your data in order.'”
