Meta just launched its AI assistant on Instagram, WhatsApp, Messenger and Facebook as if firing a missile into a corporate battlefield. And the media is playing along, recounting a “battle with ChatGPT.” But this chatbot war won’t make the history books like an actual robot war. When Meta, OpenAI, Microsoft and others make ostensibly competitive moves, they aren’t fighting for control over a source of great power. Rather, it’s largely theatrical, a battle for attention and stature.

After all, the field of generative AI is nothing if not a cosmetic play. Instead of demonstrating concrete, proven value, it mostly promotes itself with grand visions of limitless potential.

But even though chatbots like Meta’s AI assistant and ChatGPT are easier to use than other forms of AI, they’re harder to use well—in a way that generates measurable value for an enterprise. Other kinds of AI may not enjoy the same degree of user-friendliness, but some, such as predictive AI, often deliver higher returns than genAI.

Given today’s abundance of sizzle without sufficient steak, consternation is growing. The Washington Post says, “The AI hype bubble is deflating,” journalists “struggle to find examples of transformative change,” investors perceive a backlash against genAI’s overpromises and plenty of others agree.

Even studies intended to demonstrate genAI’s value sometimes discover that it fails. Stanford researchers, bullish on the technology, studied the productivity of teams using genAI to work on business problems such as how to scale up B2B sales, and compared them to teams working without an AI assistant. Much to their surprise, the researchers found that using genAI led to more average ideas, in part because most of the data it’s trained on naturally reflects common, “inside the box” thinking—and in part because humans using genAI may overly trust it and put in less cognitive effort themselves.

But these researchers remain bullish, suggesting new guidance for how to best use this new technology.

Benchmark GenAI To Establish Its Concrete Value

With genAI, we usually don’t know what its returns might be, because genAI projects simply neglect to benchmark. But if you aren’t measuring value, you aren’t pursuing value. Only by assessing the business gains—or a lack thereof—do you receive the feedback needed to navigate the project successfully. Companies can measure gains as efficiency improvements, such as time savings or increased productivity.

One factor delaying benchmarking is that, if you do stress test your project, you might be seen as a party pooper. AI hype can be intoxicating, and nothing kills the buzz of fantasizing about immense future potential more than a sober read on today’s value. In gauging its performance, you might discover that genAI is only good rather than great, a far cry from the dream of game-changing AI.

But a proven win is better than a pipe dream. Hold yourself up to the rare standards of, for example, Ally Financial, the largest all-digital bank in the U.S. It reported that “marketers were able to reduce the time needed to produce creative campaigns and content by up to 2-3 weeks, resulting in an average time savings of 34%.”

Or follow the lead of one uncommon Fortune 500 software firm studied by MIT Sloan. By using a conversational assistant, the company’s customer support team increased the number of issues resolved per hour by 14%—and the increase was higher, 34%, for novice and low-skilled workers.

This benchmarking is unusual. Other companies like Airbnb, Intuit and Motorola report that they’re just beginning to measure genAI’s value, but have yet to report what they’ve found.

Such successes come in part by applying genAI prudently. For example, it can often generate useful first drafts for rote tasks, such as certain customer support activities. In contrast, it frequently results in generic content that’s too obvious or cliché to help with higher-order writing tasks such as in journalism, where it may instead be best used for copyediting or preliminary research (so long as it’s manually fact-checked). In general, any attempt to utilize genAI involves an ad hoc, experimental process. We live in the Wild West of genAI, which is untamed and unpredictable. Its value is not guaranteed.

And yet, even if your genAI initiative proves valuable, you’re likely to find that it still doesn’t deliver the revolutionary wins that industry leaders would have us believe possible. The current hype wave continues a longstanding tradition of AI theatrics. AI has always hung its hat on the tantalizing yet relentlessly nebulous word “intelligence.” AI overall, and genAI more specifically, leverages a baked-in excitement seeded by decades of science fiction and breathless AI speculation.

And genAI offers an instant appeal that’s arguably broader than that of any other technology: Anyone can intuitively interact with it in English or other languages (though critics say it should support more). But the often-heard story that technology is well on its way to general human-level capabilities is unfounded.

The best way to defend against genAI disillusionment is to benchmark its business performance. Rather than indulging in the intoxicating narrative of machine “intelligence,” focus on credible use cases that deliver concrete value—and measure that value to keep your projects on course.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *