The First AI Gadgets Are a Cautionary Tale

→ Оригинал (без защиты от корпорастов) | Изображения из статьи: [1] [2] [3]

screen time Apr. 18, 2024

By John Herrman, a contributing writer who covers technology at Intelligencer. Formerly, he was a reporter and critic at the New York Times and co-editor of The Awl.

Photo-Illustration: Intelligencer; Photo: Getty Images

The sudden explosion of AI products has been, for the most part, a story told through software. Consumer AI is all about chatbots, media generators, plug-ins, and new features installed into apps that people already use. Over the past year, though, start-ups and bigger tech firms have been trying to figure out what an AI device might look like, and the first few attempts are hitting the market. There are the Ray-Ban Meta smart glasses, which use AI for voice commands as well as translation and object recognition, available since late last year. There are the Brilliant smart glasses, which claim to tap into AI services to let wearers "receive answers to questions about what you're currently looking at, experience live translation from either speech or text, and query the internet real-time, shipping imminently." There's the Rabbit R1, a small MP3-player-ish device that's intended to function, through its scroll wheel, camera, and voice control, as a "universal controller for apps" on your phone. Apple's former design chief Jony Ive and OpenAI are reportedly raising funds to create an "iPhone of artificial intelligence," whatever that might mean.

Then there's the Humane AI Pin, a clip that snaps onto your shirt with a magnet. It's got a camera, a microphone, a speaker, and a small projector that throws a gesture interface on your palm, for an alternative to voice commands. Along with the Rabbit, it's an interesting and novel piece of hardware, a device with no obvious precedent in consumer electronics and a number of thoughtful new features and design elements, suggesting the arrival of what David Pierce at the Verge describes as the "AI hardware revolution" — a period in which companies are designing consumer technology around a new set of assumptions about what computers can do. Humane's AI Pin, which has its own wireless connection and doesn't interact with users' other devices, is a bet that, in the chatbot era, people might want to get rid of their smartphones altogether. It's the most ambitious gadget of its kind, with hundreds of millions of dollars in funding and support from OpenAI's Sam Altman, and also one of the first to market. How is it?

Not great. Reviewers were impressed by aspects of the device's design but complained about short battery life and a tendency to overheat. The biggest problem was that the device's core functionality — the AI part — just didn't work that well. It was slow, unreliable, and conceptually sort of broken. The gap between what a reasonable person might expect a conversationally fluent "smart" device to do and what the Pin can actually help with is massive. YouTuber Marques Brownlee, who is easily the most influential gadget reviewer in the world, titled his take: "The Worst Product I've Ever Reviewed."

Meta's less ambitious smart Ray-Bans have been reviewed more positively — they're subtle, the camera is pretty good, voice commands can produce useful responses, and its image-recognition and translation capabilities are impressive. But these relatively positive reviews come with versions of the same caveats, like this one from the New York Times:

Meta's A.I.-powered glasses offer an intriguing glimpse into a future that feels distant. The flaws underscore the limitations and challenges in designing this type of product … And no matter where we were, it was awkward to speak to a virtual assistant in public. It's unclear if that ever will feel normal.

"[T]he fact that Meta's A.I. can do things like translate languages and identify landmarks through a pair of hip-looking glasses shows how far the tech has come," the reviewers said, noting that its failures and hallucinations were often more funny than frustrating — its AI features are a tech demo attached to a device people might want to buy to take videos or listen to music, and if they don't work, you still have a pair of Ray-Bans. In contrast, the Humane Pin exists solely to interact with AI, so when that AI can't do what the user expects, failures aren't charming at all. They make you wonder why the device exists in the first place.

Which is a good question! The explosion of interest in AI has produced a widespread assumption that new archetypal hardware forms are imminent and necessary and that their discovery is a massive opportunity for the taking. It's broadly intuitive: Technologists talk about AI in generational terms, and smartphones have been around for as long as laptops had been when they first came out. Smartphones and computers are built around certain ideas about how people can and do interact with machines — by typing, touching, and reading — and maybe software that can "talk" and "listen" and "see" opens up new forms of interaction that necessitate whole new forms of hardware. This is often paired with a related assumption that the pace of development of AI will continue to accelerate, and with it, AI hardware will improve. Brownlee's pan of the Humane ends with a "… for now," and Meta's Ray-Bans didn't get most of their current AI features until after launch, with more to come.

But these assumptions might be flawed. The public's first year or so spent with popular AI software was mostly about low-stakes experimentation — messing with chatbots, playing with image generators, and watching other people do the same. This was good marketing for AI in general, both demonstrating its capabilities and sustaining a sense of momentum and acceleration, and it dovetailed nicely with messaging from tech leaders that AI would soon change everything, fast. It also minimized flaws: While lots of people use ChatGPT, and many pay for it, there aren't yet that many people who can truly say they depend on it and who would find themselves in a bad spot, or meaningfully thwarted, if it failed at a specific task. If it does what you want, it's a delight. If it doesn't, that's annoying, but you can still use Google. On first encounter, it puts on a convincing performance of personhood; on further use, users separate illusion from function and narrow their expectations; used more specifically for work, its performance as an "assistant" character gradually becomes irrelevant, and you start to think about it as a tool with a set of relevant uses and limits specific to your needs. Post-ChatGPT, the dynamic has flipped: The broader the tool and its implicit promises, the bigger a target it is for backlash, and the more likely users are to find it underwhelming, disappointing, or both.

Accordingly, attempts by Google and Microsoft to commercialize AI have manifested as a wide range of specialized tools added to existing software — widgets and prompts and features added to productivity tools, communications software, search engines, and social networks. AI's general-purpose debut, in other words, was somewhat misleading about where things would be going next. AI was going to become more capable, but users were also going to expect less of it.

AI hardware — and especially the Humane AI Pin, which positions itself as a general-purpose assistant — resets and raises these expectations in a disastrous way. ChatGPT performed the impressive but limited role of a stranger in a chat window, and it benefited from how comparatively unconvincing previous chatbots were. The Humane AI Pin performs the role of a companion who is with you, who can see and hear what you can see and hear and who is offering to help, and it suffers from comparison with smartphones, which are comparatively quite capable. It's positioned like something you should be able to ask just about anything; in reality, you can't ask it about very much, and it's frequently wrong. It appears to be sort of broken, sure, but it also suffers for being an unusually direct encounter with LLM-powered AI, which, despite its fluency in conversation, either has a long way to go or is constitutionally ill suited to some of the tasks at which it seems like it might work. From the Verge:

In general, I would say that for every successful interaction with the AI Pin, I've had three or four unsuccessful ones. I'll ask the weather in New York and get the right answer; then, I'll ask the weather in Dubai, and the AI Pin tells me that "the current weather in Dubai is not available for the provided user location in New York." I'll ask about "the thing with the presidents in South Dakota," and it'll correctly tell me I mean Mount Rushmore, but then it will confidently misidentify the Brooklyn Bridge as the Triborough Bridge. And half the time — seriously, at least half — I don't even get an answer. The system just waits, and waits, and fails.

This isn't great. But I should say, as someone who has tried to keep up with the state of the art in general-purpose AI tools, it sounds about right — there are tons of tasks for which the current generation of AI chatbots is plainly ill suited, and however else this hardware might be materially or conceptually flawed, it also writes a check that the current and near-future AI software can't cash. You don't expect Google Gemini or ChatGPT or your meeting software's chat assistant to answer a genuinely wide variety of context-dependent questions about the world accurately or with human intuition, in part because that would be unreasonable and unrealistic, but also because it's not hanging from your shirt, suggesting that it can.

The First AI Gadgets Are a Cautionary Tale