Sitting in the lobby of the auto body shop waiting for a repair estimate, I realized I’d forgotten my earbuds. Normally, that’s not a major issue, but I was talking to my phone. And I wasn’t talking to another person. I was talking to ChatGPT. It felt as embarrassing as asking Siri a question from across the room or joining a Zoom meeting sans headphones in an open office.
I’m testing the advanced voice mode that comes with GPT-5, OpenAI’s latest version of the generative AI model behind ChatGPT. GPT-5 dropped this summer after many months of speculation and delays, promising AI users a faster and smarter chatbot experience. The jury’s still out on whether or not OpenAI has delivered. (Disclosure: Ziff Davis, CNET’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)
GPT-5 includes improvements to its advanced voice mode, which is essentially a way for you to literally talk to ChatGPT and have it respond in the voice of your choosing. Free users like me now have access to the advanced version (free users previously only had access to basic voice mode), and paying subscribers will receive higher usage limits. Another new GPT-5 feature allows you to choose what kind of personality you want your AI to mimic, including sassy, nerdy and robotic avatars.
To use voice mode, open ChatGPT, tap the audio button next to the prompt window where you would enter an instruction and begin chatting. You can change which voice ChatGPT uses by tapping the settings icon in the upper right hand corner on the mobile app (two bars stacked on top of each other with circles on them).
More human AI voices? How my experience went
I decided to try to speak to ChatGPT like I would a friend, like a more enthusiastic version of myself. The AI laughed when I started the call with a spirited “Heyyyy girlfriend!” which felt both funny and condescending.
ChatGPT’s voice flowed very naturally in a familiar cadence, similar to the way I would talk to a particularly friendly customer service agent. That made sense as the chatbot itself told me that the upgraded advanced voice mode helped make it sound more human.
The voice I used, ember, would often take pauses for breaths, like a human would during a longer sentence. I thought that was kind of weird, since while ChatGPT was doing its best impression of a human, we both knew it didn’t actually need to pause to catch its breath.
In my conversation with ChatGPT, it was more empathetic than I expected. It asked me how I was doing, and I said not well and told it about my car accident. In our five-minute chat, it would bookend many of its responses with empathetic statements, like saying it was sorry I was having a bad week and agreeing that dealing with insurance can be a headache. (Has ChatGPT ever had to call an insurance agent or even experienced a headache? I think not).
While a sympathetic robot ear might not seem like a big deal, it can be a sign of a bigger problem. Sycophantic AI, the term used to describe when AI is overly affectionate or emotional, can be frustrating for users just looking for information. It can also be dangerous for people who use AI as therapists or mental health counselors, something OpenAI CEO Sam Altman has warned ChatGPT users against. Previous versions of ChatGPT have been pulled and re-released after issues with sycophantic tendencies.
I also asked ChatGPT more factual questions, like the average cost of car repair labor in North Carolina and where I could go to get a second repair estimate. It responded more like a friend would than a chatbot, which may not be the most helpful. For example, when I typed the same request into ChatGPT on my laptop, it pulled up a map with the list of stores, along with more information like pricing info and store hours. But when I was chatting with ChatGPT voice mode, it brought up fewer options and described them based on what I assume are the shop’s marketing language and customer reviews, using phrases like “They’ve been around for quite a while” and saying that one shop is “known for quality service”. You also don’t get any links or sources with voice mode, which I don’t love.
ChatGPT automatically transcribes voice chats, so you can see the difference in the level of detail given in regular text prompts (left) and voice chats (right).
Using ChatGPT voice as a sounding board
One of the things voice mode is well-suited for is being a brainstorming partner, a literal wall to bounce ideas off of. I asked it to help me plan a sky-diving-themed birthday party, and it both helped me develop new ideas and refine the ones I already had.
I interrupted ChatGPT while it was speaking a couple of times, and it was able to pivot quickly. I also tend to talk quickly, and the chatbot kept up and didn’t miss any of my thoughts. I let myself ramble and steer the conversation off track, and ChatGPT didn’t blink a virtual eye. Most importantly, when I asked it a question about an earlier topic, it could pick up where we left off. Improvements to ChatGPT’s memory are to thank for that important consideration.
Watch this: The Hidden Impact of the AI Data Center Boom
Should you use ChatGPT voice mode?
Overall, I think voice mode is nice as another way to use ChatGPT, but it’s only situationally useful. If you need in-depth research and more detailed information, voice mode isn’t going to be right for you. But if you just want to talk to someone (rather, something) or work through a problem out loud, voice mode is a nice alternative to having to articulate your thoughts and type them out.
I still believe that we haven’t normalized talking to AIs in public spaces, especially without headphones. But it can be a useful alternative for people who think better aloud. For more, check out how AI is changing search engines and the best AI image generators.
Read the full article here