Microsoft-backed startup OpenAI on Monday announced new features for its generative AI-based chatbot ChatGPT. The chatbot is now getting voice and image capabilities, which will allow users to get answers from ChatGPT in five different voices, as well as get answers to the images they submit.
In a post on X announcing the new features of the viral chatbot, OpenAI said, “ChatGPT can now see, hear, and speak. Rolling out over next two weeks, Plus users will be able to have voice conversations with ChatGPT (iOS & Android) and to include images in conversations (all platforms).”
“They (voice and image capabilities) offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.” the Sam Altman-led company said in a subsequent blog post.
ChatGPT will now be able to answer users' questions in five different voices, which can be selected according to user preferences. OpenAI says it has enlisted the help of professional voice actors to create each voice, while also using the company's proprietary Whisper speech recognition system to transcribe spoken words into text.
ChatGPT's new voice capabilities are powered by a new text-to-speech model that OpenAI claims is capable of generating human-like audio from just text and a few seconds of speech samples, opening the door to many "creative and accessibility-focused applications".
OpenAI is also working with other companies to harness the power of this new technology. Spotify has also partnered with the AI startup to translate podcasts into additional languages in the podcaster's own voice.
OpenAI is using the multimodal abilities of GPT-3.5 and GPT-4 in order to power the Image understanding of ChatGPT. Users can now upload one or more images to ask ChatGPT questions like explore the contents of my fridge to plan a meal, or analyze a complex graph for work-related data.
The new features will be available to Plus and Enterprise users in the next two weeks followed by developers ‘soon after’.
In order to activate the Voice Feature users will have to ‘Settings’ menu on the ChatGPT mobile app and click on 'New Features'. They will then have to opt into voice conversations and tap the headphone button in the top-right corner of the home screen to select their preferred voice.
The Voice feature will only be available to ChatGPT app users on an opt-in beta basis. However, Image search will be turned in by default on all platforms.
Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.