OpenAI Big Updates, ChatGPT Can Now 'Speak,' Listen and Process Images

OpenAI Big Updates, ChatGPT Can Now 'Speak,' Listen and Process Images
3 min read
28 September 2023

OpenAI’s announcement on Monday revealed that ChatGPT has undergone a significant update, granting it the ability to comprehend spoken language, respond using a synthetic voice, and process images. This marks the most substantial enhancement to the chatbot since the introduction of GPT-4.

Users can now engage in voice conversations with ChatGPT through its mobile app and select from a choice of five different synthetic voices for the bot’s responses. Additionally, users will have the capability to share images with ChatGPT, enabling them to pinpoint specific areas of interest or request analysis, such as identifying cloud types.

OpenAI, plans to implement these changes for paying users within the next two weeks. Notably, the voice functionality will be accessible exclusively through the iOS and Android apps, while image processing capabilities will be available across all platforms.

The significant feature expansion coincides with the escalating competition in the artificial intelligence race involving leading chatbot companies like OpenAI, Microsoft, Google, and Anthropic.

These tech giants are in a race to not only introduce new chatbot applications but also roll out fresh functionalities, particularly during the current summer season. Google, for instance, has unveiled a series of updates for its Bard chatbot, while Microsoft has incorporated visual search capabilities into Bing.

OpenAI Big Updates, ChatGPT Can Now 'Speak,' Listen and Process Images

OpenAI Investments and Worries About Fake Voices

Earlier this year, Microsoft’s substantial additional investment of $10 billion in OpenAI marked it as the largest AI investment of the year, as reported by PitchBook. In April, OpenAI reportedly concluded a share sale worth $300 million, valuing the company at approximately $27 billion to $29 billion, with financial support coming from prominent firms like Sequoia Capital and Andreessen Horowitz.

Concerns have been raised by experts regarding AI-generated synthetic voices, which, in this context, could provide users with a more lifelike experience but also open the door to more convincing deepfakes.

Cybersecurity threat actors and researchers have already initiated investigations into the potential use of deepfakes for breaching cybersecurity systems.

In its Monday announcement, OpenAI addressed these concerns by confirming that the synthetic voices were “developed in collaboration with voice actors with whom we have direct working relationships,” rather than being sourced from unknown individuals.

The release also lacked substantial details on OpenAI’s intended utilization of consumer voice inputs and the measures in place to safeguard such data, should it be employed. OpenAI’s terms of service indicate that consumers have ownership rights over their inputs “to the extent allowed by relevant laws.”

OpenAI directed its guidelines concerning voice interactions, which assert that OpenAI does not retain audio clips, and the audio clips themselves are not utilized for model improvement.

However, it’s worth noting that the guidelines specify that transcriptions are categorized as inputs and may potentially be employed for enhancing the performance of their extensive language models.

Conclusion

The latest update to OpenAI’s ChatGPT marks a major advance by adding speech recognition and conversational responses. Microsoft’s major investment increases interest in the smart chatbot competition.

But concerns remain about fraud caused by AI-generated voices. OpenAI addresses some concerns but leaves open the question of data use, highlighting the changing nature of AI.

In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
James Robert 5
Joined: 11 months ago
Comments (0)

    No comments yet

You must be logged in to comment.

Sign In / Sign Up