OpenAI Group PBC is set to launch a new AI model focused on audio generation by the end of March, aiming to enhance natural-sounding speech and improve real-time interaction capabilities. This upcoming model may adopt a new architecture, diverging from the current transformer-based system used in GPT-realtime. OpenAI’s previous models, like Whisper, process audio through spectrograms and have varied quality editions—suggesting multiple versions of the new algorithm could also be released.
Led by Kundan Kumar, the initiative combines engineering and research efforts to bolster OpenAI’s presence in the consumer electronics market, including plans for an audio-first personal device within a year. The growing AI-generated music sector presents additional opportunities for OpenAI to expand its consumer base. The company seeks an on-device audio model strategy, mirroring Google’s approach, enabling cost-efficient local processing. Overall, this model signifies OpenAI’s ambition to innovate in both AI and consumer technology spaces.
Source link