When ChatGPT became widely available in 2022, Google reportedly declared a “code red” internally. With the AI chatbot readily providing answers to user queries, the company was left to wonder how much it would impact its search business. Since then, however, Google has made several moves to emphasize that it’s still focused on AI. Building upon the launch of its own AI-powered chatbot, Bard, earlier this year, the company is continuing to integrate machine learning technology into its products. Now, a new report suggests that its largest AI-centralized product yet will be launched this fall, spearheaded by team leads from Google Brain and DeepMind.
An anonymous source involved with the product — known as Gemini — recently provided new details on Google’s plans, as reported by The Information. Rather than simply compete with products such as ChatGPT, Google intends to surpass its competition with Gemini. The source specifies that it is focused on combining the text capabilities of its large language models (LLMs) and AI image generation to create a multifunctional product. This means that instead of only being able to generate text, like ChatGPT, Gemini would be able to create contextual images — but Google is reportedly looking into adding other features as well. For instance, you might eventually be able to use Gemini to analyze a flow chart or control software with your voice.
Given its expansive capabilities, Google will likely turn to Gemini to power its suite of products, including enterprise apps like Google Docs. The source adds that developers will need to pay for access to Gemini through the Google Cloud server rental unit. More details will come when Google reveals Gemini to app developers by the end of the year, but the company will likely start using Gemini-based products before then.
The source noted that several former members of the Google Brain and DeepMind teams are currently working on Gemini. These include Paul Barham, senior Google researcher, and Tom Hennigan of DeepMind, who is focusing on Gemini’s infrastructure. Perhaps the most notable team member, however, may be Google co-founder Sergey Brin. At the end of 2022, Brin had reportedly begun coming into Google offices more frequently. It was thought that Brin was focusing on the hiring process around Gemini after Google lost researchers to OpenAI at the end of 2022. Now, the source claims that he is playing an instrumental role in evaluating and training the Gemini models.
Similar to other machine learning models, Gemini analyzes columns of text and images to identify patterns and provide answers to specific questions. Google has been using YouTube video transcripts to train Gemini, according to the source. However, the company’s lawyers are keeping a close eye on the materials used during training to ensure it doesn’t infringe upon copyrighted data.