Google Introduced Its Multimodal Technology ‘Gemini’ and Added It to Bard

IBL News | New York

Google introduced yesterday its long-awaited answer to ChatGPT, a multimodal, natively designed, and pre-trained AI technology with reasoning capabilities named Gemini.  

While other multimodal offerings — meaning it can analyze text, audio, video, images, and code —  exist, Gemini was described by Google’s CEO Sundar Pichai as the company’s “most capable and general model yet.”

“Our first version, Gemini 1.0, is optimized for different sizes: Ultra, Pro, and Nano.”

Demis Hassabis, CEO and Co-Founder of Google DeepMind, explained that “Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.”

“This makes it especially good at explaining reasoning in complex subjects like math and physics.”

Gemini can understand, explain, and generate high-quality code in Python, Java, C++, and Go. “Its ability to work across languages and reason about complex information makes it one of the leading foundation models for coding in the world,” said Demis Hassabis.

Google said that Gemini 1.0 was now rolling out across a range of its products and platforms.

For example, the chatbot Bard was upgraded with Gemini Pro, while Gemini Ultra will applied early next year in a new experience called Bard Advanced.

Google was also bringing Gemini to Pixel. Pixel 8 Pro will be engineered to run Gemini Nano, powering new features like Summarize in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp.

In the coming months, Gemini will be available in more of our products and services like Search, Ads, Chrome, and Duet AI.

Starting on December 13, developers and enterprise customers will be able to access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

(Google AI Studio is a free, web-based developer tool to prototype and launch apps quickly with an API key.)

A chart showing Gemini Ultra’s performance on common text benchmarks, compared to GPT-4 (API numbers calculated where reported numbers were missing).

A chart showing Gemini Ultra’s performance on multimodal benchmarks compared to GPT-4V, with previous SOTA models listed in places where capabilities are not supported in GPT-4V.


