Google recently introduced the world to Gemini, its latest foundation model, a breakthrough in the generative artificial intelligence (AI) space. Claimed to be the largest and most capable AI model that will outperform other leading chatbots, Gemini is the collaborative effort of many teams across Google, including Google Research, Google DeepMind and Google AI.
With Gemini, Google has created a flexible model that can efficiently run on both data centres and mobile devices. Depending on the scale and complexity of tasks, Gemini 1.0 is available in three different versions – Ultra, Pro and Nano. The Nano version can be run on Pixel 8 Pro. Gemini Pro has already been integrated into Bard, significantly upgrading the chatbot. Gemini Ultra, designed for highly complex tasks, will be launched early next year.
Gemini is based on the Sparrow 137B architecture. It has the ability to generate realistic, highly detailed images. Bard and Gemini are complementary models. While Bard can understand challenging text, provide summaries and facts, and translate different languages in real time, Gemini is more image-oriented and can perform tasks that require creativity. Gemini’s visual and creative approach has taken Bard to the next level, with more advanced reasoning, understanding and planning capabilities.
Gemini vs GPT-4
Gemini has been designed for multimodality, which means it can seamlessly interact with different kinds of information, including text, images, code and audio. Gemini Pro is currently available for text-based prompts, and other modalities are expected to be rolled out soon. According to Google, Gemini Pro outperformed GPT-3.5 in six out of eight benchmarks.
Gemini’s state-of-the-art capabilities, in the form of Gemini Ultra, will be demonstrated through Bard Advanced early next year. Gemini Ultra is designed to take on its biggest competitor, GPT-4. It has surpassed GPT-4 in a range of benchmarks, including coding and text. It scored 90 per cent on the massive multitask language understanding benchmark, as compared to 86.4 per cent for GPT-4. Further, in line with Google’s commitment to “advancing bold and responsible AI”, Gemini has been built with “the most comprehensive safety evaluations”.
What it means for the future of telecom
Sundar Pichai, Google and Alphabet’s CEO, has emphasised the impact this model can have on all industries, creating significant opportunities for developers. Gemini has the potential to transform the telecom industry. For instance, it can help address more complex customer queries. Its ability to translate languages in real time can enable seamless communication. It can provide more personalised support to customers by analysing their past interactions and preferences. In the customer service space, it can also prevent employee burnout by handling service requests that are mundane and repetitive.
Telecom players can design their marketing strategy with inputs from Gemini. Further, Gemini can help companies determine pricing for various services by analysing market trends and consumer data. Its capabilities can inspire new and innovative applications and services in the telecom space. Gemini will help telecom companies achieve personalisation in their marketing strategy, services and products, enabling them to create more “human” experiences.
Debate around security and replacement of humans
One of the two key concerns and points of debate around generative AI is the ethical dilemma. On the one hand, generative AI can prevent fraud and identify suspicious activity; while on the other, it may pose a threat to individual privacy, resulting in potential misuse. In addition, there is concern about biases creeping into the system. To address this, Google has put in place several safety checks. “We’re approaching this work boldly and responsibly,” shared Pichai in a recent blog post announcing the launch of Gemini. The company has followed its AI principles while developing Gemini and added new protections to make the new launch safer and more inclusive. It is also collaborating with external parties from the industry to define best practices and set security benchmarks.
The other major concern is Gemini’s potential to overshadow human capabilities and talent. It promises breakthroughs in several fields but at the cost of millions of jobs. Some argue that it cannot replace human interaction. While Gemini provides speed and efficiency, it lacks empathy, and human judgement cannot be easily automated. That said, Gemini has not reached its potential yet, with more advanced versions set to launch in the coming years.
Interestingly, while researching this article, I wanted to understand what sets Gemini apart from GPT, and whether Bard and Gemini are two different products or two sides of the same coin. Instead of sifting through multiple articles, I fired my questions at Bard, which is now powered by Gemini Pro. What followed was a brainstorming session with a supportive friend!
Sugandha Khurana