Google Releases Gemini 1.5 Pro AI Model in Public Preview, Adds New Features

Share

Google introduced its artificial intelligence (AI) model with the largest context window, Gemini 1.5 Pro in public preview on Tuesday. The tech giant first announced the AI model in February, and for the next two months, it was available in Google AI Studio for developers to try out. Now, it is available to users to try out. Enthusiasts can also create or access API keys to build using the large language model (LLM). In opening it to the public, the tech giant has also included multiple new capabilities in Gemini 1.5 Pro.

The AI model was introduced in public preview during the company’s annual Google Cloud Next event. The standard version of Gemini 1.5 Pro comes with a 1,28,000 token context window. In comparison, Gemini 1.0 had a context window of 32,000 tokens. There’s a special variant of the model as well which comes with a massive context window of one million tokens. Tokens are the primary units of data, which can be understood as syllables, words, or subsections of words. The context window is the amount of information an AI model can access, based on the keywords in the prompt, to find relevant information.

To put it into context, a context window of one million tokens could be about 7,00,000 words, which is similar to ten average-sized books with 300 pages. This kind of information spread enables the AI to understand the wider context and respond with an answer that can be more relevant to the user. Further, this capability is especially useful when a user wants the AI to analyse a large file to find out a particular piece of information.

X (formerly known as Twitter) user Rowan Cheung was able to get early access to the Gemini AI model and posted about his findings from using it. In a post, he said, “I uploaded the entire NBA dunk contest from last night and asked which dunk had the highest score. Gemini 1.5 was incredibly able to find the specific perfect 50 dunk and details from just its long context video understanding!”

The AI model comes with several new features as well. Google has added native audio or speech support, and Gemini 1.5 Pro can understand verbal prompts. Alongside, a File API for handling files, system instructions, and JSON mode have also been added for developers to have better control over the model. It also comes with its multimodal capability and can analyse images and videos. The AI model is currently available in more than 180 countries including India.


Affiliate links may be automatically generated – see our ethics statement for details.