The power of
Unfortunately for Google, the controversy that derailed Gemini’s launch meant that one important fact about Gemini was not widely appreciated: that the new Gemini 1.5 model is actually rather powerful indeed.
According to the company, Gemini’s enterprise offering can accept up to a million tokens as context – that’s the input data that can be processed by the system, such as individual words, letters or other pieces of data. This means that users can feed Gemini 700,000 words, 30,000 lines of code, or even 11 hours of audio or an hour of video, and ask the AI to interrogate it. That’s around five times as many tokens as OpenAI’s GPT-4 Turbo model.
The company claims it is more efficient than GPT-4, too, meaning that it can compute responses much more quickly, thanks to a “Mixture-of-Experts” architecture that splits different types of computational tasks into different sub-models to hasten the speed of results.
That means Gemini should conceivably be capable of even more advanced computation than rivals such as ChatGPT. And although it’s early days, we’re starting to see some signs of what it can do.
For example, the legendary web technologist Simon Willison wrote a blog describing how he used Gemini to parse the contents of a video clip. To test the system, he uploaded a clip a mere seven seconds long, with the camera quickly and erratically panning across his bookshelf. He then asked Gemini to look at the video and generate a
JSON file (essentially a form of structured data like storing a dataset) of all the books on the shelf.
“Honestly, I’m pretty astonished by this,” Willison wrote. “It didn’t get all of them, but it did about as good a job as I could have done given the same video.”
However, as another example of Gemini’s rather restrictive guardrails, his second attempt with a video scanning cookbooks didn’t work as successfully, with Gemini outright refusing to parse his request. “It looks like the safety filter may have taken offence to the word ‘cocktail’!” wrote Willison.
Nonetheless, he remains impressed. “This really does feel like another one of those glimpses of a future that’s suddenly far closer than I expected it to be.”