Is Google’s Gemini really smarter than OpenAI’s GPT-4? Community sleuths find out

December 7, 2023

33

Google launched its latest artificial intelligence (AI) model Gemini on Dec. 6, announcing it as the most advanced AI model currently available on the market, surpassing OpenAI’s GPT-4.

Gemini is multimodal, which means it was built to understand and combine different types of information. It comes in three versions (Ultra, Pro, Nano) to serve different use cases, and one area in which it appears to beat GPT-4 is its ability to perform advanced math and specialized coding.

On its debut, Google released multiple benchmark tests that compared Gemini with GPT-4. The Gemini Ultra version achieved “state-of-the-art performance” in 30 out of 32 academic benchmarks that were used in large language model (LLM) development.

*Gemini vs. ChatGPT performance comparison. Source: Google*

However, this is where critics across the internet have been poking at Gemini and questioning the methods used in the benchmark test that suggest Gemini’s superiority, along with Google’s marketing of the product.

“Misleading” Gemini promotion

One user on the social media platform X who works in the field of machine learning development, questioned whether Gemini’s claim of superiority over GPT-4 was true or not.

He pointed out that Google may be hyping up Gemini or “cherry-picking” examples of its superiority. Still, he concluded, “my bet is that Gemini is very competitive and will give GPT-4 a run for its money” and that competition in the space is good.

However, shortly afterward, he made a second post saying Google should be “embarrassed” for its “misleading” promotion of the product in a promotional video it created for the release of Gemini.

Google, this is embarrassing.

You published an impressive video showing Gemini answering your questions. It looked awesome. It looked real-time.

But it was a lie. None of that happened as recorded and presented to the public.

Instead, you cherry-picked frames and edited a… pic.twitter.com/GjyqWPyaIu

— Santiago (@svpino) December 6, 2023

In response to his tweet, other X users spoke out about feeling deceived by Google’s portrayal of Gemini. One user said claims that Gemini would end the era of GPT-4 are “canceled.”

Another user, a computer scientist, agreed, and called Google’s portrayal of Gemini’s superiority “disingenuous.”

Botching benchmarks

Users pointed out that Google had included benchmarks that used an outdated version of GPT-4, rather than its current capacity, and therefore the comparisons were redundant.

Another area of concern to social media sleuths was in the parameters that Google used to compare its Gemini model with GPT-4. Moreover, the prompts given to both models were not identical, which could have major implications for the outcomes.

this is pretty weird

usually when you benchmark… you compare the results of the same exact test…

Took someone else mentioning this for me to notice

— bryankyritz.eth (@kyritzb) December 6, 2023

The user also pointed out that the results were achieved using tests carried out on a model that “isn’t publicly available” at the moment. Another user pointed out that scores could be different if the advanced model of Gemini was tested against the advanced version of GPT-4 known as “turbo.”

Related: Elon Musk’s xAI files with SEC for private sale of $1B in unregistered securities

To the test

Other social media users have decided to dismiss the benchmarks published by Google, and instead have been describing their own experiences with Gemini in comparison to GPT-4.

Anne Moss, who works in web publishing services and claims to be a regular user of AI, particularly GPT-4, said she used Gemini through Google’s Bard tool and felt “underwhelmed by the experience.”

She concluded that she would stick to GPT-4 for now explaining that the differences she noted included Gemini/Bard refusing to answer political questions and “lying” about knowing personal information.

Well, well, well… Google finally launched Gemini. You can test it using the Bard interface, so they say. Bard says so too, but I don’t trust Bard too much.

Have been playing with it and so far, I’m underwhelmed. Sticking to ChatGPT Plus for now.

Here’s why –

1. Bard is… pic.twitter.com/4uyQt2fy7G

— Anne Moss (@AnneMossYeys) December 6, 2023

Another user working in app development posted screenshots in which he asked both models, via the same prompt, to generate a code based on a photo. He pointed out Gemini/Bard’s underwhelming response in comparison to GPT-4.

Gemini “Pro” vs ChatGPT (GPT-4) @Google ??? pic.twitter.com/P0lyXZGhqC

— Terry Tan (@terrytjw) December 7, 2023

According to Google, it plans to roll out Gemini more broadly to the public in early 2024. The model will also be integrated with Google’s suit of apps and services.

Magazine: Real AI use cases in crypto: Crypto-based AI markets, and AI financial analysis

Is Google’s Gemini really smarter than OpenAI’s GPT-4? Community sleuths find out

“Misleading” Gemini promotion

Botching benchmarks

To the test

ARK Buys 725K Circle Shares in July Despite Sell-Off

Stripe, Advent Offer $53B to Acquire PayPal: Report

Kalshi says CFTC, Michigan orders leave it in ‘impossible position’

Most Popular

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

ARK Buys 725K Circle Shares in July Despite Sell-Off

Thomas Cook India relaunches One Currency Card with zero markup for overseas travellers – ET TravelWorld

Recent Comments

EDITOR PICKS

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

ARK Buys 725K Circle Shares in July Despite Sell-Off

POPULAR POSTS

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

Jay Clayton Expected to Face Tough Questions in Confirmation Hearing

ARK Buys 725K Circle Shares in July Despite Sell-Off

POPULAR CATEGORY

ABOUT US

FOLLOW US