Peter Gostev’s Post

View profile for Peter Gostev

Head of AI @ Moonpig

How companies' LLMs compare over the course of last 12 months? I have selected the top ranked LLM by each company at any one time and created a timelapse of LMSys Elo scores over the last 12 months. Some highlights: - OpenAI stayed on top for almost the whole year (briefly overtaken by Anthropic) - Open Source is kept in the game by Meta, other vendors moslty dropped off during 2024 - Google was not very convincing during the whole year, but maybe its Gemini 1.5 cemented it in the top 3 (for now) As usual, data credit goes to LMSYS Chatbot Arena: https://chat.lmsys.org/ The visualisation is here: https://lnkd.in/etb9gRAg

Nelson W. Daniel, PhD

Champion of Explainable and Actionable AI | Open to Collaboration in AI/ML Transformation, Decision Intelligence & Knowledge Discovery in Healthcare & Clinical Science | PhD in AI/ML, Fuzzy/Soft Computing & Data Science

10mo

Peter Gostev Great visual of the LMSys Elo leaderboard! Something I haven't seen is a discussion why there is the Elo score gap between the proprietary and "open weight" models. Given the cost of compute, is it purely economic since the GPT-4 class of models are estimated to have 1000B parameters, while open models top out at (maybe) 400B? Maybe training corpus size and diversity ... again favoring "deeper pocket" LLM companies?

Like
Reply
Gireesh Ramji

DAAS - Decisions are a Science!

10mo

Thanks for sharing. Is there a place where I can read about the methodology behind the benchmark?

Like
Reply
Tyler Suard

Senior AI Researcher & Developer. Author of "Enterprise RAG: Scaling Retrieval Augmented Generation", available on Manning.com on March 27. Ex-Apple, Ex-Meta. Stanford affiliate. Interested in longevity, AI +Bio.

10mo

This is great! How do you make these amazing visualizations?

Jonathan Chew

LinkedIn AI Top Voice (2023-24) | AI & Revenue Strategist @ Brandrev | AI Insider Newsletter | Executive MBA | MSc AI & ML Mgmt | PostGrad Data Science & Solutions Architecture | PCert Marketing Science & IP Strategy

10mo

Great share buddy

Like
Reply
Mark Hinkle

I publish a network of AI newsletters for business under The Artificially Intelligent Enterprise Network and I run a B2B AI Consultancy Peripety Labs. I love dogs and Brazilian Jiu Jitsu.

10mo

I love this chart Peter, I am hoping that the real OSS licensed models come back up in the rankings.

Like
Reply
Scott Penberthy

CTO | AI for Cancer | Applied AI | Board Member | Customer Engineering | Developer Relations | SWE | Entrepreneur

10mo

Slick! What tool did you use to create the animation?

Like
Reply
Peter Seeberg

Industrial AI Consultant, Moderator and Podcaster

10mo

Thanks Peter / confirms my gutfeel😉 Great to see the Chatbot Arena - at an early stage promoted by yourself - take on traction.

See more comments

To view or add a comment, sign in

Explore topics