How companies' LLMs compare over the course of last 12 months? I have selected the top ranked LLM by each company at any one time and created a timelapse of LMSys Elo scores over the last 12 months. Some highlights: - OpenAI stayed on top for almost the whole year (briefly overtaken by Anthropic) - Open Source is kept in the game by Meta, other vendors moslty dropped off during 2024 - Google was not very convincing during the whole year, but maybe its Gemini 1.5 cemented it in the top 3 (for now) As usual, data credit goes to LMSYS Chatbot Arena: https://chat.lmsys.org/ The visualisation is here: https://lnkd.in/etb9gRAg
Thanks for sharing. Is there a place where I can read about the methodology behind the benchmark?
The Fate Of Humanity 📈
This is great! How do you make these amazing visualizations?
Great share buddy
I love this chart Peter, I am hoping that the real OSS licensed models come back up in the rankings.
🙃
Slick! What tool did you use to create the animation?
Thanks Peter / confirms my gutfeel😉 Great to see the Chatbot Arena - at an early stage promoted by yourself - take on traction.
Champion of Explainable and Actionable AI | Open to Collaboration in AI/ML Transformation, Decision Intelligence & Knowledge Discovery in Healthcare & Clinical Science | PhD in AI/ML, Fuzzy/Soft Computing & Data Science
10moPeter Gostev Great visual of the LMSys Elo leaderboard! Something I haven't seen is a discussion why there is the Elo score gap between the proprietary and "open weight" models. Given the cost of compute, is it purely economic since the GPT-4 class of models are estimated to have 1000B parameters, while open models top out at (maybe) 400B? Maybe training corpus size and diversity ... again favoring "deeper pocket" LLM companies?