09 Mar 17:14

caf16a8

v1.7 Latest

Latest

LLM Tournament v1.7 Release Notes

Release Date: March 10, 2025

Overview

Version 1.7 introduces a comprehensive tiered scoring system that provides a more nuanced evaluation framework for LLM models. This release features an 11-tier classification system with spiritual names, revised color schemes, and improved score distribution mechanisms for more accurate model ranking.

New Features

Tiered Scoring System

Implemented an 11-tier classification system with spiritual names (Divine, Legendary, Mythical, etc.)
Updated the scoring scale to 0-3000+ ELO range
Introduced color scheme with reversed colors for better visual hierarchy

Mock Data Generation

Added even distribution functionality to mock results generation
Updated score weights for random score generation
Implemented proper random number generator seeding for more consistent testing
Updated mock scores generation to support the 11-tier system
Added specific score distribution for tier testing

Model Updates

Added phi-4-mini to the model roster

Bug Fixes

Fixed UI flicker and state issues with "Generate Random Mock Scores" button
Ensured consistent score calculation and display throughout the application
Fixed sorting consistency in stats chart
Ensured total score chart sorts in descending order
Added protection to ensure positive arguments for random number generation
Fixed client-side handling of sorted data to prevent display inconsistencies

Under the Hood

Removed unused math import
Updated tier names to single words for clarity and consistency
Various code optimizations for better performance

Breaking Changes

None. All changes are backward compatible with existing data.

Upgrade Instructions

Standard update procedure:

Pull the latest code
Restart the application

No data migration is required for this update.

Assets 3

06 Mar 19:30

lavantien

v1.6

a77b401

v1.6

Changelog - v1.6 (March 2025)

🚀 New Features

Added screenshots to README showcasing the program
Enhanced UI with emoji button replacements for improved intuitiveness
Added "Previous" button in results page to restore previous state after changes
Improved evaluate page with markdown-formatted prompt and solution display
Added copy button to capture raw markdown prompt text
Added Previous/Next navigation buttons for prompt browsing
Added math functions to template system for advanced calculations

🛠️ Fixes

Fixed various UI rendering issues in results page:
- Ensured table rows display correctly with improved debug logging
- Corrected template type mismatch in score button coloring
- Added nil check to ModelFilter in results template
- Fixed score buttons that were displaying too large
- Ensured raw markdown is copied instead of processed HTML
Resolved WebSocket handling issues:
- Improved error logging
- Enhanced connection stability
- Fixed initial table rendering on first load
Fixed JSON parsing error in "Generate Random Mock Scores" functionality
Fixed prompt index display in evaluate page
Added TotalPrompts field to template data for proper rendering
Optimized button arrangement for better usability

💅 UI Improvements

Refactored overall UI for improved user experience
Enhanced results page layout and design
Center-aligned model and prompt headers for better readability
Reduced progress bar region width to 20% of screen for more balanced display

📦 Updates

Updated model definitions and configurations
Updated Anthropic Claude Thinking 96K pipeline to v0.4

Assets 3

01 Mar 18:46

lavantien

v1.5

aa07152

v1.5

add more tools: openwebui/pipes/anthropic-claude-thinking-96k
add tools section to readme
fix chart overflow bug
beautify the UI
update screenshots

Assets 3

28 Feb 20:12

lavantien

v1.4

b44b9ce

v1.4

modularize the code base
granular scoring schemes and matching color scheme for cells
comprehensive stats page with tiered ranking
random persistent mock scores generator and ensure live update
fix prompt suite rename issue
import/export to json instead of csv
enhance websockets logic and stability
enhance readme quality
streamlined the prompt and contestant counts to 20
support aider configs for o1 high, o3-mini high, v3, r1, 3.7 sonnet, codestral

Assets 3

19 Jan 05:30

lavantien

v1.3

795791f

v1.3

prompts page now have another filter: by profile
full set of system prompt in XML for different purposes
tools - good and lightweight local TTS powered by Kokoro 82M and Onnx
enhance prompts quality and readme quality
update contestant list to 33
update default prompt list to 33

Assets 2

18 Jan 21:10

lavantien

v1.2

8a14fef

v1.2

optimize project structure and simplify the code base
change to profile's name will also reflects on related prompts render
and selection
full text search in profile page can now properly handle xml

Assets 2

17 Jan 18:43

lavantien

v1.1

51f4161

v1.1

added copy button to profile
update contestant list to 32
update default prompt list to 32

Assets 2

15 Jan 08:03

lavantien

v1.0

62b0185

v1.0

all functionalities finished
performance optimized
prepared contestant list (30 models)
prepared prompt suite (30 quality prompts)
prepared profiles (chain-of-thought + ReAct reasoning system-prompt, pali-vietnamese-translating system-prompt)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Tournament v1.7 Release Notes

Overview

New Features

Tiered Scoring System

Mock Data Generation

Model Updates

Bug Fixes

Under the Hood

Breaking Changes

Upgrade Instructions

Changelog - v1.6 (March 2025)

🚀 New Features

🛠️ Fixes

💅 UI Improvements

📦 Updates

Releases: lavantien/llm-tournament

v1.7

LLM Tournament v1.7 Release Notes

Overview

New Features

Tiered Scoring System

Mock Data Generation

Model Updates

Bug Fixes

Under the Hood

Breaking Changes

Upgrade Instructions

v1.6

Changelog - v1.6 (March 2025)

🚀 New Features

🛠️ Fixes

💅 UI Improvements

📦 Updates

v1.5

v1.4

v1.3

v1.2

v1.1

v1.0