Skip to content

Releases: xorbitsai/inference

v1.3.0.post2

22 Feb 15:30
378a47a
Compare
Choose a tag to compare

What's new in 1.3.0.post2 (2025-02-22)

These are the changes in inference v1.3.0.post2.

Bug fixes

Full Changelog: v1.3.0.post1...v1.3.0.post2

v1.3.0.post1

21 Feb 16:14
b2004d4
Compare
Choose a tag to compare

What's new in 1.3.0.post1 (2025-02-21)

These are the changes in inference v1.3.0.post1.

New features

Enhancements

  • enh: add gpu utilization info by @amumu96 in #2852
  • ENH: Update Kokoro model by @codingl2k1 in #2843
  • ENH: cmdline supports --n-worker, add --model-path and make it compatible with --model_path by @qinxuye in #2890
  • BLD: update sglang to v0.4.2.post4 and vllm to v0.7.2 by @qinxuye in #2838
  • BLD: fix flashinfer installation in dockerfile by @qinxuye in #2844

Bug fixes

Tests

Documentation

Others

  • CHORE: Xavier now supports vLLM >= 0.7.0, drops support for older versions by @ChengjieLi28 in #2886

New Contributors

Full Changelog: v1.2.2...v1.3.0.post1

v1.2.2

08 Feb 09:28
ac97a13
Compare
Choose a tag to compare

What's new in 1.2.2 (2025-02-08)

These are the changes in inference v1.2.2.

New features

Bug fixes

  • BUG: fix llama-cpp when some quantizations have multiple parts by @qinxuye in #2786
  • BUG: Use Cache class instead of raw tuple for transformers continuous batching, compatible with latest transformers by @ChengjieLi28 in #2820

Documentation

New Contributors

Full Changelog: v1.2.1...v1.2.2

v1.2.1

24 Jan 08:59
a57b99b
Compare
Choose a tag to compare

What's new in 1.2.1 (2025-01-24)

These are the changes in inference v1.2.1.

New features

Enhancements

Bug fixes

Tests

Documentation

  • DOC: update new models in README and doc by @qinxuye in #2761
  • DOC: using discord instead of slack & updating model to qwen2.5 in getting started doc by @qinxuye in #2775

Others

  • FIX: [UI] normalize language input to ensure consistent array format. by @yiboyasss in #2771

New Contributors

Full Changelog: v1.2.0...v1.2.1

v1.2.0

10 Jan 09:34
df45f11
Compare
Choose a tag to compare

What's new in 1.2.0 (2025-01-10)

These are the changes in inference v1.2.0.

New features

Enhancements

  • ENH: [UI] Update Button Style and Interaction Logic for Editing Cache in Model Card. by @yiboyasss in #2746
  • ENH: Improve error message by @codingl2k1 in #2738

Bug fixes

Others

New Contributors

Full Changelog: v1.1.1...v1.2.0

v1.1.1

27 Dec 10:21
d342869
Compare
Choose a tag to compare

What's new in 1.1.1 (2024-12-27)

These are the changes in inference v1.1.1.

New features

Enhancements

Bug fixes

New Contributors

Full Changelog: v1.1.0...v1.1.1

v1.1.0

13 Dec 10:29
b132fca
Compare
Choose a tag to compare

What's new in 1.1.0 (2024-12-13)

These are the changes in inference v1.1.0.

New features

Enhancements

  • ENH: Optimize error message when user parameters are passed incorrectly by @namecd in #2623
  • ENH: bypass the sampling parameter skip_special_tokens to vLLM backend by @zjuyzj in #2655
  • ENH: unify prompt_text as cosyvoice for fish speech by @qinxuye in #2658
  • ENH: Update glm4 chat model to new weights by @codingl2k1 in #2660
  • ENH: upgrade sglang in Docker by @amumu96 in #2668

Bug fixes

Documentation

Others

New Contributors

Full Changelog: v1.0.1...v1.1.0

v1.0.1

29 Nov 10:22
8dd5715
Compare
Choose a tag to compare

What's new in 1.0.1 (2024-11-29)

These are the changes in inference v1.0.1.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v1.0.0...v1.0.1

v1.0.0

15 Nov 10:15
4c96475
Compare
Choose a tag to compare

What's new in 1.0.0 (2024-11-15)

These are the changes in inference v1.0.0.

New features

Enhancements

Bug fixes

Documentation

Full Changelog: v0.16.3...v1.0.0

v0.16.3

08 Nov 05:47
85ab86b
Compare
Choose a tag to compare

What's new in 0.16.3 (2024-11-08)

These are the changes in inference v0.16.3.

New features

Enhancements

Bug fixes

Full Changelog: v0.16.2...v0.16.3