Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOTA 2-bit quants - part 2 #4856

Merged
merged 11 commits into from
Jan 11, 2024
Merged

SOTA 2-bit quants - part 2 #4856

merged 11 commits into from
Jan 11, 2024

Commits on Jan 9, 2024

  1. iq2_xs: basics

    Kawrakow committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    3569fa3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9f21b82 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9b6e38d View commit details
    Browse the repository at this point in the history
  4. iq2_xs: WIP Metal

    Kawrakow committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    0aacd55 View commit details
    Browse the repository at this point in the history
  5. iq2_xs: Metal now works

    Kawrakow committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    55e2cae View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ff49d87 View commit details
    Browse the repository at this point in the history
  7. iq2_xs: better ARM_NEON dot product

    We are now at 19.5 t/s for TG-128 and 61 t/s for PP-512 when
    running on the CPU.
    Kawrakow committed Jan 9, 2024
    Configuration menu
    Copy the full SHA
    52ea3f7 View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2024

  1. Configuration menu
    Copy the full SHA
    3198e94 View commit details
    Browse the repository at this point in the history
  2. iq2_xs: faster AVX2 dit product

    21.4 t/s for TG-128, 59.2 t/s for PP-512.
    The latter is 2x compared to the previous version.
    Kawrakow committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    8299b03 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a1610b0 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Add llama enum for IQ2_XS

    Kawrakow committed Jan 11, 2024
    Configuration menu
    Copy the full SHA
    9bfcb16 View commit details
    Browse the repository at this point in the history