Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference] Refactor modeling attention layer by abstracting attention backends #5771

Merged
merged 6 commits into from
Jun 10, 2024

Commits on Jun 7, 2024

  1. Refactor modeling by adding attention backend

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    04386d9 View commit details
    Browse the repository at this point in the history
  2. Fix tests and naming

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    eec77e5 View commit details
    Browse the repository at this point in the history
  3. Pass inference model shard configs for module init

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    5f398fc View commit details
    Browse the repository at this point in the history
  4. Clean up

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    ceba662 View commit details
    Browse the repository at this point in the history
  5. Remove flash attention backend

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 7, 2024
    Configuration menu
    Copy the full SHA
    f5981e8 View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2024

  1. Fix test import

    Signed-off-by: char-1ee <xingjianli59@gmail.com>
    char-1ee committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    b303976 View commit details
    Browse the repository at this point in the history