Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent AssertionError when running intro notebooks with multiple gen calls in f-string #556

Closed
jgordley opened this issue Dec 21, 2023 · 4 comments

Comments

@jgordley
Copy link

jgordley commented Dec 21, 2023

The bug
While using the guidance_acceleration.ipynb example and the guaranteeing_valid_syntax.ipynb example, I consistently run into an assertion error after several tokens are generated. Am I doing something wrong with my setup?

--------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/home/telnyxuser/guidance/notebooks/tutorials/guidance_acceleration.ipynb
File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:300, in Model.__add__(self, value)
    296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
--> 300     out = value(lm)
    301     if out is None:
    302         raise Exception(f"A guidance function did not return a model object! Did you forget to return the new lm at the end of your function?")

File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/_grammar.py:45, in StatefulFunction.__call__(self, model)
     44 def __call__(self, model):
---> 45     return self.f(model, *self.args, **self.kwargs)

File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:296), in Model.__add__(self, value)
    294 # run stateless functions (grammar nodes)
    295 elif isinstance(value, StatelessFunction):
--> 296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
    300     out = value(lm)

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:459, in Model._run_stateless(lm, stateless_function, temperature, top_p, n)
    457 delayed_bytes = b""
    458 # last_is_generated = False
--> 459 for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
    460 
    461     # we make everything full probability if we are not computing uncertainty
    462     if not lm.compute_log_probs:
    463         new_bytes_prob = 1.0

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:774, in Model.__call__(self, grammar, max_tokens, n, top_p, temperature, ensure_bos_token)
    769 # otherwise we need to compute the logits and sample a valid token
    770 else:
    771 
    772     # if we were forced we might need to clean up the greedy tokenization to match the global tokenization behavior as seen in training
    773     if was_forced:
--> 774         token_ids,token_byte_positions = self._cleanup_tokens(token_ids, token_byte_positions)
    775         was_forced = False
    776     grammar_temp = parser.next_byte_temperature()

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:605, in Model._cleanup_tokens(self, token_ids, token_byte_positions)
    603         for i in range(1, len(token_byte_positions)):
    604             token_byte_positions[i] -= 1
--> 605     assert token_byte_positions[-1] == last_pos
    607 return token_ids, token_byte_positions

AssertionError:

To Reproduce
This error occurs without changing anything on the guidance_acceleration.ipynb notebook, and occurs after I introduce the HF transformers model code into the guaranteeing_valid_syntax.ipynb notebook like so:

from guidance import models

model = 'mistralai/Mistral-7B-v0.1'
device = 'cuda:1'
lm = models.Transformers(model, device=device)

The easiest example is from the README, using multiple generation calls (using just one call in an f-string works great):

from guidance import models, gen, select

model = 'mistralai/Mistral-7B-v0.1'
device = 'cuda'
lm = models.Transformers(model, device=device)

lm+ f'''\
Do you want a joke or a poem? A {select(['joke', 'poem'])}.
Okay, here is a one-liner: "{gen(stop='"')}"

Output:

Do
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:288, in Model.__add__(self, value)
    286                 partial_grammar += string(part)
    287             is_id = not is_id
--> 288         out = lm + partial_grammar
    290 # if we find a null value we do nothing
    291 elif isinstance(value, Null):

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:296, in Model.__add__(self, value)
    294 # run stateless functions (grammar nodes)
    295 elif isinstance(value, StatelessFunction):
--> 296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
    300     out = value(lm)

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:459, in Model._run_stateless(lm, stateless_function, temperature, top_p, n)
...
    604             token_byte_positions[i] -= 1
--> 605     assert token_byte_positions[-1] == last_pos
    607 return token_ids, token_byte_positions

AssertionError:

System info (please complete the following information):

  • OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Ubuntu 22.04
  • Guidance Version (guidance.__version__): 0.1.8
  • GPU: MI100
@jgordley jgordley changed the title Consistent AssertionError when running intro notebooks with HF Transformers models Consistent AssertionError when running intro notebooks with multiple gen calls in f-string Dec 21, 2023
@slundberg
Copy link
Collaborator

I can't seem to reproduce the issue on my side:

image

I am on an A100 not a MI100, not sure if that might make a difference. If you have access to both AMD and NVIDIA is might be worth checking (I don't have a MI100 handy). You might also share any other short examples that crop up in case it is just a rounding error issues between the devices.

@slundberg
Copy link
Collaborator

I think I fixed the error based on #555 (in v0.1.9) so I am closing this for now. Let me know if it crops back up!

@jgordley
Copy link
Author

Great, thanks so much! I'll try it out later today :)

@SuperMasterBlasterLaser
Copy link

SuperMasterBlasterLaser commented Dec 30, 2023

@slundberg This error still happened to me on v0.1.10 version. Tried to check some ideas.
My code is something like this:

@guidance
def narrate(lm, location_name, location_description, genre, connected_locations):
    generation = gen(max_tokens=128, stop='\n', name='narrator')

    available_traversals = ''
    for cl in connected_locations:
        available_traversals += \
            f'''
            
            {cl['name']}
            '''

    narration_body = \
        f'''
        LOCATION: {location_name}
        DESCRIPTION: {location_description}
        GENRE: {genre}
        
        AVAILABLE OTHER TRAVERSALS:
        {available_traversals}
        
        
        Describe current situation in third person: {generation}
        '''
    lm += narration_body
    return lm

@guidance
def next_turn(lm):
    turn_body = f'''
    This is text adventure game.
    
    According to current narration. Decide to who's context to switch.
    
    Next turn is for {select(options=['NARRATOR', 'USER'], name='next_turn')}
    '''

    return lm + turn_body

@guidance
def background_image(lm, genre: List[str], location_name: str, location_description):
    generation = gen(max_tokens=128, stop='\n', name='image_result')

    location_body = \
        f'''LOCATION: {location_name}
        DESCRIPTION: {location_description}
        GENRE: {genre}
        
        Describe as a image. Use description and genre for describing: {generation}
        '''
    lm += location_body
    return lm

Then I've been calling these methods from infinite loop like this:

while True:
    location = codex.get_location(current_location_id)

    if current_turn == 'NARRATOR':
        lm += narrate(location['name'], location['description'], codex.genre, location['connected_locations'])
        print('[ NARRATOR ]')
        print(str(lm['narrator']))
        print()
        lm += background_image(codex.genre, location['name'], location['description'])
        print('[ BACKGROUND_IMAGE ]')
        print('Background', lm['image_result'])
        print()

    elif current_turn == 'USER':
        user_input = input('Input what to do:')
        lm += user_input

    print()
    print('+++++++++++')
    input("Press Enter to continue...")
    print()
    print('Selecting context...')
    lm += next_turn()

    current_turn = lm['next_turn']
    print(current_turn)

    continue

This returns to me this error:

Traceback (most recent call last):
  File "D:\llm_testbed\main.py", line 103, in <module>
    lm += background_image(codex.genre, location['name'], location['description'])
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 306, in __add__
    out = value(lm)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\_grammar.py", line 45, in __call__
    return self.f(model, *self.args, **self.kwargs)
  File "D:\llm_testbed\main.py", line 30, in background_image
    lm += location_body
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 294, in __add__
    out = lm + partial_grammar
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 302, in __add__
    out = lm._run_stateless(value)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 465, in _run_stateless
    for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 638, in __call__
    token_ids,token_byte_positions = self._cleanup_tokens(token_ids,token_byte_positions)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 611, in _cleanup_tokens
    assert token_byte_positions[-1] == last_pos
AssertionError

The interesting thing is that before calling background_image it returns weird text from narrate:

[ NARRATOR ]
Th<|im_start|> assistant

This error happens on SECOND cycle of this loop.

I'm using locally downloaded TheBloke_OpenHermes-2.5-Mistral-7B-16k-AWQ model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants