Consistent `AssertionError` when running intro notebooks with multiple gen calls in f-string #556

jgordley · 2023-12-21T20:23:30Z

The bug
While using the guidance_acceleration.ipynb example and the guaranteeing_valid_syntax.ipynb example, I consistently run into an assertion error after several tokens are generated. Am I doing something wrong with my setup?

--------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/home/telnyxuser/guidance/notebooks/tutorials/guidance_acceleration.ipynb
File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:300, in Model.__add__(self, value)
    296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
--> 300     out = value(lm)
    301     if out is None:
    302         raise Exception(f"A guidance function did not return a model object! Did you forget to return the new lm at the end of your function?")

File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/_grammar.py:45, in StatefulFunction.__call__(self, model)
     44 def __call__(self, model):
---> 45     return self.f(model, *self.args, **self.kwargs)

File home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:296), in Model.__add__(self, value)
    294 # run stateless functions (grammar nodes)
    295 elif isinstance(value, StatelessFunction):
--> 296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
    300     out = value(lm)

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:459, in Model._run_stateless(lm, stateless_function, temperature, top_p, n)
    457 delayed_bytes = b""
    458 # last_is_generated = False
--> 459 for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
    460 
    461     # we make everything full probability if we are not computing uncertainty
    462     if not lm.compute_log_probs:
    463         new_bytes_prob = 1.0

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:774, in Model.__call__(self, grammar, max_tokens, n, top_p, temperature, ensure_bos_token)
    769 # otherwise we need to compute the logits and sample a valid token
    770 else:
    771 
    772     # if we were forced we might need to clean up the greedy tokenization to match the global tokenization behavior as seen in training
    773     if was_forced:
--> 774         token_ids,token_byte_positions = self._cleanup_tokens(token_ids, token_byte_positions)
    775         was_forced = False
    776     grammar_temp = parser.next_byte_temperature()

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:605, in Model._cleanup_tokens(self, token_ids, token_byte_positions)
    603         for i in range(1, len(token_byte_positions)):
    604             token_byte_positions[i] -= 1
--> 605     assert token_byte_positions[-1] == last_pos
    607 return token_ids, token_byte_positions

AssertionError:

To Reproduce
This error occurs without changing anything on the guidance_acceleration.ipynb notebook, and occurs after I introduce the HF transformers model code into the guaranteeing_valid_syntax.ipynb notebook like so:

from guidance import models

model = 'mistralai/Mistral-7B-v0.1'
device = 'cuda:1'
lm = models.Transformers(model, device=device)

The easiest example is from the README, using multiple generation calls (using just one call in an f-string works great):

from guidance import models, gen, select

model = 'mistralai/Mistral-7B-v0.1'
device = 'cuda'
lm = models.Transformers(model, device=device)

lm+ f'''\
Do you want a joke or a poem? A {select(['joke', 'poem'])}.
Okay, here is a one-liner: "{gen(stop='"')}"

Output:

Do
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:288, in Model.__add__(self, value)
    286                 partial_grammar += string(part)
    287             is_id = not is_id
--> 288         out = lm + partial_grammar
    290 # if we find a null value we do nothing
    291 elif isinstance(value, Null):

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:296, in Model.__add__(self, value)
    294 # run stateless functions (grammar nodes)
    295 elif isinstance(value, StatelessFunction):
--> 296     out = lm._run_stateless(value)
    298 # run stateful functions
    299 else:
    300     out = value(lm)

File /home/telnyxuser/guidance/notebooks/tutorials/~/.local/lib/python3.10/site-packages/guidance/models/_model.py:459, in Model._run_stateless(lm, stateless_function, temperature, top_p, n)
...
    604             token_byte_positions[i] -= 1
--> 605     assert token_byte_positions[-1] == last_pos
    607 return token_ids, token_byte_positions

AssertionError:

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Ubuntu 22.04
Guidance Version (guidance.__version__): 0.1.8
GPU: MI100

The text was updated successfully, but these errors were encountered:

slundberg · 2023-12-22T16:08:49Z

I can't seem to reproduce the issue on my side:

I am on an A100 not a MI100, not sure if that might make a difference. If you have access to both AMD and NVIDIA is might be worth checking (I don't have a MI100 handy). You might also share any other short examples that crop up in case it is just a rounding error issues between the devices.

slundberg · 2023-12-22T18:41:43Z

I think I fixed the error based on #555 (in v0.1.9) so I am closing this for now. Let me know if it crops back up!

jgordley · 2023-12-22T21:11:32Z

Great, thanks so much! I'll try it out later today :)

SuperMasterBlasterLaser · 2023-12-30T08:38:13Z

@slundberg This error still happened to me on v0.1.10 version. Tried to check some ideas.
My code is something like this:

@guidance
def narrate(lm, location_name, location_description, genre, connected_locations):
    generation = gen(max_tokens=128, stop='\n', name='narrator')

    available_traversals = ''
    for cl in connected_locations:
        available_traversals += \
            f'''
            
            {cl['name']}
            '''

    narration_body = \
        f'''
        LOCATION: {location_name}
        DESCRIPTION: {location_description}
        GENRE: {genre}
        
        AVAILABLE OTHER TRAVERSALS:
        {available_traversals}
        
        
        Describe current situation in third person: {generation}
        '''
    lm += narration_body
    return lm

@guidance
def next_turn(lm):
    turn_body = f'''
    This is text adventure game.
    
    According to current narration. Decide to who's context to switch.
    
    Next turn is for {select(options=['NARRATOR', 'USER'], name='next_turn')}
    '''

    return lm + turn_body

@guidance
def background_image(lm, genre: List[str], location_name: str, location_description):
    generation = gen(max_tokens=128, stop='\n', name='image_result')

    location_body = \
        f'''LOCATION: {location_name}
        DESCRIPTION: {location_description}
        GENRE: {genre}
        
        Describe as a image. Use description and genre for describing: {generation}
        '''
    lm += location_body
    return lm

Then I've been calling these methods from infinite loop like this:

while True:
    location = codex.get_location(current_location_id)

    if current_turn == 'NARRATOR':
        lm += narrate(location['name'], location['description'], codex.genre, location['connected_locations'])
        print('[ NARRATOR ]')
        print(str(lm['narrator']))
        print()
        lm += background_image(codex.genre, location['name'], location['description'])
        print('[ BACKGROUND_IMAGE ]')
        print('Background', lm['image_result'])
        print()

    elif current_turn == 'USER':
        user_input = input('Input what to do:')
        lm += user_input

    print()
    print('+++++++++++')
    input("Press Enter to continue...")
    print()
    print('Selecting context...')
    lm += next_turn()

    current_turn = lm['next_turn']
    print(current_turn)

    continue

This returns to me this error:

Traceback (most recent call last):
  File "D:\llm_testbed\main.py", line 103, in <module>
    lm += background_image(codex.genre, location['name'], location['description'])
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 306, in __add__
    out = value(lm)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\_grammar.py", line 45, in __call__
    return self.f(model, *self.args, **self.kwargs)
  File "D:\llm_testbed\main.py", line 30, in background_image
    lm += location_body
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 294, in __add__
    out = lm + partial_grammar
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 302, in __add__
    out = lm._run_stateless(value)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 465, in _run_stateless
    for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 638, in __call__
    token_ids,token_byte_positions = self._cleanup_tokens(token_ids,token_byte_positions)
  File "D:\virtualenv310\llm_testbed\lib\site-packages\guidance\models\_model.py", line 611, in _cleanup_tokens
    assert token_byte_positions[-1] == last_pos
AssertionError

The interesting thing is that before calling background_image it returns weird text from narrate:

[ NARRATOR ]
Th<|im_start|> assistant

This error happens on SECOND cycle of this loop.

I'm using locally downloaded TheBloke_OpenHermes-2.5-Mistral-7B-16k-AWQ model.

jgordley changed the title ~~Consistent AssertionError when running intro notebooks with HF Transformers models~~ Consistent AssertionError when running intro notebooks with multiple gen calls in f-string Dec 21, 2023

slundberg closed this as completed Dec 22, 2023

chris-cortner mentioned this issue Jan 9, 2024

Dolphin Mistral Prompt blows up on assertion #581

Open

shawnz mentioned this issue Feb 21, 2024

Improve HF tokenization hack to cover multiple special tokens #649

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent `AssertionError` when running intro notebooks with multiple gen calls in f-string #556

Consistent `AssertionError` when running intro notebooks with multiple gen calls in f-string #556

jgordley commented Dec 21, 2023 •

edited

Loading

slundberg commented Dec 22, 2023

slundberg commented Dec 22, 2023

jgordley commented Dec 22, 2023

SuperMasterBlasterLaser commented Dec 30, 2023 •

edited

Loading

Consistent AssertionError when running intro notebooks with multiple gen calls in f-string #556

Consistent AssertionError when running intro notebooks with multiple gen calls in f-string #556

Comments

jgordley commented Dec 21, 2023 • edited Loading

slundberg commented Dec 22, 2023

slundberg commented Dec 22, 2023

jgordley commented Dec 22, 2023

SuperMasterBlasterLaser commented Dec 30, 2023 • edited Loading

Consistent `AssertionError` when running intro notebooks with multiple gen calls in f-string #556

Consistent `AssertionError` when running intro notebooks with multiple gen calls in f-string #556

jgordley commented Dec 21, 2023 •

edited

Loading

SuperMasterBlasterLaser commented Dec 30, 2023 •

edited

Loading