Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(task): if finish task output is None, it should not be processed #155

Merged
merged 2 commits into from
Aug 6, 2021

Conversation

anjakefala
Copy link
Contributor

@anjakefala anjakefala commented Aug 4, 2021

  • fixes bug where pipeline exceptions, if a task returns a None as a
    result to .finish
59
 5360.3s [MPI  89/128] - INFO     draco.analysis.sidereal.SiderealGrouper: self._timestream_list: [<draco.core.containers.TimeStream object at 0x2b3ac5d29220>, <draco.core.containers.TimeStream object at 0x2b3ac5cc61f0>]                                                                                                                                                                
   5360.3s [MPI   0/128] - INFO     draco.analysis.transform.CollateProducts: Profiling pipeline: 6 cores available.                                                                                                                                                                                                                                                                          
 Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                           
   File "/scratch/anja/code/caput/caput/scripts/runner.py", line 555, in <module>                                                                                                                                                                                                                                                                                                             
     cli()                                                                                                                                                                                                                                                                                                                                                                                    
   File "/project/rpp-chime/chime/chime_env/modules/chime/python/2021.03/lib/python3.8/site-packages/click/core.py", line 829, in __call__                                                                                                                                                                                                                                                    
     return self.main(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                        
   File "/project/rpp-chime/chime/chime_env/modules/chime/python/2021.03/lib/python3.8/site-packages/click/core.py", line 782, in main                                                                                                                                                                                                                                                        
     rv = self.invoke(ctx)                                                                                                                                                                                                                                                                                                                                                                    
   File "/project/rpp-chime/chime/chime_env/modules/chime/python/2021.03/lib/python3.8/site-packages/click/core.py", line 1259, in invoke                                                                                                                                                                                                                                                     
     return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                                                                                                                                                                                                                                                  
   File "/project/rpp-chime/chime/chime_env/modules/chime/python/2021.03/lib/python3.8/site-packages/click/core.py", line 1066, in invoke                                                                                                                                                                                                                                                     
     return ctx.invoke(self.callback, **ctx.params)                                                                                                                                                                                                                                                                                                                                           
   File "/project/rpp-chime/chime/chime_env/modules/chime/python/2021.03/lib/python3.8/site-packages/click/core.py", line 610, in invoke                                                                                                                                                                                                                                                      
     return callback(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                         
   File "/scratch/anja/code/caput/caput/scripts/runner.py", line 161, in run                                                                                                                                                                                                                                                                                                                  
     P.run()                                                                                                                                                                                                                                                                                                                                                                                  
   File "/scratch/anja/code/caput/caput/pipeline.py", line 633, in run                                                                                                                                                                                                                                                                                                                        
     out = task._pipeline_next()                                                                                                                                                                                                                                                                                                                                                              
   File "/scratch/anja/code/caput/caput/pipeline.py", line 1094, in _pipeline_next                                                                                                                                                                                                                                                                                                            
     out = self.finish()                                                                                                                                                                                                                                                                                                                                                                      
   File "/scratch/anja/code/draco/draco/core/task.py", line 359, in finish                                                                                                                                                                                                                                                                                                                    
     output = self._process_output(output)                                                                                                                                                                                                                                                                                                                                                    
   File "/scratch/anja/code/draco/draco/core/task.py", line 369, in _process_output                                                                                                                                                                                                                                                                                                           
     count=self._count, tag=output.attrs.get("tag", input_tag or self._count)                                                                                                                                                                                                                                                                                                                 
 AttributeError: 'NoneType' object has no attribute 'attrs'    

I saw at least a few places where the pipeline would exception, if output was a None. In another part of task.py, the output check is done before it is passed to _process_output. I repeated that for finish(), but it also did not make sense to me for _process_output to generally exception on a None. So I added the check there too.

A None should be returned in these cases, because then _check_task_output does the right thing:

     def _check_task_output(out, task):                                                                                                                                                                            
    ....                                                                                                                                                                                            
         if out is None:  # This iteration supplied no output                                                                                                                                                      
             return None

@anjakefala anjakefala requested a review from jrs65 August 4, 2021 22:55
@@ -364,6 +369,10 @@ def finish(self):

def _process_output(self, output, input_tag=None):

# when output is None, there is nothing to write or check
if output is None:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something more dramatic should happen here. This routine should only ever be called with a valid memh5 container, and so it's probably more appropriate to raise a specific exception than silently return None. Probably this test should be something more specific like not isinstance(output, memh5.MemDiskGroup) (I am not entirely sure that's the appropriate class to test for).

The same is true of _nan_process_output.

Copy link
Contributor Author

@anjakefala anjakefala Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used memh5.MemDiskGroup and memh5.MemGroup, but got this exception with the tests:

        if not isinstance(output, memh5.MemDiskGroup) or not isinstance(
            output, memh5.MemGroup
        ):
            raise pipeline.PipelineRuntimeError(
>               f"Task must output a valid memh5 container; given {type(output)}"
            )
E           caput.pipeline.PipelineRuntimeError: Task must output a valid memh5 container; given <class 'draco.core.containers.SiderealStream'>

Is the test realistically depicting the standard pipeline by returning a SiderealStream container?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the test realistically depicting the standard pipeline by returning a SiderealStream container?

Yes.

The bug is in the boolean logic. You almost certainly want not (isinstance(output, memh5.MemDiskGroup) or isinstance(output, memh5.MemGroup) or the more succinct not isinstance(output, (memh5.MemDiskgroup, memh5.MemGroup)).

What you've written is equivalent to not (isinstance(output, MemDiskgroup) and isinstance(output, MemGroup))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, you are totally right, what was I doing, I was thinking in English.

Copy link
Contributor

@jrs65 jrs65 Aug 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH you probably don't want to check for MemGroup at least within the branch that deals with saving. Ultimately it calls output.save(...) (within BasicContMixin.write_output) and that only works on subclasses of MemDiskGroup , but will fail if given a MemGroup as it doesn't guarantee a save method exists.

@anjakefala anjakefala force-pushed the handle-None branch 3 times, most recently from e753419 to 5938eb8 Compare August 5, 2021 22:57
- fixes bug where pipeline exceptions, if a task returns a None as a
result to `.finish`
@anjakefala anjakefala requested a review from jrs65 August 5, 2021 23:41
@jrs65 jrs65 merged commit 0f1b503 into master Aug 6, 2021
@jrs65 jrs65 deleted the handle-None branch August 6, 2021 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants