Postmodeling weird error #691

nanounanue · 2019-05-09T00:11:40Z

This code used to work:

audited_models_class.plot_prec_across_time(param_type='rank_pct',
                                           param=10,
                                           baseline=True,
                                           baseline_query=params.baseline_query,
                                           metric='precision@',
                                           figsize=params.figsize)

(This is from dirtyduck)

But trying with the new version of triage, now I got this:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-a829ddd494bf> in <module>
      3                                            baseline=True,
      4                                            baseline_query=params.baseline_query,
----> 5                                            metric='precision@')

/usr/local/lib/python3.6/site-packages/triage/component/postmodeling/contrast/model_group_evaluator.py in plot_prec_across_time(self, param_type, param, metric, baseline, baseline_query, df, figsize, fontsize)
    273         model_metrics[['param', 'param_type']] = \
    274                 model_metrics['parameter'].str.split('_', 1, expand=True)
--> 275         model_metrics['param'] =  model_metrics['param'].astype(str).astype(float)
    276         model_metrics['param_type'] = model_metrics['param_type'].apply(lambda x: 'rank_'+x)
    277 

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors, **kwargs)
   5689             # else, only a single dtype is given
   5690             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5691                                          **kwargs)
   5692             return self._constructor(new_data).__finalize__(self)
   5693 

/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py in astype(self, dtype, **kwargs)
    529 
    530     def astype(self, dtype, **kwargs):
--> 531         return self.apply('astype', dtype=dtype, **kwargs)
    532 
    533     def convert(self, **kwargs):

/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
    393                                             copy=align_copy)
    394 
--> 395             applied = getattr(b, f)(**kwargs)
    396             result_blocks = _extend_blocks(applied, result_blocks)
    397 

/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors, values, **kwargs)
    532     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    533         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 534                             **kwargs)
    535 
    536     def _astype(self, dtype, copy=False, errors='raise', values=None,

/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py in _astype(self, dtype, copy, errors, values, **kwargs)
    631 
    632                     # _astype_nansafe works fine with 1-d only
--> 633                     values = astype_nansafe(values.ravel(), dtype, copy=True)
    634 
    635                 # TODO(extension)

/usr/local/lib/python3.6/site-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
    700     if copy or is_object_dtype(arr) or is_object_dtype(dtype):
    701         # Explicit copy, or required since NumPy can't view from / to object.
--> 702         return arr.astype(dtype, copy=True)
    703 
    704     return arr.view(dtype)

ValueError: could not convert string to float:

:(

The text was updated successfully, but these errors were encountered:

thcrock · 2019-05-09T00:20:00Z

It's hard to read this on my phone, but I wonder if the recent changes of some prediction/evaluations columns from float to decimal could cause some downstream processes to load them as strings.

thcrock · 2019-05-09T16:02:04Z

On closer look, this shouldn't be one of those problems. The column in question is 'param', which has always been a string (e.g. 10_abs). The line before tries to split it, but data being a certain way could certainly confound this: it assumes that what comes before the underscore has to be a number, which is in no way true. Generally the ones we use match that, but there are metrics like fbeta where the param is a string (e.g. beta). And I'm not sure what an empty string (e.g. an unthresholded metric) would do here. It doesn't seem like this code is attempting to deal with these cases, and it should. I think the query that builds this in def metrics should filter only to records which look like they are thresholded.

nanounanue added postmodeling urgent labels May 9, 2019

nanounanue added a commit that referenced this issue May 9, 2019

Fixes #691

759df4e

nanounanue closed this as completed in 6cb43f9 May 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postmodeling weird error #691

Postmodeling weird error #691

nanounanue commented May 9, 2019

thcrock commented May 9, 2019

thcrock commented May 9, 2019

Postmodeling weird error #691

Postmodeling weird error #691

Comments

nanounanue commented May 9, 2019

thcrock commented May 9, 2019

thcrock commented May 9, 2019