Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postmodeling weird error #691

Closed
nanounanue opened this issue May 9, 2019 · 2 comments
Closed

Postmodeling weird error #691

nanounanue opened this issue May 9, 2019 · 2 comments

Comments

@nanounanue
Copy link
Contributor

This code used to work:

audited_models_class.plot_prec_across_time(param_type='rank_pct',
                                           param=10,
                                           baseline=True,
                                           baseline_query=params.baseline_query,
                                           metric='precision@',
                                           figsize=params.figsize)

(This is from dirtyduck)

But trying with the new version of triage, now I got this:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-a829ddd494bf> in <module>
      3                                            baseline=True,
      4                                            baseline_query=params.baseline_query,
----> 5                                            metric='precision@')

/usr/local/lib/python3.6/site-packages/triage/component/postmodeling/contrast/model_group_evaluator.py in plot_prec_across_time(self, param_type, param, metric, baseline, baseline_query, df, figsize, fontsize)
    273         model_metrics[['param', 'param_type']] = \
    274                 model_metrics['parameter'].str.split('_', 1, expand=True)
--> 275         model_metrics['param'] =  model_metrics['param'].astype(str).astype(float)
    276         model_metrics['param_type'] = model_metrics['param_type'].apply(lambda x: 'rank_'+x)
    277 

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors, **kwargs)
   5689             # else, only a single dtype is given
   5690             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5691                                          **kwargs)
   5692             return self._constructor(new_data).__finalize__(self)
   5693 

/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py in astype(self, dtype, **kwargs)
    529 
    530     def astype(self, dtype, **kwargs):
--> 531         return self.apply('astype', dtype=dtype, **kwargs)
    532 
    533     def convert(self, **kwargs):

/usr/local/lib/python3.6/site-packages/pandas/core/internals/managers.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
    393                                             copy=align_copy)
    394 
--> 395             applied = getattr(b, f)(**kwargs)
    396             result_blocks = _extend_blocks(applied, result_blocks)
    397 

/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors, values, **kwargs)
    532     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    533         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 534                             **kwargs)
    535 
    536     def _astype(self, dtype, copy=False, errors='raise', values=None,

/usr/local/lib/python3.6/site-packages/pandas/core/internals/blocks.py in _astype(self, dtype, copy, errors, values, **kwargs)
    631 
    632                     # _astype_nansafe works fine with 1-d only
--> 633                     values = astype_nansafe(values.ravel(), dtype, copy=True)
    634 
    635                 # TODO(extension)

/usr/local/lib/python3.6/site-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
    700     if copy or is_object_dtype(arr) or is_object_dtype(dtype):
    701         # Explicit copy, or required since NumPy can't view from / to object.
--> 702         return arr.astype(dtype, copy=True)
    703 
    704     return arr.view(dtype)

ValueError: could not convert string to float: 

:(

@thcrock
Copy link
Contributor

thcrock commented May 9, 2019

It's hard to read this on my phone, but I wonder if the recent changes of some prediction/evaluations columns from float to decimal could cause some downstream processes to load them as strings.

@thcrock
Copy link
Contributor

thcrock commented May 9, 2019

On closer look, this shouldn't be one of those problems. The column in question is 'param', which has always been a string (e.g. 10_abs). The line before tries to split it, but data being a certain way could certainly confound this: it assumes that what comes before the underscore has to be a number, which is in no way true. Generally the ones we use match that, but there are metrics like fbeta where the param is a string (e.g. beta). And I'm not sure what an empty string (e.g. an unthresholded metric) would do here. It doesn't seem like this code is attempting to deal with these cases, and it should. I think the query that builds this in def metrics should filter only to records which look like they are thresholded.

nanounanue added a commit that referenced this issue May 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants