You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the new pandas (1.5.3) is breaking the reading of the catalogs.
I have updated librairies in my env and I can't read catalogs anymore :'(.
It still works in another env that has pandas 1.4.3
Steps To Reproduce
xs.DataCatalog('simulation.json')
.../site-packages/intake_esm/cat.py:262: FutureWarning:
Use pd.to_datetime instead.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/period.pyx:1485, in pandas._libs.tslibs.period._extract_ordinal()
AttributeError: 'str' object has no attribute 'ordinal'
During handling of the above exception, another exception occurred:
OutOfBoundsDatetime Traceback (most recent call last)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py:1080, in _make_date_converter.<locals>.converter(*date_cols)
1078 try:
1079 result = tools.to_datetime(
-> 1080 date_parser(*date_cols), errors="ignore", cache=cache_dates
1081 )
1082 if isinstance(result, datetime.datetime):
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/xscen/catalog.py:131, in _parse_dates(elem)
129 # Only where we have NaT (parser errors and empty fields), parse into a Period
130 # This will raise DateParseError as expected if the string is not parsable.
--> 131 time[nat] = pd.PeriodIndex(elem[nat], freq="H")
132 return pd.PeriodIndex(time)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/core/indexes/period.py:273, in PeriodIndex.__new__(cls, data, ordinal, freq, dtype, copy, name, **fields)
271 else:
272 # don't pass copy here, since we copy later.
--> 273 data = period_array(data=data, freq=freq)
275 if copy:
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/core/arrays/period.py:977, in period_array(data, freq, copy)
975 data = ensure_object(arrdata)
--> 977 return PeriodArray._from_sequence(data, dtype=dtype)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/core/arrays/period.py:274, in PeriodArray._from_sequence(cls, scalars, dtype, copy)
273 freq = freq or libperiod.extract_freq(periods)
--> 274 ordinals = libperiod.extract_ordinals(periods, freq)
275 return cls(ordinals, freq=freq)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/period.pyx:1459, in pandas._libs.tslibs.period.extract_ordinals()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/period.pyx:1494, in pandas._libs.tslibs.period._extract_ordinal()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/period.pyx:2579, in pandas._libs.tslibs.period.Period.__new__()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/parsing.pyx:367, in pandas._libs.tslibs.parsing.parse_time_string()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/parsing.pyx:416, in pandas._libs.tslibs.parsing.parse_datetime_string_with_reso()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/timestamps.pyx:1698, in pandas._libs.tslibs.timestamps.Timestamp.__new__()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/conversion.pyx:249, in pandas._libs.tslibs.conversion.convert_to_tsobject()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/conversion.pyx:523, in pandas._libs.tslibs.conversion._convert_str_to_tsobject()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/conversion.pyx:506, in pandas._libs.tslibs.conversion._convert_str_to_tsobject()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/np_datetime.pyx:212, in pandas._libs.tslibs.np_datetime.check_dts_bounds()
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2291-01-01 00:00:00
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py:1088, in _make_date_converter.<locals>.converter(*date_cols)
1086 try:
1087 return tools.to_datetime(
-> 1088 parsing.try_parse_dates(
1089 parsing.concat_date_cols(date_cols),
1090 parser=date_parser,
1091 dayfirst=dayfirst,
1092 ),
1093 errors="ignore",
1094 )
1095 except Exception:
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/_libs/tslibs/parsing.pyx:718, in pandas._libs.tslibs.parsing.try_parse_dates()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/xscen/catalog.py:127, in _parse_dates(elem)
125 # Cast to normal datetime as this is much faster than to period for in-bounds dates
126 # errors are coerced to NaT, we convert to a PeriodIndex and then to a (mutable) series
--> 127 time = pd.to_datetime(elem, errors="coerce").astype(pd.PeriodDtype("H")).to_series()
128 nat = time.isnull()
AttributeError: 'Timestamp' object has no attribute 'astype'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
Input In [19], in <cell line: 7>()
5 print(pd.__version__)
6 print(sys.version)
----> 7 dcat = xs.DataCatalog('/tank/scenario/catalogues/simulation.json')
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/xscen/catalog.py:171, in DataCatalog.__init__(self, check_valid, drop_duplicates, *args, **kwargs)
166 kwargs["read_csv_kwargs"] = recursive_update(
167 csv_kwargs.copy(), kwargs.get("read_csv_kwargs", {})
168 )
169 args = args_as_str(args)
--> 171 super().__init__(*args, **kwargs)
172 if check_valid:
173 self.check_valid()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/intake_esm/core.py:94, in esm_datastore.__init__(self, obj, progressbar, sep, registry, read_csv_kwargs, storage_options, intake_kwargs)
92 self.esmcat = ESMCatalogModel.from_dict(obj)
93 else:
---> 94 self.esmcat = ESMCatalogModel.load(
95 obj, storage_options=self.storage_options, read_csv_kwargs=read_csv_kwargs
96 )
98 self.derivedcat = registry or default_registry
99 self._entries = {}
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/intake_esm/cat.py:262, in ESMCatalogModel.load(cls, json_file, storage_options, read_csv_kwargs)
260 csv_path = f'{os.path.dirname(_mapper.root)}/{cat.catalog_file}'
261 cat.catalog_file = csv_path
--> 262 df = pd.read_csv(
263 cat.catalog_file,
264 storage_options=storage_options,
265 **read_csv_kwargs,
266 )
267 else:
268 df = pd.DataFrame(cat.catalog_dict)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/util/_decorators.py:211, in deprecate_kwarg.<locals>._deprecate_kwarg.<locals>.wrapper(*args, **kwargs)
209 else:
210 kwargs[new_arg_name] = new_arg_value
--> 211 return func(*args, **kwargs)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/util/_decorators.py:331, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
325 if len(args) > num_allow_args:
326 warnings.warn(
327 msg.format(arguments=_format_argument_list(allow_args)),
328 FutureWarning,
329 stacklevel=find_stack_level(),
330 )
--> 331 return func(*args, **kwargs)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/readers.py:950, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
935 kwds_defaults = _refine_defaults_read(
936 dialect,
937 delimiter,
(...)
946 defaults={"delimiter": ","},
947 )
948 kwds.update(kwds_defaults)
--> 950 return _read(filepath_or_buffer, kwds)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/readers.py:611, in _read(filepath_or_buffer, kwds)
608 return parser
610 with parser:
--> 611 return parser.read(nrows)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/readers.py:1778, in TextFileReader.read(self, nrows)
1771 nrows = validate_integer("nrows", nrows)
1772 try:
1773 # error: "ParserBase" has no attribute "read"
1774 (
1775 index,
1776 columns,
1777 col_dict,
-> 1778 ) = self._engine.read( # type: ignore[attr-defined]
1779 nrows
1780 )
1781 except Exception:
1782 self.close()
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py:320, in CParserWrapper.read(self, nrows)
316 self._check_data_length(names, alldata)
318 data = {k: v for k, (i, v) in zip(names, data_tups)}
--> 320 names, date_data = self._do_date_conversions(names, data)
321 index, column_names = self._make_index(date_data, alldata, names)
323 return index, column_names, date_data
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py:822, in ParserBase._do_date_conversions(self, names, data)
814 def _do_date_conversions(
815 self,
816 names: Sequence[Hashable] | Index,
817 data: Mapping[Hashable, ArrayLike] | DataFrame,
818 ) -> tuple[Sequence[Hashable] | Index, Mapping[Hashable, ArrayLike] | DataFrame]:
819 # returns data, columns
821 if self.parse_dates is not None:
--> 822 data, names = _process_date_conversion(
823 data,
824 self._date_conv,
825 self.parse_dates,
826 self.index_col,
827 self.index_names,
828 names,
829 keep_date_col=self.keep_date_col,
830 )
832 return names, data
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py:1183, in _process_date_conversion(data_dict, converter, parse_spec, index_col, index_names, columns, keep_date_col)
1180 continue
1181 # Pyarrow engine returns Series which we need to convert to
1182 # numpy array before converter, its a no-op for other parsers
-> 1183 data_dict[colspec] = converter(np.asarray(data_dict[colspec]))
1184 else:
1185 new_name, col, old_names = _try_convert_dates(
1186 converter, colspec, data_dict, orig_names
1187 )
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py:1096, in _make_date_converter.<locals>.converter(*date_cols)
1087 return tools.to_datetime(
1088 parsing.try_parse_dates(
1089 parsing.concat_date_cols(date_cols),
(...)
1093 errors="ignore",
1094 )
1095 except Exception:
-> 1096 return generic_parser(date_parser, *date_cols)
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/pandas/io/date_converters.py:106, in generic_parser(parse_func, *cols)
104 for i in range(N):
105 args = [c[i] for c in cols]
--> 106 results[i] = parse_func(*args)
108 return results
File /exec/jlavoie/.conda/espo-g/lib/python3.9/site-packages/xscen/catalog.py:127, in _parse_dates(elem)
124 """Parse an array of dates (strings) into a PeriodIndex of hourly frequency."""
125 # Cast to normal datetime as this is much faster than to period for in-bounds dates
126 # errors are coerced to NaT, we convert to a PeriodIndex and then to a (mutable) series
--> 127 time = pd.to_datetime(elem, errors="coerce").astype(pd.PeriodDtype("H")).to_series()
128 nat = time.isnull()
129 # Only where we have NaT (parser errors and empty fields), parse into a Period
130 # This will raise DateParseError as expected if the string is not parsable.
AttributeError: 'Timestamp' object has no attribute 'astype'
Additional context
No response
Contribution
I would be willing/able to open a Pull Request to address this bug.
The text was updated successfully, but these errors were encountered:
I think something is wrong in the conda linux build. Hopefully, a new version of pandas will simply make this disappear. Because, with 3555 open issues, the pandas dev team might not have time to look into this....
Setup Information
Description
I think the new pandas (1.5.3) is breaking the reading of the catalogs.
I have updated librairies in my env and I can't read catalogs anymore :'(.
It still works in another env that has pandas 1.4.3
Steps To Reproduce
xs.DataCatalog('simulation.json')
Additional context
No response
Contribution
The text was updated successfully, but these errors were encountered: