DDG-DA Nested Run Error #996

hetankevin · 2022-03-20T20:13:19Z

🐛 Bug Description

Running DDG-DA almost instantly results in an error:

Run with UUID fc707f91f6c44ce6b73c383dd1fa2db4 is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True

To Reproduce

Steps to reproduce the behavior:

Run the script examples/DDG-DA/workflow.py

Screenshot

(base) hetankevin@MacBook-Pro-2 ddg-da % python workflow.py
2022-03-20 16:10:28.265 | WARNING | qlib.tests.data:qlib_data:150 - Data already exists: ~/.qlib/qlib_data/cn_data, the data download will be skipped
If downloading is required: exists_skip=False or change target_dir
[26784:MainThread](2022-03-20 16:10:28,266) INFO - qlib.Initialization - [config.py:402] - default_conf: client.
[26784:MainThread](2022-03-20 16:10:28,267) INFO - qlib.Initialization - [init.py:73] - qlib successfully initialized based on client settings.
[26784:MainThread](2022-03-20 16:10:28,267) INFO - qlib.Initialization - [init.py:75] - data_path={'__DEFAULT_FREQ': PosixPath('/Users/hetankevin/.qlib/qlib_data/cn_data')}
(base) hetankevin@MacBook-Pro-2 ddg-da % python workflow.py run_all
2022-03-20 16:11:06.772 | WARNING | qlib.tests.data:qlib_data:150 - Data already exists: ~/.qlib/qlib_data/cn_data, the data download will be skipped
If downloading is required: exists_skip=False or change target_dir
[26791:MainThread](2022-03-20 16:11:06,772) INFO - qlib.Initialization - [config.py:402] - default_conf: client.
[26791:MainThread](2022-03-20 16:11:06,773) INFO - qlib.Initialization - [init.py:73] - qlib successfully initialized based on client settings.
[26791:MainThread](2022-03-20 16:11:06,773) INFO - qlib.Initialization - [init.py:75] - data_path={'__DEFAULT_FREQ': PosixPath('/Users/hetankevin/.qlib/qlib_data/cn_data')}
[26791:MainThread](2022-03-20 16:11:25,954) INFO - qlib.timer - [log.py:113] - Time cost: 19.174s | Loading data Done
[26791:MainThread](2022-03-20 16:11:26,283) INFO - qlib.timer - [log.py:113] - Time cost: 0.145s | DropnaLabel Done
/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pandas/core/frame.py:3641: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]
[26791:MainThread](2022-03-20 16:11:28,057) INFO - qlib.timer - [log.py:113] - Time cost: 1.774s | CSZScoreNorm Done
[26791:MainThread](2022-03-20 16:11:28,065) INFO - qlib.timer - [log.py:113] - Time cost: 2.110s | fit & process data Done
[26791:MainThread](2022-03-20 16:11:28,065) INFO - qlib.timer - [log.py:113] - Time cost: 21.285s | Init data Done
Please install necessary libs for CatBoostModel.
Training until validation scores don't improve for 50 rounds
[20] train's l2: 0.959017 valid's l2: 0.992225
[40] train's l2: 0.940128 valid's l2: 0.995036
[60] train's l2: 0.923942 valid's l2: 0.998785
Early stopping, best iteration is:
[12] train's l2: 0.968307 valid's l2: 0.991277
[26791:MainThread](2022-03-20 16:11:46,886) WARNING - qlib.workflow - [expm.py:198] - No valid experiment found. Create a new experiment with name Experiment.
[26791:MainThread](2022-03-20 16:11:46,888) INFO - qlib.workflow - [exp.py:257] - Experiment 1 starts running ...
[26791:MainThread](2022-03-20 16:11:46,947) INFO - qlib.workflow - [recorder.py:293] - Recorder a1751be3870141c4a3d12750ec7d9b6d starts running under Experiment 1 ...
[26791:MainThread](2022-03-20 16:12:06,009) INFO - qlib.timer - [log.py:113] - Time cost: 18.886s | Loading data Done
/Users/hetankevin/miniforge3/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1096: RuntimeWarning: All-NaN slice encountered
result = np.apply_along_axis(_nanmedian1d, axis, a, overwrite_input)
[26791:MainThread](2022-03-20 16:12:10,618) INFO - qlib.timer - [log.py:113] - Time cost: 4.432s | RobustZScoreNorm Done
[26791:MainThread](2022-03-20 16:12:10,764) INFO - qlib.timer - [log.py:113] - Time cost: 0.146s | Fillna Done
[26791:MainThread](2022-03-20 16:12:10,945) INFO - qlib.timer - [log.py:113] - Time cost: 0.130s | DropnaLabel Done
/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pandas/core/frame.py:3641: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]
[26791:MainThread](2022-03-20 16:12:11,229) INFO - qlib.timer - [log.py:113] - Time cost: 0.283s | CSRankNorm Done
[26791:MainThread](2022-03-20 16:12:11,236) INFO - qlib.timer - [log.py:113] - Time cost: 5.226s | fit & process data Done
[26791:MainThread](2022-03-20 16:12:11,236) INFO - qlib.timer - [log.py:113] - Time cost: 24.114s | Init data Done
[26791:MainThread](2022-03-20 16:12:14,180) INFO - qlib.timer - [log.py:113] - Time cost: 0.024s | Loading data Done
[26791:MainThread](2022-03-20 16:12:14,181) INFO - qlib.timer - [log.py:113] - Time cost: 0.000s | fit & process data Done
[26791:MainThread](2022-03-20 16:12:14,181) INFO - qlib.timer - [log.py:113] - Time cost: 0.024s | Init data Done
[26791:MainThread](2022-03-20 16:12:14,531) WARNING - qlib.workflow - [expm.py:198] - No valid experiment found. Create a new experiment with name data_sim_s20.
train tasks: 0%| | 0/154 [00:00<?, ?it/s][26791:MainThread](2022-03-20 16:12:14,533) INFO - qlib.workflow - [expm.py:318] - <mlflow.tracking.client.MlflowClient object at 0x2872760d0>
[26791:MainThread](2022-03-20 16:12:14,534) INFO - qlib.workflow - [exp.py:257] - Experiment 2 starts running ...
train tasks: 0%| | 0/154 [00:00<?, ?it/s]
[26791:MainThread](2022-03-20 16:12:14,534) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[Exception: Run with UUID a1751be3870141c4a3d12750ec7d9b6d is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True].
File "/Users/hetankevin/Library/CloudStorage/OneDrive-Personal/Documents/UMich/Trading/genPerms/macrohard/qlib/examples/benchmarks_dynamic/DDG-DA/workflow.py", line 261, in
fire.Fire(DDGDA)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/hetankevin/Library/CloudStorage/OneDrive-Personal/Documents/UMich/Trading/genPerms/macrohard/qlib/examples/benchmarks_dynamic/DDG-DA/workflow.py", line 249, in run_all
self.dump_meta_ipt()
File "/Users/hetankevin/Library/CloudStorage/OneDrive-Personal/Documents/UMich/Trading/genPerms/macrohard/qlib/examples/benchmarks_dynamic/DDG-DA/workflow.py", line 122, in dump_meta_ipt
internal_data.setup(trainer=TrainerR)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/contrib/meta/data_selection/dataset.py", line 81, in setup
trainer.train(gen_task)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/model/trainer.py", line 248, in train
rec = train_func(task, experiment_name, **kwargs)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/model/trainer.py", line 116, in task_train
with R.start(experiment_name=experiment_name, recorder_name=recorder_name):
File "/Users/hetankevin/miniforge3/lib/python3.9/contextlib.py", line 119, in enter
return next(self.gen)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/workflow/init.py", line 69, in start
run = self.start_exp(
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/workflow/init.py", line 125, in start_exp
return self.exp_manager.start_exp(
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/workflow/expm.py", line 346, in start_exp
self.active_experiment.start(recorder_id=recorder_id, recorder_name=recorder_name, resume=resume)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/workflow/exp.py", line 270, in start
self.active_recorder.start_run()
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/pyqlib-0.8.4.99-py3.9-macosx-11.0-arm64.egg/qlib/workflow/recorder.py", line 287, in start_run
run = mlflow.start_run(self.id, self.experiment_id, self.name)
File "/Users/hetankevin/miniforge3/lib/python3.9/site-packages/mlflow/tracking/fluent.py", line 226, in start_run
raise Exception(
Exception: Run with UUID a1751be3870141c4a3d12750ec7d9b6d is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True

The text was updated successfully, but these errors were encountered:

you-n-g · 2022-04-02T08:11:16Z

@hetankevin

Thanks for your reporting.

It is fixed here
#1031

Please check it.
Please contact me if further error encountered

hetankevin added the bug Something isn't working label Mar 20, 2022

SunsetWolf closed this as completed Oct 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDG-DA Nested Run Error #996

DDG-DA Nested Run Error #996

hetankevin commented Mar 20, 2022 •

edited

Loading

you-n-g commented Apr 2, 2022

DDG-DA Nested Run Error #996

DDG-DA Nested Run Error #996

Comments

hetankevin commented Mar 20, 2022 • edited Loading

🐛 Bug Description

To Reproduce

Screenshot

you-n-g commented Apr 2, 2022

hetankevin commented Mar 20, 2022 •

edited

Loading