Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump_fix mode of dump_bin.py #1231

Closed
2young-2simple-sometimes-naive opened this issue Jul 25, 2022 · 0 comments
Closed

dump_fix mode of dump_bin.py #1231

2young-2simple-sometimes-naive opened this issue Jul 25, 2022 · 0 comments
Labels
bug Something isn't working

Comments

@2young-2simple-sometimes-naive
Copy link
Contributor

2young-2simple-sometimes-naive commented Jul 25, 2022

🐛 Bug Description

if the input csv files of dump_fix mode only have part of the times in the calendars, the dump_fix generated database is only accessible for the times in the input csv, although all other dates are still in the calendar.

To Reproduce

Steps to reproduce the behavior:
use the following script to test available data in the database

import qlib
qlib.init(provider_uri="qlib_data/day", dataset_cache=None, custom_ops=[], expression_cache=None, region=qlib.config.REG_US)
from qlib.data import D
import pandas as pd
close = D.features(instruments=["SPY"], fields=["$close"], start_time="2022-07-01", end_time="2022-07-31", freq="day")
df = pd.DataFrame(index=close.reset_index().datetime)
df["close"] = close.reset_index(drop=True).values
print(df.tail(5))

build the database with full calendar csv (say, from 2000-01-01 to 2022-07-25), then dump_fix the database using only one day csv (say, with only 2022-07-25), then the resulted database only have data on 2022-07-25

Expected Behavior

The dump_fix should not affect data that is not in the input csv

Screenshot

Environment

Note: User could run cd scripts && python collect_info.py all under project directory to get system information
and paste them here directly.
Linux
x86_64
Linux-4.18.0-147.el8.x86_64-x86_64-with-glibc2.2.5
#1 SMP Wed Dec 4 21:51:45 UTC 2019

Python version: 3.8.6 (default, Oct 22 2020, 17:03:03) [GCC 9.3.0]

Qlib version: 0.8.6.99
numpy==1.23.1
pandas==1.3.5
scipy==1.8.1
requests==2.28.1
sacred==0.8.2
python-socketio==5.7.1
redis==4.3.4
python-redis-lock==3.7.0
schedule==1.1.0
cvxpy==1.2.1
hyperopt==0.1.2
fire==0.4.0
statsmodels==0.13.2
xlrd==2.0.1
plotly==5.9.0
matplotlib==3.5.2
tables==3.7.0
pyyaml==6.0
mlflow==1.27.0
tqdm==4.64.0
loguru==0.6.0
lightgbm==3.3.2
tornado==6.2
joblib==1.1.0
fire==0.4.0
ruamel.yaml==0.17.21

Additional Notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant