Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError while downloading DataComp-1B #27

Open
xfgao opened this issue Jul 7, 2023 · 7 comments
Open

FileNotFoundError while downloading DataComp-1B #27

xfgao opened this issue Jul 7, 2023 · 7 comments

Comments

@xfgao
Copy link

xfgao commented Jul 7, 2023

Thanks for the great work.
I encountered the following issue while downloading the DataComp-1B dataset:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
@GeorgiosSmyrnis
Copy link
Collaborator

Hi @xfgao!

To help identify the issue, could you share the command with which you are running download_upstream.py? Also, is the above the full error message?

Thanks!

@xfgao
Copy link
Author

xfgao commented Jul 7, 2023

We were running the following command to download data:
python download_upstream.py --scale datacomp_1b --data_dir DATA_DIR
We were able to download all the metadata and a bunch of tar files at the beginning, but after a certain point we keep getting the error message:

Traceback (most recent call last):
 File "<string>", line 1, in <module>
 File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
   exitcode = _main(fd, parent_sentinel)
 File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
   self = reduction.pickle.load(from_parent)
 File "/opt/conda/envs/datacomp/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
   self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

@rom1504
Copy link

rom1504 commented Jul 8, 2023

The error seems to be a limit at the OS level on the number of thread you can open

Either decrease the number of thread you are using, increase the limit or use a different machine

If you provide some info on your environment it could help

@xfgao
Copy link
Author

xfgao commented Jul 10, 2023

Thanks for the response. For the data downloading, I'm using Ubuntu 20.04 on an AWS EC2 g5.12xlarge instance (with 48 cpu cores). After reducing the processes_count to 8 and thread_count to 8, I'm still getting the the same FileNotFoundError error.

@rom1504
Copy link

rom1504 commented Jul 10, 2023

Can you try using virtual env instead of conda?

@xfgao
Copy link
Author

xfgao commented Jul 10, 2023

Do we have a requirement.txt file for setting up virtual env?

@GeorgiosSmyrnis
Copy link
Collaborator

You can try installing the packages listed under pip in the environment.yml, if I am not mistaken it should achieve something similar to the desired environment if your system python is the correct version (although this needs to be verified). You should still train with the original environment to avoid other issues - but for just the data download it should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants