How to use --bucket parameter in colab? #4634

kuonumber · 2021-09-01T08:51:33Z

❔Question

I tried to save evlove.csv to gcs, and I thought I correctly set environment.
Could you give me some tips to get it done?
Thank you.

from google.colab import auth
auth.authenticate_user()
project_id = 'my-project'
!gcloud config set project {project_id}

Additional context

I got this erroe..
BadRequestException: 400 Invalid bucket name: 'gs:'

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2021-09-01T14:30:03Z

@kuonumber you can evolve from multiple machines in parallel using the --bucket argument:

python train.py --bucket BUCKET --evolve

BUCKET should be an open (public read/write permissions) bucket/directory string locatable on GCP, i.e. gs://bucket/dir/subidir

kuonumber · 2021-09-03T07:26:49Z

@glenn-jocher
thanks a lot

kuonumber · 2021-09-07T02:34:16Z

@glenn-jocher
After evolving, there had an empty weights folder in local machine.
Is it normal? Or should it provide best pt?

glenn-jocher · 2021-09-07T11:27:06Z

@kuonumber evolution output is hyperparameters, not weights.

rhysdg · 2021-09-15T09:55:47Z

Hey there! loving this functionality. I have a quick question however, is it possible to resume after an evo run? I've managed to succesfully store evolve.csv and hyp_evolve.yaml given the above command. However after trying to continue I receive a not a dirctory error with regards to the empty weights folder:

NotADirectoryError: [Errno 20] Not a directory: 'runs/evolve/exp4/weights'

For your reference I'm using the following to kick off hyperparameter evolution:

python train.py --img 640 --batch 64 --epochs 10 --data traffic.yaml --weights yolov5s.pt --evolve 10 --bucket yolo-evo/evolve/lisa/v5s

is it just a matter of emulating the original directory and creating an empty weights placeholder?

Cheers!

glenn-jocher · 2021-09-15T10:21:11Z

@rhysdg 👋 Hello! Thanks for asking about resuming evolution.

Resuming YOLOv5 🚀 evolution is a bit different than resuming a normal training run with python train.py --resume. If you started an evolution run which was interrupted, or finished normally, and you would like to continue for additional generations where you left off, then you pass --resume and specify the --name of the evolution you want to resume, i.e.:

Start Evolution

Assume you evolve YOLOv5s on COCO128 for 2 epochs for 3 generations:

python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --evolve 3

If this is your first evolution a new directory runs/evolve/exp will be created to save your results.

# ├── yolov5
#     └── runs
#         └── evolve
#             └── exp  ← evolution saved here

Start a Second Evolution

Now assume you want to start a completely separate evolution: YOLOv5s on VOC for 5 epochs for 3 generations. You simply start evolving, and your new evolution will again be logged to a new directory runs/evolve/exp2:

python train.py --epochs 5 --data VOC.yaml --weights yolov5s.pt --evolve 3

You will now have two evolution runs saved:

# ├── yolov5
#     └── runs
#         └── evolve
#             ├── exp  ← first evolution (COCO128)
#             └── exp2  ← second evolution (VOC)

Notebook example:

Resume an Evolution

If you want to resume the first evolution (COCO128 saved to runs/evolve/exp), then you use the same exact command you started with plus --resume --name exp, passing the additional number of generations you want, i.e. --evolve 30 for 30 more generations:

python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --evolve 30 --resume --name exp

Evolution will run for an additional 30 generations and all new results will be added to the existing runs/evolve/exp/evolve.csv.

Good luck and let us know if you have any other questions!

glenn-jocher · 2021-09-15T11:13:41Z

@rhysdg good news 😃! We fixed a small bug ✅ in resuming evolution in PR #4802. Following this PR resuming evolution should work correctly per the instructions in my previous post.

To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

rhysdg · 2021-09-22T13:46:33Z

Ah great @glenn-jocher thanks so much for the swift response! I can confirm that everything works well as per your instructions except a minor bug when resuming within a new instance that throws a file exists error. It's solved easily however by creating an empty directory corresponding to the previous session's name. So for --name v5x as an example !mkdir -p runs/evolve/v5x allows the session to continue without hassle after copying evolve.csv from a shared bucket. Cheers for everything and once again, absolutely loving working with Yolov5!

glenn-jocher · 2021-09-22T14:19:38Z

@rhysdg hmm interesting. If it seems like a reproducible bug you might want to consider submitting a PR with a proposed fix to help others in the future.

kuonumber added the question Further information is requested label Sep 1, 2021

glenn-jocher linked a pull request Sep 15, 2021 that will close this issue

Evolution --resume fix #4802

Merged

glenn-jocher closed this as completed in #4802 Sep 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use --bucket parameter in colab? #4634

How to use --bucket parameter in colab? #4634

kuonumber commented Sep 1, 2021

glenn-jocher commented Sep 1, 2021

kuonumber commented Sep 3, 2021

kuonumber commented Sep 7, 2021

glenn-jocher commented Sep 7, 2021

rhysdg commented Sep 15, 2021 •

edited

Loading

glenn-jocher commented Sep 15, 2021 •

edited

Loading

glenn-jocher commented Sep 15, 2021

rhysdg commented Sep 22, 2021

glenn-jocher commented Sep 22, 2021

How to use --bucket parameter in colab? #4634

How to use --bucket parameter in colab? #4634

Comments

kuonumber commented Sep 1, 2021

❔Question

Additional context

glenn-jocher commented Sep 1, 2021

kuonumber commented Sep 3, 2021

kuonumber commented Sep 7, 2021

glenn-jocher commented Sep 7, 2021

rhysdg commented Sep 15, 2021 • edited Loading

glenn-jocher commented Sep 15, 2021 • edited Loading

Start Evolution

Start a Second Evolution

Resume an Evolution

glenn-jocher commented Sep 15, 2021

rhysdg commented Sep 22, 2021

glenn-jocher commented Sep 22, 2021

rhysdg commented Sep 15, 2021 •

edited

Loading

glenn-jocher commented Sep 15, 2021 •

edited

Loading