Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image preprocessing pipeline #323

Open
QiJune opened this issue Sep 11, 2020 · 2 comments
Open

Image preprocessing pipeline #323

QiJune opened this issue Sep 11, 2020 · 2 comments

Comments

@QiJune
Copy link
Collaborator

QiJune commented Sep 11, 2020

Image preprocessing pipeline in GoTorch

a jpg image file ---> image.YCbCr format --> image.NRGBA format ---> Go array --> GoTorch Tensor

  1. Use image.Decode method to read a jpg image file into YCbCr
  2. Use imaging.scan method to transform YCbCr format to NRGBA format (high cost)
  3. Do some transformations, such as ResizeCrop
  4. Copy NRGBA format image to a continuous local Go array variable (high cost)
  5. Create a tensor view of the Go array, using FromBlob method
  6. Make a deep copy of the tensor, because the Go array may be freed (high cost)

Image preprocessing pipeline in PyTorch

a jpg image file ---> PIL image of RGB format ---> PyTorch Tensor

  1. Read a job image file into PIL RGB format
  2. Do some transformations, such as ResizeCrop
  3. Create a tensor view of the PIL image, using as_tensor method

Problems

  • We need to read a jpg image file into RGB format directly
  • We could not use disintegration/imaging library, since it will always transform an image into NRGBA format before doing preprocessing. NRGBA is not continuous, it inserts two no use channels.
  • We need a preprocessing library which could handle RGB image directly, so that the memory layout remains continuous all the time.

Conclusions

It seems that we could not use the Go image library and the disintegration/imaging library at all.

We need an independent and high-efficient image preprocessing library.

Maybe gocv is an option.

@QiJune
Copy link
Collaborator Author

QiJune commented Sep 14, 2020

I find that opencv is 2 times faster.

package transforms

import (
	"fmt"
	"image"
	"image/jpeg"
	"os"
	"testing"
	"time"
	"unsafe"

	torch "github.com/wangkuiyi/gotorch"
	"gocv.io/x/gocv"
)

func TestJPG(t *testing.T) {
	fileName := "188242.jpg"
	size := 224
	startTime := time.Now()

	for i := 0; i < 100; i++ {
		file, _ := os.Open(fileName)
		defer file.Close()
		img, _ := jpeg.Decode(file)
		trans1 := Resize(size, size)
		o1 := trans1.Run(img)

		trans2 := ToTensor()
		_ = trans2.Run(o1).Clone()
	}
	fmt.Println(time.Since(startTime).Seconds())

	startTime = time.Now()
	for i := 0; i < 100; i++ {
		imgCv := gocv.IMRead(fileName, gocv.IMReadColor)
		defer imgCv.Close()
		gocv.CvtColor(imgCv, &imgCv, gocv.ColorBGRToRGB)
		gocv.Resize(imgCv, &imgCv, image.Point{size, size}, 0, 0, 1)
		imgCv.ConvertTo(&imgCv, gocv.MatTypeCV32FC3)
		imgCv.MultiplyFloat(1.0 / 255.0)
		view, _ := imgCv.DataPtrFloat32()
		tensor := torch.FromBlob(unsafe.Pointer(&view[0]),
			torch.Float, []int64{int64(size), int64(size), 3})
		tensor.Permute([]int64{2, 0, 1})
	}

	fmt.Println(time.Since(startTime).Seconds())
}
go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.399983358
0.166660883
--- PASS: TestJPG (0.58s)
PASS
ok  	github.com/wangkuiyi/gotorch/vision/transforms	1.318s

go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.42601828
0.149734561
--- PASS: TestJPG (0.59s)
PASS
ok  	github.com/wangkuiyi/gotorch/vision/transforms	0.876s

go test github.com/wangkuiyi/gotorch/vision/transforms -v -run JPG -count=1
=== RUN   TestJPG
0.384203016
0.156132424
--- PASS: TestJPG (0.55s)
PASS
ok  	github.com/wangkuiyi/gotorch/vision/transforms	0.786s

@QiJune
Copy link
Collaborator Author

QiJune commented Sep 14, 2020

We decide to use gocv to do transforms in GoTorch. The following are some basic operations provided by gocv:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant