Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ROIRandomCrop operator #2638

Merged
merged 6 commits into from
Feb 1, 2021
Merged

Conversation

jantonguirao
Copy link
Contributor

@jantonguirao jantonguirao commented Jan 26, 2021

Signed-off-by: Joaquin Anton janton@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It adds new feature needed to perform biased random crop in segmentation applications

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    Added a new operator ROIRandomCrop that takes a fixed cropping window shape and a region of interest and produces a cropping window anchor (start coordinates) so that the resulting cropping window contains as much of the ROI as possible.
  • Affected modules and functionalities:
    New op
  • Key points relevant for the review:
    Calculation of crop anchor.
    Test coverage? corner cases?
  • Validation and testing:
    Python tests
  • Documentation (including examples):
    Operator documentation

JIRA TASK: [DALI-1816]

Comment on lines 47 to 51
Note: Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true)
.AddOptionalArg<std::vector<int>>("roi_shape",
R"code(ROI shape.

Note: Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true)
.AddOptionalArg<std::vector<int>>("roi_shape",
R"code(ROI shape.
Note: Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true)
.. Note::
Using ``roi_end`` is mutually exclusive with ``roi_shape``.)code", nullptr, true)
.AddOptionalArg<std::vector<int>>("roi_shape",
R"code(ROI shape.
.. Note::
Using ``roi_shape`` is mutually exclusive with ``roi_end``.)code", nullptr, true)

If provided, the cropping window start will be selected so that the cropping window is within the
bounds of the input.

Note: Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note: Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input.
.. Note::
Providing ``in_shape`` is incompatible with feeding the input data directly as a positional input.

Comment on lines 92 to 93
if ((roi_end_.IsDefined() + roi_shape_.IsDefined()) != 1)
DALI_FAIL("Either ROI end or ROI shape should be defined, but not both");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ((roi_end_.IsDefined() + roi_shape_.IsDefined()) != 1)
DALI_FAIL("Either ROI end or ROI shape should be defined, but not both");
DALI_ENFORCE(roi_end_.IsDefined() + roi_shape_.IsDefined()) == 1,
"Either ROI end or ROI shape should be defined, but not both");

if (roi_end_.IsDefined()) {
roi_end_.Acquire(spec_, ws, nsamples, sh);
} else {
assert(roi_shape_.IsDefined());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need that assert as you already have one check in the constructor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it one of the reasons to have asserts - to check for inconsistent internal state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it there mostly for documentation, but also to double-check that the logic is sane

Comment on lines 115 to 117
if (in_shape_arg_.IsDefined() && (ws.NumInput() == 1)) {
DALI_FAIL("``in_shape`` argument is incompatible with providing an input.")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (in_shape_arg_.IsDefined() && (ws.NumInput() == 1)) {
DALI_FAIL("``in_shape`` argument is incompatible with providing an input.")
}
DALI_ENFORCE(in_shape_arg_.IsDefined() && (ws.NumInput() == 1) == false, "``in_shape`` argument is incompatible with providing an input.")

for (int d = 0; d < ndim; d++) {
DALI_ENFORCE(roi_start_[s].data[d] >= 0 &&
sample_sh[d] >= (roi_start_[s].data[d] + roi_shape_[s].data[d]),
"ROI can't be out of bounds.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you print the offending shape and roi bounds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

shape=(ndim,), dtype=types.INT32, device='cpu')
roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
crop_start = fn.roi_random_crop(*inputs, crop_shape=crop_shape,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should test roi_end argument as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
crop_start = fn.roi_random_crop(*inputs, crop_shape=crop_shape,
roi_start=roi_start, roi_shape=roi_shape,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a negative cases when both use_in_shape_arg and use_shape_like_in are provided, as well as roi_end with roi_shape?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

outputs += [in_shape_out]
pipe.set_outputs(*outputs)
pipe.build()
for iter in range(niter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for iter in range(niter):
for _ in range(niter):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

for iter in range(niter):
outputs = pipe.run()
for idx in range(batch_size):
roi_start = outputs[0].at(idx).tolist()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not use at as it is deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


for d in range(ndim):
if in_shape is not None:
assert crop_start[d] >= 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't always be crop_start[d] >= 0 no matter if in_shape is provided?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Out of bounds slicing is allowed. We are OK with a negative anchor as long as the cropping window contains the ROI (could happen for a big cropping window and a small ROI near the edge)

Comment on lines +118 to +136
if (in_shape_arg_.IsDefined()) {
in_shape_.resize(nsamples, ndim);
in_shape_arg_.Acquire(spec_, ws, nsamples, sh);
for (int s = 0; s < nsamples; s++) {
auto sample_sh = in_shape_.tensor_shape_span(s);
for (int d = 0; d < ndim; d++) {
sample_sh[d] = in_shape_arg_[s].data[d];
}
}
} else {
auto &in = ws.template InputRef<CPUBackend>(0);
in_shape_ = in.shape();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if no input nor in_shape_arg_ are provided?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing special, we just don't bound the cropping window to be within the image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I see. I missed that L121 condition covers the L138 loop as well.

}
} else {
for (int d = 0; d < ndim; d++) {
DALI_ENFORCE(roi_start_[s].data[d] >= 0 && sample_sh[d] >= roi_end_[s].data[d],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to bother with roi_end_ here. Maybe it should be just roi_start_ and roi_shape_, and values from roi_end should be used to infer roi_shape_ if not provided directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd involve extra storage, and the code accessing this storage would look different than the arguments directly provided (because those are tensor views). This is why I preferred to do it on the fly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier to check end than start+shape?

ws.ArgumentInput("crop_shape").size() :
ws.GetRequestedBatchSize(0);
crop_shape_.Acquire(spec_, ws, nsamples, true);
int ndim = crop_shape_[0].shape[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should check if crop_shape_ is uniform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the true in Acquire does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to - each sample can be different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the shape of a shape we are talking about. Uniformity in crop_shape means same number of dimensions

int64_t roi_extent = -1;
int64_t roi_start = roi_start_[sample_idx].data[d];
int64_t crop_extent = crop_shape_[sample_idx].data[d];
if (roi_end_.IsDefined()) {
Copy link
Contributor

@JanuszL JanuszL Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do this in setup (roi_extent calculation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could but it'd involve extra storage to keep those.

Comment on lines 184 to 186
if (roi_extent == crop_extent) {
crop_start[sample_idx].data[d] = roi_start;
} else if (roi_extent > crop_extent) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
if (roi_extent == crop_extent) {
crop_start[sample_idx].data[d] = roi_start;
} else if (roi_extent > crop_extent) {
if (roi_extent == crop_extent) {
crop_start[sample_idx].data[d] = roi_start;
} else {
int64_t range = roi_extent - crop_extent;
if (sample_sh) {
range = std::max<int64_t>(range, -roi_start); // make sure that if roi_extent < crop_extent we won't be < 0
}
int64_t range_start = std::min<int64_t>(roi_start, roi_start + range);
int64_t range_end = std::max<int64_t>(roi_start, roi_start + range)
auto dist = std::uniform_int_distribution<int64_t>(range_start, range_end);
crop_start[sample_idx].data[d] = dist(rngs_[sample_idx]);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can get rid of if at all.

} else if (roi_extent > crop_extent) {
int64_t range_end = roi_start + roi_extent - crop_extent;
if (sample_sh)
range_end = std::min<int64_t>(sample_sh[d] - crop_extent, range_end);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you check if ROI is inside shape already in setup. So this is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't depend on the ROI but on the cropping window. Imagine a large cropping window, it would go easily out of bounds if we want to cover all the different anchors that cover the whole ROI. In the case that the cropping window is bounded the range of possible anchors is limited. Otherwise it isn't

provided region of interest (ROI) is contained in it.

If the ROI is bigger than the cropping window, the cropping window will be a subwindow of the ROI.
If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window.
If the ROI is smaller than the cropping window, the whole ROI shall be contained in the cropping window.

If the ROI is smaller than the cropping window, the whole ROI should be contained in the cropping window.

If an input shape (``in_shape``) is given, the resulting cropping window is selected to be within the
bounds of that input shape. Alternatively, the input data subject to cropping can be fed directly to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bounds of that input shape. Alternatively, the input data subject to cropping can be fed directly to
bounds of that input shape. Alternatively, the input data subject to cropping can be passed to the operator, in
which case the cropping window will be constrained to the shape of the data.

fed suggests that the data itself is consumed - and I think we should make it very clear that the operator doesn't touch the data.

Comment on lines +147 to +150
DALI_ENFORCE(sample_sh[d] >= crop_shape_[s].data[d],
make_string("Cropping shape can't be bigger than the input shape. "
"Got: crop_shape[", crop_shape_[s].data[d], "] and sample_shape[",
sample_sh[d], "] for d=", d));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what we should do in this case...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should allow (with an extra parameter) to clip the cropping window if it's larger than input? Or, alternatively, keep out-of-bounds window (pad)? Just thinking aloud.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should complicate this even more unless we have a specific use in mind.

auto& thread_pool = ws.GetThreadPool();

for (int sample_idx = 0; sample_idx < nsamples; sample_idx++) {
thread_pool.AddWork(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if asynchronous execution is justified here - after all this operator performs ndim loops per sample - not much compared to the overhead of scheduling work to thread pool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get rid of it

for idx in range(6, 10):
check_crop_start(np.array(outputs[idx][s]).tolist(), roi_start, roi_shape, crop_shape, in_shape)

def test_random_mask_pixel():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_random_mask_pixel():
def test_roi_random_cropl():

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, copy-paste :)

Comment on lines 32 to 37
crop_shape = fn.random.uniform(range=(crop_min_extent, crop_max_extent + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
roi_shape = fn.random.uniform(range=(roi_min_extent, roi_max_extent + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
roi_start = fn.random.uniform(range=(roi_min_start, roi_max_start + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that scalar arguments are not tested. Perhaps you could do sth like:

def make_arg(lo, hi):
  return [lo] * ndim if lo == hi else fn.random.uniform(range=(lo, hi+1), shape=[ndim], dtype=types.INT32)

Comment on lines 28 to 29
shape_like_in = dali.fn.external_source(lambda: np.zeros(shape_gen_f()),
device='cpu', batch=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try variable batch size?

int nsamples = crop_start.shape.size();
int ndim = crop_start[0].shape[0];

auto& thread_pool = ws.GetThreadPool();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
Signed-off-by: Joaquin Anton <janton@nvidia.com>
roi_start = np.array([(roi_min_start + roi_max_start) // 2] * ndim, dtype=np.int32) if random.choice([True, False]) \
else fn.random.uniform(range=(roi_min_start, roi_max_start + 1),
shape=(ndim,), dtype=types.INT32, device='cpu')
roi_end = roi_start + roi_shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this works, then it's by chance - if roi_start is a numpy.ndarray and roi_shape is a DataNode, this wouldn't work.

shape_like_in = dali.fn.external_source(data_gen_f, device='cpu')
in_shape = dali.fn.shapes(shape_like_in, dtype=types.INT32)

crop_shape = [(crop_min_extent + crop_max_extent) // 2] * ndim if random.choice([True, False]) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numpy arrays are passed as argument inputs, so this still does not test non-tensor arguments. It must be a plain list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I fixed that now

Signed-off-by: Joaquin Anton <janton@nvidia.com>
@jantonguirao jantonguirao merged commit a5d34b3 into NVIDIA:master Feb 1, 2021
@JanuszL JanuszL mentioned this pull request May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants