Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on Human3.6M dataset #2

Open
hliuav opened this issue Aug 10, 2018 · 6 comments
Open

Training on Human3.6M dataset #2

hliuav opened this issue Aug 10, 2018 · 6 comments

Comments

@hliuav
Copy link

hliuav commented Aug 10, 2018

Can I use this code to train on human3.6M dataset(16 landmarks) by just simply replace the dataset and partition.txt since the results I get look not as good as those in the paper

@YutingZhang
Copy link
Owner

For Human3.6M, we also used optical flow as self-supervision (see Appendix C). The result may not be as good as those in the paper if the optical flow is not used.
The implementation of the optical-flow based loss is already implemented at

# optical flow
of_condition = None
if condition_tensor is not None:
assert condition_tensor is not None, "need optical flow condition"
for v in condition_tensor:
if v["type"] == "optical_flow":
of_condition = v
optical_flow_transform_loss_weight = None
if "optical_flow_transform_loss_weight" in self.options:
optical_flow_transform_loss_weight = self.options["optical_flow_transform_loss_weight"]
if optical_flow_transform_loss_weight is None:
if of_condition is not None and "keypoint_transform_loss_weight" in self.options:
optical_flow_transform_loss_weight = self.options["keypoint_transform_loss_weight"]
optical_flow_strength_loss_weight = None
if "optical_flow_strength_loss_weight" in self.options:
optical_flow_strength_loss_weight = self.options["optical_flow_strength_loss_weight"]
if ptu.default_phase() == pt.Phase.train and \
(rbool(optical_flow_transform_loss_weight) or rbool(optical_flow_strength_loss_weight)):
assert of_condition is not None, "need optical flow condition"
# coordinate before padding
pre_keypoint_param = keypoint_param[:, :, :2]
scaling_factor = np.array(self.target_input_size) / np.array(self.input_size)
pre_keypoint_param = keypoints_2d.scale_keypoint_param(
pre_keypoint_param, scaling_factor, src_aspect_ratio=full_a)
# only use valid
ind_offset = tf.reshape(of_condition["offset"], [-1])
flow_map = of_condition["flow"] # [batch_size, h, w, 2]
valid_mask = tf.not_equal(ind_offset, 0)
# interpolation mask
flow_h, flow_w = tmf.get_shape(flow_map)[1:3]
if rbool(optical_flow_transform_loss_weight):
pre_interp_weights = keypoints_2d.gaussian_coordinate_to_keypoint_map(tf.concat([
pre_keypoint_param,
tf.ones_like(pre_keypoint_param[:, :, -1:]) / math.sqrt(flow_h * flow_w)
], axis=2), km_h=flow_h, km_w=flow_w) # [batch_size, h, w, keypoint_num]
pre_interp_weights /= tf.reduce_sum(pre_interp_weights, axis=[1, 2], keep_dims=True) + tmf.epsilon
# pointwise flow
next_ind = np.arange(batch_size) + ind_offset
next_keypoint_param = tf.gather(pre_keypoint_param, next_ind)
pointwise_flow = tf.reduce_sum(
tf.expand_dims(flow_map, axis=3)*tf.expand_dims(pre_interp_weights, axis=4),
axis=[1, 2]
)
# flow transform constraint
next_keypoint_param_2 = pre_keypoint_param + pointwise_flow
kp_of_trans_loss = tf.reduce_mean(tf.boolean_mask(
tmf.sum_per_sample(tf.square(next_keypoint_param_2 - next_keypoint_param)),
mask=valid_mask
))
optical_flow_transform_loss = kp_of_trans_loss * optical_flow_transform_loss_weight
tgu.add_to_aux_loss(optical_flow_transform_loss, "flow_trans")
if rbool(optical_flow_strength_loss_weight):
pre_interp_weights = keypoints_2d.gaussian_coordinate_to_keypoint_map(tf.concat([
pre_keypoint_param,
tf.ones_like(pre_keypoint_param[:, :, -1:]) * (1/16) #self.base_gaussian_stddev
], axis=2), km_h=flow_h, km_w=flow_w) # [batch_size, h, w, keypoint_num]
pre_interp_weights /= tf.reduce_sum(pre_interp_weights, axis=[1, 2], keep_dims=True) + tmf.epsilon
kp_of_strength_loss = tf.reduce_mean(tmf.sum_per_sample(
tf.boolean_mask(pre_interp_weights, mask=valid_mask) *
tf.sqrt(tf.reduce_sum(
tf.square(tf.boolean_mask(flow_map, mask=valid_mask)), axis=3, keep_dims=True))
))
# kp_of_strength_loss = 1/(kp_of_strength_loss+1)
kp_of_strength_loss = -kp_of_strength_loss
optical_flow_strength_loss = kp_of_strength_loss * optical_flow_strength_loss_weight
tgu.add_to_aux_loss(optical_flow_strength_loss, "flow_strength")

However, we have not released the code for the data loading and (OpenCV based) optical flow computation for Human3.6M. We plan to do that soon.

@hliuav
Copy link
Author

hliuav commented Aug 10, 2018

Thank you for your quick reply. I also find that if the network are trained with pictures with background, the landmarks tend to form a circle and each landmark only varies a little in its local region. Most of the cases shown in the paper are also trained with the images of similar pose(car, animals etc.) Only human3.6M dataset has various of poses. Is that the reason why we need to extract the background of the human3.6M dataset, that is, to make sure the network won't learn landmarks from background?(I have tried to train with human3.6M dataset with background, the network almost learn nothing)

@YutingZhang
Copy link
Owner

Sorry for the delayed response due to my recent job transition.
The method is not robust to background variations for human body images (though it works for faces).
The human body is more complicated than other objects regarding the pose variation and the viewpoint of interest. So the foreground object structure is also harder to capture. I think that is why an easier background is needed.

@ender1001
Copy link

Thank you for your great job. It really helps me a lot in my current work. I have encountered a similar problem in the background. Actually, I have extracted only the foreground from a video, but the method still recognized part of the foreground as background, hence missing some important landmarks. I am wondering if I can turn off the background channel in both encoding and decoding. I found some related options in your code but failed to enable them. Do you have any suggestions? Thank you.

@jojolee123
Copy link

Thank you for you nice work!
Can you provide a download link of Simplified Human3.6M dataset & Human3.6M dataset?
Waiting for your relay, thanks!

@YutingZhang
Copy link
Owner

YutingZhang commented Jul 31, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants