Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of penalty terms #41

Closed
rgemulla opened this issue Aug 19, 2019 · 11 comments
Closed

Computation of penalty terms #41

rgemulla opened this issue Aug 19, 2019 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@rgemulla
Copy link
Member

Commit e667cf9 changes the way penalties are interpreted for many models. The penalty term is currently computed only once per embedding, but with this change it's computed twice if subject and object embedder are the same (a common case). Instead of calling penalty twice, the code should check whether they are the same and, if so, call penalty only once.

@samuelbroscheit
Copy link
Member

What is the reasoning? They are used twice hence the penalty is computed for each factor matrix of each mode, which does matter in two cases: 1. when the penalty that is computed is weighted 2. when the penelty weight comes from a shared lookup_embedder config setting, such that we cannot learn/tune the scaling.

@rgemulla
Copy link
Member Author

I can see arguments for both ways of doing it.

  1. Keep as before: Do it like everywhere else (all related work). More intuitive/natural. Better performance.

  2. Change is in this commit: No special treatment of the shared_embedders case. Anything else?

The weights are not a big deal: can be done both ways: in (1) by simply passing along a list of ids (as suggested in #39), in (2) by calling penalty twice (for s, for o). (BTW: currently, the penalty function in KgeModel does not correctly pass along ids.)

@samuelbroscheit
Copy link
Member

currently, the penalty function in KgeModel does not correctly pass along ids.

How is that?

@rgemulla
Copy link
Member Author

currently, the penalty function in KgeModel does not correctly pass along ids.

How is that?

All fine, I must have been looking at outdated code.

@rgemulla
Copy link
Member Author

The current implementation of these penalty terms for lookup_embedders uses embed:

parameters = self.embed(kwargs['batch']['triples'][:, kwargs['slot']])

This may be flawed since this will run dropout (but shouldn't).

@samuelbroscheit
Copy link
Member

samuelbroscheit commented Aug 21, 2019 via email

@rgemulla
Copy link
Member Author

Looks fine now. The only suggested change I still have is the API change from #39:

I suggest to change self.get_s_embedder().penalty(slot=0, **kwargs) to self.get_s_embedder().penalty(penality_ids=triples[slot], **kwargs) or so. This way, the embedder can be used later on when the weights do not come from triples.

@rgemulla
Copy link
Member Author

rgemulla commented Oct 21, 2019

The change above is still open.

I also think that it would be helpful (for understanding the model) if the penalty terms were named (e.g., entity_embedder.l2) and traced/printed both with and without the regularization weight being applied.

@rgemulla rgemulla added the enhancement New feature or request label Dec 18, 2019
@samuelbroscheit
Copy link
Member

Is this currently being worked on already?

@rgemulla
Copy link
Member Author

Yes, by @Nzteb

@rgemulla
Copy link
Member Author

Closed with #101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants