-
Notifications
You must be signed in to change notification settings - Fork 0
How does 'srilm gen num ' work?
Idea was that it chooses the most probable word for given context whene generating sentence.
Actual implementation is different. LM.cc
lines 1097–1117 (LM::generateWord()
) are most important. The conditional probabilities are put in a row on [0,1]
interval and the word whose interval contains randomly generated number (from [0, 1)
) is chosen as continuation.
I did simple calculations and the number of tested words depends on those conditional probs. distribution.
[0,1)
. Therefore
$$
\mathbb{E}M(X) = \sum_{m=1}^N P[M(X) = m] m = \sum_{m=1}^N p_m m
$$
Conclusion: this is not what we want for "nice" implementation of getAllPossibilities.