Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Ch 7 #42

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

[WIP] Ch 7 #42

wants to merge 4 commits into from

Conversation

canyon289
Copy link
Collaborator

I have question on exercises 2 and 4 in the notebook. The rest are up for review

@AlexAndorra
Copy link

Thank you Ravin! Here are my thoughts, from what I understood:

  • Exercise 2: Yes, I think the generated values are the range of Y values from the multivariate normal. The goal of the exercise is probably to show that the range of Y values increases when increasing the number of samples obtained from the GP prior, because the variation in the samples mechanically goes up (which is the case here: the range approx. doubles from the book's example)

  • Exercise 4: I think there is a typo in the book's question and I guess the question is: "Re-run the model model_reg and get new plots but using as test_points X_new = np.linspace(np.floor(x.min()), 20, 100)[:,None]". In other words, get posterior predictive samples of the fit model (and graph them), but instead of stoping at the data's maximum, let's compute them up to x=20, where we don't have any observed data:

X_new = np.linspace(np.floor(x.min()), 20, 100)[:,None]

with model_reg:
    # conditional distribution evaluated over new input locations
    f_pred = gp.conditional("f_pred", X_new)
    # samples from the posterior predictive distribution evaluated at the X_new values
    pred_samples = pm.sample_posterior_predictive(trace_reg, vars=[f_pred])

_, ax = plt.subplots(figsize=(12,5))
ax.plot(X_new, pred_samples["f_pred"].T, "C1-", alpha=0.3)
ax.plot(X, y, "ko");

The resulting plot gives us an estimation of the uncertainty in our model, which is dependent on the data and model's specification.

  • Exercises 7 and 8: LGTM. Just one question: what is the purpose of the ϵ parameter? Because I don't see it used anywhere in the model.

Everything else LGTM! Thank you for this hard and useful work 👏
PyMCheers!

@aloctavodia
Copy link
Owner

Adding a few comments to @AlexAndorra answers

Exercise 2: The range does not really increase with the number of samples, instead the real range becomes more evident. The range of allowed values always come from the same multivariate Gaussian, but this is not very easy to see when the number of samples is low. I would recommend using semitransparent lines and the same color for all the lines.

Exercise 4: Alex interpretation is right, sorry about the typo. Even knowing about it it was really hard to me to find it! This is why proofreading by "others" is so important when writing something.

One more comment

Exercise 7, notice that for this example it is not possible to compute a good linear boundary

@AlexAndorra
Copy link

Oh ok! Thanks for correcting my misunderstanding Osvaldo 😃

@canyon289
Copy link
Collaborator Author

I added Exercise 3. If someone gets time could use a double check to make sure it's correct!

@aloctavodia
Copy link
Owner

Exercise 3 is OK!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants