diffNuisances.py: adding per-nuisance delta NLL #827

kcormi · 2023-03-22T08:21:13Z

This PR makes a few changes to diffNuisances.py, the largest of them being the ability to print and plot explicitly the change in log-likelihood between the background-only and signal+background fits for every given parameter.

Because the likelihood factorizes into a Poisson over each bin and a constraint term over each nuisance, the contribution of each bin and/or nuisance can be directly determined. These differences simply sum to give the total delta NLL.

Here, by passing a workspace we evaluate the pdf constraint term for each nuisance at its background-only best fit point and S+B best fit point to get the delta NLL. This is optional, if no workspace is passed, then diffNuisances.py simply runs as it used to.

A plot of the deltaNLL is also made ordered from largest to smallest DeltaNLL and showing a cumulative line. This can help to quickly identify if the change in postfit nuisances is contributing to a significance, and which nuisances in particular are contributing.

Other smaller changes:

Added a parameter --max-nuis which limits the number of nuisances per plot. If more nuisances exist multiple plots are created of each type, each containing only up to the maximum number of nuisances per plot. I've also increased the bottom margin of the plots to help make the nuisance names visible.
I've moved the diffNuisances.py script from test/ to scripts/ and made it executable, so that it can be run as a command-line tool. I've also removed the version under the data/tutorial/longexercise/, which had become slightly out-of-date, and updated the documentation in the exercise to call the script without needing the explicit python invocation.

A few thoughts for future PRs:

I'd like to add a similar plot of the dNLL contribution per bin including a cumulative (per region) line. But this is probably better suited to be added somewhere else, perhaps FitDiagnostics directly, but I think better is probably in PostFitShapesFromWorkspace to avoid jamming everything into FitDiagnostics. I'm open to suggestions.

For future developments, it might be useful to separate the table formatting into some functions which will be more generalizable and reusable. OTOH, if we incorporate more modern tools like pandas, then it is already set up to do things like this.

Will improve my checks in future

nucleosynthesis · 2023-10-17T08:37:11Z

Does this only work for binned (template) based analyses or also parametric and unbinned? If the former, we probably need the tool to halt if it gets something other than that.

kcormi added 11 commits March 21, 2023 12:31

First pass at adding nuisance delta nll plot

210fa7b

merge tutorial diffNuisances to test diffNuisances

7c91db9

move diffNuisances to scripts, make executable

7072001

Update all formatting print options

0d191b7

Remove separate version of diffNusiances in tutorial, update doc

d35b6e6

Fix linting problems

3dad6ac

Missed some linting issues

4af3707

really fix linting

d38fa92

more verbose help, (also linting)

4436e4b

remove whitespace

c5c85b4

:( okay, should be last linting issue

4466ce5

Will improve my checks in future

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

diffNuisances.py: adding per-nuisance delta NLL #827

diffNuisances.py: adding per-nuisance delta NLL #827

kcormi commented Mar 22, 2023

nucleosynthesis commented Oct 17, 2023

diffNuisances.py: adding per-nuisance delta NLL #827

Are you sure you want to change the base?

diffNuisances.py: adding per-nuisance delta NLL #827

Conversation

kcormi commented Mar 22, 2023

nucleosynthesis commented Oct 17, 2023