Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSP component order selection #8151

Merged
merged 27 commits into from
Sep 2, 2020
Merged

CSP component order selection #8151

merged 27 commits into from
Sep 2, 2020

Conversation

mbillingr
Copy link
Contributor

@mbillingr mbillingr commented Aug 24, 2020

Reference issue

Resolves #8074.

What does this implement/fix?

  1. Add new parameter to CSP to select how to order components.
  2. Add additional tests to make sure nothing broke.
  3. (optional) Refactoring
    • Moved duplicated data type check into _check_Xy.
    • Split .fit() into smaller functions for improved readability.

Additional information

I left the refactoring in for now because I already made it.
You may to prefer not to have these changes in order to keep the code closer to the original from pyRiemann.
Let me know :)

@mbillingr mbillingr changed the title Csp [Wip] Csp component order selection Aug 24, 2020
@cbrnr
Copy link
Contributor

cbrnr commented Aug 24, 2020

I'll take a detailed look soon, but here are already three questions:

  1. I don't quite understand the difference between 'new' and 'mutual_info' - aren't both methods sorting by mutual information? Would it make sense to combine these methods into one?
  2. Why do we need a None value for the component_order parameter?
  3. Does the behavior of CSP change with this PR? If so, we need a deprecation cycle, but I'd prefer if the default behavior doesn't change.

@mbillingr
Copy link
Contributor Author

  1. 'new' sorts by abs(eig - 0.5) and 'mutual_info' sorts by mutual information. In the two-class case they seem to result in the same order but in the multi-class case you can't use 'new'. I'd rather not combine them because pyRiemann distinguishes these two cases.

  2. None is the current behavior, which remains the default. It chooses between 'new' and 'mutual_info' based on the number of classes. Maybe 'auto' would feel more consistent than None?

  3. The default behavior of CSP does not change. This PR does not touch any of the existing tests and they still pass.

@larsoner larsoner added this to the 0.21 milestone Aug 24, 2020
@larsoner
Copy link
Member

'new'

I would maybe use extrema or extreme or deviation or something, because it's more about which are closer to the extremes of the possible values of the eigh result (0 and 1).

Also if mutual_info is expected to generally be better, we should just make it the default and call it a bug fix

@cbrnr
Copy link
Contributor

cbrnr commented Aug 25, 2020

There are too many choices for my taste. Previously, we didn't have a parameter to select the sorting order - the function used MI-based sorting for both two- and multi-class cases, even though the implementation differed. The problem with that was that the "classic" approach of sorting patterns by alternating sorted eigenvalues was not available. IMO, it would be sufficient to have two possible values for the parameter, "mutual_info" and "alternating" or something like that.

@agramfort
Copy link
Member

agramfort commented Aug 25, 2020 via email

@cbrnr
Copy link
Contributor

cbrnr commented Aug 25, 2020

@agramfort see my comment here.

Currently we have no options, but we're already doing two different things for the two- and multi-class case. Why do you want to introduce a new argument for something we already did without?

Also, am I correct in assuming that the distance to 0.5 is not equal (not even conceptually) to sorting by mutual information?

Even if these are not equal, I'd still summarize them as one method (the advanced), whereas the alternating one (the naive) should be there for practical reasons (it has been published, and many people in the BCI community have been using that - not sure if this is still the case though).

@agramfort
Copy link
Member

agramfort commented Aug 25, 2020 via email

@cbrnr
Copy link
Contributor

cbrnr commented Aug 25, 2020

the question is do we want mutual info in the 2d case? if so we need a parameter.

I don't think so, this PR just adds the classic alternating sorting. I think it's still useful because the question how we sort using the new method(s) comes up frequently. However, we could also go with @larsoner's suggestion and just implement the new approach. This would mean no new parameter, no change to the code, just add links to the literature to explain the current sorting. I'm fine with either approach, but I am 👎 on introducing a new parameter with three or four possible values.

@agramfort
Copy link
Member

@cbrnr so what you suggest in the 2d case? what is called old or new here?

@cbrnr
Copy link
Contributor

cbrnr commented Aug 27, 2020

@agramfort I think your comment about the distance to 0.5 equalling variance is not quite correct. After all, CSP is designed to maximize the variance difference between/within conditions, so all criteria will somehow work on the variance (at least indirectly).

Therefore, I suggest that we keep things simple (at the risk of being inaccurate) and use the following values for the component_order parameter:

  1. mutual_info: this is the current behavior (and the default); it uses the distance from 0.5 in the 2D case and MI in the ND case.
  2. alternate: this is the classic Blankertz 2008 behavior that can be enabled for the 2D case. If this argument is used for more than 2 classes, we throw an error.

@cbrnr
Copy link
Contributor

cbrnr commented Aug 27, 2020

In addition, I would rename component_order to order_components or order_filters.

@agramfort
Copy link
Member

agramfort commented Aug 27, 2020 via email

@cbrnr
Copy link
Contributor

cbrnr commented Aug 27, 2020

That's the inaccuracy I was talking about 😄 - I think it's more useful to keep it simple, and people can always read the docstring to find the exact methods with references. Also, we still haven't ruled out that the distance to 0.5 is somehow related to the mutual information as a special case. @alexandrebarachant maybe?

@agramfort
Copy link
Member

agramfort commented Aug 27, 2020 via email

@mbillingr
Copy link
Contributor Author

use the following values for the component_order parameter:

  1. mutual_info: this is the current behavior (and the default); it uses the distance from 0.5 in the 2D case and MI in the ND case.
  2. alternate: this is the classic Blankertz 2008 behavior that can be enabled for the 2D case. If this argument is used for more than 2 classes, we throw an error.

Does everybody agree now it should be done like that?

@cbrnr
Copy link
Contributor

cbrnr commented Aug 27, 2020

@mbillingr and @agramfort? Plus I suggest using order_components - also OK with you?

@mbillingr
Copy link
Contributor Author

So we'll have order_components := 'mutual_info' | 'alternate'. OK with me.

@agramfort
Copy link
Member

agramfort commented Aug 27, 2020 via email

@cbrnr cbrnr changed the title [Wip] Csp component order selection [Wip] CSP component order selection Aug 27, 2020
@cbrnr
Copy link
Contributor

cbrnr commented Sep 2, 2020

I implemented the changes, let's see if tests still come back green.

@cbrnr cbrnr changed the title [Wip] CSP component order selection CSP component order selection Sep 2, 2020
@cbrnr
Copy link
Contributor

cbrnr commented Sep 2, 2020

@larsoner I've implemented all suggestions, please check if the examples are OK and if the footbibliography works.

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larsoner merge if happy

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbrnr
Copy link
Contributor

cbrnr commented Sep 2, 2020

The title "Decoding in time-frequency space data using the Common Spatial Pattern (CSP)" could be shortened to "Decoding in time-frequency space using Common Spatial Patterns (CSP)".

@larsoner
Copy link
Member

larsoner commented Sep 2, 2020

Feel free to push this change if you want

@cbrnr
Copy link
Contributor

cbrnr commented Sep 2, 2020

By "scores changed slightly" I'm assuming you mean coverage - where is our coverage CI?

@larsoner
Copy link
Member

larsoner commented Sep 2, 2020

By "scores changed slightly" I'm assuming you mean coverage

No I mean the decoding scores in the bar plots. master:

PR:

For coverage, codecov comments inline (eventually) when there are un-covered lines.

@cbrnr
Copy link
Contributor

cbrnr commented Sep 2, 2020

Got it. The scores changed quite a lot - @mbillingr do you have an idea what could have caused this change? I thought that this PR didn't change the previous (default) behavior?

@mbillingr
Copy link
Contributor Author

Run-to-run variability due to cross-validation shuffling?

@larsoner
Copy link
Member

larsoner commented Sep 2, 2020

Could be -- @cbrnr or @mbillingr please check and if so set the random state so it's reproducible

@larsoner
Copy link
Member

larsoner commented Sep 2, 2020

Indeed there is a missing random_state. Setting it to 42 I get the same image on master and this PR:

Screenshot from 2020-09-02 11-47-01

Screenshot from 2020-09-02 11-46-28

doc/references.bib Outdated Show resolved Hide resolved
@larsoner larsoner merged commit 9f4ace5 into mne-tools:master Sep 2, 2020
@larsoner
Copy link
Member

larsoner commented Sep 2, 2020

@mbillingr @cbrnr !

@mbillingr mbillingr deleted the csp branch September 3, 2020 06:45
larsoner added a commit to libertyh/mne-python that referenced this pull request Sep 9, 2020
* upstream/master: (489 commits)
  MRG, DOC: Fix ICA docstring, add whitening (mne-tools#8227)
  MRG: Extract measurement date and age for NIRX files (mne-tools#7891)
  Nihon Kohden EEG file reader WIP (mne-tools#6017)
  BUG: Fix scaling for src_mri_t in coreg (mne-tools#8223)
  MRG: Set pyvista as default 3d backend (mne-tools#8220)
  MRG: Recreate our helmet graphic (mne-tools#8116)
  [MRG] Adding get_montage for montage to BaseRaw objects (mne-tools#7667)
  ENH: Allow setting tqdm backend (mne-tools#8177)
  [MRG, IO] Persyst reader into Raw object (mne-tools#8176)
  MRG, BUG: Fix errors in IO/loading/projectors (mne-tools#8210)
  MAINT: vectorize _read_annotations_edf (mne-tools#8214)
  FIX : events_from_annotation when annotations.orig_time is None and f… (mne-tools#8209)
  FIX: do not project to sphere; DOC - explain how to get EEGLAB-like topoplots (mne-tools#7455)
  [MRG, DOC] Added linear algebra of transform to doc (mne-tools#7087)
  FIX: Travis failure on python3.8.1 (mne-tools#8207)
  BF: String formatting in exception message (mne-tools#8206)
  BUG: Fix STC limit bug (mne-tools#8202)
  MRG, DOC: fix ica tutorial (mne-tools#8175)
  CSP component order selection (mne-tools#8151)
  MRG, ENH: Add on_missing to plot_events (mne-tools#8198)
  ...
marsipu pushed a commit to marsipu/mne-python that referenced this pull request Oct 14, 2020
* move data type check into function that checks input data

* allow selection of csp decomposition algorithm

* replace selection of decomposition by selection of ordering

* remove redundant comments

* flake8

* remove unused function

* pydocstyle

* Simplify component_order argument

* Update test

* Update docstring

* Fix component_order attribute error

* Add whats_new entry

* fix default in component_order docstring

* test for exception in case of wrong component_order

* Minor improvements to examples

* Test invalid component_order/number of classes combination

* Use utils functions to validate types and check options

* Use footbibliography

* Merge tests

* Better example title

* Remove empty field in bib entry

* Add DOI

* FIX: Seed

* FIX: Test

* FIX: Dup

* DOC: DOI

* FIX: Test

Co-authored-by: Clemens Brunner <clemens.brunner@gmail.com>
Co-authored-by: Eric Larson <larson.eric.d@gmail.com>
@larsoner larsoner mentioned this pull request Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pairing arrange of CSP filters removed
4 participants