Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: vectorize _read_annotations_edf #8214

Merged
merged 3 commits into from
Sep 4, 2020
Merged

MAINT: vectorize _read_annotations_edf #8214

merged 3 commits into from
Sep 4, 2020

Conversation

jvdd
Copy link
Contributor

@jvdd jvdd commented Sep 4, 2020

Speedup the processing of the channel data in EDF+ TAL format through vectorizing the code.
This speedup tackles the time bottleneck when a RawEDF object is created.

Note: experimental results on private datasets yielded a speedup of 100-150x on EDF+C and EDF+D files of 20 - 360 MB when calling mne.io.read_raw_edf.

Author: Jeroen Van Der Donckt (IDLab Ghent University - imec)

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with or without my suggested change. @cbrnr do you want to look and merge if you're happy?

@@ -1354,6 +1355,9 @@ def _read_annotations_edf(annotations):
triggers = re.findall(pat, annot_file.read())
else:
tals = bytearray()
annotations = np.array(annotations) # To make sure that annotations is a ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just do annotations = np.atleast_2d(annotations) (or below if annotations.ndim == 1: annotations = annotations[np.newaxis] rather than going through expand_dims)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great remark, glad I learned a new numpy method today 👍 ! ! I will insert this change.

@larsoner
Copy link
Member

larsoner commented Sep 4, 2020

... oh and please add an entry to doc/changes/latest.inc with your name in names.inc so you get credit for the speedup!

@larsoner
Copy link
Member

larsoner commented Sep 4, 2020

Azure style complains:

Running flake8
./mne/io/edf/edf.py:10:80: E501 line too long (95 > 79 characters)
./mne/io/edf/edf.py:1358:44: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1358:80: E501 line too long (88 > 79 characters)
./mne/io/edf/edf.py:1359:34: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1359:80: E501 line too long (115 > 79 characters)
./mne/io/edf/edf.py:1369:39: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1369:80: E501 line too long (100 > 79 characters)
./mne/io/edf/edf.py:1372:80: E501 line too long (122 > 79 characters)
./mne/io/edf/edf.py:1372:88: E261 at least two spaces before inline comment

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides LGTM

doc/changes/latest.inc Outdated Show resolved Hide resolved
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
@cbrnr
Copy link
Contributor

cbrnr commented Sep 4, 2020

Very nice improvement (although speedup with my test files ranges from 1x to 10x, but I guess this heavily depends on the length and other file properties). I'll merge as soon as CIs come back green!

@cbrnr cbrnr merged commit 3d16144 into mne-tools:master Sep 4, 2020
@cbrnr
Copy link
Contributor

cbrnr commented Sep 4, 2020

Thanks @jvdd!

larsoner added a commit to libertyh/mne-python that referenced this pull request Sep 9, 2020
* upstream/master: (489 commits)
  MRG, DOC: Fix ICA docstring, add whitening (mne-tools#8227)
  MRG: Extract measurement date and age for NIRX files (mne-tools#7891)
  Nihon Kohden EEG file reader WIP (mne-tools#6017)
  BUG: Fix scaling for src_mri_t in coreg (mne-tools#8223)
  MRG: Set pyvista as default 3d backend (mne-tools#8220)
  MRG: Recreate our helmet graphic (mne-tools#8116)
  [MRG] Adding get_montage for montage to BaseRaw objects (mne-tools#7667)
  ENH: Allow setting tqdm backend (mne-tools#8177)
  [MRG, IO] Persyst reader into Raw object (mne-tools#8176)
  MRG, BUG: Fix errors in IO/loading/projectors (mne-tools#8210)
  MAINT: vectorize _read_annotations_edf (mne-tools#8214)
  FIX : events_from_annotation when annotations.orig_time is None and f… (mne-tools#8209)
  FIX: do not project to sphere; DOC - explain how to get EEGLAB-like topoplots (mne-tools#7455)
  [MRG, DOC] Added linear algebra of transform to doc (mne-tools#7087)
  FIX: Travis failure on python3.8.1 (mne-tools#8207)
  BF: String formatting in exception message (mne-tools#8206)
  BUG: Fix STC limit bug (mne-tools#8202)
  MRG, DOC: fix ica tutorial (mne-tools#8175)
  CSP component order selection (mne-tools#8151)
  MRG, ENH: Add on_missing to plot_events (mne-tools#8198)
  ...
marsipu pushed a commit to marsipu/mne-python that referenced this pull request Oct 14, 2020
* MAINT: vectorize _read_annotations_edf

* MAINT: fix style complaints to + use np.atleast_2d vectorized _read_annotations_edf

* Update doc/changes/latest.inc

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants