MAINT: vectorize _read_annotations_edf #8214

jvdd · 2020-09-04T12:57:49Z

Speedup the processing of the channel data in EDF+ TAL format through vectorizing the code.
This speedup tackles the time bottleneck when a RawEDF object is created.

Note: experimental results on private datasets yielded a speedup of 100-150x on EDF+C and EDF+D files of 20 - 360 MB when calling mne.io.read_raw_edf.

Author: Jeroen Van Der Donckt (IDLab Ghent University - imec)

larsoner

LGTM with or without my suggested change. @cbrnr do you want to look and merge if you're happy?

larsoner · 2020-09-04T12:59:40Z

mne/io/edf/edf.py

@@ -1354,6 +1355,9 @@ def _read_annotations_edf(annotations):
            triggers = re.findall(pat, annot_file.read())
    else:
        tals = bytearray()
+        annotations = np.array(annotations) # To make sure that annotations is a ndarray


I would just do annotations = np.atleast_2d(annotations) (or below if annotations.ndim == 1: annotations = annotations[np.newaxis] rather than going through expand_dims)

Great remark, glad I learned a new numpy method today 👍 ! ! I will insert this change.

larsoner · 2020-09-04T13:01:36Z

... oh and please add an entry to doc/changes/latest.inc with your name in names.inc so you get credit for the speedup!

larsoner · 2020-09-04T13:02:07Z

Azure style complains:

Running flake8
./mne/io/edf/edf.py:10:80: E501 line too long (95 > 79 characters)
./mne/io/edf/edf.py:1358:44: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1358:80: E501 line too long (88 > 79 characters)
./mne/io/edf/edf.py:1359:34: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1359:80: E501 line too long (115 > 79 characters)
./mne/io/edf/edf.py:1369:39: E261 at least two spaces before inline comment
./mne/io/edf/edf.py:1369:80: E501 line too long (100 > 79 characters)
./mne/io/edf/edf.py:1372:80: E501 line too long (122 > 79 characters)
./mne/io/edf/edf.py:1372:88: E261 at least two spaces before inline comment

…nnotations_edf

agramfort

besides LGTM

doc/changes/latest.inc

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

cbrnr · 2020-09-04T14:11:55Z

Very nice improvement (although speedup with my test files ranges from 1x to 10x, but I guess this heavily depends on the length and other file properties). I'll merge as soon as CIs come back green!

cbrnr · 2020-09-04T14:56:12Z

Thanks @jvdd!

* upstream/master: (489 commits) MRG, DOC: Fix ICA docstring, add whitening (mne-tools#8227) MRG: Extract measurement date and age for NIRX files (mne-tools#7891) Nihon Kohden EEG file reader WIP (mne-tools#6017) BUG: Fix scaling for src_mri_t in coreg (mne-tools#8223) MRG: Set pyvista as default 3d backend (mne-tools#8220) MRG: Recreate our helmet graphic (mne-tools#8116) [MRG] Adding get_montage for montage to BaseRaw objects (mne-tools#7667) ENH: Allow setting tqdm backend (mne-tools#8177) [MRG, IO] Persyst reader into Raw object (mne-tools#8176) MRG, BUG: Fix errors in IO/loading/projectors (mne-tools#8210) MAINT: vectorize _read_annotations_edf (mne-tools#8214) FIX : events_from_annotation when annotations.orig_time is None and f… (mne-tools#8209) FIX: do not project to sphere; DOC - explain how to get EEGLAB-like topoplots (mne-tools#7455) [MRG, DOC] Added linear algebra of transform to doc (mne-tools#7087) FIX: Travis failure on python3.8.1 (mne-tools#8207) BF: String formatting in exception message (mne-tools#8206) BUG: Fix STC limit bug (mne-tools#8202) MRG, DOC: fix ica tutorial (mne-tools#8175) CSP component order selection (mne-tools#8151) MRG, ENH: Add on_missing to plot_events (mne-tools#8198) ...

* MAINT: vectorize _read_annotations_edf * MAINT: fix style complaints to + use np.atleast_2d vectorized _read_annotations_edf * Update doc/changes/latest.inc Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

MAINT: vectorize _read_annotations_edf

2e53c07

larsoner approved these changes Sep 4, 2020

View reviewed changes

MAINT: fix style complaints to + use np.atleast_2d vectorized _read_a…

ba2ee5d

…nnotations_edf

agramfort reviewed Sep 4, 2020

View reviewed changes

doc/changes/latest.inc Outdated Show resolved Hide resolved

Update doc/changes/latest.inc

98f4de2

Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>

agramfort approved these changes Sep 4, 2020

View reviewed changes

cbrnr merged commit 3d16144 into mne-tools:master Sep 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: vectorize _read_annotations_edf #8214

MAINT: vectorize _read_annotations_edf #8214

jvdd commented Sep 4, 2020

larsoner left a comment

larsoner Sep 4, 2020

jvdd Sep 4, 2020

larsoner commented Sep 4, 2020

larsoner commented Sep 4, 2020

agramfort left a comment

cbrnr commented Sep 4, 2020

cbrnr commented Sep 4, 2020

MAINT: vectorize _read_annotations_edf #8214

MAINT: vectorize _read_annotations_edf #8214

Conversation

jvdd commented Sep 4, 2020

larsoner left a comment

Choose a reason for hiding this comment

larsoner Sep 4, 2020

Choose a reason for hiding this comment

jvdd Sep 4, 2020

Choose a reason for hiding this comment

larsoner commented Sep 4, 2020

larsoner commented Sep 4, 2020

agramfort left a comment

Choose a reason for hiding this comment

cbrnr commented Sep 4, 2020

cbrnr commented Sep 4, 2020