Optimised eif_new.py #24

lpryszcz · 2020-08-31T15:54:38Z

I've optimised Python version so it matches performance with C++ version and allow saving the models.
There is runtime examle added to Notebooks/comparison_py_cxx.ipynb
The code was rewritten entirely. Some functions are optimised with numba.
The iForest is now a numpy array, which allow fast computation and model dump with low storage footprint.

wundermahn · 2021-06-14T18:36:21Z

Is this still an active project?

lpryszcz · 2021-07-01T09:29:21Z

That's a good question @wundermahn . If you want optimised Python version, you can get it directly from my fork.

psmgeelen · 2021-12-05T20:45:24Z

Hi there, this would be the fix for my problem as well, would it? I am currently trying to pickle the isolationForest model and failing due to som Cython issue:

File "stringsource", line 2, in eif.iForest.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__

lpryszcz · 2021-12-06T02:41:12Z

hi @psmgeelen , yes, you can't save models from Cython version. Try my fork - it has a performance similar to Cython version, but is implemented in Python (with Numba optimisations).

psmgeelen · 2021-12-06T17:53:16Z

@lpryszcz , you are the best! I will get on it now! So I really only need the eif_new.py file and that's it? Maybe it's worthwhile to have your version to be integrated in scikit. I recommended you anyhow scikit-learn/scikit-learn#16517

EDIT: It works out of the box, I love the script! Small questions though, does it make sense to have a threshold that is always 0.5? Instead you could just push the values directly.

lpryszcz · 2022-02-04T13:11:42Z

I'm glad it works for you :) And thanks for the recommendation @psmgeelen . I'd be more than happy to contribute to scikit-learn given there is interest from their side.

lpryszcz added 6 commits August 25, 2020 18:37

iTree_array: cpu & memory optimised; single-thread only

57bcd95

further cpu optimisation; forced to single-thread

28b2434

further cpu optimisation; forced to single-thread

435490d

performance matching c++

c3dce92

performance matching c++

58233c6

performance matching c++

cf50816

lpryszcz mentioned this pull request Aug 31, 2020

Model saving #18

Open

solaris

5e0bebd

psmgeelen mentioned this pull request Dec 6, 2021

Implement Extended Isolation Forest scikit-learn/scikit-learn#16517

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimised eif_new.py #24

Optimised eif_new.py #24

lpryszcz commented Aug 31, 2020

wundermahn commented Jun 14, 2021

lpryszcz commented Jul 1, 2021

psmgeelen commented Dec 5, 2021

lpryszcz commented Dec 6, 2021 •

edited

Loading

psmgeelen commented Dec 6, 2021 •

edited

Loading

lpryszcz commented Feb 4, 2022

Optimised eif_new.py #24

Are you sure you want to change the base?

Optimised eif_new.py #24

Conversation

lpryszcz commented Aug 31, 2020

wundermahn commented Jun 14, 2021

lpryszcz commented Jul 1, 2021

psmgeelen commented Dec 5, 2021

lpryszcz commented Dec 6, 2021 • edited Loading

psmgeelen commented Dec 6, 2021 • edited Loading

lpryszcz commented Feb 4, 2022

lpryszcz commented Dec 6, 2021 •

edited

Loading

psmgeelen commented Dec 6, 2021 •

edited

Loading