reworking the FFT in power_spectrum #2

skewballfox · 2022-01-13T16:46:08Z

Okay, so I've been tackling translating this this morning, and I realized that translation for this part is going to be a bit more complicated. np.fft.rfft doesn't stand for reverse, it's an fft calculated for real input and outputs complex numbers(see the docs). for more information on the difference between np.fft.fft and np.fft.rfft please see this stack overflow question

there are only two crates that handle this:

also, it seems that the ffts from numpy normalize a portion of the returned matrixes, where the third argument (norm=None) is actually indicating that it should use the default normalization option(backward) that appears to be distinct from an inverse fft(see the list real ffts).

ndrustfft may be a better fit here, but I'm still trying to figure out if it has options for normalization.

The text was updated successfully, but these errors were encountered:

skewballfox · 2022-01-13T18:25:06Z

after reading over the docs a couple more times it looks like the output of this particular function isn't getting normalized. normalization only happens for inverse ffts(unless I am misinterpreting)

The default normalization ("backward") has the direct (forward) transforms unscaled and the inverse (backward) transforms scaled by $1/n$.

it's happening along the last axis, but the number of elements isn't determined by the number of elements in that vector, and it looks like the output should be a 2d matrix.

StuartIanNaylor · 2022-05-17T20:11:48Z

There is an example in C if that helps?
https://github.com/42io/dataset/tree/master/google_speech_commands
Also hidden away in tensorflow is https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/microfrontend

skewballfox · 2022-05-28T20:17:46Z

thanks, for the links. It's definitely useful to get some reference to other implementations. I'm going to look at this again sometime Monday.

StuartIanNaylor · 2022-05-28T21:50:24Z

The 42io/dataset is quite a good example as dumping to stdout is quite a good method to create a pipeline.

Also spec augment is now quite commonly used and is an optional part of MFCC creation.
https://github.com/DemisEom/SpecAugment

@skewballfox https://github.com/HEnquist/realfft but compared to say pocketfft or fftw I don't think there is a neon implementation and RUST itself from a complete noob perspective looks at 1st glance lacking in embedded arm optimisation.
The C implementations that do support neon could be x2+ times faster.

skewballfox · 2022-05-30T15:22:19Z

I don't think there is a neon implementation and RUST itself from a complete noob perspective looks at 1st glance lacking in embedded arm optimisation.
The C implementations that do support neon could be x2+ times faster.

Still pretty new to this stuff, is this supposed to be handled directly by the library or more as a compile time optimization? I think the language has support for (most) neon instruction sets, going off this issue and the last corresponding PR.

also, a bit off topic, but what are some good resources for learning about this topic?

StuartIanNaylor · 2022-05-31T02:52:12Z

Dunno as I am the same with C as I can hack bits and pieces together I had a little look out of interest with RUST but apart from 'let mut' is about bringing the dog in the only thing it seems to me you either create your own crates or you have limits on what RUST supplies.

I have been wondering especially with voiceAI being so Arm concentric that maybe even the ArmLibs might be the way to go but with FFT its the usual of FFTW but the new one that tensorflow and torch have employed is pocketfftw.
https://docs.rs/rustfft/latest/rustfft/ seems to be AVX/SSE4.1 with no mention of Neon?
Pocket & FFTW are Neon optimised but again I am wondering what gains you get with specific Arm optimised libs on Arm as Arm make some big claims but have never checked it as generally FFT twists my melon.
https://developer.arm.com/documentation/101004/2100/Fast-Fourier-Transforms-FFTs/Fast-Fourier-Transforms-FFTs-Introduction
https://developer.arm.com/documentation/102620/latest/

I doubt I am any further on than you apart from when you mentioned reverse iFFT (well you mentioned real FFT) but reverse got a mention.
There are a lot of algs in the input stream of a voiceAI pipeline and often are standalone instances where the end part uses iFFT to return back to a PCM stream for the next step to convert to FFT for processing.
I think the biggest optimisation could be to feed FFT and treat alg pipeline as one and keep as FFT all the way through to MFCC.
Thing is with the math these algs employ you need to be a rocket scientist to have a clue what is going on in the math and where I am a big fail.

I still think plain old C is a better option as its so much more portable across platforms in a general way (yeah you will get args against) but also the existing codebase and examples is much more.
Still even if you have the language and the libs the math can still be a huge barrier as if you hit bugs and a shallow understanding of the math its prob impossible to debug.

sheosi · 2022-05-31T13:59:56Z

Some nugget of info about SIMD, C, Rust and embedded.

Neither C nor Rust provide any architecture-agnostic SIMD and both rely on libraries that take profit of those to offer something more comfortable. Not long ago Rust (1.59.0) Rust added stable (it was on nightly before that) support for some SIMD on Aarch64.

Overall, C support is much more mature, (specially in embedded) while Rust's is improving rapidly. Other than ready-made libs (which is important, but can be wrapped) I can't see a huge impact in Rust vs clang C embedded-wise. And the funny thing is that Rust has opportunities to be faster than C (ownership, which enables making non-allocating code even on complex codebases) (pointer alias which does not happen in Rust and thus another source for potential optimizations).

And on the topic of RustFFT and Neon, there's a PR working exactly on that, now that it is finally in Rust stable.

StuartIanNaylor · 2022-06-01T05:09:01Z

And on the topic of RustFFT and Neon, there's a PR working exactly on that

That is exactly what I was saying as working on still doesn't mean its available which I was pointing out its not.
I am not here to play fan boy fave programming language I was just pointing out its likely what your currently doing with FFT isn't optimised for Neon which can be pretty huge.

All the complex math libs such as FFT OpenBlas and various others are highly optimised after years of dev where the likes of Intel & Arm create API identical optimised libs and obviously you don't understand but its likely a fresh generalised lib such as in Rust is not as performant as say FFTW or PocketFFT and definitely not as performant as platform optimised vendor versions which Rust lacks.

I am not bothered what you use but that is very much the situation, but prob no bother as really its already part of Torch.audio & TF.signal so probably you don't even need to provide unless fanning out to multiple models with the same MFCC.

skewballfox · 2022-08-03T15:19:33Z

coming back to this, as this is actually the current source of failure for the MFCC test.

in both the python version and this implementation, the shape of frames is [6248,320]. python side, the shape of the SPECTRUM_VECTOR inside fft_spectrum function is [6248,257]; currently the rust code is generating a panic due to incompatible shapes. I could just provide a view inside a correctly sized slice of frames, or pad it with zeros, but I'm not sure if either would be correct.

my intuition says that zero padding would probably be the correct as this is a 512 point FFT, and there are only 320 entries along the specified axis, but I need to confirm.

skewballfox closed this as completed Aug 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reworking the FFT in power_spectrum #2

reworking the FFT in power_spectrum #2

skewballfox commented Jan 13, 2022 •

edited

Loading

skewballfox commented Jan 13, 2022

StuartIanNaylor commented May 17, 2022

skewballfox commented May 28, 2022

StuartIanNaylor commented May 28, 2022 •

edited

Loading

skewballfox commented May 30, 2022

StuartIanNaylor commented May 31, 2022 •

edited

Loading

sheosi commented May 31, 2022 •

edited

Loading

StuartIanNaylor commented Jun 1, 2022

skewballfox commented Aug 3, 2022

reworking the FFT in power_spectrum #2

reworking the FFT in power_spectrum #2

Comments

skewballfox commented Jan 13, 2022 • edited Loading

skewballfox commented Jan 13, 2022

StuartIanNaylor commented May 17, 2022

skewballfox commented May 28, 2022

StuartIanNaylor commented May 28, 2022 • edited Loading

skewballfox commented May 30, 2022

StuartIanNaylor commented May 31, 2022 • edited Loading

sheosi commented May 31, 2022 • edited Loading

StuartIanNaylor commented Jun 1, 2022

skewballfox commented Aug 3, 2022

skewballfox commented Jan 13, 2022 •

edited

Loading

StuartIanNaylor commented May 28, 2022 •

edited

Loading

StuartIanNaylor commented May 31, 2022 •

edited

Loading

sheosi commented May 31, 2022 •

edited

Loading