Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded Streams #4

Open
jscheiny opened this issue Sep 4, 2014 · 2 comments
Open

Multithreaded Streams #4

jscheiny opened this issue Sep 4, 2014 · 2 comments

Comments

@jscheiny
Copy link
Owner

jscheiny commented Sep 4, 2014

Allow parallelized streams (with a more limited set of operators).

@oldboldpilot
Copy link

Great! Jonah! Can't wait for the parallelised streams 👍 :)

@rressi
Copy link

rressi commented May 23, 2015

I think your job with this project is great!

Parallelism are one obvious next big thing for a functional framework since one of the biggest selling points of functional programming is to give to developers a robust and cheap access to parallel processing.

The reasons OOProgramming is loosing the 'holiness aura' it used to have in the past years are:

  • Objects are often state-full and this don't play well with multi-threading, you often need to create a lot of interlocks. And CPUs stopped evolving vertically since a while, but are just evolving horizontally. We actually have machines with 20 real cores (40 thread cores) that most of the time starves asking for data.
  • Classes tend to mix data and logics and this don't play well with distributed projects where the data need to be transmitted (and saved) efficiently again and again. Network cables have evolved much slowly comparing to CPUs. We still have Gigabit cables, 10Gb is the standard, nothing compared to how much have the CPU evolved.
  • The number of users one service have to serve just exploded far beyond the capabilities of a single machine, before we have at most 10000K users, now hundred millions and this thanks to the mobile revolutions, social networks, internet of things....

Here come the big U-turn of many developers about OO, including myself.

Do you have ideas how to implement with simplicity multi threading?

What about the following?

// This sample takes an huge amount of words, and put them into a normalized, sorted and deduplicated vector:
myWords = load_from_somewhere();

auto my_normalize = [ ](const std::sting& word) -> std::string {
    ...
    return normalizedWord;
}; 
auto maxWorkers = 4;     // Takes up to 4 CPUs
auto bucketSize = 10000; // Every internal parallel task will have 10k elements.

auto myVucabulary = myWords | parallel(map_(my_normalize) | distinct(),
                                       numWorkers, bucketSize)
                            | distinct()
                            | to_vector();

If you want I can help you with or we can work together and design the best solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants