You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to run the "process_data.py" file with the same word2vec binary file (i.e. GoogleNews-vectors-negative300.bin) but it didn't work. The process got killed after 30 mint approx.
Before I was thinking, it may be a memory problem, but I tried on the server (256GB RAM and 16GB GPU) too but unfortunately found same results (i.e. program got killed after running approx. 30 mint).
what could be possible reasons?
Your response will be highly appreciable.
The text was updated successfully, but these errors were encountered:
usama6832
changed the title
How much RAM memory do I need to process Goole News dataset bin file (i.e. GoogleNews-vectors-negative300.bin)
How much memory do I need to process bin file (i.e. GoogleNews-vectors-negative300.bin)
May 22, 2019
Yes, I solved my problem and successfully run this file on my data center having 256GB RAM. Perhaps, It was the python version compatibility problem instead of memory problem.
If you are attempting to do this under python 3 and are having memory limitation problems, then your issue likely lies within the string processing. Python 2 and Python 3 process binary files differently where all comparisons of binary strings in Python 3 must be preceded by a lowercase b for it to be successful.
Here is an example:
with open(fname, "rb") as f: for line in range(foo): ch = f.read(1) if ch == b' ': do something
Notice the space ' ' has a b before it: b' '
Without this b, that comparison will always be false if that character is a space in a binary file. This can lead to a memory leak that can grow to infinite size.
Hi everyone,
I try to run the "process_data.py" file with the same word2vec binary file (i.e. GoogleNews-vectors-negative300.bin) but it didn't work. The process got killed after 30 mint approx.
Before I was thinking, it may be a memory problem, but I tried on the server (256GB RAM and 16GB GPU) too but unfortunately found same results (i.e. program got killed after running approx. 30 mint).
what could be possible reasons?
Your response will be highly appreciable.
The text was updated successfully, but these errors were encountered: