Replies: 1 comment 3 replies
-
I don't really see vaex supporting generators in the way you describe (although you never know..). If your goal is to have columns with constant values then indeed we should have a dedicated method for that, and make those columns not use any memory. There are already a couple of tricks of how you can do this for both numerical and string columns scattered around the issues. Btw, where is already a I hope this helps a bit. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Possibly, this should be in feature request.
When vaex create a DataFrame from a dict, and this dict only contains generator, can it 'generate in memory' the data only when it needs to?
I am aware that one cannot know the length of a generator in advance (or so I think) so maybe an additional parameter to specify the length of the DataFrame is required.
Something like:
My use case (I should have started here) consists in loading data from files and joining it with vaex before recording it again this time in a single file.
I know in advance the number of columns, column names, number of rows and so on that are loaded.
When the file is not existing, then I have to use default values like the ones in above example (either constant values, or generated by a range() function).
My current method is to create the array with pandas, then transform it in vaex DataFrame.
But vaex could help make smart use of memory here as well probably?
Last but not least,
If considered, single values in the dict could be interpreted as constant values for the full column, hence allowing the user to write the shorter code:
Should I put this ticket as a feature request?
Thanks for your feedback, bests!
Beta Was this translation helpful? Give feedback.
All reactions