Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudf::column redesign #2207

Merged
merged 282 commits into from
Oct 2, 2019
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
282 commits
Select commit Hold shift + click to select a range
4f4ee73
Made the null mask into a column.
jrhemstad Jul 15, 2019
846382a
Add definitions for column and column view.
jrhemstad Jul 15, 2019
1cdcacc
add missing system includes.
jrhemstad Jul 15, 2019
d38e2ae
Copy old type_dispatcher to make new type_dispatcher.
jrhemstad Jul 15, 2019
49fcf9c
Made the bitmask into another device_buffer.
jrhemstad Jul 15, 2019
5bfcf62
Removed building of bitmask tests.
jrhemstad Jul 15, 2019
8ccddd2
Added offset to column view.
jrhemstad Jul 15, 2019
1f23cda
Added doc to column_view.
jrhemstad Jul 15, 2019
1f84167
Clarify usage of head<T>().
jrhemstad Jul 15, 2019
3098923
Add column_device_view.
jrhemstad Jul 17, 2019
ad5eea3
Moved child access to header.
jrhemstad Jul 17, 2019
7b30632
Added const and non-const functions to column_view.
jrhemstad Jul 17, 2019
d87f2cc
Added concrete data_type wrapper class.
jrhemstad Jul 17, 2019
e44621f
Remove files no longer used.
jrhemstad Jul 17, 2019
955868f
Remove unused bitmask files.
jrhemstad Jul 17, 2019
a3427d5
Doc.
jrhemstad Jul 18, 2019
2e6458f
Add bit.cuh utilities.
jrhemstad Jul 18, 2019
ace7347
Added new type dispatcher.
jrhemstad Jul 18, 2019
a1099fc
Add type to id mapping utilities.
jrhemstad Jul 18, 2019
6c8bb12
use a macro for the mapping of types to ids.
jrhemstad Jul 18, 2019
8c5cb94
Doc.
jrhemstad Jul 18, 2019
70d3d97
Doc.
jrhemstad Jul 18, 2019
22e2de0
Add return to avoid warning.
jrhemstad Jul 18, 2019
a04b732
Update doc on CUDF_FAIL.
jrhemstad Jul 19, 2019
fe4618e
minor updates.
jrhemstad Jul 19, 2019
548b7b6
Removed duplicate create function.
jrhemstad Jul 20, 2019
e6e4335
Corrected variable names in bit utils.
jrhemstad Jul 20, 2019
b52094e
Type dispatcher doc.
jrhemstad Jul 20, 2019
711ecf1
Doc.
jrhemstad Jul 22, 2019
1756fe8
Add const to bit_is_set.
jrhemstad Jul 22, 2019
fc56e80
Added mutable column views.
jrhemstad Jul 22, 2019
718f867
Corrected num_children in mutable_column_view.
jrhemstad Jul 22, 2019
f19e15a
Created new column_view_base from which column_view and mutable_colum…
jrhemstad Jul 22, 2019
dd20a6a
Moved column_view constructors to .cpp file.
jrhemstad Jul 22, 2019
f153007
Updated int to size_type for UNKNOWN_NULL_COUNT.
jrhemstad Jul 22, 2019
3ded4fa
Added column_device_view_base from which column_device_view and mutab…
jrhemstad Jul 22, 2019
ee7ce4e
Added num_children to mutable_device_view.
jrhemstad Jul 22, 2019
44c167d
Merge remote-tracking branch 'origin/branch-0.9' into fea-ext-column-…
jrhemstad Jul 25, 2019
e45f443
Add unknown null count to types.hpp.
jrhemstad Jul 25, 2019
c454423
Moved constructors for column_view_base to be protected.
jrhemstad Jul 25, 2019
e56cd1c
Updated accessors for column.
jrhemstad Jul 25, 2019
782ffce
Doc.
jrhemstad Jul 26, 2019
8406d49
Added table_view and mutable_table_view.
jrhemstad Jul 26, 2019
85be5d0
Begin adding new table.
jrhemstad Jul 26, 2019
9fab018
Merge branch 'fea-ext-move-table-legacy' into fea-ext-column-redesign
jrhemstad Jul 26, 2019
d504a29
Add conversion for mutable_table_view to table_view.
jrhemstad Jul 26, 2019
97b62a9
Add conversion from mutable table view to immutable.
jrhemstad Jul 26, 2019
247edca
Add table implementation.
jrhemstad Jul 26, 2019
109d633
Implement memebers of table.
jrhemstad Jul 26, 2019
6906bd1
Updated/added checks for columns to make sure they are nullable when …
jrhemstad Jul 26, 2019
7a3e966
Added traits for fixed_width and complex/simple types.
jrhemstad Jul 30, 2019
ab19562
Add null_mask header and functions for creating a device_buffer nullm…
jrhemstad Jul 30, 2019
47ae45b
Add the null mask files.
jrhemstad Jul 30, 2019
7fa51f5
Merge remote-tracking branch 'origin/branch-0.9' into fea-ext-column-…
jrhemstad Jul 30, 2019
0baf445
Doc.
jrhemstad Jul 30, 2019
a71c73f
Added device/host default error case to new type-dispatcher.
jrhemstad Jul 31, 2019
b0bcdd6
Changed column.cpp to column.cu
jrhemstad Jul 31, 2019
c3bafcd
Disabled column constructor from view.
jrhemstad Jul 31, 2019
8c715ac
Added is_numeric trait.
jrhemstad Aug 1, 2019
17480f3
Simplified column constructor in favor of type specific factories
jrhemstad Aug 1, 2019
7d116a5
Add column_factory.cpp to build.
jrhemstad Aug 1, 2019
fcbdede
Add release_assert to type_dispatcher.
jrhemstad Aug 1, 2019
ebea3b0
Simplified and cleaned up docs for traits.
jrhemstad Aug 1, 2019
ca4d065
Added function to compute null count for a given mask state
jrhemstad Aug 1, 2019
04bca77
Implemented factory for numeric columns
jrhemstad Aug 1, 2019
41f013e
Move fill value.
jrhemstad Aug 1, 2019
7a8940b
Made column children into unique_ptr.
jrhemstad Aug 1, 2019
7775d8a
Added forward declarations for rmm utilities.
jrhemstad Aug 1, 2019
37401cf
Updated forward decls.
jrhemstad Aug 1, 2019
cbb87c6
Updated docs for make_numeric_column
jrhemstad Aug 1, 2019
ffb2726
Added stub for new sorted_order.
jrhemstad Aug 1, 2019
3454786
Doc updates for invalidating null count w/ mutable view
jrhemstad Aug 2, 2019
46ac1a0
Add stub for table device view
jrhemstad Aug 2, 2019
c1902b1
Updated doc for table_view
jrhemstad Aug 2, 2019
071d7c2
Add API stubs for interoperability w/ gdf_column
jrhemstad Aug 2, 2019
3e2bd6c
Doc
jrhemstad Aug 2, 2019
5c04740
Implemented interop APIs.
jrhemstad Aug 2, 2019
f5cf8bf
Removed unneccessary const_cast
jrhemstad Aug 2, 2019
acf8334
Moved count_descendants to external header
jrhemstad Aug 5, 2019
80ad711
Unsigned to signed comparison
jrhemstad Aug 5, 2019
512fff0
Initial table_device_view
jrhemstad Aug 5, 2019
7a68b93
Add table device view construction.
jrhemstad Aug 7, 2019
c0a2d8a
Simplified device_table_view creation.
jrhemstad Aug 7, 2019
fda42bf
Moved device table creation to derived classes.
jrhemstad Aug 7, 2019
1c9863e
Added device table creation to sort.
jrhemstad Aug 7, 2019
11404b8
Moved CUDA keyword macros to separate header.
jrhemstad Aug 7, 2019
de4f1ac
Add trait for comparability of types.
jrhemstad Aug 7, 2019
a2891d9
Added initial row relational comparison operator.
jrhemstad Aug 7, 2019
bbf6d21
Add a const column accessor to table device view.
jrhemstad Aug 8, 2019
286ab6b
Implemented the row lexicographic comparator.
jrhemstad Aug 8, 2019
ebea5bc
Doc for row comparator.
jrhemstad Aug 8, 2019
0b56e6f
Separated out the sorted_order to a detail namespace with default str…
jrhemstad Aug 8, 2019
939c911
Added std namespace to begin/end
jrhemstad Aug 9, 2019
4610629
Added begin/end members to column views
jrhemstad Aug 9, 2019
f78890e
Added a has_nulls for table_view
jrhemstad Aug 9, 2019
6eb5680
Fixed the is_relationally_comparable enable_if call.
jrhemstad Aug 9, 2019
2cb3c57
Finished implementation of sorted_order
jrhemstad Aug 9, 2019
f9f50aa
Update doc.
jrhemstad Aug 9, 2019
4aed0e5
Added checking for empty input.
jrhemstad Aug 9, 2019
1955a6f
Clarified doc.
jrhemstad Aug 9, 2019
e05378f
Rename null_size
jrhemstad Aug 20, 2019
936a596
Make has_nulls invoke null_count()
jrhemstad Aug 20, 2019
5fb3a74
Update null_count to store and return zero if bitmask is null
jrhemstad Aug 20, 2019
14d87ea
Add missing numeric include.
jrhemstad Aug 26, 2019
ee3986c
Add mutable to column_view null_count
jrhemstad Aug 26, 2019
490042e
Apply suggestions from code review
jrhemstad Aug 28, 2019
2597557
Apply suggestions from code review
jrhemstad Aug 28, 2019
52fe7ed
Apply suggestions from code review
jrhemstad Aug 28, 2019
853e842
Merge branch 'branch-0.10' into fea-ext-column-redesign
jrhemstad Sep 9, 2019
43cbabf
Fix extra parenthesis.
jrhemstad Sep 9, 2019
50b93e3
Merge branch 'fea-ext-column-redesign' of github.com:jrhemstad/cudf i…
jrhemstad Sep 9, 2019
edf1b61
Moved contents of utils/ to utilities/.
jrhemstad Sep 9, 2019
b75b4ef
Updated doc.
jrhemstad Sep 9, 2019
c1b03f4
Add stub for traits tests.
jrhemstad Sep 9, 2019
14bf61f
Removed resolution specific TIMESTAMP types.
jrhemstad Sep 9, 2019
10ccfa0
Added tests for numeric and relationally_comparable traits
jrhemstad Sep 9, 2019
04c8b4f
Add doc for future tests.
jrhemstad Sep 9, 2019
1567cb1
Add stub for factory tests.
jrhemstad Sep 9, 2019
c880652
Added centralized abstraction for typed test lists
jrhemstad Sep 9, 2019
73f641c
Update traits tests to use new centralized types.
jrhemstad Sep 9, 2019
956845b
Updated docs.
jrhemstad Sep 9, 2019
9f26c69
Added element access template to device view.
jrhemstad Sep 10, 2019
ac5bc77
Explicit type.
jrhemstad Sep 11, 2019
e8b541d
Stub test file with includes.
jrhemstad Sep 11, 2019
23bbb99
Began conversion from C++17 to C++14
jrhemstad Sep 12, 2019
e3bce38
Clarified documentation of null mask state.
jrhemstad Sep 12, 2019
c2dca8f
Add additional documentation.
jrhemstad Sep 12, 2019
892c21a
Update doc.
jrhemstad Sep 13, 2019
ae014da
Updated doc and removed Values.
jrhemstad Sep 14, 2019
c975add
Update doc.
jrhemstad Sep 14, 2019
fb4a9fc
Added tests for type list utilities.
jrhemstad Sep 14, 2019
94f24bf
Updated documentation.
jrhemstad Sep 14, 2019
5b32ea9
Replaced header guards with pragma once
jrhemstad Sep 14, 2019
50ceb66
Created macro for compile time test of equality between types.
jrhemstad Sep 14, 2019
d03d703
Update tests.
jrhemstad Sep 14, 2019
6c71cb5
Doc for RemoveIf
jrhemstad Sep 14, 2019
dda3f14
Update tests: Transform, Append, Remove
jrhemstad Sep 14, 2019
a8135a3
Update docs: Transform, Repeat, Append, Remove
jrhemstad Sep 14, 2019
0fd3292
Updated all tests.
jrhemstad Sep 14, 2019
66d3f51
Remove unneeded macros.
jrhemstad Sep 14, 2019
28d8d43
Update license
jrhemstad Sep 14, 2019
2596613
Moved typed test utilities.
jrhemstad Sep 15, 2019
a43ae3e
Moved type list tests to utilities test.
jrhemstad Sep 15, 2019
55fba2b
Updated centralized type lists to use new Type lists.
jrhemstad Sep 15, 2019
6eaefb8
Added GTest type list utilities and tests.
jrhemstad Sep 16, 2019
c79f234
Added type_list_tests.cpp.
jrhemstad Sep 16, 2019
3c1b96b
Renamed files.
jrhemstad Sep 16, 2019
af668b8
Merge remote-tracking branch 'origin/branch-0.10' into fea-ext-column…
jrhemstad Sep 17, 2019
c5c0c88
Merge branch 'fea-ext-column-redesign' into type_list
jrhemstad Sep 17, 2019
4b8a060
Merge branch 'fea-ext-gtest-type-list-utils' into type_list
jrhemstad Sep 17, 2019
19f57fd
Rename and update docs.
jrhemstad Sep 17, 2019
4178282
testing -> test
jrhemstad Sep 17, 2019
9acfb1f
Update naming
jrhemstad Sep 17, 2019
af18d5f
Update includes.
jrhemstad Sep 17, 2019
630f00d
Moved mask_state to types.hpp
jrhemstad Sep 17, 2019
6a3a3d7
Added equality operator for data_type
jrhemstad Sep 17, 2019
471373f
Added base test fixture
jrhemstad Sep 17, 2019
a82e26c
Updated AllTypes to concat with numeric types.
jrhemstad Sep 17, 2019
5bd7c1a
Added tests for numeric factory.
jrhemstad Sep 17, 2019
afc4989
Stubbed out type parameterization of column test
jrhemstad Sep 17, 2019
80e70e9
Add copy constructor with explicit stream and mr.
jrhemstad Sep 17, 2019
28c2c4d
Clean up column docs.
jrhemstad Sep 17, 2019
2762641
Added element equality comparator
jrhemstad Sep 18, 2019
7371380
Update construction of element relational comparator
jrhemstad Sep 18, 2019
b1b1f3d
reorder ctors
jrhemstad Sep 19, 2019
352eaab
Explicit return type for create.
jrhemstad Sep 19, 2019
6db8521
Add row equality comparator
jrhemstad Sep 19, 2019
9a862c3
struct to class
jrhemstad Sep 19, 2019
76655f5
Don't throw when making an empty table device view.
jrhemstad Sep 19, 2019
b3ee825
Add utility for equality between columns.
jrhemstad Sep 19, 2019
09cdd5b
Initial column test.
jrhemstad Sep 19, 2019
e3a1757
Apply suggestions from code review
jrhemstad Sep 23, 2019
fc93c01
Apply suggestions from code review
jrhemstad Sep 23, 2019
319db7e
Apply suggestions from code review
jrhemstad Sep 23, 2019
5c1dbd9
Apply suggestions from code review
jrhemstad Sep 23, 2019
f9a32a3
Merge remote-tracking branch 'personal/fea-ext-column-redesign' into …
jrhemstad Sep 23, 2019
4100c48
Clarified doc.
jrhemstad Sep 23, 2019
4d9c018
Fix typo.
jrhemstad Sep 23, 2019
9297df3
Add include guard.
jrhemstad Sep 23, 2019
29f0ae2
Add column copy ctor w/ explicit stream/mr.
jrhemstad Sep 23, 2019
10f09a8
Add check in row_equality for equal number of columns.
jrhemstad Sep 23, 2019
dbde479
Merge remote-tracking branch 'origin/branch-0.10' into type_list
jrhemstad Sep 23, 2019
020d393
Remove merge message from type_list.
jrhemstad Sep 23, 2019
1e0a686
Remove experimental tests.
jrhemstad Sep 23, 2019
22ce2d9
typo
jrhemstad Sep 23, 2019
521307f
Removed experimental tests from cmake
jrhemstad Sep 23, 2019
46cb971
Added tests for new type dispatcher.
jrhemstad Sep 24, 2019
ccdf131
Move old tests to legacy/
jrhemstad Sep 24, 2019
5cda050
Updated type lists to type_id instead of data_type
jrhemstad Sep 24, 2019
e77ec2a
Made data_type::id host/device callable.
jrhemstad Sep 24, 2019
07b9459
Default in class init
jrhemstad Sep 24, 2019
2eee48f
Removed print statement.
jrhemstad Sep 24, 2019
19c797c
Add device sync.
jrhemstad Sep 24, 2019
cca9a6c
Split column_utilities into header and source.
jrhemstad Sep 24, 2019
88e5f52
Update cpp/include/cudf/table/row_operators.cuh
jrhemstad Sep 24, 2019
1390e4a
Made column_test into cpp
jrhemstad Sep 24, 2019
551aed6
Moved null_order and order enums to types.hpp
jrhemstad Sep 24, 2019
30d3e95
Merge branch 'fea-ext-column-redesign' of github.com:jrhemstad/cudf i…
jrhemstad Sep 24, 2019
f3c3e0f
Add utility to verify bitwise equality of two buffers.
jrhemstad Sep 24, 2019
3627c1a
Made column_test into a cu file.
jrhemstad Sep 24, 2019
d627455
Added mr argument to sorted_order.
jrhemstad Sep 24, 2019
fcd6e83
Add feature for counting non-zero bits.
jrhemstad Sep 24, 2019
3a05c85
Add column test for copy ctor.
jrhemstad Sep 24, 2019
67bb2d7
Use rmm::device_scalar.
jrhemstad Sep 24, 2019
9525ccf
Rename bitmask "element" to "word".
jrhemstad Sep 25, 2019
882d81d
Add count bitmask tests.
jrhemstad Sep 25, 2019
6213ee3
Add functions for setting most/least significant bits.
jrhemstad Sep 25, 2019
e999142
Working count_set_bits_kernel.
jrhemstad Sep 25, 2019
227c256
Added more tests.
jrhemstad Sep 25, 2019
2622a31
Add check for negative start index.
jrhemstad Sep 25, 2019
71f6833
Update exception documentation
jrhemstad Sep 25, 2019
a696f70
const correctness.
jrhemstad Sep 25, 2019
3e0385e
Add on-demand null counting to column.
jrhemstad Sep 25, 2019
eff07cb
Add on-demand null count to column_view.
jrhemstad Sep 25, 2019
af4e4ea
Correct null count to use offset.
jrhemstad Sep 26, 2019
842c5e2
Remove noexcept from nullcount.
jrhemstad Sep 26, 2019
5bf666c
More count bitmask tests. Found bug.
jrhemstad Sep 26, 2019
87cb5c3
Add cudf::size_of to types.hpp
jrhemstad Sep 26, 2019
bb3f62f
Move size_of
jrhemstad Sep 26, 2019
7ff07ee
Overhaul and add new column tests w/ null_count.
jrhemstad Sep 26, 2019
7e91014
Removed incorrect noexcept.
jrhemstad Sep 26, 2019
9dfdb72
Updated iterators.
jrhemstad Sep 26, 2019
130af56
Remove noexcept.
jrhemstad Sep 26, 2019
935d87b
Add new test that breaks bit counting.
jrhemstad Sep 26, 2019
b5f59c7
Fixed count null mask.
jrhemstad Sep 26, 2019
7e972e3
Update naming for inclusive range.
jrhemstad Sep 26, 2019
3d66f89
Fix doc and thread_word_index.
jrhemstad Sep 27, 2019
d24c6b7
Added function for counting unset bits.
jrhemstad Sep 27, 2019
f43405d
Updated null counting to use count_unset_bits.
jrhemstad Sep 27, 2019
bfa99e6
Add tests for counting unset bits.
jrhemstad Sep 27, 2019
e5c3f25
More column tests.
jrhemstad Sep 27, 2019
1bcd68d
Move old table tests to legacy.
jrhemstad Sep 27, 2019
501579c
Move old table tests to legacy.
jrhemstad Sep 27, 2019
fa46581
Add column_view tests.
jrhemstad Sep 27, 2019
e7d690e
Update doc.
jrhemstad Sep 27, 2019
63dd7c9
CHANGELOG.
jrhemstad Sep 27, 2019
ce39c3c
Merge remote-tracking branch 'origin/branch-0.10' into fea-ext-column…
jrhemstad Sep 27, 2019
01f111a
Apply suggestions from code review
jrhemstad Sep 30, 2019
1e6d8f8
Update cpp/include/cudf/utilities/bit.cuh
jrhemstad Sep 30, 2019
bd1a419
Simplify null_count().
jrhemstad Sep 30, 2019
cb8b505
Merge branch 'fea-ext-column-redesign' of github.com:jrhemstad/cudf i…
jrhemstad Sep 30, 2019
b62921b
Update cpp/include/cudf/column/column.hpp
jrhemstad Sep 30, 2019
6f94ada
Use member functions instead of accessing private members.
jrhemstad Sep 30, 2019
606c9db
Merge branch 'fea-ext-column-redesign' of github.com:jrhemstad/cudf i…
jrhemstad Sep 30, 2019
ad6903b
Merge branch 'branch-0.10' into fea-ext-column-redesign
jrhemstad Sep 30, 2019
e578176
Update null_count to avoid creating view.
jrhemstad Sep 30, 2019
e1d9088
Correct function call for null_count().
jrhemstad Oct 1, 2019
745e792
Conclass -> Construct.
jrhemstad Oct 1, 2019
7fb7979
Merge remote-tracking branch 'origin/branch-0.10' into fea-ext-column…
jrhemstad Oct 1, 2019
f7fc79e
Merge branch 'branch-0.10' into fea-ext-column-redesign
jrhemstad Oct 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,12 @@
- PR #2836 Add nvstrings.code_points method
- PR #2844 Add Series/DataFrame notnull
- PR #2858 Add GTest type list utilities
- PR #2207 Beginning of libcudf overhaul: introduce new column and table types
- PR #2838 CSV Reader: Support ARROW_RANDOM_FILE input
- PR #2655 CuPy-based Series and Dataframe .values property
- PR #2803 Added `edit_distance_matrix()` function to calculate pairwise edit distance for each string on a given nvstrings object.


## Improvements

- PR #2578 Update legacy_groupby to use libcudf group_by_without_aggregation
Expand Down
12 changes: 11 additions & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,17 @@ add_library(cudf
src/copying/copy_range.cu
src/filling/fill.cu
src/filling/repeat.cu
src/search/search.cu)
src/search/search.cu
src/column/column.cu
src/column/column_view.cpp
src/column/column_device_view.cu
src/column/column_factories.cpp
src/table/table_view.cpp
src/table/table_device_view.cu
src/table/table.cpp
src/bitmask/null_mask.cu
src/sort/sort.cu
src/column/legacy/interop.cpp)

# Override RPATH for nvstrings
set_target_properties(libNVStrings PROPERTIES BUILD_RPATH "\$ORIGIN")
Expand Down
255 changes: 255 additions & 0 deletions cpp/include/cudf/column/column.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
/*
* Copyright (c) 2019, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cudf/types.hpp>
#include "column_view.hpp"

#include <rmm/device_buffer.hpp>
#include <rmm/mr/device_memory_resource.hpp>

#include <memory>
#include <type_traits>
#include <utility>
#include <vector>

namespace cudf {

class column {
public:
column() = default;
~column() = default;
column& operator=(column const& other) = delete;
column& operator=(column&& other) = delete;

/**---------------------------------------------------------------------------*
* @brief Construct a new column by deep copying the contents of `other`.
*
* All device memory allocation and copying is done using the
* `device_memory_resource` and `stream` from `other`.
*
* @param other The column to copy
*---------------------------------------------------------------------------**/
column(column const& other);

/**---------------------------------------------------------------------------*
* @brief Construct a new column object by deep copying the contents of
*`other`.
*
* Uses the specified `stream` and device_memory_resource for all allocations
* and copies.
*
* @param other The `column` to copy
* @param stream The stream on which to execute all allocations and copies
* @param mr The resource to use for all allocations
*---------------------------------------------------------------------------**/
column(column const& other, cudaStream_t stream,
rmm::mr::device_memory_resource* mr = rmm::mr::get_default_resource());

/**---------------------------------------------------------------------------*
* @brief Move the contents from `other` to create a new column.
*
* After the move, `other.size() == 0` and `other.type() = {EMPTY}`
*
* @param other The column whose contents will be moved into the new column
*---------------------------------------------------------------------------**/
column(column&& other);

/**---------------------------------------------------------------------------*
* @brief Construct a new column from existing device memory.
*
* @note This constructor is primarily intended for use in column factory
* functions.
*
* @param[in] dtype The element type
* @param[in] size The number of elements in the column
* @param[in] data The column's data
* @param[in] null_mask Optional, column's null value indicator bitmask. May
* be empty if `null_count` is 0 or `UNKNOWN_NULL_COUNT`.
* @param null_count Optional, the count of null elements. If unknown, specify
* `UNKNOWN_NULL_COUNT` to indicate that the null count should be computed on
* the first invocation of `null_count()`.
* @param children Optional, vector of child columns
*---------------------------------------------------------------------------**/
template <typename B1, typename B2 = rmm::device_buffer>
felipeblazing marked this conversation as resolved.
Show resolved Hide resolved
column(data_type dtype, size_type size, B1&& data, B2&& null_mask = {},
size_type null_count = UNKNOWN_NULL_COUNT,
std::vector<std::unique_ptr<column>>&& children = {})
: _type{dtype},
_size{size},
_data{std::forward<B1>(data)},
_null_mask{std::forward<B2>(null_mask)},
_null_count{null_count},
_children{std::move(children)} {}

/**---------------------------------------------------------------------------*
* @brief Construct a new column by deep copying the contents of a
* `column_view`.
*
* This accounts for the `column_view`'s offset.
*
* @param view The view to copy
* @param stream The stream on which all allocations and copies will be
* executed
* @param mr The resource to use for all allocations
*---------------------------------------------------------------------------**/
explicit column(
column_view view, cudaStream_t stream = 0,
rmm::mr::device_memory_resource* mr = rmm::mr::get_default_resource());

/**---------------------------------------------------------------------------*
* @brief Returns the column's logical element type
*---------------------------------------------------------------------------**/
data_type type() const noexcept { return _type; }

/**---------------------------------------------------------------------------*
* @brief Returns the number of elements
*---------------------------------------------------------------------------**/
size_type size() const noexcept { return _size; }

/**---------------------------------------------------------------------------*
* @brief Returns the count of null elements.
*
* @note If the column was constructed with `UNKNOWN_NULL_COUNT`, or if at any
* point `set_null_count(UNKNOWN_NULL_COUNT)` was invoked, then the
* first invocation of `null_count()` will compute and store the count of null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and other column methods that launch kernels underneath, should they should use the stream specified in the constructor or an optional stream argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the fact that null_count() can potentially launch a kernel isn't something I want people to usually have to think about. That said, a column doesn't have a stream member, it's device_buffer members do, and I think it would be odd to use those.

So I guess it makes sense to add a defaulted stream parameter to this function, but it makes me feel a bit uncomfortable since null_count() is supposed to be masquerading as just a simple const accessor.

* elements indicated by the `null_mask` (if it exists).
*---------------------------------------------------------------------------**/
size_type null_count() const;

/**---------------------------------------------------------------------------*
* @brief Updates the count of null elements.
*
* @note `UNKNOWN_NULL_COUNT` can be specified as `new_null_count` to force
* the next invocation of `null_count()` to recompute the null count from the
* null mask.
*
* @throws cudf::logic_error if `new_null_count > 0 and nullable() == false`
*
* @param new_null_count The new null count.
*---------------------------------------------------------------------------**/
void set_null_count(size_type new_null_count);

/**---------------------------------------------------------------------------*
* @brief Indicates whether it is possible for the column to contain null
* values, i.e., it has an allocated null mask.
*
* This may return `false` iff `null_count() == 0`.
*
* May return true even if `null_count() == 0`. This function simply indicates
* whether the column has an allocated null mask.
*
* @return true The column can hold null values
* @return false The column cannot hold null values
*---------------------------------------------------------------------------**/
bool nullable() const noexcept { return (_null_mask.size() > 0); }

/**---------------------------------------------------------------------------*
* @brief Indicates whether the column contains null elements.
*
* @return true One or more elements are null
* @return false Zero elements are null
*---------------------------------------------------------------------------**/
bool has_nulls() const noexcept { return (null_count() > 0); }

/**---------------------------------------------------------------------------*
* @brief Returns the number of child columns
*---------------------------------------------------------------------------**/
size_type num_children() const noexcept { return _children.size(); }

/**---------------------------------------------------------------------------*
* @brief Returns a reference to the specified child
*
* @param child_index Index of the desired child
* @return column& Reference to the desired child
*---------------------------------------------------------------------------**/
column& child(size_type child_index) noexcept {
return *_children[child_index];
};

/**---------------------------------------------------------------------------*
* @brief Returns a const reference to the specified child
*
* @param child_index Index of the desired child
* @return column const& Const reference to the desired child
*---------------------------------------------------------------------------**/
column const& child(size_type child_index) const noexcept {
return *_children[child_index];
};

/**---------------------------------------------------------------------------*
* @brief Creates an immutable, non-owning view of the column's data and
* children.
*
* @return column_view The immutable, non-owning view
*---------------------------------------------------------------------------**/
column_view view() const;

/**---------------------------------------------------------------------------*
* @brief Implicit conversion operator to a `column_view`.
*
* This allows passing a `column` object directly into a function that
* requires a `column_view`. The conversion is automatic.
*
* @return column_view Immutable, non-owning `column_view`
*---------------------------------------------------------------------------**/
operator column_view() const { return this->view(); };

/**---------------------------------------------------------------------------*
* @brief Creates a mutable, non-owning view of the column's data and
* children.
*
* @note Creating a mutable view of a `column` invalidates the `column`'s
* `null_count()` by setting it to `UNKNOWN_NULL_COUNT`. The user can
* either explicitly update the null count with `set_null_count()`, or
* if not, the null count will be recomputed on the next invocation of
*`null_count()`.
*
* @return mutable_column_view The mutable, non-owning view
*---------------------------------------------------------------------------**/
mutable_column_view mutable_view();

/**---------------------------------------------------------------------------*
* @brief Implicit conversion operator to a `mutable_column_view`.
*
* This allows pasing a `column` object into a function that accepts a
*`mutable_column_view`. The conversion is automatic.

* @note Creating a mutable view of a `column` invalidates the `column`'s
* `null_count()` by setting it to `UNKNOWN_NULL_COUNT`. For best performance,
* the user should explicitly update the null count with `set_null_count()`.
* Otherwise, the null count will be recomputed on the next invocation of
* `null_count()`.
*
* @return mutable_column_view Mutable, non-owning `mutable_column_view`
*---------------------------------------------------------------------------**/
operator mutable_column_view() { return this->mutable_view(); };

private:
data_type _type{EMPTY}; ///< Logical type of elements in the column
cudf::size_type _size{}; ///< The number of elements in the column
rmm::device_buffer _data{}; ///< Dense, contiguous, type erased device memory
///< buffer containing the column elements
rmm::device_buffer _null_mask{}; ///< Bitmask used to represent null values.
///< May be empty if `null_count() == 0`
mutable size_type _null_count{
jrhemstad marked this conversation as resolved.
Show resolved Hide resolved
UNKNOWN_NULL_COUNT}; ///< The number of null elements
std::vector<std::unique_ptr<column>>
_children{}; ///< Depending on element type, child
///< columns may contain additional data
};

} // namespace cudf
Loading