Show partition metrics for custom shuffler reader #1060

andygrove · 2020-11-03T22:32:25Z

This PR adds metrics to GpuCustomShuffleReader so that we can see the number and size of the resulting partitions.

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/ShuffledBatchRDD.scala

jlowe · 2020-11-10T20:22:44Z

build

sameerz · 2020-11-15T15:38:33Z

build

andygrove · 2020-11-17T17:49:26Z

build

andygrove · 2020-11-17T23:02:05Z

failed due to timeout this time

andygrove · 2020-11-17T23:02:11Z

build

andygrove · 2020-11-18T01:14:06Z

orc write test failed this time

16:53:08  �[31m�[1m_________________________ test_buckets_write_fallback __________________________�[0m
16:53:08  [gw1] linux -- Python 3.6.12 /usr/bin/python
16:53:08  
16:53:08  spark_tmp_path = '/tmp/pyspark_tests//573108/'
16:53:08  spark_tmp_table_factory = <conftest.TmpTableFactory object at 0x7f3a20192240>
16:53:08  
16:53:08      @allow_non_gpu('DataWritingCommandExec')
16:53:08      def test_buckets_write_fallback(spark_tmp_path, spark_tmp_table_factory):
16:53:08          data_path = spark_tmp_path + '/ORC_DATA'
16:53:08          assert_gpu_fallback_write(
16:53:08                  lambda spark, path: spark.range(10e4).write.bucketBy(4, "id").sortBy("id").format('orc').mode('overwrite').option("path", path).saveAsTable(spark_tmp_table_factory.get()),
16:53:08                  lambda spark, path: spark.read.orc(path),
16:53:08                  data_path,
16:53:08  >               'DataWritingCommandExec')
16:53:08  
16:53:08  �[1m�[31m../../src/main/python/orc_write_test.py�[0m:119: 
16:53:08  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
16:53:08  �[1m�[31m../../src/main/python/asserts.py�[0m:275: in assert_gpu_fallback_write
16:53:08      assert_equal(from_cpu, from_gpu)
16:53:08  �[1m�[31m../../src/main/python/asserts.py�[0m:86: in assert_equal
16:53:08      _assert_equal(cpu, gpu, float_check=get_float_check(), path=[])
16:53:08  �[1m�[31m../../src/main/python/asserts.py�[0m:38: in _assert_equal
16:53:08      _assert_equal(cpu[index], gpu[index], float_check, path + [index])
16:53:08  �[1m�[31m../../src/main/python/asserts.py�[0m:31: in _assert_equal
16:53:08      _assert_equal(cpu[field], gpu[field], float_check, path + [field])
16:53:08  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
16:53:08  
16:53:08  cpu = 6250, gpu = 75003

jlowe · 2020-11-18T01:15:00Z

Need to upmerge this PR. That test was recently fixed.

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove · 2020-11-18T01:18:31Z

build

jlowe

Protip: If you merge to the latest on branch-0.3 rather than rebasing and force-pushing, Github is smart enough to not lose the PR approval. See #1145 as an example.

andygrove · 2020-11-18T04:17:22Z

Now failing to build against 3.1.0 due to changes to MapOutputTracker API. I may need to do some shim work. I will look at this tomorrow.

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove · 2020-11-19T14:29:53Z

build

andygrove · 2020-11-19T22:14:47Z

build

andygrove · 2020-11-19T23:22:38Z

Failed due to test_broadcast_nested_loop_join_special_case which is now fixed in master. I will merge the fix here and rebuild.

andygrove · 2020-11-19T23:24:04Z

build

* Add partition metrics to custom shuffle reader Signed-off-by: Andy Grove <andygrove@nvidia.com> * revert change Signed-off-by: Andy Grove <andygrove@nvidia.com> * refactor to combine metrics and reader in single match statement Signed-off-by: Andy Grove <andygrove@nvidia.com> * use shim layer to get map output sizes Signed-off-by: Andy Grove <andygrove@nvidia.com>

…IDIA#1060) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

andygrove added the SQL part of the SQL/Dataframe plugin label Nov 3, 2020

andygrove self-assigned this Nov 3, 2020

andygrove commented Nov 3, 2020

View reviewed changes

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/ShuffledBatchRDD.scala Outdated Show resolved Hide resolved

andygrove requested a review from abellina November 3, 2020 22:38

andygrove changed the title ~~[WIP] Show metrics for custom shuffler reader~~ [WIP] Show # input/outpout partitions for custom shuffler reader Nov 3, 2020

andygrove changed the title ~~[WIP] Show # input/outpout partitions for custom shuffler reader~~ [WIP] Show # input/output partitions for custom shuffler reader Nov 3, 2020

andygrove changed the title ~~[WIP] Show # input/output partitions for custom shuffler reader~~ [WIP] Show partitions metrics for custom shuffler reader Nov 4, 2020

andygrove changed the title ~~[WIP] Show partitions metrics for custom shuffler reader~~ [WIP] Show partition metrics for custom shuffler reader Nov 4, 2020

andygrove force-pushed the shuffle-reader-metrics branch from 948d9be to 4eaa240 Compare November 4, 2020 20:48

andygrove changed the title ~~[WIP] Show partition metrics for custom shuffler reader~~ Show partition metrics for custom shuffler reader Nov 4, 2020

jlowe reviewed Nov 5, 2020

View reviewed changes

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/ShuffledBatchRDD.scala Outdated Show resolved Hide resolved

jlowe previously approved these changes Nov 10, 2020

View reviewed changes

abellina previously approved these changes Nov 11, 2020

View reviewed changes

andygrove dismissed stale reviews from abellina and jlowe via 69c39a7 November 17, 2020 17:49

andygrove force-pushed the shuffle-reader-metrics branch from 5a9fd23 to 69c39a7 Compare November 17, 2020 17:49

andygrove added 3 commits November 17, 2020 18:18

Add partition metrics to custom shuffle reader

9171592

Signed-off-by: Andy Grove <andygrove@nvidia.com>

revert change

28eb943

Signed-off-by: Andy Grove <andygrove@nvidia.com>

refactor to combine metrics and reader in single match statement

ed7272c

Signed-off-by: Andy Grove <andygrove@nvidia.com>

andygrove force-pushed the shuffle-reader-metrics branch from 69c39a7 to ed7272c Compare November 18, 2020 01:18

jlowe previously approved these changes Nov 18, 2020

View reviewed changes

andygrove added 2 commits November 19, 2020 07:19

use shim layer to get map output sizes

f9a3296

Signed-off-by: Andy Grove <andygrove@nvidia.com>

Merge branch 'branch-0.3' into shuffle-reader-metrics

9e52da5

andygrove dismissed jlowe’s stale review via 9e52da5 November 19, 2020 14:22

Merge branch 'branch-0.3' into shuffle-reader-metrics

fb26e0c

Merge branch 'branch-0.3' into shuffle-reader-metrics

695e77f

jlowe approved these changes Nov 19, 2020

View reviewed changes

andygrove merged commit f66b41d into NVIDIA:branch-0.3 Nov 20, 2020

andygrove deleted the shuffle-reader-metrics branch November 20, 2020 03:38

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023

Update submodule cudf to cab6522a3f47b4d8286ac72898268f5929c04d79 (NV…

6aa71c2

…IDIA#1060) Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show partition metrics for custom shuffler reader #1060

Show partition metrics for custom shuffler reader #1060

andygrove commented Nov 3, 2020 •

edited

Loading

jlowe commented Nov 10, 2020

sameerz commented Nov 15, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 18, 2020

jlowe commented Nov 18, 2020

andygrove commented Nov 18, 2020

jlowe left a comment

andygrove commented Nov 18, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

Show partition metrics for custom shuffler reader #1060

Show partition metrics for custom shuffler reader #1060

Conversation

andygrove commented Nov 3, 2020 • edited Loading

jlowe commented Nov 10, 2020

sameerz commented Nov 15, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 17, 2020

andygrove commented Nov 18, 2020

jlowe commented Nov 18, 2020

andygrove commented Nov 18, 2020

jlowe left a comment

Choose a reason for hiding this comment

andygrove commented Nov 18, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 19, 2020

andygrove commented Nov 3, 2020 •

edited

Loading