Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing performance differences of multi-threaded/coalesce/perfile Parquet reader type for a single file #1366

Closed
abellina opened this issue Dec 10, 2020 · 3 comments
Assignees
Labels
performance A performance related task/issue

Comments

@abellina
Copy link
Collaborator

We ran into a case where a single Parquet file loaded an order of magnitude slower (800ms vs 80ms) comparing 0.3 snapshot vs 0.2 of the plugin.

But when we set: spark.rapids.sql.format.parquet.reader.type=MULTITHREADED or spark.rapids.sql.format.parquet.reader.type=COALESCING we see consistent or better results <= 80ms in 0.3.

This seems interesting because it's a single 128MB parquet file, and there's a single batch that comes out of it. So it looks like the reader has some overheads if left by default AUTO. Note that the file was being read from hdfs://.

The file in question was one of the Parquet parts of the web_sales table for TPCDS at 1TB, and the query where we saw significant effect was q88. The file has decimals in it, and the test projected out the integers s.t. the gpu parquet scan would be enabled:

val test = spark.read.parquet("hdfs://.../web_sales/ws_sold_date_sk=2450816")
test.select("ws_web_page_sk", "ws_order_number").write.mode("overwrite").parquet("test")
@abellina abellina added feature request New feature or request ? - Needs Triage Need team to review and classify labels Dec 10, 2020
@sameerz sameerz added performance A performance related task/issue and removed ? - Needs Triage Need team to review and classify feature request New feature or request labels Dec 15, 2020
@sameerz sameerz changed the title [FEA] Seeing performance differences of multi-threaded/coalesce/perfile Parquet reader type for a single file Seeing performance differences of multi-threaded/coalesce/perfile Parquet reader type for a single file Dec 15, 2020
@tgravescs
Copy link
Collaborator

so when I run this on yarn, and just look at the parquet read stats on our yarn cluster, I see the opposite, the multi-threaded reader has a total scan time larger then auto = coalescing. But when I run the entire tpcds query 9 on the yarn cluster, using partitioned data, which by the comments I assume is what was done in raplab, the multi-threaded reader is faster. The issue here is the partitioning. If I pull in the PR to fix coalesce reader with partitioning (#1200) then AUTO = COALESCING is much faster.

this was running the query with 10G tpcds data that is partitioned

  • multi-threaded = 288s
  • AUTO=coalescing = 420s
  • coalescing with partitioning fix for issue 1200 = 164s

I ran q88 with similar results as well. Going by the path:
spark.read.parquet("hdfs:/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/web_sales/ws_sold_date_sk=2450816")

certainly looks like partitioned data. I'm guessing this is what is happening in raplab as well but we need to verify.

@tgravescs tgravescs self-assigned this Dec 17, 2020
@tgravescs
Copy link
Collaborator

one thing we could do for the 0.3 release is default it to the multi-threaded reader if we are worried to many people use partitioned data

@sameerz sameerz added this to the Dec 7 - Dec 18 milestone Dec 17, 2020
@tgravescs
Copy link
Collaborator

tgravescs commented Dec 17, 2020

looking at the logs of one application that was worse on raplab we see that its reading many partitioned files:

20/12/09 23:55:28 INFO GpuParquetMultiFilePartitionReaderFactory: Using the coalesce multi-file parquet reader, files: hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451111/part-00016-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452169/part-00041-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452147/part-00052-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451469/part-00068-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452209/part-00045-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451485/part-00188-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452184/part-00068-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451063/part-00005-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452501/part-00086-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452213/part-00052-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451815/part-00152-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452546/part-00108-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet task attemptid: 12158
20/12/09 23:55:28 INFO GpuParquetMultiFilePartitionReaderFactory: Using the coalesce multi-file parquet reader, files: hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451027/part-00135-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452467/part-00198-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450886/part-00084-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452327/part-00143-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450842/part-00051-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451003/part-00107-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451243/part-00078-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452407/part-00053-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451238/part-00038-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451661/part-00069-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451208/part-00133-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452337/part-00089-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450835/part-00133-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452092/part-00194-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451555/part-00143-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451282/part-00157-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452122/part-00157-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451959/part-00084-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452051/part-00107-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451252/part-00198-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452487/part-00107-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451380/part-00084-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452296/part-00008-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet,hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450930/part-00023-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet task attemptid: 12189

These end up being split into separate batches and thus is not efficient intransferring:

20/12/09 23:55:29 INFO MultiFileParquetPartitionReader: Partition values for the next file hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450842/part-00051-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet doesn't match current hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2452327/part-00143-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet, splitting it into another batch!
20/12/09 23:55:29 INFO MultiFileParquetPartitionReader: Partition values for the next file hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2451003/part-00107-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet doesn't match current hdfs://host1:9000/data/tpcds_sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/store_sales/ss_sold_date_sk=2450842/part-00051-2e3057df-fc7e-4ba1-9a05-2147079e088e.c000.snappy.parquet, splitting it into another batch!

so I think this is a dup of #1200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

No branches or pull requests

3 participants