Z-order is no-op for strings with identical prefix of length >= 14 #2844

cjolowicz · 2024-09-04T15:51:07Z

Environment

Delta-rs version:

0.19.1

Binding:
Python and Rust

Environment:

Cloud provider: local filesystem and R2
OS: Linux
Other:

Bug

What happened:

Apply z-order to a Delta Table on a column that contains strings with identical prefixes of at least 14 characters. The records in the new Parquet files retain their original order.

I initially witnessed this when z-ordering a large partition on ISO 8601 timestamps using delta-rs in Rust. I've since reproduced this with Python bindings and a small data frame using strings containing zero-padded integers (see repro below).

What you expected to happen:

The resulting Parquet files are ordered by the column specified for z-ordering.

How to reproduce it:

# test_zorder.py
import shutil
import pandas
from deltalake import write_deltalake, DeltaTable

def test_zorder() -> None:
    table = "a"
    field = "b"
    items = [f"{item:015}" for item in [2, 3, 1]]

    shutil.rmtree(table, ignore_errors=True)

    write_deltalake(table, pandas.DataFrame({field: items}))

    DeltaTable(table).optimize.z_order([field])

    sorted_items = DeltaTable(table).to_pyarrow_table().to_pydict()[field]

    assert sorted(items) == sorted_items

Run this with uv:

# caveat: this removes a directory named `a` from the current directory
uvx --with deltalake --with pandas pytest -vv test_zorder.py

Output:

========================= test session starts ==========================
platform linux -- Python 3.12.5, pytest-8.3.2, pluggy-1.5.0 -- /home/claudio/.cache/uv/archive-v0/A-uQ68p-4BWRUFltJ5Mv2/bin/python
cachedir: .pytest_cache
rootdir: ...
collected 1 item                                                       

test_zorder.py::test_zorder FAILED                               [100%]

=============================== FAILURES ===============================
_____________________________ test_zorder ______________________________

...
    
>       assert sorted(items) == sorted_items
E       AssertionError: assert ['000000000000001', '000000000000002', '000000000000003'] == ['000000000000002', '000000000000003', '000000000000001']
E         
E         At index 0 diff: '000000000000001' != '000000000000002'
E         
E         Full diff:
E           [
E         +     '000000000000001',
E               '000000000000002',
E               '000000000000003',
E         -     '000000000000001',
E           ]

test_zorder.py:20: AssertionError
======================= short test summary info ========================
FAILED test_zorder.py::test_zorder - AssertionError: assert ['000000000000001', '000000000000002', '000000000000003'] == ['000000000000002', '000000000000003', '000000000000001']
  
  At index 0 diff: '000000000000001' != '000000000000002'
  
  Full diff:
    [
  +     '000000000000001',
        '000000000000002',
        '000000000000003',
  -     '000000000000001',
    ]
========================== 1 failed in 0.36s ===========================

More details:

N/A

The text was updated successfully, but these errors were encountered:

cjolowicz · 2024-09-05T10:22:21Z

This seems to be an issue with how we compute the z-order key. Here's the output of df.show() during read_zorder when running the failing test above:

+-----------------+----------------------------------+
| b               | __zorder_key                     |
+-----------------+----------------------------------+
| 000000000000002 | 023030303030303030ff303030303030 |
| 000000000000003 | 023030303030303030ff303030303030 |
| 000000000000001 | 023030303030303030ff303030303030 |
+-----------------+----------------------------------+

Every row gets the same z-order key even though they have distinct values in the z-order column b.

Update: This happens because we only look at the first 16 bytes of each column to compute the z-order key.

delta-rs/crates/core/src/operations/optimize.rs

Lines 1426 to 1431 in de11e6b

    
           /// Creates a new binary array containing the zorder keys for the given columns 
        
           /// 
        
           /// Each value is 16 bytes * number of columns. Each column is converted into 
        
           /// its row binary representation, and then the first 16 bytes are taken. 
        
           /// These truncated values are interleaved in the array values. 
        
           pub fn zorder_key(columns: &[ArrayRef]) -> Result<ArrayRef, ArrowError> {

I've updated the issue description ("with identical prefixes of at least 14 characters").

This limitation is mentioned in the PR for the original z-order implementation: #1429 (comment)

cjolowicz · 2024-09-05T12:26:18Z

@wjones127 The z-order design document recommends an implementation that would avoid the issue of dropping bytes for long strings (decision 1, option 3). It seems we ended up going with option 1 instead, which does have that issue. Do you think option 3 is still a viable approach for us? Any pointers for how to implement this?

Another, simpler option would be to make the number of significant bytes per z-order column configurable.

cjolowicz added the bug Something isn't working label Sep 4, 2024

cjolowicz changed the title ~~Z-order is no-op for single column of strings with length >= 15~~ Z-order is no-op for single column of strings with identical prefix of length >= 14 Sep 5, 2024

cjolowicz changed the title ~~Z-order is no-op for single column of strings with identical prefix of length >= 14~~ Z-order is no-op for strings with identical prefix of length >= 14 Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Z-order is no-op for strings with identical prefix of length >= 14 #2844

Z-order is no-op for strings with identical prefix of length >= 14 #2844

cjolowicz commented Sep 4, 2024 •

edited

Loading

cjolowicz commented Sep 5, 2024 •

edited

Loading

cjolowicz commented Sep 5, 2024 •

edited

Loading

Z-order is no-op for strings with identical prefix of length >= 14 #2844

Z-order is no-op for strings with identical prefix of length >= 14 #2844

Comments

cjolowicz commented Sep 4, 2024 • edited Loading

Environment

Bug

cjolowicz commented Sep 5, 2024 • edited Loading

cjolowicz commented Sep 5, 2024 • edited Loading

cjolowicz commented Sep 4, 2024 •

edited

Loading

cjolowicz commented Sep 5, 2024 •

edited

Loading

cjolowicz commented Sep 5, 2024 •

edited

Loading