Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z-order is no-op for strings with identical prefix of length >= 14 #2844

Open
cjolowicz opened this issue Sep 4, 2024 · 2 comments
Open
Labels
bug Something isn't working

Comments

@cjolowicz
Copy link

cjolowicz commented Sep 4, 2024

Environment

Delta-rs version:

0.19.1

Binding:
Python and Rust

Environment:

  • Cloud provider: local filesystem and R2
  • OS: Linux
  • Other:

Bug

What happened:

Apply z-order to a Delta Table on a column that contains strings with identical prefixes of at least 14 characters. The records in the new Parquet files retain their original order.

I initially witnessed this when z-ordering a large partition on ISO 8601 timestamps using delta-rs in Rust. I've since reproduced this with Python bindings and a small data frame using strings containing zero-padded integers (see repro below).

What you expected to happen:

The resulting Parquet files are ordered by the column specified for z-ordering.

How to reproduce it:

# test_zorder.py
import shutil
import pandas
from deltalake import write_deltalake, DeltaTable

def test_zorder() -> None:
    table = "a"
    field = "b"
    items = [f"{item:015}" for item in [2, 3, 1]]

    shutil.rmtree(table, ignore_errors=True)

    write_deltalake(table, pandas.DataFrame({field: items}))

    DeltaTable(table).optimize.z_order([field])

    sorted_items = DeltaTable(table).to_pyarrow_table().to_pydict()[field]

    assert sorted(items) == sorted_items

Run this with uv:

# caveat: this removes a directory named `a` from the current directory
uvx --with deltalake --with pandas pytest -vv test_zorder.py

Output:

========================= test session starts ==========================
platform linux -- Python 3.12.5, pytest-8.3.2, pluggy-1.5.0 -- /home/claudio/.cache/uv/archive-v0/A-uQ68p-4BWRUFltJ5Mv2/bin/python
cachedir: .pytest_cache
rootdir: ...
collected 1 item                                                       

test_zorder.py::test_zorder FAILED                               [100%]

=============================== FAILURES ===============================
_____________________________ test_zorder ______________________________

...
    
>       assert sorted(items) == sorted_items
E       AssertionError: assert ['000000000000001', '000000000000002', '000000000000003'] == ['000000000000002', '000000000000003', '000000000000001']
E         
E         At index 0 diff: '000000000000001' != '000000000000002'
E         
E         Full diff:
E           [
E         +     '000000000000001',
E               '000000000000002',
E               '000000000000003',
E         -     '000000000000001',
E           ]

test_zorder.py:20: AssertionError
======================= short test summary info ========================
FAILED test_zorder.py::test_zorder - AssertionError: assert ['000000000000001', '000000000000002', '000000000000003'] == ['000000000000002', '000000000000003', '000000000000001']
  
  At index 0 diff: '000000000000001' != '000000000000002'
  
  Full diff:
    [
  +     '000000000000001',
        '000000000000002',
        '000000000000003',
  -     '000000000000001',
    ]
========================== 1 failed in 0.36s ===========================

More details:

N/A

@cjolowicz cjolowicz added the bug Something isn't working label Sep 4, 2024
@cjolowicz
Copy link
Author

cjolowicz commented Sep 5, 2024

This seems to be an issue with how we compute the z-order key. Here's the output of df.show() during read_zorder when running the failing test above:

+-----------------+----------------------------------+
| b               | __zorder_key                     |
+-----------------+----------------------------------+
| 000000000000002 | 023030303030303030ff303030303030 |
| 000000000000003 | 023030303030303030ff303030303030 |
| 000000000000001 | 023030303030303030ff303030303030 |
+-----------------+----------------------------------+

Every row gets the same z-order key even though they have distinct values in the z-order column b.

Update: This happens because we only look at the first 16 bytes of each column to compute the z-order key.

/// Creates a new binary array containing the zorder keys for the given columns
///
/// Each value is 16 bytes * number of columns. Each column is converted into
/// its row binary representation, and then the first 16 bytes are taken.
/// These truncated values are interleaved in the array values.
pub fn zorder_key(columns: &[ArrayRef]) -> Result<ArrayRef, ArrowError> {

I've updated the issue description ("with identical prefixes of at least 14 characters").

This limitation is mentioned in the PR for the original z-order implementation: #1429 (comment)

@cjolowicz cjolowicz changed the title Z-order is no-op for single column of strings with length >= 15 Z-order is no-op for single column of strings with identical prefix of length >= 14 Sep 5, 2024
@cjolowicz cjolowicz changed the title Z-order is no-op for single column of strings with identical prefix of length >= 14 Z-order is no-op for strings with identical prefix of length >= 14 Sep 5, 2024
@cjolowicz
Copy link
Author

cjolowicz commented Sep 5, 2024

@wjones127 The z-order design document recommends an implementation that would avoid the issue of dropping bytes for long strings (decision 1, option 3). It seems we ended up going with option 1 instead, which does have that issue. Do you think option 3 is still a viable approach for us? Any pointers for how to implement this?

Another, simpler option would be to make the number of significant bytes per z-order column configurable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant