feat: add postpone bucket (bucket=-2) write support for primary-key tables by JingsongLi · Pull Request #252 · apache/paimon-rust

JingsongLi · 2026-04-16T06:50:57Z

Purpose

Postpone bucket mode writes data in KV format without sorting or deduplication, deferring bucket assignment to background compaction. Files are written to bucket-postpone directory and are invisible to normal reads until compacted.

Brief change log

Tests

API and Format

Documentation

…ables Postpone bucket mode writes data in KV format without sorting or deduplication, deferring bucket assignment to background compaction. Files are written to `bucket-postpone` directory and are invisible to normal reads until compacted.

jerry-024 · 2026-04-16T07:38:17Z

I found two behavior-compatibility issues compared with Java's postpone-bucket implementation:

In crates/paimon/src/table/table_write.rs, postpone files are always named with ...-s-0-w-.... This is not compatible with Java. Java assigns a distinct writeId per writer and encodes it in the file name; the postpone compaction path later parses that writeId and keeps files from the same writer on the same reader so replay order matches writer-local production order. Hard-coding s-0 removes that writer boundary entirely. Once multiple writers produce files for the same postpone partition, compaction / replay can no longer reconstruct the ordering assumptions used by Java, so conflicting PK records may resolve differently.
In crates/paimon/src/table/postpone_file_writer.rs, rolled files are closed asynchronously and creation_time is assigned with Utc::now() when the async close finishes. This is also incompatible with Java's ordering semantics. Java's postpone compaction sorts files by DataFileMeta.creationTime before replaying them, so creationTime is part of the effective replay-order contract. Here an earlier file can easily end up with a later creation_time than a later file, depending on close timing. That makes replay order nondeterministic and can again change the final result for PK conflicts.

So these are not just implementation differences; they change the behavioral assumptions that Java relies on for postpone-bucket replay / compaction.

JingsongLi · 2026-04-16T07:45:00Z

Regarding the first question, writeId identifies which worker it is. Since Rust only has one worker, it's hardcoded as 0.

JingsongLi · 2026-04-16T07:45:20Z

Regarding the second question, there is indeed a problem; I will fix it.

JingsongLi · 2026-04-16T07:52:16Z

Thanks @jerry-024 , comment addressed.

jerry-024

+1

JingsongLi force-pushed the postpone branch from f5f99f0 to 4890e21 Compare April 16, 2026 06:53

Fix comment

b393c46

jerry-024 approved these changes Apr 16, 2026

View reviewed changes

JingsongLi merged commit d5dd8fc into apache:main Apr 16, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add postpone bucket (bucket=-2) write support for primary-key tables#252

feat: add postpone bucket (bucket=-2) write support for primary-key tables#252
JingsongLi merged 2 commits intoapache:mainfrom
JingsongLi:postpone

JingsongLi commented Apr 16, 2026

Uh oh!

jerry-024 commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

jerry-024 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JingsongLi commented Apr 16, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

jerry-024 commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

JingsongLi commented Apr 16, 2026

Uh oh!

jerry-024 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants