Skip to content

Conversation

@ryyhan
Copy link

@ryyhan ryyhan commented Jan 8, 2026

Description

Resolves #410.

Adds page_no attribute to the TableCell class to support cross-page table data as requested. This allows upstream parsers to store page location information for individual cells in multi-page tables.

Changes

  • Modified TableCell in docling_core/types/doc/document.py to include optional page_no.
  • Updated regression test gold file test/data/docling_document/unit/TableItem.yaml.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • New and existing unit tests pass locally with my changes

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

DCO Check Passed

Thanks @ryyhan, all your commits are properly signed off. 🎉

@dosubot
Copy link

dosubot bot commented Jan 8, 2026

Related Documentation

Checked 6 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@mergify
Copy link

mergify bot commented Jan 8, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@ceberam
Copy link
Member

ceberam commented Jan 9, 2026

@ryyhan Thanks for this PR. I think it makes sense to keep track of the page number (when available) in table cells.
Could you please check that the PR passes the code and style checks?
Simply run uv run pre-commit run --all-files. You may also want to install pre-commit locally with uv run pre-commit install to set up the git hook scripts.

@ryyhan ryyhan force-pushed the fix/issue-410-add-page-no-to-table-cell branch from 54fd775 to 67365e1 Compare January 9, 2026 11:35
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Member

@ceberam ceberam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ryyhan
Copy link
Author

ryyhan commented Jan 9, 2026

LGTM

Thank you for the review and approval, @ceberam! I'm glad the changes look good.

@ryyhan
Copy link
Author

ryyhan commented Jan 12, 2026

@dolfim-ibm / @PeterStaar-IBM could you also please take a look when you have a moment? We need a second approval for the test updates to merge this.

@PeterStaar-IBM
Copy link
Member

@ryyhan I am not sure about this, it looks like a patch instead of a proper solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bee] need page_no on TableCell to support cross page table data

3 participants