Skip to content

Conversation

@ryyhan
Copy link

@ryyhan ryyhan commented Jan 8, 2026

Resolves #350

Fixes the loss of trailing zeros in Markdown table export by disabling automatic number parsing in tabulate.

Verified with reproduction script.

docling-project#465)

* feat(DocItem): Add comments field for linking annotations to document items

Implements support for linking comments (from Word/PPT documents) to their
annotated content using the established FloatingItem/RefItem pattern.

Changes:
- Add `comments: List[RefItem]` field to DocItem class
- Update `_update_breadth_first_with_lookup()` to handle comment references on deletion
- Bump CURRENT_VERSION to 1.9.0
- Fix version comparison bug (string vs integer for minor version)
- Add 4 new tests for comments functionality
- Update test data files for new schema

Closes: docling-project/docling#464
Related: docling-project/docling#2834

Signed-off-by: s1v4-d <[email protected]>

* improve comment Pydantic serialization

Signed-off-by: Panos Vagenas <[email protected]>

* add add_comment, update tests

Signed-off-by: Panos Vagenas <[email protected]>

* introduce fine-granular references with span ranges

Signed-off-by: Panos Vagenas <[email protected]>

* simplify last test

Signed-off-by: Panos Vagenas <[email protected]>

---------

Signed-off-by: s1v4-d <[email protected]>
Signed-off-by: Panos Vagenas <[email protected]>
Co-authored-by: Panos Vagenas <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

DCO Check Failed

Hi @ryyhan, your pull request has failed the Developer Certificate of Origin (DCO) check.

This repository supports remediation commits, so you can fix this without rewriting history — but you must follow the required message format.


🛠 Quick Fix: Add a remediation commit

Run this command:

# Unable to auto-generate remediation message. Please check the DCO check details.
git push

🔧 Advanced: Sign off each commit directly

For the latest commit:

git commit --amend --signoff
git push --force-with-lease

For multiple commits:

git rebase --signoff origin/main
git push --force-with-lease

More info: DCO check report

@mergify
Copy link

mergify bot commented Jan 8, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@dosubot
Copy link

dosubot bot commented Jan 8, 2026

Related Documentation

Checked 6 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@ryyhan ryyhan force-pushed the fix/issue-350-trailing-zeros branch from 05a346c to 2a0b348 Compare January 8, 2026 19:31
Copy link
Member

@cau-git cau-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryyhan Thanks for attempting a fix at #350. We must however ensure it has no unwanted side effects in other cases than the ones observed in that issue. The original logic was for sure created with some intention.

The branch may also need an rebase to main, I see some unrelated stuff like the add_comment in the diff, which is already present on main.

Comment on lines 420 to 432
table_text = tabulate(rows[1:], headers=rows[0], tablefmt="github")
table_text = tabulate(
rows[1:],
headers=rows[0],
tablefmt="github",
disable_numparse=True,
)
except ValueError:
table_text = tabulate(
rows[1:],
headers=rows[0],
tablefmt="github",
disable_numparse=True,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the try and except blocks call tabulate with exactly the same parameters, which makes no sense any more.
It looks like the original intention was to try using tabulate with number parsing, and disable it as fallback when a ValueError occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue with loss of trailing zeros in formatted numbers during PDF parsing

3 participants