fix: lazy load transformers in HybridChunker #468

ryyhan · 2026-01-08T15:27:59Z

Description

Resolves #459.

Implements lazy loading for transformers and semchunk in HybridChunker. This prevents RuntimeError when using docling-core with only the chunking-openai extra (which does not depend on transformers).

Note: This applies the same lazy-loading pattern accepted in docling PR #2826(docling-project/docling#2826) (fix: transformers models lazy-loaded).

Changes

Moved semchunk import to _split_using_plain_text method.
Replaced explicit isinstance check for PreTrainedTokenizerBase with a module path string check (tokenizer.__module__) to avoid importing transformers.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
New and existing unit tests pass locally with my changes

github-actions · 2026-01-08T15:28:09Z

✅ DCO Check Passed

Thanks @ryyhan, all your commits are properly signed off. 🎉

dosubot · 2026-01-08T15:28:12Z

Related Documentation

Checked 6 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

mergify · 2026-01-08T15:28:34Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

Signed-off-by: ryyhan <[email protected]>

codecov · 2026-01-12T14:49:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ryyhan · 2026-01-12T15:21:28Z

Hi team,

This PR has passed all CI checks and is ready for review.

Summary:

Implements lazy loading for transformers in HybridChunker to prevent RuntimeErrors when dependencies are missing.
Resolves Using docling[chunking-openai] raises import RuntimeError #459.

@dolfim-ibm Could you please take a look? Thanks!

ryyhan mentioned this pull request Jan 8, 2026

Using docling[chunking-openai] raises import RuntimeError #459

Open

ryyhan force-pushed the fix/issue-459-lazy-load-transformers branch from 879c4f9 to 0ed9445 Compare January 8, 2026 18:23

fix: lazy load transformers in HybridChunker

addf8bf

Signed-off-by: ryyhan <[email protected]>

ryyhan force-pushed the fix/issue-459-lazy-load-transformers branch from 0ed9445 to addf8bf Compare January 8, 2026 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: lazy load transformers in HybridChunker #468

fix: lazy load transformers in HybridChunker #468

ryyhan commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

dosubot bot commented Jan 8, 2026

Uh oh!

mergify bot commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 12, 2026

Uh oh!

ryyhan commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: lazy load transformers in HybridChunker #468

Are you sure you want to change the base?

fix: lazy load transformers in HybridChunker #468

Conversation

ryyhan commented Jan 8, 2026

Description

Changes

Type of change

Checklist

Uh oh!

github-actions bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dosubot bot commented Jan 8, 2026

Uh oh!

mergify bot commented Jan 8, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

codecov bot commented Jan 12, 2026

Codecov Report

Uh oh!

ryyhan commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Jan 8, 2026 •

edited

Loading