Skip to content

Conversation

@ryyhan
Copy link

@ryyhan ryyhan commented Jan 8, 2026

Description

Resolves #459.

Implements lazy loading for transformers and semchunk in HybridChunker. This prevents RuntimeError when using docling-core with only the chunking-openai extra (which does not depend on transformers).

Note: This applies the same lazy-loading pattern accepted in docling PR #2826(docling-project/docling#2826) (fix: transformers models lazy-loaded).

Changes

  • Moved semchunk import to _split_using_plain_text method.
  • Replaced explicit isinstance check for PreTrainedTokenizerBase with a module path string check (tokenizer.__module__) to avoid importing transformers.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • New and existing unit tests pass locally with my changes

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

DCO Check Passed

Thanks @ryyhan, all your commits are properly signed off. 🎉

@dosubot
Copy link

dosubot bot commented Jan 8, 2026

Related Documentation

Checked 6 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@mergify
Copy link

mergify bot commented Jan 8, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@ryyhan ryyhan force-pushed the fix/issue-459-lazy-load-transformers branch from 0ed9445 to addf8bf Compare January 8, 2026 18:48
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@ryyhan
Copy link
Author

ryyhan commented Jan 12, 2026

Hi team,

This PR has passed all CI checks and is ready for review.

Summary:

@dolfim-ibm Could you please take a look? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using docling[chunking-openai] raises import RuntimeError

1 participant