Fix TypeError in LongCatImageEditPipeline truncation warning#13526
Open
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Open
Fix TypeError in LongCatImageEditPipeline truncation warning#13526Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Conversation
`_encode_prompt` in `LongCatImageEditPipeline` calls `len()` twice on
`all_tokens` when logging the truncation warning:
f" {self.tokenizer_max_length} input token nums : {len(len(all_tokens))}"
`len(all_tokens)` already returns an `int`, so the outer `len()` raises
`TypeError: object of type 'int' has no len()`. The failure triggers
exactly in the only branch this warning exists for (prompts longer
than `tokenizer_max_length`, default 512), turning the intended
informational warning into a hard crash.
The sibling `LongCatImagePipeline._encode_prompt` has the correct
`{len(all_tokens)}` at line 291, so this is a typo local to the edit
pipeline. Minimal fix: drop the extra `len()` call.
Reproduces with any prompt whose tokenization exceeds 512 tokens:
pipe = LongCatImageEditPipeline.from_pretrained(...)
pipe(prompt="a very long prompt ..." * 200, image=...)
# TypeError: object of type 'int' has no len()
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
LongCatImageEditPipeline._encode_promptcallslen()twice on anintwhen building its truncation warning message:https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/longcat_image/pipeline_longcat_image_edit.py#L284-L289
len(all_tokens)already returns anint, so the outerlen()raisesTypeError: object of type 'int' has no len()— inside the f-string formatting — every time the branch fires.Why this is a real bug
The warning only executes when
len(all_tokens) > self.tokenizer_max_length(default 512). That is the exact scenario the warning is supposed to inform the user about. Instead of informing them, the pipeline crashes with aTypeErrorduring prompt encoding.Minimal repro:
The sibling
LongCatImagePipeline._encode_prompthas the correct form at line 291 — this is a local typo in the edit pipeline only.Fix
Drop the extraneous
len()call to match the sibling pipeline:Before submitting
Who can review?
@yiyixuxu @sayakpaul