fix(modeling): remove tuple builtin shadow and raise on empty dtype in get_parameter_dtype#13963
Open
adhavan18 wants to merge 2 commits into
Open
fix(modeling): remove tuple builtin shadow and raise on empty dtype in get_parameter_dtype#13963adhavan18 wants to merge 2 commits into
adhavan18 wants to merge 2 commits into
Conversation
added 2 commits
June 15, 2026 12:44
…-shift (huggingface#13243) FlowMatchEulerDiscreteScheduler.__init__ computed sigma_min and sigma_max from the already-shifted sigmas. When set_timesteps regenerated the sigma grid from those bounds via _sigma_to_t -> linspace -> /num_train_timesteps, it recovered the shifted values and then applied the shift formula a second time, producing a doubly-shifted (and therefore incorrect) schedule. Fix: record sigma_min and sigma_max from the raw linear sigmas (timesteps / num_train_timesteps) before the shift formula is applied, so set_timesteps starts from the correct unshifted bounds and the shift is applied exactly once. Regression test: test_set_timesteps_no_double_shift verifies that set_timesteps(num_inference_steps=1000) reproduces the same sigma grid that __init__ stored, for a scheduler with shift=3.0.
…n get_parameter_dtype `get_parameter_dtype` had two problems in its DataParallel fallback path: 1. The loop variable was named `tuple`, silently shadowing Python's built-in. Under some linters / runtime inspectors this causes unexpected behaviour and masks the real type annotation `list[tuple[str, Tensor]]` on `find_tensor_attributes`. 2. When all three search paths (layerwise hooks, named_parameters/buffers, and __dict__ tensor inspection) are exhausted without finding any tensors, the function falls off the end and returns `None` implicitly. Callers like `UNet2DModel.forward` then pass `dtype=None` to `tensor.to()`, which raises a cryptic `TypeError` or `UnboundLocalError` that is hard to trace back to `get_parameter_dtype`. Fixes: - Rename the loop variable `tuple` → `t` (and `last_tuple` → `last_t`) to un-shadow the built-in. - Add an explicit `raise ValueError` with an actionable message when no dtype is found, instead of returning `None`. The message hints at the most common cause (model not moved to device before wrapping with `nn.DataParallel`). Closes huggingface#13789
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #13789.
Problem
In
get_parameter_dtype, the loop variable is namedtuple, which shadows Python's built-in type. Also, ifgenis exhausted without a floating-point tensor, the function falls off the end and implicitly returnsNoneinstead of raising.Fix
tuple/last_tupletot/last_traise ValueErrorwhen no floating-point or complex dtype is foundTest
Existing tests pass.