diffengine.datasets.transforms
¶ Submodules¶ Package Contents¶ Classes¶ Attributes¶
Base class for all transformations.
Call function to transform data.
results (dict) –
dict | tuple[list, list] | None
Transform the data.
The transform function. All subclass of BaseTransform should override this method.
This function takes the result dict as the input, and can add new items to the dict or modify existing items in the dict. And the result dict will be returned in the end, which allows to concate multiple transforms into a pipeline.
Args:¶Returns:¶results (dict): The result dict.
dict: The result dict.
results (dict) –
dict | tuple[list, list] | None
Dump the image processed by the pipeline.
Args:¶max_imgs (int): Maximum value of output. dump_dir (str): Dump output directory.
Dump the input image to the specified directory.
No changes will be made.
Args:¶Returns:¶results (dict): Result dict from loading pipeline.
results (dict): Result dict from loading pipeline. (same as input)
results (dict) –
dict
max_imgs (int) –
dump_dir (str) –
Dump Masked the image processed by the pipeline.
Args:¶max_imgs (int): Maximum value of output. dump_dir (str): Dump output directory.
Dump the input image to the specified directory.
No changes will be made.
Args:¶Returns:¶results (dict): Result dict from loading pipeline.
results (dict): Result dict from loading pipeline. (same as input)
results (dict) –
dict
max_imgs (int) –
dump_dir (str) –
Bases: diffengine.datasets.transforms.BaseTransform
Pack the inputs data.
Required Keys:
input_key
Deleted Keys:
All other keys in the dict.
Args:¶
- input_keys (List[str]): The key of element to feed into the model
forwarding. Defaults to [‘img’, ‘text’].
- skip_to_tensor_key (List[str]): The key of element to skip to_tensor.
Defaults to [‘text’].
Transform the data.
results (dict) –
dict
input_keys (list[str] | None) –
skip_to_tensor_key (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
Load Mask for multiple types.
Copied from https://github.com/open-mmlab/mmagic/blob/main/mmagic/utils/trans_utils.py
Reference from: mmagic.datasets.transforms.loading.LoadMask
For different types of mask, users need to provide the corresponding config dict.
Example config for bbox:
config = dict(max_bbox_shape=128)
Example config for irregular:
config = dict( num_vertices=(4, 12), max_angle=4., length_range=(10, 100), brush_width=(10, 40), area_ratio_range=(0.15, 0.5))
Example config for ff:
config = dict( num_vertices=(4, 12), mean_angle=1.2, angle_range=0.4, brush_width=(12, 40))Args:¶
- mask_mode (str): Mask mode in [‘bbox’, ‘irregular’, ‘ff’, ‘set’,
‘whole’]. Default: ‘bbox’. * bbox: square bounding box masks. * irregular: irregular holes. * ff: free-form holes from DeepFillv2. * set: randomly get a mask from a mask set. * whole: use the whole image as mask.
- mask_config (dict): Params for creating masks. Each type of mask needs
different configs. Default: None.
Transform function.
Args:¶Returns:¶
- results (dict): A dict containing the necessary information and
data for augmentation.
dict: A dict containing the processed data and information.
results (dict) –
dict
mask_mode (str) –
mask_config (dict | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
AddConstantCaption.
-> “a dog. in szn style”
constant_caption (str): constant_caption to add. keys (List[str], optional): keys to apply augmentation from results.
Defaults to None.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
constant_caption (str) –
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
CenterCrop.
1. save crop top left as ‘crop_top_left’ and crop_bottom_right in results
- size (sequence or int): Desired output size of the crop. If size is an
int instead of sequence like (h, w), a square crop (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0])
keys (List[str]): keys to apply augmentation from results.
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘crop_top_left’ key is added as crop points.
results (dict) –
dict | tuple[list, list] | None
size (collections.abc.Sequence[int] | int) –
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
CLIPImageProcessor.
Args:¶key (str): key to apply augmentation from results. Defaults to ‘img’. output_key (str): output_key after applying augmentation from
results. Defaults to ‘clip_img’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
key (str) –
output_key (str) –
pretrained (str | None) –
subfolder (str | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
Compute aMUSEd micro_conds as ‘micro_conds’ in results.
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘micro_conds’ key is added as original image shape.
results (dict) –
dict | tuple[list, list] | None
Bases: diffengine.datasets.transforms.base.BaseTransform
Compute Orig Height and Widh + Aspect Ratio.
Return ‘resolution’, ‘aspect_ratio’ in results
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘time_ids’ key is added as original image shape.
results (dict) –
dict | tuple[list, list] | None
Bases: diffengine.datasets.transforms.base.BaseTransform
Compute time ids as ‘time_ids’ in results.
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘time_ids’ key is added as original image shape.
results (dict) –
dict | tuple[list, list] | None
Bases: diffengine.datasets.transforms.base.BaseTransform
ConcatMultipleImgs.
Args:¶
- keys (List[str], optional): keys to apply augmentation from results.
Defaults to None.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
GetMaskedImage.
Args:¶
- key (str): key to outputs.
Defaults to ‘masked_image’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
key (str) –
Bases: diffengine.datasets.transforms.base.BaseTransform
MaskToTensor.
Convert mask to tensor.
Transpose mask from (H, W, 1) to (1, H, W)
- key (str): key to apply augmentation from results.
Defaults to ‘mask’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
key (str) –
Bases: diffengine.datasets.transforms.base.BaseTransform
Multi Aspect Ratio Resize and Center Crop.
Args:¶
- sizes (List[sequence]): List of desired output size of the crop.
Sequence like (h, w).
keys (List[str]): keys to apply augmentation from results. interpolation (str): Desired interpolation enum defined by
torchvision.transforms.InterpolationMode. Defaults to ‘bilinear’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
sizes (list[collections.abc.Sequence[int]]) –
keys (list[str] | None) –
interpolation (str) –
Bases: diffengine.datasets.transforms.base.BaseTransform
RandomCrop.
1. save crop top left as ‘crop_top_left’ and crop_bottom_right in results 2. apply same random parameters to multiple keys like [‘img’, ‘condition_img’].
- size (sequence or int): Desired output size of the crop. If size is an
int instead of sequence like (h, w), a square crop (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0])
keys (List[str]): keys to apply augmentation from results. force_same_size (bool): Force same size for all keys. Defaults to True.
Transform.
Args:¶Returns:¶results (dict): The result dict.
- dict: ‘crop_top_left’ and crop_bottom_right key is added as crop
point.
results (dict) –
dict | tuple[list, list] | None
size (collections.abc.Sequence[int] | int) –
keys (list[str] | None) –
force_same_size (bool) –
Bases: diffengine.datasets.transforms.base.BaseTransform
RandomHorizontalFlip.
update ‘crop_top_left’ and crop_bottom_right if exists.
2. apply same random parameters to multiple keys like [‘img’, ‘condition_img’].
- p (float): probability of the image being flipped.
Default value is 0.5.
keys (List[str]): keys to apply augmentation from results.
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘crop_top_left’ key is fixed.
results (dict) –
dict | tuple[list, list] | None
p (float) –
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
RandomTextDrop. Replace text to empty.
Args:¶
- p (float): probability of the image being flipped.
Default value is 0.5.
keys (List[str]): keys to apply augmentation from results.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
p (float) –
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
Save image shape as ‘ori_img_shape’ in results.
Transform.
Args:¶Returns:¶results (dict): The result dict.
dict: ‘ori_img_shape’ key is added as original image shape.
results (dict) –
dict | tuple[list, list] | None
Bases: diffengine.datasets.transforms.base.BaseTransform
T5 Text Preprocess.
Args:¶keys (List[str]): keys to apply augmentation from results. clean_caption (bool): clean caption. Defaults to False.
Clean caption.
Copied from diffusers.pipelines.deepfloyd_if.pipeline_if.IFPipeline._clean_caption
caption (str) –
str
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
keys (list[str] | None) –
clean_caption (bool) –
Bases: diffengine.datasets.transforms.base.BaseTransform
TransformersImageProcessor.
Args:¶pretrained (str): pretrained model name. key (str): key to apply augmentation from results. Defaults to ‘img’. output_key (str): output_key after applying augmentation from
results. Defaults to ‘clip_img’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
pretrained (str) –
key (str) –
output_key (str) –
TorchVisonTransformWrapper.
We can use torchvision.transforms like dict(type=’torchvision/Resize’, size=512)
Args:¶
- transform (str): The name of transform. For example
torchvision/Resize.
keys (List[str]): keys to apply augmentation from results.
Call transform.
results (dict) –
dict
Repr.
str
keys (list[str] | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
TransformersImageProcessor.
Args:¶key (str): key to apply augmentation from results. Defaults to ‘img’. output_key (str): output_key after applying augmentation from
results. Defaults to ‘clip_img’.
Transform.
Args:¶results (dict): The result dict.
results (dict) –
dict | tuple[list, list] | None
key (str) –
output_key (str) –
pretrained (str | None) –
Bases: diffengine.datasets.transforms.base.BaseTransform
Process data with a randomly chosen transform from given candidates.
Copied from mmcv/transforms/wrappers.py.
Args:¶Examples:¶
- transforms (list[list]): A list of transform candidates, each is a
sequence of transforms.
- prob (list[float], optional): The probabilities associated
with each pipeline. The length should be equal to the pipeline number and the sum should be 1. If not given, a uniform distribution will be assumed.
>>> # config >>> pipeline = [ >>> dict(type='RandomChoice', >>> transforms=[ >>> [dict(type='RandomHorizontalFlip')], # subpipeline 1 >>> [dict(type='RandomRotate')], # subpipeline 2 >>> ] >>> ) >>> ]
Iterate over transforms.
collections.abc.Iterator
Return a random transform index.
int
Randomly choose a transform to apply.
results (dict) –
dict | None
transforms (list[Transform | list[Transform]]) –
prob (list[float] | None) –
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4