A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://diffengine.readthedocs.io/en/latest/autoapi/diffengine/datasets/index.html below:

Website Navigation


diffengine.datasets — diffengine 1.0.0 documentation

diffengine.datasets Subpackages Submodules Package Contents Classes
class diffengine.datasets.HFControlNetDataset(dataset, image_column='image', condition_column='condition', caption_column='text', csv='metadata.csv', pipeline=(), cache_dir=None)[source]

Bases: torch.utils.data.Dataset

Dataset for huggingface datasets.

Args:

dataset (str): Dataset name or path to dataset. image_column (str): Image column name. Defaults to ‘image’. condition_column (str): Condition column name for ControlNet.

Defaults to ‘condition’.

caption_column (str): Caption column name. Defaults to ‘text’. csv (str): Caption csv file name when loading local folder.

Defaults to ‘metadata.csv’.

pipeline (Sequence): Processing pipeline. Defaults to an empty tuple. cache_dir (str, optional): The directory where the downloaded datasets

will be stored.Defaults to None.

__len__()[source]

Get the length of dataset.

Returns:

int

Return type:

The length of filtered dataset.

__getitem__(idx)[source]

Get item.

Get the idx-th image and data information of dataset after ``self.pipeline`.

Args:

idx (int): The index of self.data_list.

Returns:

dict: The idx-th image and data information of dataset after self.pipeline.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • dataset (str) –

  • image_column (str) –

  • condition_column (str) –

  • caption_column (str) –

  • csv (str) –

  • pipeline (collections.abc.Sequence) –

  • cache_dir (str | None) –

class diffengine.datasets.HFDataset(dataset, image_column='image', caption_column='text', csv='metadata.csv', pipeline=(), cache_dir=None)[source]

Bases: torch.utils.data.Dataset

Dataset for huggingface datasets.

Args:

dataset (str): Dataset name or path to dataset. image_column (str): Image column name. Defaults to ‘image’. caption_column (str): Caption column name. Defaults to ‘text’. csv (str): Caption csv file name when loading local folder.

Defaults to ‘metadata.csv’.

pipeline (Sequence): Processing pipeline. Defaults to an empty tuple. cache_dir (str, optional): The directory where the downloaded datasets

will be stored.Defaults to None.

__len__()[source]

Get the length of dataset.

Returns:

int

Return type:

The length of filtered dataset.

__getitem__(idx)[source]

Get item.

Get the idx-th image and data information of dataset after ``self.pipeline`.

Args:

idx (int): The index of self.data_list.

Returns:

dict: The idx-th image and data information of dataset after self.pipeline.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • dataset (str) –

  • image_column (str) –

  • caption_column (str) –

  • csv (str) –

  • pipeline (collections.abc.Sequence) –

  • cache_dir (str | None) –

class diffengine.datasets.HFDatasetPreComputeEmbs(*args, model='stabilityai/stable-diffusion-xl-base-1.0', text_hasher='text', device='cuda', proportion_empty_prompts=0.0, **kwargs)[source]

Bases: HFDataset

Dataset for huggingface datasets.

The difference from HFDataset is
  1. pre-compute Text Encoder embeddings to save memory.

Args:
model (str): pretrained model name of stable diffusion xl.

Defaults to ‘stabilityai/stable-diffusion-xl-base-1.0’.

text_hasher (str): Text embeddings hasher name. Defaults to ‘text’. device (str): Device used to compute embeddings. Defaults to ‘cuda’. proportion_empty_prompts (float): The probabilities to replace empty

text. Defaults to 0.9.

__getitem__(idx)[source]

Get item.

Get the idx-th image and data information of dataset after ``self.train_transforms`.

Args:

idx (int): The index of self.data_list.

Returns:

dict: The idx-th image and data information of dataset after self.train_transforms.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • model (str) –

  • text_hasher (str) –

  • device (str) –

  • proportion_empty_prompts (float) –

class diffengine.datasets.HFDPODataset(dataset, image_columns=None, caption_column='text', label_column='label_0', csv='metadata.csv', pipeline=(), split='train', cache_dir=None)[source]

Bases: torch.utils.data.Dataset

DPO Dataset for huggingface datasets.

Args:

dataset (str): Dataset name or path to dataset. image_columns (list[str]): Image column names. Defaults to [‘image’]. caption_column (str): Caption column name. Defaults to ‘text’. label_column (str): Label column name of whether image_columns[0] is

better than image_columns[1]. Defaults to ‘label_0’.

csv (str): Caption csv file name when loading local folder.

Defaults to ‘metadata.csv’.

pipeline (Sequence): Processing pipeline. Defaults to an empty tuple. split (str): Dataset split. Defaults to ‘train’. cache_dir (str, optional): The directory where the downloaded datasets

will be stored.Defaults to None.

__len__()[source]

Get the length of dataset.

Returns:

int

Return type:

The length of filtered dataset.

__getitem__(idx)[source]

Get item.

Get the idx-th image and data information of dataset after ``self.pipeline`.

Args:

idx (int): The index of self.data_list.

Returns:

dict: The idx-th image and data information of dataset after self.pipeline.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • dataset (str) –

  • image_columns (list[str] | None) –

  • caption_column (str) –

  • label_column (str) –

  • csv (str) –

  • pipeline (collections.abc.Sequence) –

  • split (str) –

  • cache_dir (str | None) –

class diffengine.datasets.HFDreamBoothDataset(dataset, instance_prompt, image_column='image', dataset_sub_dir=None, class_image_config=None, class_prompt=None, pipeline=(), csv=None, cache_dir=None)[source]

Bases: torch.utils.data.Dataset

DreamBooth Dataset for huggingface datasets.

Args:

dataset (str): Dataset name. instance_prompt (str):

The prompt with identifier specifying the instance.

image_column (str): Image column name. Defaults to ‘image’. dataset_sub_dir (optional, str): Dataset sub directory name. class_image_config (dict):

model (str): pretrained model name of stable diffusion to

create training data of class images. Defaults to ‘runwayml/stable-diffusion-v1-5’.

data_dir (str): A folder containing the training data of class

images. Defaults to ‘work_dirs/class_image’.

num_images (int): Minimal class images for prior preservation

loss. If there are not enough images already present in class_data_dir, additional images will be sampled with class_prompt. Defaults to 200.

recreate_class_images (bool): Whether to re create all class

images. Defaults to True.

class_prompt (Optional[str]): The prompt to specify images in the same

class as provided instance images. Defaults to None.

pipeline (Sequence): Processing pipeline. Defaults to an empty tuple. csv (str, optional): Image path csv file name when loading local

folder. If None, the dataset will be loaded from image folders. Defaults to None.

cache_dir (str, optional): The directory where the downloaded datasets

will be stored.Defaults to None.

default_class_image_config :dict
generate_class_image(class_image_config)[source]

Generate class images for prior preservation loss.

Parameters:

class_image_config (dict) –

Return type:

None

__len__()[source]

Get the length of dataset.

Returns:

int

Return type:

The length of filtered dataset.

__getitem__(idx)[source]

Get item.

Get the idx-th image and data information of dataset after ``self.pipeline`.

Args:

idx (int): The index of self.data_list.

Returns:

dict: The idx-th image and data information of dataset after self.pipeline.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • dataset (str) –

  • instance_prompt (str) –

  • image_column (str) –

  • dataset_sub_dir (str | None) –

  • class_image_config (dict | None) –

  • class_prompt (str | None) –

  • pipeline (collections.abc.Sequence) –

  • csv (str | None) –

  • cache_dir (str | None) –

class diffengine.datasets.HFESDDatasetPreComputeEmbs(forget_caption, model='stabilityai/stable-diffusion-xl-base-1.0', device='cuda', pipeline=())[source]

Bases: torch.utils.data.Dataset

Huggingface Erasing Concepts from Diffusion Models Dataset.

Dataset of huggingface datasets for Erasing Concepts from Diffusion Models.

Args:

forget_caption (str): The caption used to forget. model (str): pretrained model name of stable diffusion xl.

Defaults to ‘stabilityai/stable-diffusion-xl-base-1.0’.

device (str): Device used to compute embeddings. Defaults to ‘cuda’. pipeline (Sequence): Processing pipeline. Defaults to an empty tuple.

__len__()[source]

Get the length of dataset.

Returns:

int

Return type:

The length of filtered dataset.

__getitem__(idx)[source]

Get the dataset after ``self.pipeline`.

Args:

idx (int): The index.

Returns:

dict: The idx-th data information of dataset after self.pipeline.

Parameters:

idx (int) –

Return type:

dict

Parameters:
  • forget_caption (str) –

  • model (str) –

  • device (str) –

  • pipeline (collections.abc.Sequence) –


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4