aiuNN/README.md

7.7 KiB

aiuNN

Adaptive Image Upscaler using Neural Networks aiuNN is an adaptive image upscaling model built on top of the Adaptive Image Intelligence Architecture (AIIA). This project provides fine-tuned versions of AIIA models specifically designed for high-quality image upscaling. By leveraging neural networks, aiuNN can significantly enhance the resolution and detail of images.

Features

  • High-Quality Upscaling: Achieve superior image quality with detailed and sharp outputs.
  • Fine-Tuned Models: Pre-trained on a diverse dataset to ensure optimal performance.
  • Easy Integration: Simple API for integrating upscaling capabilities into your applications.
  • Customizable: Fine-tune the models further on your own datasets for specific use cases.

Installation

You can install aiuNN using pip. Run the following command:

pip install git+https://gitea.fabelous.app/Machine-Learning/aiuNN.git

Usage

Basic Example

Here's a basic example of how to use aiuNN for image upscaling:

from aiia import AIIABase, AIIAConfig
from aiunn import aiuNN, aiuNNTrainer
import pandas as pd
from torchvision import transforms

# Create a configuration and build a base model.
config = AIIAConfig()
ai_config = aiuNNConfig()

base_model = AIIABase(config)
upscaler = aiuNN(config=ai_config)

# Load your base model and upscaler
pretrained_model_path = "path/to/aiia/model"
base_model = AIIABase.from_pretrained(pretrained_model_path)
upscaler.load_base_model(base_model)

# Create trainer with your dataset class
trainer = aiuNNTrainer(upscaler, dataset_class=UpscaleDataset)

# Load data using parameters for your dataset
dataset_params = {
    'parquet_files': [
        "path/to/dataset1",
        "path/to/dataset2"
    ],
    'transform': transforms.Compose([transforms.ToTensor()]),
    'samples_per_file': 5000 # Your training samples you want to load per file
}
trainer.load_data(dataset_params=dataset_params, batch_size=1)

# Fine-tune the model
trainer.finetune(output_path="trained_model")

Dataset Class

The UpscaleDataset class is designed to handle Parquet files containing image data. It loads a subset of images from each file and validates the data types to ensure consistency.

class UpscaleDataset(Dataset):
    def __init__(self, parquet_files: list, transform=None, samples_per_file=10_000):
        combined_df = pd.DataFrame()
        for parquet_file in parquet_files:
            # Load a subset from each parquet file
            df = pd.read_parquet(parquet_file, columns=['image_410', 'image_820']).head(samples_per_file)
            combined_df = pd.concat([combined_df, df], ignore_index=True)

        # Validate rows (ensuring each value is bytes or str)
        self.df = combined_df.apply(self._validate_row, axis=1)
        self.transform = transform
        self.failed_indices = set()

    def _validate_row(self, row):
        for col in ['image_410', 'image_820']:
            if not isinstance(row[col], (bytes, str)):
                raise ValueError(f"Invalid data type in column {col}: {type(row[col])}")
        return row

    def _decode_image(self, data):
        try:
            if isinstance(data, str):
                return base64.b64decode(data)
            elif isinstance(data, bytes):
                return data
            raise ValueError(f"Unsupported data type: {type(data)}")
        except Exception as e:
            raise RuntimeError(f"Decoding failed: {str(e)}")

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        # If previous call failed for this index, use a different index
        if idx in self.failed_indices:
            return self[(idx + 1) % len(self)]
        try:
            row = self.df.iloc[idx]
            low_res_bytes = self._decode_image(row['image_410'])
            high_res_bytes = self._decode_image(row['image_820'])
            ImageFile.LOAD_TRUNCATED_IMAGES = True
            # Open image bytes with Pillow and convert to RGBA first
            low_res_rgba = Image.open(io.BytesIO(low_res_bytes)).convert('RGBA')
            high_res_rgba = Image.open(io.BytesIO(high_res_bytes)).convert('RGBA')

            # Create a new RGB image with black background
            low_res_rgb = Image.new("RGB", low_res_rgba.size, (0, 0, 0))
            high_res_rgb = Image.new("RGB", high_res_rgba.size, (0, 0, 0))

            # Composite the original image over the black background
            low_res_rgb.paste(low_res_rgba, mask=low_res_rgba.split()[3])
            high_res_rgb.paste(high_res_rgba, mask=high_res_rgba.split()[3])

            # Now we have true 3-channel RGB images with transparent areas converted to black
            low_res = low_res_rgb
            high_res = high_res_rgb

            # If a transform is provided (e.g. conversion to Tensor), apply it
            if self.transform:
                low_res = self.transform(low_res)
                high_res = self.transform(high_res)
            return low_res, high_res
        except Exception as e:
            print(f"\nError at index {idx}: {str(e)}")
            self.failed_indices.add(idx)
            return self[(idx + 1) % len(self)]

Trainers

aiuNN provides two types of trainers: the standard aiuNNTrainer and the optimized MemoryOptimizedTrainer. Choose the one that best fits your needs based on your hardware capabilities and memory constraints.

Standard Trainer (aiuNNTrainer)

Use the standard trainer when you have sufficient memory resources and do not need advanced optimizations. This trainer is straightforward and easy to use for most basic training tasks.

from aiunn import aiuNNTrainer

# Create trainer with your dataset class
trainer = aiuNNTrainer(upscaler, dataset_class=UpscaleDataset)

# Load data using parameters for your dataset
dataset_params = {
    'parquet_files': [
        "path/to/dataset1",
        "path/to/dataset2"
    ],
    'transform': transforms.Compose([transforms.ToTensor()]),
    'samples_per_file': 5000 # Your training samples you want to load per file
}
trainer.load_data(dataset_params=dataset_params, batch_size=4)

# Fine-tune the model
trainer.finetune(output_path="trained_model")

Memory-Efficient Trainer (MemoryOptimizedTrainer)

Use the MemoryOptimizedTrainer when you have limited memory resources and need to optimize training for better efficiency. This trainer includes features like gradient accumulation, memory profiling, and aggressive memory cleanup.

from aiunn import MemoryOptimizedTrainer

# Replace your existing trainer with the optimized version
trainer = MemoryOptimizedTrainer(
    upscaler_model=your_model,
    dataset_class=your_dataset_class,
    use_gradient_accumulation=True,
    accumulation_steps=4,  # Effective batch size = batch_size * 4
    use_memory_profiling=True,
    use_model_compilation=True
)

# Load data with memory optimizations
trainer.load_data(
    dataset_params=your_params,
    batch_size=2,  # Use smaller batch size with gradient accumulation
    val_batch_size=1  # Even smaller validation batches
)

# Train with optimizations
trainer.finetune(
    output_path="./output",
    epochs=10,
    lr=1e-4
)

# Get memory usage summary
memory_summary = trainer.get_memory_summary()
print(memory_summary)

When to Use Which

  • Use aiuNNTrainer when:

    • You have sufficient GPU and RAM resources.
    • You prefer a straightforward training process without additional optimizations.
    • Your dataset is not excessively large, and memory constraints are not a concern.
  • Use MemoryOptimizedTrainer when:

    • You have limited GPU or RAM resources.
    • You need to train on larger datasets that may exceed your hardware capabilities.
    • You want to monitor memory usage and optimize training for better efficiency.