Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for image masks? #6

Open
gradeeterna opened this issue Feb 20, 2025 · 12 comments
Open

Support for image masks? #6

gradeeterna opened this issue Feb 20, 2025 · 12 comments

Comments

@gradeeterna
Copy link

Hey, are image masks currently supported? I would like to try training circular fisheye datasets which use OPENCV_FISHEYE model, but the black edges around the image circle need to be masked out for training.

I have binary mask .pngs in masks, masks_2 etc folders next to my images and sparse folders, with matching filenames to the images.

Thanks!

Image

Image

@half-potato
Copy link
Owner

There is some loading support for image masks based on the alpha channel of the image. However, some of the losses might need to be changed:

Ll1 = l1_loss(image, gt_image)

Also, I’m not entirely sure if that projection is supposed by default. The area that handles different camera types is here:

if camtype == ProjectionType.FISHEYE:

@half-potato
Copy link
Owner

I was procrastinating so I added masking. Tell me if it works. Also tell me if the FISHEYE projection matches your image.

@gradeeterna
Copy link
Author

Thanks a lot, will test asap! Guessing I need to put my separate masks in the alpha channel instead right?

@half-potato
Copy link
Owner

Yeah

@gradeeterna
Copy link
Author

Getting this error when I try to train with or without masks:

AttributeError: 'Camera' object has no attribute 'alpha_mask'

Think I've fixed it by changing line 158 in train.py from:

if viewpoint_cam.alpha_mask is not None: alpha_mask = viewpoint_cam.alpha_mask.cuda() image *= alpha_mask

to:

if viewpoint_cam.gt_alpha_mask is not None: alpha_mask = viewpoint_cam.gt_alpha_mask.cuda() image *= alpha_mask

and adding self.gt_alpha_mask = gt_alpha_mask to cameras.py

Will let you know if the masks are working and about this fisheye projection.

@gradeeterna
Copy link
Author

Seems like both circular fisheye and masks are working! Not sure if the masks are causing this blue weirdness you can see on the ground somehow?

Also any settings you could recommend tweaking for massive scenes like this? It didn't turn out great with default settings, but definitely handled it way better than 3DGS!

@gradeeterna
Copy link
Author

gradeeterna commented Feb 23, 2025

@half-potato
Copy link
Owner

half-potato commented Feb 23, 2025

Woah. These are incredible results, to be honest! Especially the fisheye. I'm glad it works out of the box.

Are you using different exposures for different photos? If so, there are some extra steps that could be needed.

I'm not exactly sure what to tune, as you are probably running into the limits of what is possible with current technology. If you have a very large GPU, here is what I recommend to increase detail:

Scenes with many images tend to need a longer training time (--iterations 60000) . I would recommend a longer densification time (--densify_until_iter 30000) too. If you start to run out of memory, increase the densification interval (--densification_interval 200)

Compared to the scenes I've been working with, the main difference is the actual distance to objects. This probably mostly impacts the distortion loss, which is probably why you are seeing weird blue colors through the floor. In theory, the PRE_MULTI should be scaled based on the scene I believe, or at least the average distance to objects. Unfortunately, I have not exposed the parameter, so it's buried in this file:
https://github.com/google/ever/blob/fe6811328991ab03edcff4ef0462292e81be5c3f/splinetracers/fast_ellipsoid_splinetracer/slang/spline-machine.slang#L18

I guess I could get around to fixing this, if you really need this feature.

The setting I usually change for larger scenes is --percent_dense, which might need to be lower on some larger scenes. Also, --position_lr_init 1e-5 --position_lr_final 1e-7 might help.

@half-potato
Copy link
Owner

Do you think I could share some of these videos on twitter?

@gradeeterna
Copy link
Author

Thanks, so happy the fisheye and masking is working! I'll try tuning all these to see if I can get better results on the huge scenes.

Yeah sure, just tried to DM you on there but it wouldn't let me. Follow me @ gradeeterna and I'll send over the videos.

@yuliangguo
Copy link

@half-potato @gradeeterna May I ask the command and data preparation used at last to apply such mask in reconstruction? Any clue how the render.py and metric.py should be adapted in applying mask in evaluation?

@gradeeterna
Copy link
Author

gradeeterna commented Feb 28, 2025

@yuliangguo Masks need to be in the alpha of the images, and it detects them without adding anything to the training command. I am exporting separate binary masks from Metashape and then using this script to combine them in to the alpha of the images. Not sure about the second question.

python maskcombiner.py imagespath maskspath combinedpath

import sys
from concurrent.futures import ProcessPoolExecutor
from functools import partial
import numpy as np
from PIL import Image

def create_rgba_image(image_path, mask_path, output_path):
    # Open the image
    with Image.open(image_path) as image:
        image = image.convert('RGB')
        # Create a new RGBA image
        rgba = Image.new('RGBA', image.size)
        # Paste the RGB image into the RGBA image
        rgba.paste(image)
        
        if os.path.exists(mask_path):
            # Open the mask
            with Image.open(mask_path) as mask:
                mask = mask.convert('L')
                # Use the mask directly as the alpha channel
                rgba.putalpha(mask)
        else:
            # Create a fully opaque alpha channel
            alpha = Image.new('L', image.size, 255)
            rgba.putalpha(alpha)
        
        # Save the result
        rgba.save(output_path)

def process_image(image_file, input_folder, mask_folder, output_folder):
    base_name = os.path.splitext(image_file)[0]
    mask_file = base_name + '.png'
    image_path = os.path.join(input_folder, image_file)
    mask_path = os.path.join(mask_folder, mask_file)
    output_path = os.path.join(output_folder, base_name + '.png')
    
    create_rgba_image(image_path, mask_path, output_path)
    
    if os.path.exists(mask_path):
        return f"Processed with mask: {image_file}"
    else:
        return f"Processed without mask: {image_file}"

def process_all_images(input_folder, mask_folder, output_folder):
    # Create output folder if it doesn't exist
    os.makedirs(output_folder, exist_ok=True)
    
    # Get all jpg, png, and tiff files
    image_files = [f for f in os.listdir(input_folder) if f.lower().endswith(('.jpg', '.jpeg', '.png', '.tif', '.tiff'))]
    
    with ProcessPoolExecutor() as executor:
        process_func = partial(process_image, input_folder=input_folder, mask_folder=mask_folder, output_folder=output_folder)
        results = list(executor.map(process_func, image_files))
    
    with_mask_count = sum(1 for r in results if "with mask" in r)
    without_mask_count = sum(1 for r in results if "without mask" in r)
    
    print("\nSummary:")
    print(f"Total images processed: {len(results)}")
    print(f"Images with masks: {with_mask_count}")
    print(f"Images without masks: {without_mask_count}")

if __name__ == "__main__":
    if len(sys.argv) != 4:
        print("Usage: python maskcombiner.py <input_folder> <mask_folder> <output_folder>")
        sys.exit(1)
    
    input_folder, mask_folder, output_folder = sys.argv[1:4]
    process_all_images(input_folder, mask_folder, output_folder)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants