Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast_ellipsoid_splinetracer errors #5

Open
gradeeterna opened this issue Feb 19, 2025 · 10 comments
Open

fast_ellipsoid_splinetracer errors #5

gradeeterna opened this issue Feb 19, 2025 · 10 comments

Comments

@gradeeterna
Copy link

gradeeterna commented Feb 19, 2025

Hey, I'm getting this error when I try to train, any ideas?

python train.py data/berlin Traceback (most recent call last): File "/home/grade/src/ever_training/train.py", line 17, in <module> from gaussian_renderer.ever import splinerender File "/home/grade/src/ever_training/gaussian_renderer/ever.py", line 20, in <module> from ever.splinetracers.fast_ellipsoid_splinetracer import trace_rays File "/home/grade/src/ever_training/ever/splinetracers/fast_ellipsoid_splinetracer.py", line 27, in <module> from build.splinetracer.extension import fast_ellipsoid_splinetracer_cpp_extension as sp ImportError: /home/grade/src/ever_training/ever/build/splinetracer/extension/fast_ellipsoid_splinetracer_cpp_extension.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

Ubuntu 22.04, CUDA 12.6

@half-potato
Copy link
Owner

half-potato commented Feb 19, 2025

What is your torch version? Can you post the result of this command?

print(torch.__config__.show())

Also, can you send the CMake output?

@gradeeterna
Copy link
Author

gradeeterna commented Feb 19, 2025

Thanks, I was using torch 2.5 but couldn't get it work by including C++11 ABI in the CMake. Downgraded to 2.4 which seems to have fixed that issue.

But new problem is that I run out of memory during "reading camera" at the start of training. I just tried with the latest commit, but didn't help and RAM was full after reading a few hundred images.

python train.py -s data/berlin --resolution 4

Changing resolution from 1 to 8 doesn't seem to make any difference.

I only have 32GB RAM and RTX 3090 24GB here, should it be useable with that? I can train 3DGS with the berlin dataset at downscale 4.

@gaetan-landreau
Copy link

Hi,
Got the exact same error as you regarding the import from build.splinetracer.extension import fast_ellipsoid_splinetracer_cpp_extension as sp . Indeed downgrading torch to 2.4 help, thanks ! 👍

Regarding the reading_camera issue, what's the resolution of your images ? Can't make a proper parallel with my own config since I got 64GB RAM, but I successfully load 169 images, shape 960x540 with a similar GPU. How much images do you have in your dataset ?

@gradeeterna
Copy link
Author

gradeeterna commented Feb 20, 2025

Hey, glad that worked for you!

I'm using the example zipnerf berlin dataset, which is 1500 7k images. I'm trying to use the downscales with -r 4 or 8, but that doesn't seem to change memory usage during the reading cameras. I will test with some smaller and lower res datasets, but most of my my scenes are similar size to the zipnerf ones.

@gaetan-landreau
Copy link

Yes, 1500 images is quite significant. Even at -r 4 you're dealing with images that almost have 2k resolutions. Since all images are loading only once in the code (see /scene/dataset_readers.py), you're going to mostly always experience OOM issues. Will give a shot first with a scene with ~100images, 2k res. if possible ! At least to see if the code is running fine :)

@gradeeterna
Copy link
Author

I managed to train berlin at -r 8 by editing datasets_readers.py as mentioned here #4

Also had to use the SIBR_viewers from 3DGS instead, as had the same problems mentioned in the other issues.

Looks great even with the 1k downscales, look forward to trying with my own data now!

@half-potato
Copy link
Owner

Sorry about that. Each image is being loaded twice. I’ll try to fix this ASAP.

@half-potato
Copy link
Owner

Should be fixed in the latest commit.

@yuliangguo
Copy link

yuliangguo commented Feb 20, 2025

@gradeeterna @gaetan-landreau @half-potato

Also refer to my report #4, my quick fix for the memory double usage is

dataset_readers.py

Image

camera_utils.py

Image

@half-potato
Copy link
Owner

Cool. That change should be on the master branch now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants