Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbolic links in DebugLoc.File paths may have unexpected effects #24

Open
abbriggs opened this issue Jan 19, 2023 · 6 comments
Open

Symbolic links in DebugLoc.File paths may have unexpected effects #24

abbriggs opened this issue Jan 19, 2023 · 6 comments

Comments

@abbriggs
Copy link

abbriggs commented Jan 19, 2023

The DebugLoc for a optimization record is created from several sources:

  • the path from the root of the build to this record's associated source file
    • This is effectively controlled by the project's makefiles and how they invoke their build system.
  • the location of an #included file
    • This depends on how the source file invokes #include and how the include directories are specified by the makefiles.

Both of these may include . and .. in their paths.

Working with a few projects, I've noticed a couple of minor issues due to symlinks in DebugLocs:

  1. If a remark exists with a DebugLoc.File that begins with ., the HTML file name will still contain the . at the beginning. On Unix-like systems, this hides the HTML file from typical file explorers.
    • Example: ./common/foo.h ==> ._common_foo.h.html.
  2. If source files in multiple directories include the same header using relative paths, one HTML file will be created for each directory.
    • Example: ./dir1/foo.c and ./dir2/bar.c both contain #include <../lib/baz.h>. Two HTML files will be created: ._dir1_.._lib_baz.h.html and ._dir2_.._lib_baz.h.html.

I think both of these could be resolved by transforming the DebugLoc.File into an absolute path w.r.t the source_dir, but I'm not sure exactly where to do so. It seems like the code partially supports dumping remarks back into YAML, and it's likely that resolving these paths would remove compatibility (since dumping back to YAML would produce absolute paths).

@OfekShilon
Copy link
Owner

@abbriggs good catches and good solution direction - thanks. Will look into it soon.

@OfekShilon
Copy link
Owner

@abbriggs my commit name caused github to automatically close this. Can you please try the latest scripts and verify that it is solved for you?

@abbriggs
Copy link
Author

abbriggs commented Jan 23, 2023

@abbriggs my commit name caused github to automatically close this. Can you please try the latest scripts and verify that it is solved for you?

Both problems discussed in the issue body appear to be correct now - the remarks across multiple uses of the same header look like they're in the same file.

The commit does introduce an issue with remark processing, though:

2023-01-23 10:18:00,847 INFO Reading YAML files...
        32 of 32
2023-01-23 10:18:05,371 INFO Rendering index page...
2023-01-23 10:18:05,371 INFO   0 raw remarks
2023-01-23 10:18:05,371 WARNING Not generating report!
2023-01-23 10:18:05,371 INFO Ran for 0:00:04.523817

Likely due to these changes:

    @property
    def File(self):
-       return self.DebugLoc['File']
+       return os.path.abspath(self.DebugLoc['File'])

...

            if not annotate_external:
-               if os.path.isabs(remark.File):
+               if not remark.File.startswith(source_dir):
                    continue

os.path.abspath uses the current working directory to resolve direct relative paths, which will make the returned paths invalid unless your cwd is identical to the source directory.

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.getcwd()
'/home/abbriggs/optview2'
>>> os.path.abspath(".//foo/foo.c")
'/home/abbriggs/optview2/foo/foo.c'
>>>

Indeed, if I change directory to my source dir before running opt-viewer, the paths are correct:

2023-01-23 10:25:37,631 INFO Reading YAML files...
        32 of 32
2023-01-23 10:25:42,189 INFO Rendering index page...
2023-01-23 10:25:42,189 INFO   22803 raw remarks
2023-01-23 10:25:42,404 INFO   5508 unique source locations
2023-01-23 10:25:42,767 INFO Copying assets
2023-01-23 10:25:42,768 INFO Rendering HTML files...
        38 of 38
2023-01-23 10:25:43,902 INFO Done - check the index page at file:///home/abbriggs/optview2/html/index.html
2023-01-23 10:25:43,902 INFO Ran for 0:00:06.271743

I think it could be resolved if we could do something like:

    @property
    def File(self):
       return os.path.abspath(os.path.join(source_dir, self.DebugLoc['File'])

But Remark would need access to the source_dir for this. Perhaps Remark.canonicalize() could take the source_dir as an argument and modify the DebugLoc before interning the string?

@OfekShilon
Copy link
Owner

@abbriggs Sorry for the delay - this was reproduced and should now be fixed by applying abspath to source_dir (not to each individual DebugLoc.File). Please say if this fixes it for you too.

@abbriggs
Copy link
Author

@abbriggs Sorry for the delay - this was reproduced and should now be fixed by applying abspath to source_dir (not to each individual DebugLoc.File). Please say if this fixes it for you too.

Unfortunately, the aforementioned issue does not seem to be fixed for me.

Here's what my command line looks like:

(venv) abbriggs@[REDACTED]:~/optview2$ python ~/optview2/opt-viewer.py --demangler /usr/bin/llvm-cxxfilt --source-dir /home/abbriggs/zstd/lib --output-dir /home/abbriggs/optview2/after /home/abbriggs/zstd/lib/obj/conf_3
53f6c1de005deab6e57e59d300a06fd/dynamic
2023-02-13 10:23:22,571 INFO Reading YAML files...
        32 of 32
2023-02-13 10:23:26,745 INFO Rendering index page...
2023-02-13 10:23:26,745 INFO   0 raw remarks
2023-02-13 10:23:26,745 WARNING Not generating report! Please verify your --source-dir argument is exactly the path from which the compiler was invoked.
2023-02-13 10:23:26,745 INFO Ran for 0:00:04.174514
(venv) abbriggs@[REDACTED]:~/optview2$

Unless I change directory to the source_dir before running opt-viewer.py (or enable --annotate-external), the script fails to generate HTML files. It seems like the check at optrecord.py#L302 is still the issue.

@OfekShilon OfekShilon reopened this Feb 19, 2023
@OfekShilon OfekShilon reopened this Feb 19, 2023
@OfekShilon
Copy link
Owner

I apologize, my attempts at this were careless and caused the scripts to break - now reverted to working condition. I will take a stab at this with a different approach in the near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants