Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement workaround for apple-clang@15.x.y linker/loader issue #971

Closed
srherbener opened this issue Jan 26, 2024 · 14 comments · Fixed by #1031
Closed

Implement workaround for apple-clang@15.x.y linker/loader issue #971

srherbener opened this issue Jan 26, 2024 · 14 comments · Fixed by #1031
Assignees
Labels
INFRA JEDI Infrastructure

Comments

@srherbener
Copy link
Collaborator

srherbener commented Jan 26, 2024

Is your feature request related to a problem? Please describe.

Apple has an issue with the linker/loader (ld) that is shipped with apple-clang%15.x compilers of which can be worked around by requesting the older linker/loader functionality via the -Wl,-ld_classic compiler option. Apple is aware of this issue and has an internal ticket: FB13208302: "Building ffmpeg 6.0 with shared libraries broken with Xcode 15's default linker"

For some examples, see:
spack/spack#40187 (ffmpeg build)
spack/spack#42264 (hdf5 build)

This workaround also fixes the issue with apple-clang%15.x reported in #829 which reported errors during the configure step with the openmpi build.

There seems to be number of packages impacted by the apple-clang%15.x linker/loader issue: hdf5, openmpi, ffmpeg so far.

Option 2 has the advantage that we are only fixing known problems, and if it turns out to be a handful of packages it's really not that difficult to remove later on.

Describe the solution you'd like
I thing there are two options:

  1. Globally set LDFLAGS="-Wl,-ld_classic" when compiling with apple-clang%15.x to address anticipated issues
  2. Set LDFLAGS="-Wl,-ld_classic" on a case-by-case basis as issues are found

Option 1 has the advantage that it is easy to remove the workaround once apple has fixed the issue. But has the disadvantage that we are changing build process for a lot of packages that are not broken.

Option 2 has the advantage that we are only fixing the packages that are known to be broken, and if there are not too many of those it will be easy to remove the work around when apple fixes their linker/loader issue.

I'm leaning toward option 2 now. I'm a little bit concerned about the global setting in option 1.

What does everyone think?

Additional context

@mathomp4
Copy link
Collaborator

I'm thinking case-by-case is better. Mainly because it is...confusing as to when this is needed. For example, on my Mac's I've never needed it with ESMF builds, but a coworker does. We've never quite figured out why.

That said, I'm not sure it hurts to use it? I mean, if you don't need it and you enable it, it doesn't seem to make a build not work.

@srherbener
Copy link
Collaborator Author

This issue is related to #938.

@climbfuji
Copy link
Collaborator

I also think option 2 is better. We may want to keep track of the changes, or do a recursive search for this compiler flags to later constrain it to the "bad" version of apple-clang.

@srherbener
Copy link
Collaborator Author

@fabiolrdiniz thanks for closing the related issue (#938). Makes things simpler. Could you give details about the fix you used that successfully built on your Mac. Specifically, did the configure script honor the setting of LD_FLAGS, or did you have to edit the configure script? Thanks!

Everyone, thanks for your responses to the proposed solution. I'll go with option 2 (case-by-case) basis.

@fabiolrdiniz
Copy link
Contributor

@srherbener, thanks for asking. I played with the spack edit openmpi command, resulting in these differences:

$ cd spack-stack/spack
$ git diff
diff --git i/var/spack/repos/builtin/packages/openmpi/package.py w/var/spack/repos/builtin/packages/openmpi/package.py
index 3fcd245847..444f2f67b8 100644
--- i/var/spack/repos/builtin/packages/openmpi/package.py
+++ w/var/spack/repos/builtin/packages/openmpi/package.py
@@ -411,7 +411,7 @@ def patch(self):
         spec = self.spec
         if "+two_level_namespace" in spec and spec.satisfies("platform=darwin"):
             print("Applying configure patch for two_level_namespace on MacOS")
-            filter_file(r"-flat_namespace", "-commons,use_dylibs", "configure")
+            filter_file(r"-flat_namespace", "-ld_classic", "configure")
 
     variant(
         "fabrics",
@@ -514,7 +514,7 @@ def patch(self):
         default=False,
         description="""Build shared libraries and programs
 built with the mpicc/mpifort/etc. compiler wrappers
-with '-Wl,-commons,use_dylibs' and without
+with '-Wl,-ld_classic' and without
 '-Wl,-flat_namespace'.""",
     )

@srherbener
Copy link
Collaborator Author

@fabiolrdiniz thanks for the details. This is very helpful.

Everyone, we've got a PR (JCSDA/spack#401) in the works that syncs up our fork of spack with the authoritative spack repo. Since JCSDA/spack#401 will update the openmpi/package.py script and this fault is only impacting a few isolated cases (that we are aware of now), I think it makes sense to wait until JCSDA/spack#401 is merged, try the updated openmpi build, and if that still breaks then put in this workaround.

Unless there are any objections, I'll take this approach.

Thanks!

@climbfuji
Copy link
Collaborator

@fabiolrdiniz thanks for the details. This is very helpful.

Everyone, we've got a PR (JCSDA/spack#401) in the works that syncs up our fork of spack with the authoritative spack repo. Since JCSDA/spack#401 will update the openmpi/package.py script and this fault is only impacting a few isolated cases (that we are aware of now), I think it makes sense to wait until JCSDA/spack#401 is merged, try the updated openmpi build, and if that still breaks then put in this workaround.

Unless there are any objections, I'll take this approach.

Thanks!

Makes sense to me, thanks @srherbener

@fabiolrdiniz
Copy link
Contributor

@srherbener, nice to read that. Please, let me know if would be useful to test JCSDA/spack#401. I can try it on the side.

@climbfuji climbfuji changed the title Implement workaround for apple-clang%15.x linker/loader issue Implement workaround for apple-clang@15.x.y linker/loader issue Feb 6, 2024
@srherbener
Copy link
Collaborator Author

@fabiolrdiniz JCSDA/spack/pull/401 and jscda/spack-stack/977 have been merged. These PRs got us back in sync with the LLNL spack repo, which includes updates to the openmpi package.py script, and we also moved from openmpi%4.1.6 to openmpi%5.0.1. I don't know if this will fix the issue you are experiencing with the openmpi build, but I think it's worth a try testing it now. You will need to checkout the following branches:

  • jcsda/spack-stack develop
  • jcsda/spack jcsda_emc_spack_stack

Thanks!

@srherbener
Copy link
Collaborator Author

@fabiolrdiniz another note. I ran into this: #995 with the branches I noted in the previous comment. I had to workaround this by disabling the build of cylc through this change in the spack.yaml file:

  definitions:
  - compilers: ['%apple-clang']
  - packages:
    # Disable cylc variant until build issue is resolved
    # https://github.com/JCSDA/spack-stack/issues/995
    # - ewok-env +ecflow +cylc
    - ewok-env +ecflow ~cylc

Note on the ewok-env line: "+cylc" -> "~cylc"

@fabiolrdiniz
Copy link
Contributor

Thanks for providing the details, @srherbener. I'll give it a try and let you know.

@fabiolrdiniz
Copy link
Contributor

Just to document it here... my first tentative with develop didn't work when I tried. I need to come back to it and try again.

@climbfuji
Copy link
Collaborator

@srherbener Can you add the link to the apple ticket in the issue description please? It would be good to track the ticket. We should also update the documentation for spack-stack-1.7.0 to state that when you are using apple-clang@15.x.y (whatever the affected x and y are), you need to update the site compiler flags (i.e. use option 1 in your issue description above). We should be able to do this via spack config ... commands I believe. Thanks!

@srherbener
Copy link
Collaborator Author

The apple ticket number is: FB13208302, but I cannot get access to the ticket report. I am using my UCAR apple id and looking on the apple developer site. I am probably doing something wrong. Could someone help me get access to the apple ticket system so I can see the report? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INFRA JEDI Infrastructure
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

6 participants