Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch 1.12 #122

Closed
wants to merge 4 commits into from
Closed

Pytorch 1.12 #122

wants to merge 4 commits into from

Conversation

hmaarrfk
Copy link
Contributor

@hmaarrfk hmaarrfk commented Jul 16, 2022

Closes #123 <--- Maybe merge that one instead
Closes #120
Closes #107 (Thank you @ngam)

Builds locally for:

  • CPU and linux64 bit.
  • GPU and linux64 (cuda 11.2)

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@ngam
Copy link
Contributor

ngam commented Jul 16, 2022

Please see #118 for earlier attempt(s). I am happy to help here, though I will likely be limited in my ability to diagnose the problem I faced in #118 ...

Thank you for opening this!

(Also, people like isuruf and others pointed out that we could try to build without differentiating mps variants --- I could not get that to work, but it should be doable. I personally vote for differentiating them until we figure out a better way...)

Edit 2: I expect all these builds to fail with the same issue in #118 except for the MPS ones according in your matrix. That's exactly where I got stuck...

pool:
vmImage: windows-2019
store_build_artifacts: true
vmImage: macos-12
Copy link
Contributor

@ngam ngam Jul 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this, i.e. macos-12, we get a collection of the most recent macos SDKs exposed (included in this macos-12 image). The way the logic is set up upstream, is that it will ignore your USE_MPS flag in build_pytorch.sh and just automatically turn it on regardless. Hence, if you go through the logs, you will see MPS is on for all the builds...

  --   USE_LEVELDB           : OFF
  --   USE_LITE_PROTO        : OFF
  --   USE_LMDB              : OFF
  --   USE_METAL             : OFF
  --   USE_PYTORCH_METAL     : OFF
  --   USE_PYTORCH_METAL_EXPORT     : OFF
  --   USE_MPS               : ON
  --   USE_FFTW              : OFF
  --   USE_MKL               : ON
  --   USE_MKLDNN            : ON
  --   USE_NCCL              : OFF
  --   USE_NNPACK            : ON
  --   USE_NUMPY    

The logic upstream is here: https://github.com/pytorch/pytorch/blob/b370959da11fe1c5fe9a6a6335a9b10ab6ad0a39/CMakeLists.txt#L90-L124. I cannot follow it all.

I can try to help you with this OSX stuff as we go forward, but the point is: If we build them all with macos-12, we no longer need to differentiate them. Eventually, this means that we are building for MPS and then on the user's devices, it will work if the user has macos12.3 and the associated SDKs.

However, the issue I faced was such that, yes, the MPS flag gets activated, but then it fails in the build if the correct SDK is not exposed (note this is different from the initial exposure, because we control this in our builds as we set exactly the SDK expected). So somehow, the fact that the image is macos-12 indicates to the pytorch logic that it is expecting the recent SDKs (12+) and if we don't have them exposed in our matrix, the build fails.

Obviously, I could've been doing something so wrong, so maybe you will have better luck.

We currently have no mechanism to set different macos images for different builds. At least not that I am aware of... We potentially need to edit the pipelines yaml file ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I think maybe we can set MKS_FOUND=0 or something like that. I'm going to let these builds run since I think linux should now pass. I think we had been disabling MKLDNN for a while now.... unintentionally.

We will have to start to build with macos-12 conda-forge/conda-forge.github.io#1798 (comment) anyway....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once i fixed the string parsing it claims to follow the flag:

  --   USE_PYTORCH_METAL_EXPORT     : OFF
  --   USE_MPS               : OFF
  --   USE_FFTW              : OFF
  --   USE_MKL               :
  --   USE_MKLDNN            : 0
  --   USE_NCCL              : OFF
  --   USE_NNPACK            : ON
  --   USE_NUMPY             : ON

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! So the problem was that one needed to negate the default, okay makes more senes. I didn't think of that before 😳

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main section said you built it locally for linux cpu? Did the find_path thing fix it? Wow!!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think of that before

You did think of that. You just had your code two lines too early.

The main section said you built it locally for linux cpu? Did the find_path thing fix it? Wow!!!

Yeah, builds are quicker and I think it worked trough the docker process.

@hmaarrfk hmaarrfk marked this pull request as draft July 16, 2022 20:35
@hmaarrfk hmaarrfk mentioned this pull request Jul 16, 2022
5 tasks
@hmaarrfk hmaarrfk closed this Jul 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CHECKLIST FOR v1.12
4 participants