Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amdgpu module crashes the kernel on R9 285 Tonga. #128

Closed
weabot opened this issue Mar 15, 2017 · 15 comments
Closed

amdgpu module crashes the kernel on R9 285 Tonga. #128

weabot opened this issue Mar 15, 2017 · 15 comments

Comments

@weabot
Copy link

weabot commented Mar 15, 2017

Hello, I am on an amdgpu R9 285 (Tonga family) and loading the amdgpu.ko kernel module with kldload seems to crash the kernel.

To give some context, I installed the latest FreeBSD 12.0-CURRENT snapshot and installed the GENERIC_DRM-NODEBUG kernel cloned from the drm-next branch. I didn't modify the configuration. I loaded the drm module, then the amdgpu module. My GPU fan throttled for a second or so, then completely stopped. The screen went black.

Here are the logs. I'm guessing the small throttle had to do with powerplay starting, but it seems stuck in that loop where it fails to send messages to the card (I'm guessing). On my end the screen is black and the kernel seems unresponsive. Loading the module at boot from the loader results in a page fault (supervisor read data, page not present).

Thank you for your work! :)

@nomadlogic
Copy link

Hey there - I am not %100 certain as to the state of support for the Tonga GPU on drm-next, but I know there is def active work happening to support AMD GPU's better.

It sounds like you are just copying to kernel config into the stock 12-CURRENT tree you have locally. This won't work since there is quite a bit amount of code added to the drm-next branch that you'll need. You can follow the instructions here for reference to get your system built properly:

https://github.com/FreeBSDDesktop/freebsd-base-graphics/wiki#building-kernel-from-scratch

Basically you'll need to checkout the drm-next branch, then build a new world and kernel off of that repository. We have also added a change that requires you to have llvm40 installed. This will speed up your build by quite a bit since you will not need to build llvm40 during "buildworld".

@weabot
Copy link
Author

weabot commented Mar 15, 2017

I cloned the tree directly from github... I put it in /usr/drm instead of /usr/src, surely it can't be because of that?

@iotamudelta
Copy link
Member

This seems to be a genuine startup problem, I also have a few still for discrete AMD cards. @markjdb probably would be the right contact to tell you what data is needed to debug.

@weabot
Copy link
Author

weabot commented Mar 15, 2017

However I didn't do build/install world because I thought the kernel would be enough. Let me try that and report back.

@markjdb
Copy link

markjdb commented Mar 15, 2017

Is this a regression?

@iotamudelta
Copy link
Member

I doubt it. There were still lingering problems for a few discrete cards (including the Fury Nanos and S9150s I have access to). This is likely one of these cases.

@weabot
Copy link
Author

weabot commented Mar 15, 2017

No changes with the new world. Didn't think so either because that's definitely not a userspace issue but I'm just covering my bases.

You guys seem to have an idea what's wrong though.

@nomadlogic
Copy link

looking at your logfile from original message this looks potentially interesting:

Mar 15 12:34:03 TJULP kernel: [drm:gfx_v8_0_ring_test_ring] amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
Mar 15 12:34:03 TJULP kernel: [drm:amdgpu_init] hw_init of IP block <gfx_v8_0> failed -22
Mar 15 12:34:03 TJULP kernel: drmn0: amdgpu_init failed

I'll defer to others actually hacking on the code though...

@freebsd-nils-level1
Copy link

Unfortunately same here for most of the time: amdgpu.polaris.crash.txt

Using a RX460.

Once, I was able to get "amdgpu.ko" loaded and running after several tries:
amdgpu.polaris.loaded.txt - but I could not repeat it again until now...

BTW: 3D didn't work because somehow the PCI bus addresses got mixed up; DRI driver "radeonsi.so" tried to bind to "hostb9@pci0:0:24:2" although the RX460 is located at "vgapci0@pci0:36:0:0". Xorg it self identified the PCI bus addresses correctly for some reason...

@freebsd-nils-level1
Copy link

Don't ask me how - but I've managed to get the module running again. I've tried four times and then
the fifth time, I've switched to another VTY and executed "kldload amdgpu". As soon as I got my
prompt back, I've executed "service startkde4 onestart".

The problem with the wrong PCI bus addresses is probably related to a potential "libdevq" bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=217886

Before I forget; thanks for all the hard work you guys are doing here. Big kudos for that...

@weabot
Copy link
Author

weabot commented Mar 30, 2017

What the hell this worked... I guess since Linux doesn't have tty0 it didn't know what to do? Something like that? Damn. I actually didn't expect this to work.

@gjs278
Copy link

gjs278 commented Apr 25, 2017

Hi guys, I was considering getting an RX460 and running it on FreeBSD. Does it work for you without issues currently? @nbe-renzel-net

@iotamudelta
Copy link
Member

@gjs278 I think we had somebody running the RX460. It'll certainly have the same issues all amdgpu cards share at the moment: 3D is not feasible since the kernel leaks memory for it. This is one of the reasons that I have also not kept up with the latest Mesa updates and whether they even manage to load the required libraries properly. So, if 2D acceleration is enough for you: the card will likely work. 3D: not yet.

@weabot
Copy link
Author

weabot commented Jul 16, 2017

This is a duplicate of the issue in #158 as the solutions proposed here and in that thread both work and are different solutions of that issue.

@weabot weabot closed this as completed Jul 16, 2017
@markjdb
Copy link

markjdb commented Jul 17, 2017

@gjs278 Sorry for the late reply, but I've been using an RX460 without issues for the past month or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants