Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 03 Sep 2024 12:53:45 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 276985] crash in LinuxKPI/drm
Message-ID:  <bug-276985-227-q5t48bH5wa@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-276985-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-276985-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D276985

--- Comment #34 from Olivier Certner <olce@FreeBSD.org> ---
(In reply to Tomasz "CeDeROM" CEDRO from comment #30)

> Thanks Olivier. This is my last message in this thread.

I hope not, and that you will answer the question I posted for you in comme=
nt
#29.

> I am happy that your setup forks fine, and for your friends, but I would =
not ship a product like this knowing it does not work for me nor for some p=
eople. I just got used to comfort of a release being always rock solid.

I'm really amazed that this is your view on FreeBSD, especially on its grap=
hics
stack, after hanging around for so long.  You must have forgotten the early
201x years, where, IIRC, you'd have to stick with >5 years old cards to get
some acceleration.

Of course, everyone supposedly tries its best to ship flawless (or, at leas=
t,
functional) pieces of software.  However, in practice, hardware support can=
 be
very difficult (and some hardware especially... uncooperative), and in
particular the graphics stack is a beast of its own (I'm just a newbie here,
gradually learning).  It's not even developed in-house but rather imported =
from
Linux, which gradually requires implementing the Linux KPI, and AFAIU we are
undermanned for this task *alone*.  Add to that the fact that even Linux DRM
drivers are occasionally buggy (formerly, more often than not, but the
situation seems to have improved in the past few years, but perhaps my view=
 is
biased as I avoid very recent hardware), and the numerous card models, even=
 if
based on the same chips.  It's simply impossible to test all combinations. =
 I
can safely bet that if every amdgpu-supported GPUs really had been broken by
the latest FreeBSD stacks, as you were insinuating, they wouldn't have been
released as is, or they would have gotten a lot more attention.

> Thanks for pointing out its a different issue.

Well, to be crystal clear, while I responded also here for that point, I'm
referring to what you posted in bug #278212, which seems clearly out of pla=
ce.=20
By contrast, in your stacks, I see one correspondence with earlier stacks
posted here (bug #276985), one by feh@ and one by Vlad, where the crash hap=
pens
in linux_rcu_cleaner_func() (and there's another one on Reddit).  So there =
may
be something in common, I just don't know at this time (perhaps someone else
would have a hint).

If you intend to post more traces/dumps or more explanations on your scenar=
ios,
it would be wise to open a new bug, yes.

> Also thank you for pointing out I can work with 5.10, 5.15, and 6.10 on 1=
4 release what is not possible on 13 (some documentation on this would be n=
ice).

It's a very useful possibility indeed.  It should have been documented, but=
 can
also be inferred by looking at the available ports and the content of their
Makefile.  However, I've heard that the plan going forward is to integrate =
back
DRM into base.  If carried out, such substitution isn't going to be possible
anymore unfortunately.

> If I knew what the problem is and how to fix it I would send patches not =
crash logs.

Sending crash logs is not the problem I'm pointing out.  Crash logs are
welcome.  Not being able to fix isn't a problem per-se either.  The problem=
 is
where you attached logs and posted comments, and the necessity to describe
things factually, from the scenario of your interactions with the computer =
to
the problem you're experiencing, with details on the hardware and software
(versions in particular) used, without conflating or extrapolating things. =
 In
complex matters like this, it is also precious to be able to re-test with
slight changes to be able to spot differences in behavior.  These are areas
where you can actually make progress and help.

> Btw there is no need to use offensive aggressive and arrogant language (i=
.e. "you are not willing to test", "you're alone", "spreading FUD by over-g=
eneralizing your own case", etc).

"not willing to test" was my feeling, although when I first wrote that I us=
ed
"willing" in a broader sense than the one you actually received.  "you're
alone" is simply re-using your own words.  And "spreading FUD by
over-generalizing your own case", as I already explained, is just describin=
g a
factual reality.

I certainly did not intend to be offensive nor arrogant, and I don't think I
was.  I've just been factual, and firm about some principles without which
everybody is losing time, trust, etc.

On the contrary, bragging in multiple, loosely related bugs that 14 with DR=
M is
generally unsuitable for production use is what is aggressive, and more
importantly, as I showed, wrong and unproductive.  There is no denying that
there are still problems (I already wrote some rough summary above), and my=
 aim
is precisely to nail them to be able to get rid of them, if possible.

> This is not a constructive and motivating language that I am used for to =
see here.

On the contrary, my messages are (at least, aim to be) very constructive, a=
nd
it is exactly in this direction that I'm trying to steer you.

[As a side note, concerning the second part of your sentence, given how long
you've been around, I don't think you can possibly believe what you wrote.=
=20
There has certainly been a lot of abusive language in the project, especial=
ly
in the old years.  I do not endorse such a language, and perhaps contrary to
what you seem to believe, I've not engaged in that here.]

> But I get the point. (...)

Great.

> I am just a bit scared to do this on a production machine you can imagine=
 that.

Surely.  But what will you do if your production machine happens to break?=
=20
Don't you have another very similar or identical setup more or less ready to
replace it, especially if downtime is a concern?  Such a setup would
additionally be useful to test stuff without disturbing your main machine.

> My last question - if you are the drm module maintainer / developer - wou=
ld it be possible to mark following ports with incremental numbers like 510=
, 515, 610, not 61 for a newest release please (61 < 515)?

I'm not (at least for now).  I understand that, e.g., a name like
'drm-601-kmod' would have been more satisfying, but does this have any
relevance in practice?  Are you scripting things to automatically update yo=
ur
DRM modules depending on the inferred version (in which case, there are
probably other means)?  And, as said above, I think the plan is that these
ports are going to disappear.  So I doubt there will be any change in this
area.


Before trying to setup a new machine, could you please answer my very simple
question of comment #29: Where did the packages you used to install DRM mod=
ules
came from?  Did you build them yourself, or were they official packages, or
...?  If you didn't build them yourself, the first thing to try would be to=
 do
exactly that and see if problems persist.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-276985-227-q5t48bH5wa>