Date: Thu, 14 Jan 2021 01:36:27 GMT From: myfreeweb <greg@unrelenting.technology> To: John Baldwin <jhb@FreeBSD.org>, Konstantin Belousov <kostikbel@gmail.com> Cc: Emmanuel Vadot <manu@freebsd.org>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 11d62b6f31ab - main - linuxkpi: add kernel_fpu_begin/kernel_fpu_end Message-ID: <171B7072-9BAE-46BB-82BA-4792AEBAD0EB@unrelenting.technology> In-Reply-To: <ce860007-4c19-8fb2-05b9-9b9e1bcc0723@FreeBSD.org> References: <202101121143.10CBh02x095972@gitrepo.freebsd.org> <X/2hR9Hi3Jhf5ZNs@kib.kiev.ua> <20210113110826.46fbc900b3c375e7215a8195@bidouilliste.com> <A7AF80F3-3E01-44DD-B1FF-49BAEFCF4C4A@unrelenting.technology> <ce860007-4c19-8fb2-05b9-9b9e1bcc0723@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On January 13, 2021 8:58:58 PM UTC, John Baldwin <jhb@FreeBSD=2Eorg> wrote= : >On 1/13/21 3:42 AM, myfreeweb wrote: >>=20 >>=20 >> On January 13, 2021 10:08:26 AM UTC, Emmanuel Vadot <manu@bidouilliste= =2Ecom> wrote: >>> On Tue, 12 Jan 2021 15:16:55 +0200 >>> Konstantin Belousov <kostikbel@gmail=2Ecom> wrote: >>> >>>> On Tue, Jan 12, 2021 at 11:43:00AM +0000, Emmanuel Vadot wrote: >>>>> The branch main has been updated by manu: >>>>> >>>>> URL: https://cgit=2EFreeBSD=2Eorg/src/commit/?id=3D11d62b6f31ab4e99d= f6d0c6c23406b57eaa37f41 >>>>> >>>>> commit 11d62b6f31ab4e99df6d0c6c23406b57eaa37f41 >>>>> Author: Emmanuel Vadot <manu@FreeBSD=2Eorg> >>>>> AuthorDate: 2021-01-12 11:02:38 +0000 >>>>> Commit: Emmanuel Vadot <manu@FreeBSD=2Eorg> >>>>> CommitDate: 2021-01-12 11:31:00 +0000 >>>>> >>>>> linuxkpi: add kernel_fpu_begin/kernel_fpu_end >>>>> =20 >>>>> With newer AMD GPUs (>=3DNavi,Renoir) there is FPU context usage= in the >>>>> amdgpu driver=2E >>>>> The `kernel_fpu_begin/end` implementations in drm did not even a= llow nested >>>>> begin-end blocks=2E >>>> >>>> Does Linux allow more then one thread to execute kernel_fpu_begin ? >>> >>> I actually have no idea, adding Greg to cc=2E >>=20 >> Looks like they save the context into the current thread state, so yes?= (drm doesn't need that) >>=20 >> Also they seem to do something FPU_KERN_NOCTX like (??) because they di= sable preemption inside these blocks=2E >> (Where does our NOCTX actually store the state?) > >It doesn't store at all because threads aren't allowed to sleep in a crit= ical >section, so the thread will never give up the CPU while in the FPU sectio= n=2E If >threads can voluntarily sleep (cv_wait*, *sleep(), etc=2E) while using >kernel_fpu_begin(), then NOCTX won't work and we will need something else= =2E Hmm but with no storage at all, how would it work from a syscall? The manpage does mention a "usermode save area" =E2=80=93 I was talking ab= out that=2E Linux kernel_fpu_begin starts with preempt_disable, so definitely no condv= ars and the like=2E No idea about simple time sleeps=2E But amdgpu doesn't = seem to do even that=2E >However, the code snippet from the stackoverflow URL I posted earlier loo= ks >exactly like the NOCTX case where we flush the user FPU state to the thre= ad >if the FPU state is "dirty" and then load a clean initial state for use b= y >the FPU=2E It would also seem to never save the kernel FPU state anywher= e by >counting on avoiding context switches=2E So, I think you probably should= just >make this use NOCTX=2E NOCTX was the first thing I've tried, and it didn't work, but probably jus= t because of the nesting=2E Haven't retried it with the nesting counter=2E Testing a bunch of things would be easier if I had one of the GPUs that us= e this code instead of having to ask someone else to test=E2=80=A6
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?171B7072-9BAE-46BB-82BA-4792AEBAD0EB>