Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Aug 2020 20:45:58 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>
Subject:   can buffer cache pages be used in ext_pgs mbufs?
Message-ID:  <QB1PR01MB3364679FDA14B84DD3BA5CA3DD480@QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM>

next in thread | raw e-mail | index | archive | help
Hi,=0A=
=0A=
I've been at this game for a while and one of the axioms is...=0A=
"Everything is harder than it at first looks."=0A=
=0A=
Currently, when the FreeBSD NFS client does a write, it does:=0A=
- VOP_WRITE() copies the data into buffer cache block(s).=0A=
--> An nfsiod thread (or sometimes the thread that called VOP_WRITE()),=0A=
   copies the data from the buffer cache block into a list of mbuf clusters=
,=0A=
   prepends the NFS and RPC headers, then passes it down to TCP via sosend(=
).=0A=
=0A=
   After the RPC reply is received (or the RPC fails due to timeout):=0A=
   - m_freem() is called on the mbuf list.=0A=
   - bufdone()/brelse() is called for the buffer cache block.=0A=
=0A=
For TLS, the mbuf list passed into sosend() must be ext_pgs mbufs, so the=
=0A=
mbuf clusters get copied to ext_pgs mbufs with anonymous pages before=0A=
the sosend() call.=0A=
=0A=
So, what if the pages associated with the buffer cache block (b_pages)=0A=
were entered in the m_epg_pa[] array for the ext_pgs mbufs, instead of=0A=
copying the data into mbuf clusters?=0A=
- At a glance, this just seems like it would work.=0A=
  It looks like the buffer cache pages are wired down until bufdone()/brels=
e(),=0A=
  which happens after m_freem() on the mbuf list.=0A=
- There would need to be a custom m__ext.ext_free, but it looks like a no-o=
p.=0A=
  (ie. does nothing, since the buffer cache code deals with the pages later=
.)=0A=
=0A=
The only thing I can think of (and I don't understand the vm/memory cache=
=0A=
parts of FreeBSD) is that, since the buffer cache pages are written via cop=
ying=0A=
into their kva addresses and then read via the direct map of their physical=
=0A=
pages, there might be some sort of memory cache flush needed to ensure the=
=0A=
physical pages are up to date (no data still working its way through write-=
back).=0A=
- Is this a problem and how is it handled?=0A=
=0A=
In summary, what am I missing that makes this difficult/impossible to do?=
=0A=
=0A=
If no one has an answer, I'll just code it up and see what happens.=0A=
=0A=
Thanks for any comments, rick=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?QB1PR01MB3364679FDA14B84DD3BA5CA3DD480>