Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 19 Jul 2020 23:34:24 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>
Cc:        "jhb@FreeBSD.org" <jhb@FreeBSD.org>, "gallatin@freebsd.org" <gallatin@freebsd.org>, Gleb Smirnoff <glebius@freebsd.org>
Subject:   RFC: ktls and krpc using M_EXTPG mbufs
Message-ID:  <QB1PR01MB33640E7C88BA0E27A89587DFDD7A0@QB1PR01MB3364.CANPRD01.PROD.OUTLOOK.COM>

next in thread | raw e-mail | index | archive | help
I spent a little time chasing a problem in the nfs-over-tls code, where it=
=0A=
would sometimes end up with corrupted data in the file(s) of a mirrored=0A=
pNFS configuration.=0A=
=0A=
I think the problem was that the code filled the data to be written into=0A=
anonymous page M_EXTPG mbufs, then did a m_copym() { copy by=0A=
reference } and used the copies for the mirrored writes.=0A=
--> In ktls_encrypt(), the encryption was done to the same pages and,=0A=
       sometimes, the encrypted data got encrypted again during the=0A=
       sosend() of the other copy.=0A=
=0A=
Although I haven't reproduced it, a regular kernel write RPC could suffer t=
he=0A=
same consequences if the RPC is retried (it keeps an m_copym() copy=0A=
of the request in the krpc for an RPC retry).=0A=
=0A=
At this time, the code in projects/nfs-over-tls works correctly, since it=
=0A=
always fills the data to be written into mbuf clusters, m_copym()s those=0A=
and then copies those { real copying using memcpy() } via=0A=
mb_mapped_to_unmapped() just before calling sosend().=0A=
--> This works, but it would be nice to avoid the mb_mapped_to_unmapped()=
=0A=
      copying for all the data being written via an NFS over TLS connection=
.=0A=
=0A=
For the TCP_TLS_MODE_SW case:=0A=
--> The NFS code can fill the written data into anonymous pages on M_EXTPG=
=0A=
       mbufs.=0A=
Then, the ktls_encrypt() could be modified to=0A=
allocate a new set of anonymous pages for the destination side of=0A=
the encryption (it already does this for the sendfile case) and put those=
=0A=
in a new mbuf list.=0A=
--> This would result in new anonymous pages and mbufs being allocated,=0A=
       but would not do memcpy()s.=0A=
After encryption, it would just do a m_freem() on the unencrypted list.=0A=
--> For the krpc client case, this call would only decrement the reference=
=0A=
      count on the unencrypted list and it could be used for a retry by the=
 krpc=0A=
      and then be free'd { m_freem() call } after a reply is received.=0A=
=0A=
If doing this for all the sosend()s of anonymous page M_EXTPG mbufs seems=
=0A=
like unnecessary overhead, the above could be enabled via a setsockopt()=0A=
on the socket.=0A=
=0A=
What do others think of this?=0A=
=0A=
For the hardware offload case:=0A=
- Can I assume that the anonymous pages in M_EXTPG mbufs will remain=0A=
   unchanged?=0A=
--> If so, and it won't change to TCP_TLS_MODE_SW, the NFS code could=0A=
       fill the data to be written into M_EXTPG mbufs safely.=0A=
=0A=
- And, if so, can I safely use the ktls_session mode field to decide if off=
load=0A=
  is happening?=0A=
  I see the TCP_TXTLS_MODE socket opt which seems to=0A=
  switch the mode to TCP_TLS_MODE_SW.=0A=
  When does this happen? Or, can this happen to a session once in use?=0A=
=0A=
Thanks for any/all comments on this, rick=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?QB1PR01MB33640E7C88BA0E27A89587DFDD7A0>