Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Apr 2023 23:58:27 GMT
From:      Cheng Cui <cc@FreeBSD.org>
To:        src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org
Subject:   git: 60167184abd5 - main - siftr: remove barely used hash generation per record
Message-ID:  <202304262358.33QNwR3B042770@gitrepo.freebsd.org>

next in thread | raw e-mail | index | archive | help
The branch main has been updated by cc:

URL: https://cgit.FreeBSD.org/src/commit/?id=60167184abd54ca12c6bf7ab60f2a08b41342f84

commit 60167184abd54ca12c6bf7ab60f2a08b41342f84
Author:     Cheng Cui <cc@FreeBSD.org>
AuthorDate: 2023-04-26 11:18:01 +0000
Commit:     Cheng Cui <cc@FreeBSD.org>
CommitDate: 2023-04-26 19:57:42 +0000

    siftr: remove barely used hash generation per record
    
    Reviewers: rscheff, tuexen
    Approved by: rscheff, tuexen
    Subscribers: imp, melifaro, glebius
    Differential Revision: https://reviews.freebsd.org/D39835
---
 share/man/man4/siftr.4 |  67 ++++++++++++-------------------
 sys/netinet/siftr.c    | 105 +------------------------------------------------
 2 files changed, 28 insertions(+), 144 deletions(-)

diff --git a/share/man/man4/siftr.4 b/share/man/man4/siftr.4
index 78eb755ddef1..6182daddd8b6 100644
--- a/share/man/man4/siftr.4
+++ b/share/man/man4/siftr.4
@@ -29,7 +29,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd April 25, 2023
+.Dd April 26, 2023
 .Dt SIFTR 4
 .Os
 .Sh NAME
@@ -122,15 +122,6 @@ By default, the file /var/log/siftr.log is used.
 The path can be changed at any time, even while the module is enabled.
 .El
 .Bl -tag -offset indent -width Va
-.It Va net.inet.siftr.genhashes
-controls whether a hash is generated for each TCP packet seen by
-.Nm .
-By default, the value is set to 0, which means no hashes are generated.
-The hashes are useful to correlate which TCP packet triggered the generation of
-a particular log message, but calculating them adds additional computational
-overhead into the fast path.
-.El
-.Bl -tag -offset indent -width Va
 .It Va net.inet.siftr.port_filter
 controls on which source or destination port siftr should capture
 .Nm .
@@ -216,120 +207,116 @@ for out.
 .El
 .Bl -tag -offset indent -width Va
 .It Va 2
-Hash of the packet that triggered the log message.
-.El
-.Bl -tag -offset indent -width Va
-.It Va 3
 Time at which the packet that triggered the log message was processed by
 the
 .Xr pfil 9
 hook function, in seconds and microseconds since the UNIX epoch.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 4
+.It Va 3
 The IPv4 or IPv6 address of the local host, in dotted quad (IPv4 packet)
 or colon-separated hex (IPv6 packet) notation.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 5
+.It Va 4
 The TCP port that the local host is communicating via.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 6
+.It Va 5
 The IPv4 or IPv6 address of the foreign host, in dotted quad (IPv4 packet)
 or colon-separated hex (IPv6 packet) notation.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 7
+.It Va 6
 The TCP port that the foreign host is communicating via.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 8
+.It Va 7
 The slow start threshold for the flow, in bytes.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 9
+.It Va 8
 The current congestion window for the flow, in bytes.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 10
+.It Va 9
 The current state of the t_flags2 field for the flow.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 11
+.It Va 10
 The current sending window for the flow, in bytes.
 The post scaled value is reported, except during the initial handshake (first
 few packets), during which time the unscaled value is reported.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 12
+.It Va 11
 The current receive window for the flow, in bytes.
 The post scaled value is always reported.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 13
+.It Va 12
 The current window scaling factor for the sending window.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 14
+.It Va 13
 The current window scaling factor for the receiving window.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 15
+.It Va 14
 The current state of the TCP finite state machine, as defined
 in
 .Aq Pa netinet/tcp_fsm.h .
 .El
 .Bl -tag -offset indent -width Va
-.It Va 16
+.It Va 15
 The maximum segment size for the flow, in bytes.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 17
+.It Va 16
 The current smoothed RTT estimate for the flow, in units of microsecond.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 18
+.It Va 17
 SACK enabled indicator. 1 if SACK enabled, 0 otherwise.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 19
+.It Va 18
 The current state of the TCP flags for the flow.
 See
 .Aq Pa netinet/tcp_var.h
 for information about the various flags.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 20
+.It Va 19
 The current retransmission timeout length for the flow, in units microsecond.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 21
+.It Va 20
 The current size of the socket send buffer in bytes.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 22
+.It Va 21
 The current number of bytes in the socket send buffer.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 23
+.It Va 22
 The current size of the socket receive buffer in bytes.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 24
+.It Va 23
 The current number of bytes in the socket receive buffer.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 25
+.It Va 24
 The current number of unacknowledged bytes in-flight.
 Bytes acknowledged via SACK are not excluded from this count.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 26
+.It Va 25
 The current number of segments in the reassembly queue.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 27
+.It Va 26
 Flowid for the connection.
 A caveat: Zero '0' either represents a valid flowid or a default value when it's
 not being set.
@@ -337,7 +324,7 @@ There is no easy way to differentiate without looking at actual
 network interface card and drivers being used.
 .El
 .Bl -tag -offset indent -width Va
-.It Va 28
+.It Va 27
 Flow type for the connection.
 Flowtype defines which protocol fields are hashed to produce the flowid.
 A complete listing is available in
@@ -760,8 +747,6 @@ does not take into account bytes that have been
 .No SACK Ap ed
 by the receiving host.
 .It
-Packet hash generation does not currently work for IPv6 based TCP packets.
-.It
 Compressed notation is not used for IPv6 address representation.
 This consumes more bytes than is necessary in log output.
 .El
diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c
index 05104627910d..ce895ba77b82 100644
--- a/sys/netinet/siftr.c
+++ b/sys/netinet/siftr.c
@@ -179,8 +179,6 @@ struct pkt_node {
 	}			direction;
 	/* IP version pkt_node relates to; either INP_IPV4 or INP_IPV6. */
 	uint8_t			ipver;
-	/* Hash of the pkt which triggered the log message. */
-	uint32_t		hash;
 	/* Local/foreign IP address. */
 #ifdef SIFTR_IPV6
 	uint32_t		ip_laddr[4];
@@ -270,7 +268,6 @@ DPCPU_DEFINE_STATIC(struct siftr_stats, ss);
 static volatile unsigned int siftr_exit_pkt_manager_thread = 0;
 static unsigned int siftr_enabled = 0;
 static unsigned int siftr_pkts_per_log = 1;
-static unsigned int siftr_generate_hashes = 0;
 static uint16_t     siftr_port_filter = 0;
 /* static unsigned int siftr_binary_log = 0; */
 static char siftr_logfile[PATH_MAX] = "/var/log/siftr.log";
@@ -310,10 +307,6 @@ SYSCTL_UINT(_net_inet_siftr, OID_AUTO, ppl, CTLFLAG_RW,
     &siftr_pkts_per_log, 1,
     "number of packets between generating a log message");
 
-SYSCTL_UINT(_net_inet_siftr, OID_AUTO, genhashes, CTLFLAG_RW,
-    &siftr_generate_hashes, 0,
-    "enable packet hash generation");
-
 SYSCTL_U16(_net_inet_siftr, OID_AUTO, port_filter, CTLFLAG_RW,
     &siftr_port_filter, 0,
     "enable packet filter on a TCP port");
@@ -448,11 +441,10 @@ siftr_process_pkt(struct pkt_node * pkt_node)
 		/* Construct an IPv6 log message. */
 		log_buf->ae_bytesused = snprintf(log_buf->ae_data,
 		    MAX_LOG_MSG_LEN,
-		    "%c,0x%08x,%zd.%06ld,%x:%x:%x:%x:%x:%x:%x:%x,%u,%x:%x:%x:"
+		    "%c,%zd.%06ld,%x:%x:%x:%x:%x:%x:%x:%x,%u,%x:%x:%x:"
 		    "%x:%x:%x:%x:%x,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,"
 		    "%u,%u,%u,%u,%u,%u,%u,%u,%u,%u\n",
 		    direction[pkt_node->direction],
-		    pkt_node->hash,
 		    pkt_node->tval.tv_sec,
 		    pkt_node->tval.tv_usec,
 		    UPPER_SHORT(pkt_node->ip_laddr[0]),
@@ -508,10 +500,9 @@ siftr_process_pkt(struct pkt_node * pkt_node)
 		/* Construct an IPv4 log message. */
 		log_buf->ae_bytesused = snprintf(log_buf->ae_data,
 		    MAX_LOG_MSG_LEN,
-		    "%c,0x%08x,%jd.%06ld,%u.%u.%u.%u,%u,%u.%u.%u.%u,%u,%u,%u,"
+		    "%c,%jd.%06ld,%u.%u.%u.%u,%u,%u.%u.%u.%u,%u,%u,%u,"
 		    "%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u,%u\n",
 		    direction[pkt_node->direction],
-		    pkt_node->hash,
 		    (intmax_t)pkt_node->tval.tv_sec,
 		    pkt_node->tval.tv_usec,
 		    pkt_node->ip_laddr[0],
@@ -629,36 +620,6 @@ siftr_pkt_manager_thread(void *arg)
 	kthread_exit();
 }
 
-static uint32_t
-hash_pkt(struct mbuf *m, uint32_t offset)
-{
-	uint32_t hash;
-
-	hash = 0;
-
-	while (m != NULL && offset > m->m_len) {
-		/*
-		 * The IP packet payload does not start in this mbuf, so
-		 * need to figure out which mbuf it starts in and what offset
-		 * into the mbuf's data region the payload starts at.
-		 */
-		offset -= m->m_len;
-		m = m->m_next;
-	}
-
-	while (m != NULL) {
-		/* Ensure there is data in the mbuf */
-		if ((m->m_len - offset) > 0)
-			hash = hash32_buf(m->m_data + offset,
-			    m->m_len - offset, hash);
-
-		m = m->m_next;
-		offset = 0;
-        }
-
-	return (hash);
-}
-
 /*
  * Check if a given mbuf has the SIFTR mbuf tag. If it does, log the fact that
  * it's a reinjected packet and return. If it doesn't, tag the mbuf and return.
@@ -925,68 +886,6 @@ siftr_chkpkt(struct mbuf **m, struct ifnet *ifp, int flags,
 
 	siftr_siftdata(pn, inp, tp, INP_IPV4, dir, inp_locally_locked);
 
-	if (siftr_generate_hashes) {
-		if ((*m)->m_pkthdr.csum_flags & CSUM_TCP) {
-			/*
-			 * For outbound packets, the TCP checksum isn't
-			 * calculated yet. This is a problem for our packet
-			 * hashing as the receiver will calc a different hash
-			 * to ours if we don't include the correct TCP checksum
-			 * in the bytes being hashed. To work around this
-			 * problem, we manually calc the TCP checksum here in
-			 * software. We unset the CSUM_TCP flag so the lower
-			 * layers don't recalc it.
-			 */
-			(*m)->m_pkthdr.csum_flags &= ~CSUM_TCP;
-
-			/*
-			 * Calculate the TCP checksum in software and assign
-			 * to correct TCP header field, which will follow the
-			 * packet mbuf down the stack. The trick here is that
-			 * tcp_output() sets th->th_sum to the checksum of the
-			 * pseudo header for us already. Because of the nature
-			 * of the checksumming algorithm, we can sum over the
-			 * entire IP payload (i.e. TCP header and data), which
-			 * will include the already calculated pseduo header
-			 * checksum, thus giving us the complete TCP checksum.
-			 *
-			 * To put it in simple terms, if checksum(1,2,3,4)=10,
-			 * then checksum(1,2,3,4,5) == checksum(10,5).
-			 * This property is what allows us to "cheat" and
-			 * checksum only the IP payload which has the TCP
-			 * th_sum field populated with the pseudo header's
-			 * checksum, and not need to futz around checksumming
-			 * pseudo header bytes and TCP header/data in one hit.
-			 * Refer to RFC 1071 for more info.
-			 *
-			 * NB: in_cksum_skip(struct mbuf *m, int len, int skip)
-			 * in_cksum_skip 2nd argument is NOT the number of
-			 * bytes to read from the mbuf at "skip" bytes offset
-			 * from the start of the mbuf (very counter intuitive!).
-			 * The number of bytes to read is calculated internally
-			 * by the function as len-skip i.e. to sum over the IP
-			 * payload (TCP header + data) bytes, it is INCORRECT
-			 * to call the function like this:
-			 * in_cksum_skip(at, ip->ip_len - offset, offset)
-			 * Rather, it should be called like this:
-			 * in_cksum_skip(at, ip->ip_len, offset)
-			 * which means read "ip->ip_len - offset" bytes from
-			 * the mbuf cluster "at" at offset "offset" bytes from
-			 * the beginning of the "at" mbuf's data pointer.
-			 */
-			th->th_sum = in_cksum_skip(*m, ntohs(ip->ip_len),
-			    ip_hl);
-		}
-
-		/*
-		 * XXX: Having to calculate the checksum in software and then
-		 * hash over all bytes is really inefficient. Would be nice to
-		 * find a way to create the hash and checksum in the same pass
-		 * over the bytes.
-		 */
-		pn->hash = hash_pkt(*m, ip_hl);
-	}
-
 	mtx_lock(&siftr_pkt_queue_mtx);
 	STAILQ_INSERT_TAIL(&pkt_queue, pn, nodes);
 	mtx_unlock(&siftr_pkt_queue_mtx);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202304262358.33QNwR3B042770>