From owner-freebsd-net@freebsd.org Wed Dec 27 20:09:41 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E5BB9EAA6A1 for ; Wed, 27 Dec 2017 20:09:41 +0000 (UTC) (envelope-from nparhar@gmail.com) Received: from mail-pg0-x235.google.com (mail-pg0-x235.google.com [IPv6:2607:f8b0:400e:c05::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AFEAA75DCC for ; Wed, 27 Dec 2017 20:09:41 +0000 (UTC) (envelope-from nparhar@gmail.com) Received: by mail-pg0-x235.google.com with SMTP id q12so2143439pgt.7 for ; Wed, 27 Dec 2017 12:09:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=r3vwxU3x+tUzwkP10lDzYctOQQ3el14GrJ/fcFfk9Hk=; b=sxiYFHFKzYpP6/Haafxbik2x8+dtcp7fgCcoQzupZ6EsL89EwozMPXisNzq4Znxufe cEEcmqomMALcOj1/PgtUh75gGpwebH6GESyMRQjRP3IDSr/kmwC24tEcnZuWDFe/OAMP njzGx0z1fiQ/r8e9ZWXrLKEMQ1pMWgwZQEqRurn0dxjpLbxfeMAc3f1eWHDM57j2RurN 8XQplUNoXPgeX0Bqu0ykKlDQsCushOCIqVotnB07Y2kabPk4TZGggKSlDw0wuL2anHgK 5o5tfLDslBV4+rA1/pzkP8ZWfprrEC8ZmRcqmOb09g/i4e7MtRNDyQVPXlPV/nLJkm5b htdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=r3vwxU3x+tUzwkP10lDzYctOQQ3el14GrJ/fcFfk9Hk=; b=NQjY8YY8rPfnrf/NpucZmssgTDuSYGCFVeEBZWu5wpxV7nVA7YcNGDB1wi3n1ZWtQa ZnW1vkkWMiMzaZU7vct7WNXQ6tzlaWViFHZHLvRVImOoRF+0xQepNfUZm28GOsDcN4AA uY4p5+Gni+2rUrF+7hsPIqsC27PI3/7dh8gIqj3N83qsYqEelTf3LZ3fqwa7YfyGR9Yr rNM22UCSLSmsIb+PEXjJnCg8REYsckRL4wnLyNdrJ8PheQb++gw+DmG+9HQwn/Degg15 9zF2xmE5L/bu5erWpoJggSBd5zRkZ510UWLtwWN/uNjWsw9LBARVBRrLNAK/BX0tCLKZ zDJw== X-Gm-Message-State: AKGB3mLB11XPm8mTY/Ng9Ox0kQtYWyvStlJ69sT4dDVTcy2vekCzV5do u7Jw/8w1RmSoLkgtDwku+mSzPA== X-Google-Smtp-Source: ACJfBovRxCUJGcjPhe8w58kjGnujUPhum4m/zctN7f1wocvn3ncOlGvGoBoa4i7ZcsRJfWi1ITsc6g== X-Received: by 10.98.33.8 with SMTP id h8mr29302981pfh.160.1514405380983; Wed, 27 Dec 2017 12:09:40 -0800 (PST) Received: from [10.192.166.0] (stargate.chelsio.com. [12.32.117.8]) by smtp.googlemail.com with ESMTPSA id b8sm6468838pgt.14.2017.12.27.12.09.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Dec 2017 12:09:40 -0800 (PST) Sender: Navdeep Parhar Subject: Re: [freebsd-current]Who should reset M_PKTHDR flag in m_buf when IP packets are fragmented. m_unshare panic throw when IPSec is enabled To: "Andrey V. Elsukov" , Harsh Jain , freebsd-net@freebsd.org References: <73302ead-b2e9-c25b-4d11-475f38dec1a1@chelsio.com> <993c58bb-3bf2-d6a3-9a05-13e1631aec87@yandex.ru> From: Navdeep Parhar Message-ID: Date: Wed, 27 Dec 2017 12:09:39 -0800 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Dec 2017 20:09:42 -0000 On 12/26/2017 03:33, Andrey V. Elsukov wrote: > On 26.12.2017 13:22, Harsh Jain wrote: >>>> panic: m_unshare: m0 0xfffff80020f82600, m 0xfffff8005d054100 has M_PKTHDR >>>> cpuid = 15 >>>> time = 1495578455 >>>> KDB: stack backtrace: >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe044e9bb890 >>>> kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe044e9bb960 >>>> vpanic() at vpanic+0x269/frame 0xfffffe044e9bba30 >>>> kassert_panic() at kassert_panic+0xc7/frame 0xfffffe044e9bbac0 >>>> m_unshare() at m_unshare+0x578/frame 0xfffffe044e9bbbc0 >>>> esp_output() at esp_output+0x44c/frame 0xfffffe044e9bbe40 >>>> ipsec4_perform_request() at ipsec4_perform_request+0x5df/frame 0xfffffe044e9bbff0 >>> Hi, >>> >>> it seems unusual that IP reassembly happens on outbound path. >> It can be re-produced with single Ping packet on chelsio(cxgbe) NIC. I tried with Intel NIC. It seems they re-produce M_WRITEABLE() buffer(follows different path in m_unshare) which is not true for cxgbe. > > In my view, IP fragmentation should occur in ip_output after IPsec > encryption. Something like: > > 1. rip_output() has mbuf chain where only first mbuf has M_PKTHDR flag > 2. ip_output() -> IPSEC_OUTPUT() -> esp_output() -> m_unshare(). We > should still have only one mbuf with M_PKTHDR flag here. > 3. esp_output_cb() -> ipsec_process_done() -> ip_output() > 4. Now IP fragmentation should occur: ip_fragment() creates chain of > mbufs to send, where M_PKTHDR flag will be set for each fragment. > >>> Do you have some packet normalization using firewall? >> Default FREEBSD current installation. No explicit firewall. >> What you think above patch makes sense. > > It is not clear to me why it helps. The panic happens on outbound path, > where mbuf should be allocated by network stack and should be writeable. > ip_reass() usually used on inbound path. I think the patch just hides > the problem in another place. > Do you mean that cxgbe can produce !WRITEABLE mbuf for received packet > and then pass it to the network stack? > Yes, cxgbe does that. But I think the real bug here is in ip_reass because it doesn't properly get rid of the pkthdr of the fragments while creating the reassembled datagram. cxgbe happens to trip on this easily because it often creates !WRITEABLE mbufs. This should fix it: https://people.freebsd.org/~np/ip_reass_demotehdr.diff It will also fix leaks in configurations where mbuf tags are in use by default (for example with MAC), ip_reass is involved during rx, and the mbuf chain never gets m_demote'd elsewhere (meaning ip_reass should have freed the tags itself). Regards, Navdeep