Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Sep 2016 17:41:03 +0300
From:      Slawa Olhovchenkov <slw@zxy.spb.ru>
To:        freebsd-stable@freebsd.org
Cc:        John Baldwin <jhb@freebsd.org>
Subject:   Re: nginx and FreeBSD11
Message-ID:  <20160915144103.GB2960@zxy.spb.ru>
In-Reply-To: <20160907191348.GD22212@zxy.spb.ru>
References:  <20160907191348.GD22212@zxy.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov wrote:

> I am have strange issuse with nginx on FreeBSD11.
> I am have FreeBSD11 instaled over STABLE-10.
> nginx build for FreeBSD10 and run w/o recompile work fine.
> nginx build for FreeBSD11 crushed inside rbtree lookups: next node
> totaly craped.
> 
> I am see next potential cause:
> 
> 1) clang 3.8 code generation issuse
> 2) system library issuse
> 
> may be i am miss something?
> 
> How to find real cause?

I find real cause and this like show-stopper for RELEASE.
I am use nginx with AIO and AIO from one nginx process corrupt memory
from other nginx process. Yes, this is cross-process memory
corruption.

Last case, core dumped proccess with pid 1060 at 15:45:14.
Corruped memory at 0x860697000.
I am know about good memory at 0x86067f800.
Dumping (form core) this region to file and analyze by hexdump I am
found start of corrupt region -- offset 0000c8c0 from 0x86067f800.
0x86067f800+0xc8c0 = 0x86068c0c0

I am preliminary enabled debuggin of AIO started operation to nginx
error log (memory address, file name, offset and size of transfer).

grep -i 86068c0c0 error.log near 15:45:14 give target file.
grep ce949665cbcd.hls error.log near 15:45:14 give next result:

2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD 000000082065DB60 start 000000086068C0C0 561b0   2646736 ce949665cbcd.hls
2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD 000000081F1FFB60 start 000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls
2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD 00000008216B6B60 start 000000086472B7C0 7ff70   2999424 ce949665cbcd.hls

0x860697000-0x86068c0c0 = 0xaf40

from memory dump:
0000af00  5c 81 4d 7c 0b b6 81 f2  c8 a5 df 94 08 43 c1 08  |\.M|.........C..|
0000af10  74 00 57 55 5f 15 11 b1  00 d5 29 6a 4e d2 fd fb  |t.WU_.....)jN...|
0000af20  49 d1 fd 98 49 58 b7 66  c2 c9 64 67 30 05 06 c0  |I...IX.f..dg0...|
0000af30  0e b2 64 fa b7 9f 69 69  fc cd 91 82 83 ba c3 f2  |..d...ii........|
0000af40  b7 34 eb 8e 0e 88 40 60  1b a8 71 7a 12 15 26 d3  |.4....@`..qz..&.|
0000af50  7f 3e 80 e9 74 96 30 24  cb 82 88 8a ea e0 45 10  |.>..t.0$......E.|
0000af60  e5 75 b2 f7 5b 7c 83 fa  95 a9 09 80 0a 8c fd a9  |.u..[|..........|
0000af70  ef 30 f6 68 9c b2 3f ae  2e e5 21 79 78 8b 34 36  |.0.h..?...!yx.46|
0000af80  c6 55 16 a2 47 00 ca 13  9c 8e 2c 6b eb c7 4f 51  |.U..G.....,k..OQ|
0000af90  81 80 71 f3 a5 9a 5f 40  54 9c f1 f9 ba 81 b2 82  |..q..._@T.......|

from disk file (offset from 2646736):
0000af00  5c 81 4d 7c 0b b6 81 f2  c8 a5 df 94 08 43 c1 08  |\.M|.........C..|
0000af10  74 00 57 55 5f 15 11 b1  00 d5 29 6a 4e d2 fd fb  |t.WU_.....)jN...|
0000af20  49 d1 fd 98 49 58 b7 66  c2 c9 64 67 30 05 06 c0  |I...IX.f..dg0...|
0000af30  0e b2 64 fa b7 9f 69 69  fc cd 91 82 83 ba c3 f2  |..d...ii........|
0000af40  b7 34 eb 8e 0e 88 40 60  1b a8 71 7a 12 15 26 d3  |.4....@`..qz..&.|
0000af50  7f 3e 80 e9 74 96 30 24  cb 82 88 8a ea e0 45 10  |.>..t.0$......E.|
0000af60  e5 75 b2 f7 5b 7c 83 fa  95 a9 09 80 0a 8c fd a9  |.u..[|..........|
0000af70  ef 30 f6 68 9c b2 3f ae  2e e5 21 79 78 8b 34 36  |.0.h..?...!yx.46|
0000af80  c6 55 16 a2 47 00 ca 13  9c 8e 2c 6b eb c7 4f 51  |.U..G.....,k..OQ|
0000af90  81 80 71 f3 a5 9a 5f 40  54 9c f1 f9 ba 81 b2 82  |..q..._@T.......|

Bingo!
aio read file by process 1055 placed to same memory address as requested but in memory space of process 1060!

This is kernel bug and this bug must be stoped release.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160915144103.GB2960>