Date: Fri, 15 Feb 2019 07:22:24 +0000 From: Eric Bautsch <eric.bautsch@pobox.com> To: "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net>, Roger Pau Monn? <roger.pau@citrix.com> Cc: freebsd-xen@freebsd.org Subject: Re: Issues with XEN and ZFS Message-ID: <234bf1db-b9e9-f30d-f966-5b4b6973fee7@pobox.com> In-Reply-To: <201902111543.x1BFhODs071427@pdx.rh.CN85.dnsmgr.net> References: <201902111543.x1BFhODs071427@pdx.rh.CN85.dnsmgr.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Thanks all for your help and my apologies for the late reply, I was out on a
long weekend and then on customer site until Wednesday night....
Comments/answers inline.
Thanks again.
Eric
On 11/02/2019 15:43, Rodney W. Grimes wrote:
>> Thanks for the testing!
>>
>> On Fri, Feb 08, 2019 at 07:35:04PM +0000, Eric Bautsch wrote:
>>> Hi.
>>>
>>>
>>> Brief abstract: I'm having ZFS/Xen interaction issues with the disks being
>>> declared unusable by the dom0.
>>>
>>>
>>> The longer bit:
>>>
>>> I'm new to FreeBSD, so my apologies for all the stupid questions. I'm trying
>>> to migrate from Linux as my virtual platform host (very bad experiences with
>>> stability, let's leave it at that). I'm hosting mostly Solaris VMs (that
>>> being my choice of OS, but again, Betamax/VHS, need I say more), as well as
>>> a Windows VM (because I have to) and a Linux VM (as a future desktop via
>>> thin clients as and when I have to retire my SunRay solution which also runs
>>> on a VM for lack of functionality).
>>>
>>> So, I got xen working on FreeBSD now after my newbie mistake was pointed out to me.
>>>
>>> However, I seem to be stuck again:
>>>
>>> I have, in this initial test server, only two disks. They are SATA hanging
>>> off the on-board SATA controller. The system is one of those Shuttle XPC
>>> cubes, an older one I had hanging around with 16GB memory and I think 4
>>> cores.
>>>
>>> I've given the dom0 2GB of memory and 2 core to start with.
>> 2GB might be too low when using ZFS, I would suggest 4G as a minimum
>> when using ZFS for reasonable performance, even 8G. ZFS is quite
>> memory hungry.
> 2GB should not be too low, I comfortably run ZFS in 1G. ZFS is a
> "free memory hog", by design it uses all memory it can. Unfortantly
> often the free aspect is over looked and it does not return memory when
> it should, leading to OOM kills, those are bugs and need fixed.
>
> If you are going to run ZFS at all I do strongly suggest overriding
> the arc memory size with vfs.zfs.arc_max= in /boot/loader.conf to be
> something more reasonable than the default 95% of host memory.
On my machines, I tend to limit it to 2GB where there's plenty of memory about.
As this box only has 2GB, I didn't bother, but thanks for letting me know where
and how to do it, as I will need to know at some point... ;-)
>
> For a DOM0 I would start at 50% of memory (so 1G in this case) and
> monitor the DOM0 internally with top, and slowly increase this limit
> until the free memory dropped to the 256MB region. If the work load
> on DOM0 changes dramatically you may need to readjust.
>
>>> The root filesystem is zfs with a mirror between the two disks.
>>>
>>> The entire thing is dead easy to blow away and re-install as I was very
>>> impressed how easy the FreeBSD automatic installer was to understand and
>>> pick up, so I have it all scripted. If I need to blow stuff away to test, no
>>> problem and I can always get back to a known configuration.
>>>
>>>
>>> As I only have two disks, I have created a zfs volume for the Xen domU thus:
>>>
>>> zfs create -V40G -o volmode=dev zroot/nereid0
>>>
>>>
>>> The domU nereid is defined thus:
>>>
>>> cat - << EOI > /export/vm/nereid.cfg
>>> builder = "hvm"
>>> name = "nereid"
>>> memory = 2048
>>> vcpus = 1
>>> vif = [ 'mac=00:16:3E:11:11:51,bridge=bridge0',
>>> 'mac=00:16:3E:11:11:52,bridge=bridge1',
>>> 'mac=00:16:3E:11:11:53,bridge=bridge2' ]
>>> disk = [ '/dev/zvol/zroot/nereid0,raw,hda,rw' ]
>>> vnc = 1
>>> vnclisten = "0.0.0.0"
>>> serial = "pty"
>>> EOI
>>>
>>> nereid itself also auto-installs, it's a Solaris 11.3 instance.
>>>
>>>
>>> As it tries to install, I get this in the dom0:
>>>
>>> Feb 8 18:57:16 bianca.swangage.co.uk kernel: (ada1:ahcich1:0:0:0):
>>> WRITE_FPDMA_QUEUED. ACB: 61 18 a0 ef 88 40 46 00 00 00 00 00
>>> Feb 8 18:57:16 bianca.swangage.co.uk last message repeated 4 times
>>> Feb 8 18:57:16 bianca.swangage.co.uk kernel: (ada1:ahcich1:0:0:0): CAM
>>> status: CCB request was invalid
>> That's weird, and I would say it's not related to ZFS, the same could
>> likely happen with UFS since this is an error message from the
>> disk controller hardware.
> CCB invalid, thats not good, we sent a command to the drive/controller that
> it does not like.
> This drive may need to be quirked in some way, or there may be
> some hardware issues here of some kind.
Should I have pointed out that these two disks are both identical and not SSDs:
Geom name: ada0
Providers:
1. Name: ada0
Mediasize: 1000204886016 (932G)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r2w2e3
descr: ST1000LM035-1RK172
lunid: 5000c5009d4d4c12
ident: WDE0R5LL
rotationrate: 5400
fwsectors: 63
fwheads: 16
>> Can you test whether the same happens _without_ Xen running?
>>
>> Ie: booting FreeBSD without Xen and then doing some kind of disk
>> stress test, like fio [0].
I've just run an fio thus (sorry, not used it before, this seemed like a
reasonable set of options, but tell me if there's a better set):
fio --name=randwrite --iodepth=4 --rw=randwrite --bs=4k --direct=0 --size=512M
--numjobs=10 --runtime=1200 --group_reporting
Leading to this output when I stopped it:
randwrite: (groupid=0, jobs=10): err= 0: pid=68148: Thu Feb 14 09:50:08 2019
write: IOPS=926, BW=3705KiB/s (3794kB/s)(2400MiB/663425msec)
clat (usec): min=10, max=4146.6k, avg=9558.71, stdev=94020.98
lat (usec): min=10, max=4146.6k, avg=9558.97, stdev=94020.98
clat percentiles (usec):
| 1.00th=[ 47], 5.00th=[ 52], 10.00th=[ 100],
| 20.00th=[ 133], 30.00th=[ 161], 40.00th=[ 174],
| 50.00th=[ 180], 60.00th=[ 204], 70.00th=[ 249],
| 80.00th=[ 367], 90.00th=[ 2008], 95.00th=[ 10552],
| 99.00th=[ 160433], 99.50th=[ 566232], 99.90th=[1367344],
| 99.95th=[2055209], 99.99th=[2868904]
bw ( KiB/s): min= 7, max=16383, per=16.36%, avg=606.11, stdev=1379.59,
samples=7795
iops : min= 1, max= 4095, avg=151.06, stdev=344.94, samples=7795
lat (usec) : 20=0.51%, 50=2.53%, 100=6.88%, 250=60.31%, 500=12.97%
lat (usec) : 750=2.16%, 1000=1.64%
lat (msec) : 2=2.98%, 4=2.65%, 10=2.27%, 20=1.16%, 50=1.58%
lat (msec) : 100=0.95%, 250=0.63%, 500=0.22%, 750=0.17%, 1000=0.16%
cpu : usr=0.04%, sys=0.63%, ctx=660907, majf=1, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,614484,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=4
Run status group 0 (all jobs):
WRITE: bw=3705KiB/s (3794kB/s), 3705KiB/s-3705KiB/s (3794kB/s-3794kB/s),
io=2400MiB (2517MB), run=663425-663425msec
I didn't manage to produce any errors in the log files...
Just to be on the safe side, I have changed the dom0 memory to 4GB and limited
ZFS arc to 1GB thus:
xen_cmdline="dom0_mem=4092M dom0_max_vcpus=2 dom0=pvh console=com1,vga
com1=115200,8n1 guest_loglvl=all loglvl=all"
vfs.zfs.arc_max="1024M"
I've now re-created one of my domUs and I have not experienced any issues at all
this time. Of course I now don't know if it was the limiting of ZFS arc, the
increase in memory or both together that fixed it.
I will attempt further tests and update the list....
Thanks again.
Eric
--
____
/ . Eric A. Bautsch
/-- __ ___ ______________________________________
/ / / / /
(_____/____(___(__________________/ email: eric.bautsch@pobox.com
[-- Attachment #2 --]
0 *H
010
`He 0 *H
R0b00\Ĝ0
*H=0R10 UES10U
StartCom CA1-0+U$StartCom Certification Authority ECC0
170428080035Z
370428080035Z0i10 UES10U
StartCom CA1)0'U StartCom Certification Authority10UStartCom CC2 ICA0v0*H=+ "b {V/3~^ݝƳ\ɔ&4!,ȡXX{
!j?u]rL})$T[W֒o;W?r0n0m+a0_05+0)http://aia.startcomca.com/certs/cacc2.crt0&+0http://ocsp.startcomca.com0U<މ&ZWЯ0jFf0U0 0U#0l3 !~}0AU :0806U 0.0,+ http://www.startcomca.com/policy07U00.0,*(&http://crl.startcomca.com/sfscacc2.crl0U0U%0++0
*H=h 0e1 Q
w;Q7"?
&;ceA%syub|0%4pDWIU e%яGIT:Qo]t030Xa0
*H=0i10 UES10U
StartCom CA1)0'U StartCom Certification Authority10UStartCom CC2 ICA0
170703103729Z
190703024700Z0H1%0# *H
eric.bautsch@pobox.com10Ueric.bautsch@pobox.com0"0
*H
0
![G/!Hg#/ <%YX'jHQ:A4,V>jCm1xЍavliD:tDgoVU~J0w~@Bi;ϱ"Ϝcv{oG Z=aTȧa3t
!A'_S^aK@dm)}=OHV
j$SIa/T S@ 00t+h0f0<+00http://aia.startcomca.com/certs/sca.client22.crt0&+0http://ocsp.startcomca.com0UKS}.rNv.Cʜ0 U0 0U#0<މ&ZWЯ0jFf0HU A0?0=+70.0,+ http://www.startcomca.com/policy0;U40200.,*http://crl.startcomca.com/sca-client22.crl0U0U%0++0!U0eric.bautsch@pobox.com0
*H=i 0f1 /)xoe6c?&d_ݗ]dr611 ё$LL!Z{yUzsnw:P+&6~'8`Aa+00LmMY1lɋ^<Y0
*H
0}10 UIL10U
StartCom Ltd.1+0)U"Secure Digital Certificate Signing1)0'U StartCom Certification Authority0
170411073001Z
220411073001Z0R10 UES10U
StartCom CA1-0+U$StartCom Certification Authority ECC0v0*H=+ "b mSL=M> Z|)w/Csoxb"I7;4C&@ptAMpsnehU^e_X:
t}h!B2AZ#wr쫣0 0U0U002U+0)0'%#!http://crl.startssl.com/sfsca.crl0f+Z0X0$+0http://ocsp.startssl.com00+0$http://aia.startssl.com/certs/ca.crt0Ul3 !~}0U#0N@[i04hCA0
*H
:Q=QvO`Z|uxa!T1`PV ONG+b_B֫@ƈ_oEXD{@zy0e%LNlK~{@&Jz&#nR2]ik4q
qGYz%]z0bӎj?YuN]7 A`+2b?gF@TfiW
ٸyBY/4Ձc, }z'f`g؋=jX| Jyj^sʽP{.NmYYuB9"QBa$$DcvMڀ|6AzaC;EC'SD8rh=gRcқF4S>4":w2,A+nCkvhI+ ;8-Z0._3YOmӚu^+nE!~` 7I100u0i10 UES10U
StartCom CA1)0'U StartCom Certification Authority10UStartCom CC2 ICAXa0
`He 0 *H
1 *H
0 *H
1
190215072224Z0/ *H
1" ABGoq7qCcL q_c;ʅ>I0l *H
1_0]0 `He*0 `He0
*H
0*H
0
*H
@0+0
*H
(0 +71w0u0i10 UES10U
StartCom CA1)0'U StartCom Certification Authority10UStartCom CC2 ICAXa0*H
1wu0i10 UES10U
StartCom CA1)0'U StartCom Certification Authority10UStartCom CC2 ICAXa0
*H
65K`4Pgâq5侢DF_۽QL[ċǏkA+OfDY8@f.eٔ07+9^\}Kb&?oKJr+Q)H:`DzZ=u*( f.3RB
3#IY\*jhh&×λR9S}'e/rA.lƑ(ԫSי %@
FM+ˤi'rS".
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?234bf1db-b9e9-f30d-f966-5b4b6973fee7>
