From nobody Wed Apr 6 16:34:56 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 94E3B1A93BAB; Wed, 6 Apr 2022 16:35:01 +0000 (UTC) (envelope-from egoitz@ramattack.net) Received: from cu1208c.smtpx.saremail.com (cu1208c.smtpx.saremail.com [195.16.148.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4KYVVC4tX0z4ftP; Wed, 6 Apr 2022 16:34:59 +0000 (UTC) (envelope-from egoitz@ramattack.net) Received: from www.saremail.com (unknown [194.30.0.183]) by sieve-smtp-backend02.sarenet.es (Postfix) with ESMTPA id 4B1C660C60B; Wed, 6 Apr 2022 18:34:56 +0200 (CEST) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_9e84ea9eb28b05e81541398ce76d2803" Date: Wed, 06 Apr 2022 18:34:56 +0200 From: egoitz@ramattack.net To: Stefan Esser Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org, Rainer Duffner Subject: Re: {* 05.00 *}Re: Desperate with 870 QVO and ZFS In-Reply-To: References: <4e98275152e23141eae40dbe7ba5571f@ramattack.net> <665236B1-8F61-4B0E-BD9B-7B501B8BD617@ultra-secure.de> <0ef282aee34b441f1991334e2edbcaec@ramattack.net> Message-ID: X-Sender: egoitz@ramattack.net User-Agent: Saremail webmail X-Rspamd-Queue-Id: 4KYVVC4tX0z4ftP X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=reject) header.from=ramattack.net; spf=pass (mx1.freebsd.org: domain of egoitz@ramattack.net designates 195.16.148.183 as permitted sender) smtp.mailfrom=egoitz@ramattack.net X-Spamd-Result: default: False [-3.79 / 15.00]; RCVD_TLS_LAST(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; XM_UA_NO_VERSION(0.01)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:195.16.148.0/24]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ARC_NA(0.00)[]; RCPT_COUNT_FIVE(0.00)[5]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[ramattack.net,reject]; FROM_NO_DN(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; MLMMJ_DEST(0.00)[freebsd-fs,freebsd-hackers,freebsd-performance]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:3262, ipnet:195.16.128.0/19, country:ES]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[] X-ThisMailContainsUnwantedMimeParts: N --=_9e84ea9eb28b05e81541398ce76d2803 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Hi Stefan! Thank you so much for your answer!!. I do answer below in green bold for instance... for a better distinction.... Very thankful for all your comments Stefan!!! :) :) :) Cheers!! El 2022-04-06 17:43, Stefan Esser escribió: > ATENCION > ATENCION > ATENCION!!! Este correo se ha enviado desde fuera de la organizacion. No pinche en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y sepa que el contenido es seguro. > > Am 06.04.22 um 16:36 schrieb egoitz@ramattack.net: > >> Hi Rainer! >> >> Thank you so much for your help :) :) >> >> Well I assume they are in a datacenter and should not be a power outage.... >> >> About dataset size... yes... our ones are big... they can be 3-4 TB easily each >> dataset..... >> >> We bought them, because as they are for mailboxes and mailboxes grow and >> grow.... for having space for hosting them... > > Which mailbox format (e.g. mbox, maildir, ...) do you use? > > I'M RUNNING CYRUS IMAP SO SORT OF MAILDIR... TOO MANY LITTLE FILES NORMALLY..... SOMETIMES DIRECTORIES WITH TONS OF LITTLE FILES.... > >> We knew they had some speed issues, but those speed issues, we thought (as >> Samsung explains in the QVO site) they started after exceeding the speeding >> buffer this disks have. We though that meanwhile you didn't exceed it's >> capacity (the capacity of the speeding buffer) no speed problem arises. Perhaps >> we were wrong?. > > These drives are meant for small loads in a typical PC use case, > i.e. some installations of software in the few GB range, else only > files of a few MB being written, perhaps an import of media files > that range from tens to a few hundred MB at a time, but less often > than once a day. > > WE MOVE, YOU KNOW... LOTS OF LITTLE FILES... AND LOT'S OF DIFFERENT CONCURRENT MODIFICATIONS BY 1500-2000 CONCURRENT IMAP CONNECTIONS WE HAVE... > > As the SSD fills, the space available for the single level write > cache gets smaller > > THE SINGLE LEVEL WRITE CACHE IS THE CACHE THESE SSD DRIVERS HAVE, FOR COMPENSATING THE SPEED ISSUES THEY HAVE DUE TO USING QLC MEMORY?. DO YOU REFER TO THAT?. SORRY I DON'T UNDERSTAND WELL THIS PARAGRAPH. > > (on many SSDs, I have no numbers for this > particular device), and thus the amount of data that can be > written at single cell speed shrinks as the SSD gets full. > > I have just looked up the size of the SLC cache, it is specified > to be 78 GB for the empty SSD, 6 GB when it is full (for the 2 TB > version, smaller models will have a smaller SLC cache). > > ASSUMING YOU WERE TALKING ABOUT THE CACHE FOR COMPENSATING SPEED WE PREVIOUSLY COMMENTED, I SHOULD SAY THESE ARE THE 870 QVO BUT THE 8TB VERSION. SO THEY SHOULD HAVE THE BIGGEST CACHE FOR COMPENSATING THE SPEED ISSUES... > > But after writing those few GB at a speed of some 500 MB/s (i.e. > after 12 to 150 seconds), the drive will need several minutes to > transfer those writes to the quad-level cells, and will operate > at a fraction of the nominal performance during that time. > (QLC writes max out at 80 MB/s for the 1 TB model, 160 MB/s for the > 2 TB model.) > > WELL WE ARE IN THE 8TB MODEL. I THINK I HAVE UNDERSTOOD WHAT YOU WROTE IN PREVIOUS PARAGRAPH. YOU SAID THEY CAN BE FAST BUT NOT CONSTANTLY, BECAUSE LATER THEY HAVE TO WRITE ALL THAT TO THEIR PERPETUAL STORAGE FROM THE CACHE. AND THAT'S SLOW. AM I WRONG?. EVEN IN THE 8TB MODEL YOU THINK STEFAN?. > > THE MAIN PROBLEM WE ARE FACING IS THAT IN SOME PEAK MOMENTS, WHEN THE MACHINE SERVES CONNECTIONS FOR ALL THE INSTANCES IT HAS, AND ONLY AS SAID IN SOME PEAK MOMENTS... LIKE THE 09AM OR THE 11AM.... IT SEEMS THE MACHINE BECOMES SLOWER... AND LIKE IF THE DISKS WEREN'T ABLE TO SERVE ALL THEY HAVE TO SERVE.... IN THESE MOMENTS, NO BIG FILES ARE MOVED... BUT AS WE HAVE 1800-2000 CONCURRENT IMAP CONNECTIONS... NORMALLY THEY ARE DOING EACH ONE... LITTLE CHANGES IN THEIR MAILBOX. DO YOU THINK PERHAPS THIS DISKS THEN ARE NOT APPROPRIATE FOR THIS KIND OF USAGE?- > > And cheap SSDs often have no RAM cache (not checked, but I'd be > surprised if the QVO had one) and thus cannot keep bookkeeping date > in such a cache, further limiting the performance under load. > > THIS BROCHURE (HTTPS://SEMICONDUCTOR.SAMSUNG.COM/RESOURCES/BROCHURE/870_SERIES_BROCHURE.PDF AND THE DATASHEET HTTPS://SEMICONDUCTOR.SAMSUNG.COM/RESOURCES/DATA-SHEET/SAMSUNG_SSD_870_QVO_DATA_SHEET_REV1.1.PDF) SAIS IF I HAVE READ PROPERLY, THE 8TB DRIVE HAS 8GB OF RAM?. I ASSUME THAT IS WHAT THEY CALL THE TURBO WRITE CACHE?. > > And the resilience (max. amount of data written over its lifetime) > is also quite low - I hope those drives are used in some kind of > RAID configuration. > > YEP WE USE RAIDZ-2 > > The 870 QVO is specified for 370 full capacity > writes, i.e. 370 TB for the 1 TB model. That's still a few hundred > GB a day - but only if the write amplification stays in a reasonable > range ... > > WELL YES... 2880TB IN OUR CASE....NOT BAD.. ISN'T IT? --=_9e84ea9eb28b05e81541398ce76d2803 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8

Hi Stefan!


Thank you so much for your answer!!. I do answer below in green bold for= instance... for a better distinction....


Very thankful for all your comments Stefan!!! :) :) :)


Cheers!!

 


El 2022-04-06 17:43, Stefan Esser escribió:

= ATENCION
ATENCION
ATENCION!!! Este correo se ha enviado desde f= uera de la organizacion. No pinche en los enlaces ni abra los adjuntos a no= ser que reconozca el remitente y sepa que el contenido es seguro.
Am 06.04.22 um 16:36 schrieb egoitz@ramattack.net:
Hi Rainer!

Thank you so much for your hel= p :) :)

Well I assume they are in a datacenter and should not = be a power outage....

About dataset size... yes... our ones ar= e big... they can be 3-4 TB easily each
dataset.....

We = bought them, because as they are for mailboxes and mailboxes grow and
= grow.... for having space for hosting them...

Which mailbox format (e.g. mbox, maildir, ...) do you use?
=  
= I'm running Cyrus imap so sort of M= aildir... too many little files normally..... Sometimes directories with to= ns of little files....

We knew they had some speed issues, but those speed is= sues, we thought (as
Samsung explains in the QVO site) they started a= fter exceeding the speeding
buffer this disks have. We though that me= anwhile you didn't exceed it's
capacity (the capacity of the speeding= buffer) no speed problem arises. Perhaps
we were wrong?.
These drives are meant for small loads in a typical PC use case,
i.e. some installations of software in the few GB range, else only
= files of a few MB being written, perhaps an import of media files
th= at range from tens to a few hundred MB at a time, but less often
than= once a day.
=  
= We move, you know... lots of little= files... and lot's of different concurrent modifications by 1500-2000 conc= urrent imap connections we have...
=
As the SSD fills, the space available for the single level write
cache gets smaller
=  
= The single level write cache is the= cache these ssd drivers have, for compensating the speed issues they have = due to using qlc memory?. Do you refer to that?. Sorry I don't understand w= ell this paragraph.
=  
= (on many SSDs, I have no numbers for this
particular device), and thu= s the amount of data that can be
written at single cell speed shrinks= as the SSD gets full.
=  
=

I have just looked up the size of the SLC cache, it is specif= ied
to be 78 GB for the empty SSD, 6 GB when it is full (for the 2 TB=
version, smaller models will have a smaller SLC cache).
=  
= Assuming you were talking about the= cache for compensating speed we previously commented, I should say these a= re the 870 QVO but the 8TB version. So they should have the biggest cache f= or compensating the speed issues...
=  
=

But after writing those few GB at a speed of some 500 MB/s (i= =2Ee.
after 12 to 150 seconds), the drive will need several minutes t= o
transfer those writes to the quad-level cells, and will operate
at a fraction of the nominal performance during that time.
(QLC wr= ites max out at 80 MB/s for the 1 TB model, 160 MB/s for the
2 TB mod= el.)
=  
= Well we are in the 8TB model. I thi= nk I have understood what you wrote in previous paragraph. You said they ca= n be fast but not constantly, because later they have to write all that to = their perpetual storage from the cache. And that's slow. Am I wrong?. Even = in the 8TB model you think Stefan?.
=  
= The main problem we are facing is t= hat in some peak moments, when the machine serves connections for all the i= nstances it has, and only as said in some peak moments... like the 09am or = the 11am.... it seems the machine becomes slower... and like if the disks w= eren't able to serve all they have to serve.... In these moments, no big fi= les are moved... but as we have 1800-2000 concurrent imap connections... no= rmally they are doing each one... little changes in their mailbox. Do you t= hink perhaps this disks then are not appropriate for this kind of usage?-

And cheap SSDs often have no RAM cache (not che= cked, but I'd be
surprised if the QVO had one) and thus cannot keep b= ookkeeping date
in such a cache, further limiting the performance und= er load.
=  
= This brochure (https://semiconductor.samsung.com/resources/brochure/87= 0_Series_Brochure.pdf and the datasheet https://semiconductor.samsung= =2Ecom/resources/data-sheet/Samsung_SSD_870_QVO_Data_Sheet_Rev1.1.pdf) sais= if I have read properly, the 8TB drive has 8GB of ram?. I assume that is w= hat they call the turbo write cache?.

And the = resilience (max. amount of data written over its lifetime)
is also qu= ite low - I hope those drives are used in some kind of
RAID configura= tion.
=  
= Yep we use raidz-2<= /div>
=  
= The 870 QVO is specified for 370 full capacity
writes, i.e. 370 TB fo= r the 1 TB model. That's still a few hundred
GB a day - but only if t= he write amplification stays in a reasonable
range ...
=  
= Well yes... 2880TB in our case...= =2Enot bad.. isn't it?
--=_9e84ea9eb28b05e81541398ce76d2803--