Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Aug 2014 21:58:12 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <royger@freebsd.org>, src-committers@freebsd.org
Subject:   Re: svn commit: r269814 - head/sys/dev/xen/blkfront
Message-ID:  <53FF7BC4.6050801@FreeBSD.org>
In-Reply-To: <20140828184515.GV71691@funkthat.com>
References:  <53e8e31e.2179.30c1c657@svn.freebsd.org> <53FF7386.3050804@FreeBSD.org> <20140828184515.GV71691@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 28.08.2014 21:45, John-Mark Gurney wrote:
> Alexander Motin wrote this message on Thu, Aug 28, 2014 at 21:23 +0300:
>> Hi, Roger.
>>
>> It looks to me like this commit does not work as it should. I got
>> problem when I just tried `newfs /dev/ada0 ; mount /dev/ada0 /mnt`.
>> Somehow newfs does not produce valid filesystem. Problem is reliably
>> repeatable and reverting this commit fixes it.
>>
>> I found at least one possible cause there: If original data buffer is
>> unmapped, misaligned and not physically contiguous, then present x86
>> bus_dmamap_load_bio() implementation will process each physically
>> contiguous segment separately. Due to the misalignment first and last
>> physical segments may have size not multiple to 512 bytes. Since each
>> segment processed separately, they are not joined together, and
>> xbd_queue_cb() is getting segments not multiple to 512 bytes. Attempt to
>> convert them to exact number of sectors in the driver cause data corruption.
> 
> Are you sure this isn't a problem w/ the tag not properly specifying
> the correct alignement? 

I don't know how to specify it stronger then this:
        error = bus_dma_tag_create(
            bus_get_dma_tag(sc->xbd_dev),       /* parent */
            512, PAGE_SIZE,                     /* algnmnt, boundary */
            BUS_SPACE_MAXADDR,                  /* lowaddr */
            BUS_SPACE_MAXADDR,                  /* highaddr */
            NULL, NULL,                         /* filter, filterarg */
            sc->xbd_max_request_size,
            sc->xbd_max_request_segments,
            PAGE_SIZE,                          /* maxsegsize */
            BUS_DMA_ALLOCNOW,                   /* flags */
            busdma_lock_mutex,                  /* lockfunc */
            &sc->xbd_io_lock,                   /* lockarg */
            &sc->xbd_io_dmat);

> Also, I don't think there is a way for busdma
> to say that you MUST have a segment be a multiple of 512, though you
> could use a 512 boundary, but that would force all segments to only be
> 512 bytes...

As I understand, that is mandatory requirement for this "hardware".
Alike 4K alignment requirement also exist at least for SDHCI, and IIRC
UHCI/OHCI hardware. Even AHCI requires both segment addresses and
lengths to be even.

I may be wrong, but I think it is quite likely that hardware that
requires segment address alignment quite likely will have the same
requirements for segments length.

>> This is a bug of the busdma code, and until it is fixed I don't see
>> solution other then temporary reverting this commit.
>>
>> On 11.08.2014 18:37, Roger Pau Monné   wrote:
>>> Author: royger
>>> Date: Mon Aug 11 15:37:02 2014
>>> New Revision: 269814
>>> URL: http://svnweb.freebsd.org/changeset/base/269814
>>>
>>> Log:
>>>   blkfront: add support for unmapped IO
>>>   
>>>   Using unmapped IO is really beneficial when running inside of a VM,
>>>   since it avoids IPIs to other vCPUs in order to invalidate the
>>>   mappings.
>>>   
>>>   This patch adds unmapped IO support to blkfront. The following tests
>>>   results have been obtained when running on a Xen host without HAP:
>>>   
>>>   PVHVM
>>>        3165.84 real      6354.17 user      4483.32 sys
>>>   PVHVM with unmapped IO
>>>        2099.46 real      4624.52 user      2967.38 sys
>>>   
>>>   This is because when running using shadow page tables TLB flushes and
>>>   range invalidations are much more expensive, so using unmapped IO
>>>   provides a very important performance boost.
>>>   
>>>   Sponsored by:	Citrix Systems R&D
>>>   Tested by:	robak
>>>   MFC after:	1 week
>>>   PR:		191173
>>>   
>>>   dev/xen/blkfront/blkfront.c:
>>>    - Add and announce support for unmapped IO.
>>>
>>> Modified:
>>>   head/sys/dev/xen/blkfront/blkfront.c
>>>
>>> Modified: head/sys/dev/xen/blkfront/blkfront.c
>>> ==============================================================================
>>> --- head/sys/dev/xen/blkfront/blkfront.c	Mon Aug 11 15:06:07 2014	(r269813)
>>> +++ head/sys/dev/xen/blkfront/blkfront.c	Mon Aug 11 15:37:02 2014	(r269814)
>>> @@ -272,8 +272,12 @@ xbd_queue_request(struct xbd_softc *sc, 
>>>  {
>>>  	int error;
>>>  
>>> -	error = bus_dmamap_load(sc->xbd_io_dmat, cm->cm_map, cm->cm_data,
>>> -	    cm->cm_datalen, xbd_queue_cb, cm, 0);
>>> +	if (cm->cm_bp != NULL)
>>> +		error = bus_dmamap_load_bio(sc->xbd_io_dmat, cm->cm_map,
>>> +		    cm->cm_bp, xbd_queue_cb, cm, 0);
>>> +	else
>>> +		error = bus_dmamap_load(sc->xbd_io_dmat, cm->cm_map,
>>> +		    cm->cm_data, cm->cm_datalen, xbd_queue_cb, cm, 0);
>>>  	if (error == EINPROGRESS) {
>>>  		/*
>>>  		 * Maintain queuing order by freezing the queue.  The next
>>> @@ -333,8 +337,6 @@ xbd_bio_command(struct xbd_softc *sc)
>>>  	}
>>>  
>>>  	cm->cm_bp = bp;
>>> -	cm->cm_data = bp->bio_data;
>>> -	cm->cm_datalen = bp->bio_bcount;
>>>  	cm->cm_sector_number = (blkif_sector_t)bp->bio_pblkno;
>>>  
>>>  	switch (bp->bio_cmd) {
>>> @@ -993,7 +995,7 @@ xbd_instance_create(struct xbd_softc *sc
>>>  
>>>  	sc->xbd_disk->d_mediasize = sectors * sector_size;
>>>  	sc->xbd_disk->d_maxsize = sc->xbd_max_request_size;
>>> -	sc->xbd_disk->d_flags = 0;
>>> +	sc->xbd_disk->d_flags = DISKFLAG_UNMAPPED_BIO;
>>>  	if ((sc->xbd_flags & (XBDF_FLUSH|XBDF_BARRIER)) != 0) {
>>>  		sc->xbd_disk->d_flags |= DISKFLAG_CANFLUSHCACHE;
>>>  		device_printf(sc->xbd_dev,
> 


-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53FF7BC4.6050801>