Date: Sun, 21 Mar 2021 15:17:28 +0000 (UTC) From: "Jason W. Bacon" <jwb@FreeBSD.org> To: ports-committers@freebsd.org, svn-ports-all@freebsd.org, svn-ports-head@freebsd.org Subject: svn commit: r568922 - in head/biology: . vcf-split Message-ID: <202103211517.12LFHS9C036761@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: jwb Date: Sun Mar 21 15:17:27 2021 New Revision: 568922 URL: https://svnweb.freebsd.org/changeset/ports/568922 Log: biology/vcf-split: Split a multi-sample VCF into single-sample VCFs Vcf-split splits a multi-sample VCF into single-sample VCFs, writing thousands of output files simultaneously. Parsing the TOPMed human chromosome 1 BCF with bcftools takes two days, so extracting the 137,977 samples one at a time or using thousands of parallel readers of the same file is impractical. Vcf-split solves this by generating thousands of single-sample outputs during a single sweep through the multi-sample input. Added: head/biology/vcf-split/ head/biology/vcf-split/Makefile (contents, props changed) head/biology/vcf-split/distinfo (contents, props changed) head/biology/vcf-split/pkg-descr (contents, props changed) Modified: head/biology/Makefile Modified: head/biology/Makefile ============================================================================== --- head/biology/Makefile Sun Mar 21 15:10:17 2021 (r568921) +++ head/biology/Makefile Sun Mar 21 15:17:27 2021 (r568922) @@ -179,6 +179,7 @@ SUBDIR += trimadap SUBDIR += trimmomatic SUBDIR += ugene + SUBDIR += vcf-split SUBDIR += vcflib SUBDIR += vcftools SUBDIR += velvet Added: head/biology/vcf-split/Makefile ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/biology/vcf-split/Makefile Sun Mar 21 15:17:27 2021 (r568922) @@ -0,0 +1,23 @@ +# $FreeBSD$ + +PORTNAME= vcf-split +DISTVERSION= 0.1.1 +CATEGORIES= biology + +MAINTAINER= jwb@FreeBSD.org +COMMENT= Split a multi-sample VCF into single-sample VCFs + +LICENSE= BSD2CLAUSE +LICENSE_FILE= ${WRKSRC}/LICENSE + +BUILD_DEPENDS= biolibc>=0.1.1:biology/biolibc + +USE_GITHUB= yes +GH_ACCOUNT= auerlab + +PLIST_FILES= bin/vcf-split man/man1/vcf-split.1.gz + +pre-build: + (cd ${WRKSRC} && ${MAKE} LOCALBASE=${LOCALBASE} depend) + +.include <bsd.port.mk> Added: head/biology/vcf-split/distinfo ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/biology/vcf-split/distinfo Sun Mar 21 15:17:27 2021 (r568922) @@ -0,0 +1,3 @@ +TIMESTAMP = 1616331493 +SHA256 (auerlab-vcf-split-0.1.1_GH0.tar.gz) = 07fb3aff5bf6038b251baa6c0cbff0600487766838b497468ab06d300488f310 +SIZE (auerlab-vcf-split-0.1.1_GH0.tar.gz) = 14226 Added: head/biology/vcf-split/pkg-descr ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/biology/vcf-split/pkg-descr Sun Mar 21 15:17:27 2021 (r568922) @@ -0,0 +1,8 @@ +Vcf-split splits a multi-sample VCF into single-sample VCFs, writing thousands +of output files simultaneously. Parsing the TOPMed human chromosome 1 BCF +with bcftools takes two days, so extracting the 137,977 samples one at a time +or using thousands of parallel readers of the same file is impractical. +Vcf-split solves this by generating thousands of single-sample outputs during +a single sweep through the multi-sample input. + +WWW: https://github.com/auerlab/vcf-split
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202103211517.12LFHS9C036761>