From owner-freebsd-questions@freebsd.org Sat Jan 23 10:35:38 2021 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4309F4EFC75 for ; Sat, 23 Jan 2021 10:35:38 +0000 (UTC) (envelope-from 4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com) Received: from s1-b0c6.socketlabs.email-od.com (s1-b0c6.socketlabs.email-od.com [142.0.176.198]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DNCFj1CDLz3pd8 for ; Sat, 23 Jan 2021 10:35:36 +0000 (UTC) (envelope-from 4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com) DKIM-Signature: v=1; a=rsa-sha256; d=email-od.com;i=@email-od.com;s=dkim; c=relaxed/relaxed; q=dns/txt; t=1611398137; x=1613990137; h=content-transfer-encoding:content-type:mime-version:references:in-reply-to:message-id:subject:cc:to:from:date:x-thread-info; bh=n7GdP1W8sZKn7S9hEbmPLqonCdNEZQBIl3nR1fWGu4Y=; b=iNqwMmpmH9A9lRqRiKGeerpfiq0sgoMVK0/rzkEXvCY6QOJ6jsWUsyV2FXPVAHYLdAFZfxhzn7cx3OEpXFpwp+Sch3qLNurCBzIkt2FCksq1YP1lYDfEXxqgcTAdl4v9/PN6soZiBT7V0RLEmIyTnqIVwzWd19inS9oA7uPuRr0= X-Thread-Info: NDI1MC4xMi4xZDRjOTAwMDAzYzkzNTcuZnJlZWJzZC1xdWVzdGlvbnM9ZnJlZWJzZC5vcmc= Received: from r3.us-west-2.aws.in.socketlabs.com (r3.us-west-2.aws.in.socketlabs.com [142.0.190.3]) by mxsg2.email-od.com with ESMTP(version=Tls12 cipher=Aes256 bits=256); Sat, 23 Jan 2021 05:35:31 -0500 Received: from smtp.lan.sohara.org (EMTPY [185.202.17.215]) by r3.us-west-2.aws.in.socketlabs.com with ESMTP(version=Tls12 cipher=Aes256 bits=256); Sat, 23 Jan 2021 05:35:24 -0500 Received: from [192.168.63.1] (helo=steve.lan.sohara.org) by smtp.lan.sohara.org with smtp (Exim 4.94 (FreeBSD)) (envelope-from ) id 1l3GGI-0007yC-I8; Sat, 23 Jan 2021 10:35:22 +0000 Date: Sat, 23 Jan 2021 10:35:22 +0000 From: Steve O'Hara-Smith To: Polytropon Cc: freebsd-questions@freebsd.org Subject: Re: Convert PDF to Excel Message-Id: <20210123103522.196b43152a4a18f559c4dcf8@sohara.org> In-Reply-To: <20210123111441.b8c5de4e.freebsd@edvax.de> References: <20210123054209.f03ac420.freebsd@edvax.de> <20210123094041.f932fd4c.freebsd@edvax.de> <20210123090421.7fb3ede1754fe280b685f83c@sohara.org> <20210123111441.b8c5de4e.freebsd@edvax.de> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd12.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4DNCFj1CDLz3pd8 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=email-od.com header.s=dkim header.b=iNqwMmpm; dmarc=none; spf=pass (mx1.freebsd.org: domain of 4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com designates 142.0.176.198 as permitted sender) smtp.mailfrom=4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com X-Spamd-Result: default: False [-2.70 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[email-od.com:s=dkim]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:142.0.176.0/20]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[sohara.org]; RBL_DBL_DONT_QUERY_IPS(0.00)[142.0.176.198:from]; SPAMHAUS_ZRD(0.00)[142.0.176.198:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[email-od.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[142.0.176.198:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; FORGED_SENDER(0.30)[steve@sohara.org,4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:7381, ipnet:142.0.176.0/22, country:US]; FROM_NEQ_ENVFROM(0.00)[steve@sohara.org,4250.82.1d4c900003c9357.9a5dcb2ef2320b7126db9d4162d2e338@email-od.com]; MAILMAN_DEST(0.00)[freebsd-questions] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jan 2021 10:35:38 -0000 On Sat, 23 Jan 2021 11:14:41 +0100 Polytropon wrote: > On Sat, 23 Jan 2021 09:04:21 +0000, Steve O'Hara-Smith wrote: > > On Sat, 23 Jan 2021 09:40:41 +0100 > > Polytropon wrote: > > > > > They contain text, so the OCR problem is out of the way. > > > Sadly, the text is re-arranged so the optimal solution (one > > > line in a table equals one line of text, with the columns > > > being separated by whitespace) does not appear, instead it > > > is the other way round: one line equals one column. > > > > I spy a fun interview question buried in this problem - > > flipping a text file like that efficiently is far from easy - dead easy > > if you don't mind eating memory of course. > > The lesson to learn for this potential interview question > simply is RTFM; from "man pdftotext": -layout will try its Aw but where's the fun in that. There are very few interview questions for which the right answer isn't "use the tool that already exists" but that defeats the purpose of interview questions which is to watch the candidate squirm^Wthink. Over the years I've picked up a couple of gems without off the shelf or well known answers. -- Steve O'Hara-Smith | Directable Mirror Arrays C:\>WIN | A better way to focus the sun The computer obeys and wins. | licences available see You lose and Bill collects. | http://www.sohara.org/