[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] words words words



In article <20000518234029.A10131@teleport.com>,
Steven Alexander <stevena@teleport.com> wrote:
> What would be a fun (short | elegant | fast) way, given two files
> with nine- and three-letter words one per line, respectively, of
> listing only those nines which are the concatenation of three
> threes?
> 
> Anyone who wants to go so far as to do relative benchmarks
> should use the three- and nine-letter words from
> <http://www.geocities.com/TimesSquare/Castle/5057/TWL98.zip>

Here are a couple: (I've dumbed these down for the pre5.6.0'ers)
--------------------------
#!perl -w
use strict;

my (%threes, @nines, @matches);

open(F,'Twl98.txt');
while (<F>) {
  chomp;
  if (length==3) { undef $threes{$_} }
  elsif (length==9) { push @nines, $_ }
}

print "Read ", scalar keys %threes, " three letter words and ",
      scalar @nines, " nine letter words.\n";

@matches = grep exists $threes{substr($_,0,3)} &&
                exists $threes{substr($_,3,3)} &&
                exists $threes{substr($_,6,3)}, @nines;

print scalar @matches, " nine letter words matched using hash.\n";
--------------------------

and (this one requires a foray into CPAN):

--------------------------
#!perl -w
use strict;
use Regex::PreSuf;

my ($re, @threes, @nines, @matches);

open(F,'Twl98.txt');
while (<F>) {
  chomp;
  if (length==3) { push @threes, $_ }
  elsif (length==9) { push @nines, $_ }
}

print "Read ", scalar @threes, " three letter words and ",
      scalar @nines, " nine letter words.\n";

$re = '^(?:'.presuf(@threes).'){3}$';
@matches = grep /$re/o, @nines;

print scalar @matches, " nine letter words matched using regex: $re\n";
--------------------------

Ilya had said something about a Text::Trie module that might be
similar to Regex::PreSuf, but I couldn't find it.

For the curious but lazy, The output of this one is:
Read 972 three letter words and 24792 nine letter words.
1178 nine letter words matched using regular expression: (?:A(?:A[HLS]|B[AOSY]|C
[ET]|D[DOSZ]|F[FT]|G[AEO]|HA|I[DLMNRST]|L[ABELPST]|M[AIPU]|N[ADEITY]|P[ET]|R[BCE
FKMST]|S[HKPS]|T[ET]|UK|V[AEO]|W[AELN]|XE|Y[ES]|ZO)|B(?:A[ADGHLMNPRSTY]|E[DEGLNT
Y]|I[BDGNOSTZ]|O[ABDGOPSTWXY]|R[AOR]|U[BDGMNRSTY]|Y[ES])|C(?:A[BDMNPRTWY]|E[ELP]
|HI|IS|O[BDGLNOPRSTWXYZ]|RY|U[BDEMPRT]|WM)|D(?:A[BDGHKLMPWY]|E[BELNVWXY]|I[BDEGM
NPST]|O[CEGLMNRSTW]|RY|U[BDEGINOP]|YE)|E(?:A[RTU]|BB|CU|DH|EL|F[FST]|G[GO]|KE|L[
DFKLMS]|M[EFSU]|N[DGS]|ON|R[AEGNRS]|SS|T[AH]|VE|WE|YE)|F(?:A[DGNRSTXY]|E[DEHMNRT
UWYZ]|I[BDEGLNRTXZ]|L[UY]|O[BEGHNPRUXY]|R[OY]|U[BDGNR])|G(?:A[BDEGLMNPRSTY]|E[DE
LMNTY]|HI|I[BDEGNPT]|NU|O[ABDORTXY]|U[LMNTVY]|Y[MP])|H(?:A[DEGHJMOPSTWY]|E[HMNPR
STWXY]|I[CDEMNPST]|MM|O[BDEGNPTWY]|U[BEGHMNPT]|YP)|I(?:C[EHKY]|DS|F[FS]|L[KL]|MP
|N[KNS]|ON|R[EK]|SM|TS|VY)|J(?:A[BGMRWY]|E[ETUW]|I[BGN]|O[BEGTWY]|U[GNST])|K(?:A
[BEFSTY]|E[AFGNPXY]|HI|I[DFNPRT]|O[ABIPRS]|UE)|L(?:A[BCDGMPRSTVWXY]|E[ADEGIKTUVX
YZ]|I[BDENPST]|O[BGOPTWX]|U[GMVX]|YE)|M(?:A[CDEGNPRSTWXY]|E[DLMNTW]|HO|I[BDGLMRS
X]|O[ABCDGLMNOPRSTW]|U[DGMNST])|N(?:A[BEGHMNPWY]|E[BETW]|I[BLMPTX]|O[BDGHMORSTW]
|TH|U[BNST])|O(?:A[FKRT]|B[EI]|CA|D[DES]|ES|F[FT]|H[MOS]|IL|K[AE]|L[DE]|MS|N[ES]
|O[HT]|P[EST]|R[ABCEST]|SE|U[DRT]|VA|W[ELN]|X[OY])|P(?:A[CDHLMNPRSTWXY]|E[ACDEGH
NPRSTW]|H[IT]|I[ACEGNPSTUX]|LY|O[DHILMPTWX]|R[OY]|SI|U[BDGLNPRST]|Y[AEX])|Q(?:AT
|UA)|R(?:A[DGHJMNPSTWXY]|E[BCDEFGIMPSTVX]|HO|I[ABDFGMNP]|O[BCDEMTW]|U[BEGMNT]|Y[
AE])|S(?:A[BCDEGLPTUWXY]|E[ACEGILNRTWX]|H[AEHY]|I[BCMNPRSTX]|K[AIY]|LY|O[BDLNPST
UWXY]|P[AY]|RI|TY|U[BEMNPQ]|YN)|T(?:A[BDEGJMNOPRSTUVWX]|E[ADEGLNTW]|H[EOY]|I[CEL
NPST]|O[DEGMNOPRTWY]|RY|SK|U[BGINPTX]|W[AO]|YE)|U(?:DO|GH|KE|LU|M[MP]|NS|P[OS]|R
[BDN]|SE|T[AS])|V(?:A[CNRSTUVW]|E[EGTX]|I[AEGMS]|O[EWX]|UG)|W(?:A[BDEGNPRSTWXY]|
E[BDENT]|H[AOY]|I[GNSTZ]|O[EGKNOPSTW]|RY|UD|Y[EN])|XIS|Y(?:A[HKMPRWY]|E[AHNPSTW]
|I[DNP]|O[BDKMNUW]|U[KMP])|Z(?:A[GPX]|E[DEK]|I[GNPT]|O[AO]))

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe