[NLUUG]   Welcome to ftp.nluug.nl
Current directory: /ibiblio/distributions/CPAN/authors/id/P/PA/PALVARO/
 
Current bandwidth utilization 1160.16 Mbit/s
Bandwidth utilization bar
Contents of README:
NAME
    Bloom::Faster - Perl extension for the c library libbloom.

INSTALLATION
    see INSTALL

SYNOPSIS
      use Bloom::Faster;
  
      # m = ideal vector size.  
      # k = # of hash functions to use. 

      my $bloom = new Bloom::Faster({m => 1000000,k => 5});

      # this gives us very tight control of memory usage (a function of m)
      # and performance (a function of k).  but in most applications, we won't
      # know the optimal values of either of these.  for these cases, it is 
      # much easier to supply:
      #
      # n = number of expected elements to check for duplicates,
      # e = acceptable error rate (probability of false positive)
      #
      # my $bloom = new Bloom::Faster({n => 1000000, e => 0.00001});

      while (<>) {
            chomp;
            # Bloom::Faster->add() returns true when the value is a duplicate.
            if ($bloom->add($_)) {
                    print "DUP: $_\n";
            }
      }

DESCRIPTION
    Bloom filters are a lightweight duplicate detection algorithm proposed
    by Burton Bloom
    (http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal), with
    applications in stream data processing, among others. Bloom filters are
    a very cool thing. Where occasional false positives are acceptable,
    bloom filters give us the ability to detect duplicates in a fast and
    resource-friendly manner.

    The allocation of memory for the bit vector is handled in the c layer,
    but perl's oo capability handles the garbage collection. when a
    Bloom::Faster object goes out of scope, the vector pointed to by the c
    structure will be free()d. to manually do this, the DESTROY builtin
    method can be called.

    A bloom filter perl module is currently avaible on CPAN, but it is
    profoundly slow and cannot handle large vectors. This alternative uses a
    more efficient c library which can handle arbitrarily large vectors (up
    to the maximum size of a "long long" datatype (at least
    9223372036854775807, on supported systems ).

  EXPORT
    None by default.

  Exportable constants
      HASHCNT
      PRIME_SIZ
      SIZ

SEE ALSO
    libbbloom.so

AUTHOR
    Peter Alvaro and Dmitriy Ryaboy, <palvaro@ask.com>

COPYRIGHT AND LICENSE
    Copyright (C) 2006 by Peter Alvaro and Dmitriy Ryaboy

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself, either Perl version 5.8.5 or, at
    your option, any later version of Perl 5 you may have available.


Icon  Name                                         Last modified      Size  
[DIR] Parent Directory - [TXT] Bloom-Faster-1.3.1.meta 16-Mar-2007 20:08 312 [TXT] Bloom-Faster-1.3.1.readme 10-Mar-2007 00:16 2.5K [   ] Bloom-Faster-1.3.1.tar.gz 17-Mar-2007 07:25 8.5M [TXT] Bloom-Faster-1.3.meta 23-Feb-2007 06:48 310 [TXT] Bloom-Faster-1.3.readme 23-Feb-2007 06:45 2.5K [   ] Bloom-Faster-1.3.tar.gz 23-Feb-2007 06:51 477K [TXT] Bloom-Faster-1.4.meta 17-Mar-2007 08:44 310 [TXT] Bloom-Faster-1.4.readme 10-Mar-2007 00:16 2.5K [   ] Bloom-Faster-1.4.tar.gz 17-Mar-2007 08:48 602K [TXT] Bloom-Faster-1.6.2.meta 12-Jun-2010 23:05 312 [TXT] Bloom-Faster-1.6.2.readme 22-Jun-2009 02:19 2.5K [   ] Bloom-Faster-1.6.2.tar.gz 12-Jun-2010 23:16 21K [TXT] Bloom-Faster-1.6.meta 23-Jun-2009 04:41 307 [TXT] Bloom-Faster-1.6.readme 22-Jun-2009 02:31 2.5K [   ] Bloom-Faster-1.6.tar.gz 23-Jun-2009 04:42 22K [TXT] Bloom-Faster-1.7.meta 13-Jun-2010 00:06 310 [TXT] Bloom-Faster-1.7.readme 22-Jun-2009 02:19 2.5K [   ] Bloom-Faster-1.7.tar.gz 13-Jun-2010 00:17 21K [   ] CHECKSUMS 21-Nov-2021 23:55 4.5K [TXT] README 23-Feb-2007 18:54 2.5K

NLUUG - Open Systems. Open Standards
Become a member and get discounts on conferences and more, see the NLUUG website!