Blame perlfilter.pod

Packit 745572
=head1 NAME
Packit 745572
Packit 745572
perlfilter - Source Filters
Packit 745572
Packit 745572
=head1 DESCRIPTION
Packit 745572
Packit 745572
This article is about a little-known feature of Perl called
Packit 745572
I<source filters>. Source filters alter the program text of a module
Packit 745572
before Perl sees it, much as a C preprocessor alters the source text of
Packit 745572
a C program before the compiler sees it. This article tells you more
Packit 745572
about what source filters are, how they work, and how to write your
Packit 745572
own.
Packit 745572
Packit 745572
The original purpose of source filters was to let you encrypt your
Packit 745572
program source to prevent casual piracy. This isn't all they can do, as
Packit 745572
you'll soon learn. But first, the basics.
Packit 745572
Packit 745572
=head1 CONCEPTS
Packit 745572
Packit 745572
Before the Perl interpreter can execute a Perl script, it must first
Packit 745572
read it from a file into memory for parsing and compilation. If that
Packit 745572
script itself includes other scripts with a C<use> or C<require>
Packit 745572
statement, then each of those scripts will have to be read from their
Packit 745572
respective files as well.
Packit 745572
Packit 745572
Now think of each logical connection between the Perl parser and an
Packit 745572
individual file as a I<source stream>. A source stream is created when
Packit 745572
the Perl parser opens a file, it continues to exist as the source code
Packit 745572
is read into memory, and it is destroyed when Perl is finished parsing
Packit 745572
the file. If the parser encounters a C<require> or C<use> statement in
Packit 745572
a source stream, a new and distinct stream is created just for that
Packit 745572
file.
Packit 745572
Packit 745572
The diagram below represents a single source stream, with the flow of
Packit 745572
source from a Perl script file on the left into the Perl parser on the
Packit 745572
right. This is how Perl normally operates.
Packit 745572
Packit 745572
    file -------> parser
Packit 745572
Packit 745572
There are two important points to remember:
Packit 745572
Packit 745572
=over 5
Packit 745572
Packit 745572
=item 1.
Packit 745572
Packit 745572
Although there can be any number of source streams in existence at any
Packit 745572
given time, only one will be active.
Packit 745572
Packit 745572
=item 2.
Packit 745572
Packit 745572
Every source stream is associated with only one file.
Packit 745572
Packit 745572
=back
Packit 745572
Packit 745572
A source filter is a special kind of Perl module that intercepts and
Packit 745572
modifies a source stream before it reaches the parser. A source filter
Packit 745572
changes our diagram like this:
Packit 745572
Packit 745572
    file ----> filter ----> parser
Packit 745572
Packit 745572
If that doesn't make much sense, consider the analogy of a command
Packit 745572
pipeline. Say you have a shell script stored in the compressed file
Packit 745572
I<trial.gz>. The simple pipeline command below runs the script without
Packit 745572
needing to create a temporary file to hold the uncompressed file.
Packit 745572
Packit 745572
    gunzip -c trial.gz | sh
Packit 745572
Packit 745572
In this case, the data flow from the pipeline can be represented as follows:
Packit 745572
Packit 745572
    trial.gz ----> gunzip ----> sh
Packit 745572
Packit 745572
With source filters, you can store the text of your script compressed and use a source filter to uncompress it for Perl's parser:
Packit 745572
Packit 745572
     compressed           gunzip
Packit 745572
    Perl program ---> source filter ---> parser
Packit 745572
Packit 745572
=head1 USING FILTERS
Packit 745572
Packit 745572
So how do you use a source filter in a Perl script? Above, I said that
Packit 745572
a source filter is just a special kind of module. Like all Perl
Packit 745572
modules, a source filter is invoked with a use statement.
Packit 745572
Packit 745572
Say you want to pass your Perl source through the C preprocessor before
Packit 745572
execution. As it happens, the source filters distribution comes with a C
Packit 745572
preprocessor filter module called Filter::cpp.
Packit 745572
Packit 745572
Below is an example program, C<cpp_test>, which makes use of this filter.
Packit 745572
Line numbers have been added to allow specific lines to be referenced
Packit 745572
easily.
Packit 745572
Packit 745572
    1: use Filter::cpp;
Packit 745572
    2: #define TRUE 1
Packit 745572
    3: $a = TRUE;
Packit 745572
    4: print "a = $a\n";
Packit 745572
Packit 745572
When you execute this script, Perl creates a source stream for the
Packit 745572
file. Before the parser processes any of the lines from the file, the
Packit 745572
source stream looks like this:
Packit 745572
Packit 745572
    cpp_test ---------> parser
Packit 745572
Packit 745572
Line 1, C<use Filter::cpp>, includes and installs the C<cpp> filter
Packit 745572
module. All source filters work this way. The use statement is compiled
Packit 745572
and executed at compile time, before any more of the file is read, and
Packit 745572
it attaches the cpp filter to the source stream behind the scenes. Now
Packit 745572
the data flow looks like this:
Packit 745572
Packit 745572
    cpp_test ----> cpp filter ----> parser
Packit 745572
Packit 745572
As the parser reads the second and subsequent lines from the source
Packit 745572
stream, it feeds those lines through the C<cpp> source filter before
Packit 745572
processing them. The C<cpp> filter simply passes each line through the
Packit 745572
real C preprocessor. The output from the C preprocessor is then
Packit 745572
inserted back into the source stream by the filter.
Packit 745572
Packit 745572
                  .-> cpp --.
Packit 745572
                  |         |
Packit 745572
                  |         |
Packit 745572
                  |       <-'
Packit 745572
   cpp_test ----> cpp filter ----> parser
Packit 745572
Packit 745572
The parser then sees the following code:
Packit 745572
Packit 745572
    use Filter::cpp;
Packit 745572
    $a = 1;
Packit 745572
    print "a = $a\n";
Packit 745572
Packit 745572
Let's consider what happens when the filtered code includes another
Packit 745572
module with use:
Packit 745572
Packit 745572
    1: use Filter::cpp;
Packit 745572
    2: #define TRUE 1
Packit 745572
    3: use Fred;
Packit 745572
    4: $a = TRUE;
Packit 745572
    5: print "a = $a\n";
Packit 745572
Packit 745572
The C<cpp> filter does not apply to the text of the Fred module, only
Packit 745572
to the text of the file that used it (C<cpp_test>). Although the use
Packit 745572
statement on line 3 will pass through the cpp filter, the module that
Packit 745572
gets included (C<Fred>) will not. The source streams look like this
Packit 745572
after line 3 has been parsed and before line 4 is parsed:
Packit 745572
Packit 745572
    cpp_test ---> cpp filter ---> parser (INACTIVE)
Packit 745572
Packit 745572
    Fred.pm ----> parser
Packit 745572
Packit 745572
As you can see, a new stream has been created for reading the source
Packit 745572
from C<Fred.pm>. This stream will remain active until all of C<Fred.pm>
Packit 745572
has been parsed. The source stream for C<cpp_test> will still exist,
Packit 745572
but is inactive. Once the parser has finished reading Fred.pm, the
Packit 745572
source stream associated with it will be destroyed. The source stream
Packit 745572
for C<cpp_test> then becomes active again and the parser reads line 4
Packit 745572
and subsequent lines from C<cpp_test>.
Packit 745572
Packit 745572
You can use more than one source filter on a single file. Similarly,
Packit 745572
you can reuse the same filter in as many files as you like.
Packit 745572
Packit 745572
For example, if you have a uuencoded and compressed source file, it is
Packit 745572
possible to stack a uudecode filter and an uncompression filter like
Packit 745572
this:
Packit 745572
Packit 745572
    use Filter::uudecode; use Filter::uncompress;
Packit 745572
    M'XL(".H<US4''V9I;F%L')Q;>7/;1I;_>_I3=&E=%:F*I"T?22Q/
Packit 745572
    M6]9*<IQCO*XFT"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=_[>
Packit 745572
    ...
Packit 745572
Packit 745572
Once the first line has been processed, the flow will look like this:
Packit 745572
Packit 745572
    file ---> uudecode ---> uncompress ---> parser
Packit 745572
               filter         filter
Packit 745572
Packit 745572
Data flows through filters in the same order they appear in the source
Packit 745572
file. The uudecode filter appeared before the uncompress filter, so the
Packit 745572
source file will be uudecoded before it's uncompressed.
Packit 745572
Packit 745572
=head1 WRITING A SOURCE FILTER
Packit 745572
Packit 745572
There are three ways to write your own source filter. You can write it
Packit 745572
in C, use an external program as a filter, or write the filter in Perl.
Packit 745572
I won't cover the first two in any great detail, so I'll get them out
Packit 745572
of the way first. Writing the filter in Perl is most convenient, so
Packit 745572
I'll devote the most space to it.
Packit 745572
Packit 745572
=head1 WRITING A SOURCE FILTER IN C
Packit 745572
Packit 745572
The first of the three available techniques is to write the filter
Packit 745572
completely in C. The external module you create interfaces directly
Packit 745572
with the source filter hooks provided by Perl.
Packit 745572
Packit 745572
The advantage of this technique is that you have complete control over
Packit 745572
the implementation of your filter. The big disadvantage is the
Packit 745572
increased complexity required to write the filter - not only do you
Packit 745572
need to understand the source filter hooks, but you also need a
Packit 745572
reasonable knowledge of Perl guts. One of the few times it is worth
Packit 745572
going to this trouble is when writing a source scrambler. The
Packit 745572
C<decrypt> filter (which unscrambles the source before Perl parses it)
Packit 745572
included with the source filter distribution is an example of a C
Packit 745572
source filter (see Decryption Filters, below).
Packit 745572
Packit 745572
Packit 745572
=over 5
Packit 745572
Packit 745572
=item B<Decryption Filters>
Packit 745572
Packit 745572
All decryption filters work on the principle of "security through
Packit 745572
obscurity." Regardless of how well you write a decryption filter and
Packit 745572
how strong your encryption algorithm is, anyone determined enough can
Packit 745572
retrieve the original source code. The reason is quite simple - once
Packit 745572
the decryption filter has decrypted the source back to its original
Packit 745572
form, fragments of it will be stored in the computer's memory as Perl
Packit 745572
parses it. The source might only be in memory for a short period of
Packit 745572
time, but anyone possessing a debugger, skill, and lots of patience can
Packit 745572
eventually reconstruct your program.
Packit 745572
Packit 745572
That said, there are a number of steps that can be taken to make life
Packit 745572
difficult for the potential cracker. The most important: Write your
Packit 745572
decryption filter in C and statically link the decryption module into
Packit 745572
the Perl binary. For further tips to make life difficult for the
Packit 745572
potential cracker, see the file I<decrypt.pm> in the source filters
Packit 745572
distribution.
Packit 745572
Packit 745572
=back
Packit 745572
Packit 745572
=head1 CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE
Packit 745572
Packit 745572
An alternative to writing the filter in C is to create a separate
Packit 745572
executable in the language of your choice. The separate executable
Packit 745572
reads from standard input, does whatever processing is necessary, and
Packit 745572
writes the filtered data to standard output. C<Filter::cpp> is an
Packit 745572
example of a source filter implemented as a separate executable - the
Packit 745572
executable is the C preprocessor bundled with your C compiler.
Packit 745572
Packit 745572
The source filter distribution includes two modules that simplify this
Packit 745572
task: C<Filter::exec> and C<Filter::sh>. Both allow you to run any
Packit 745572
external executable. Both use a coprocess to control the flow of data
Packit 745572
into and out of the external executable. (For details on coprocesses,
Packit 745572
see Stephens, W.R., "Advanced Programming in the UNIX Environment."
Packit 745572
Addison-Wesley, ISBN 0-210-56317-7, pages 441-445.) The difference
Packit 745572
between them is that C<Filter::exec> spawns the external command
Packit 745572
directly, while C<Filter::sh> spawns a shell to execute the external
Packit 745572
command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning
Packit 745572
a shell allows you to make use of the shell metacharacters and
Packit 745572
redirection facilities.
Packit 745572
Packit 745572
Here is an example script that uses C<Filter::sh>:
Packit 745572
Packit 745572
    use Filter::sh 'tr XYZ PQR';
Packit 745572
    $a = 1;
Packit 745572
    print "XYZ a = $a\n";
Packit 745572
Packit 745572
The output you'll get when the script is executed:
Packit 745572
Packit 745572
    PQR a = 1
Packit 745572
Packit 745572
Writing a source filter as a separate executable works fine, but a
Packit 745572
small performance penalty is incurred. For example, if you execute the
Packit 745572
small example above, a separate subprocess will be created to run the
Packit 745572
Unix C command. Each use of the filter requires its own subprocess.
Packit 745572
If creating subprocesses is expensive on your system, you might want to
Packit 745572
consider one of the other options for creating source filters.
Packit 745572
Packit 745572
=head1 WRITING A SOURCE FILTER IN PERL
Packit 745572
Packit 745572
The easiest and most portable option available for creating your own
Packit 745572
source filter is to write it completely in Perl. To distinguish this
Packit 745572
from the previous two techniques, I'll call it a Perl source filter.
Packit 745572
Packit 745572
To help understand how to write a Perl source filter we need an example
Packit 745572
to study. Here is a complete source filter that performs rot13
Packit 745572
decoding. (Rot13 is a very simple encryption scheme used in Usenet
Packit 745572
postings to hide the contents of offensive posts. It moves every letter
Packit 745572
forward thirteen places, so that A becomes N, B becomes O, and Z
Packit 745572
becomes M.)
Packit 745572
Packit 745572
Packit 745572
   package Rot13;
Packit 745572
Packit 745572
   use Filter::Util::Call;
Packit 745572
Packit 745572
   sub import {
Packit 745572
      my ($type) = @_;
Packit 745572
      my ($ref) = [];
Packit 745572
      filter_add(bless $ref);
Packit 745572
   }
Packit 745572
Packit 745572
   sub filter {
Packit 745572
      my ($self) = @_;
Packit 745572
      my ($status);
Packit 745572
Packit 745572
      tr/n-za-mN-ZA-M/a-zA-Z/
Packit 745572
         if ($status = filter_read()) > 0;
Packit 745572
      $status;
Packit 745572
   }
Packit 745572
Packit 745572
   1;
Packit 745572
Packit 745572
All Perl source filters are implemented as Perl classes and have the
Packit 745572
same basic structure as the example above.
Packit 745572
Packit 745572
First, we include the C<Filter::Util::Call> module, which exports a
Packit 745572
number of functions into your filter's namespace. The filter shown
Packit 745572
above uses two of these functions, C<filter_add()> and
Packit 745572
C<filter_read()>.
Packit 745572
Packit 745572
Next, we create the filter object and associate it with the source
Packit 745572
stream by defining the C<import> function. If you know Perl well
Packit 745572
enough, you know that C<import> is called automatically every time a
Packit 745572
module is included with a use statement. This makes C<import> the ideal
Packit 745572
place to both create and install a filter object.
Packit 745572
Packit 745572
In the example filter, the object (C<$ref>) is blessed just like any
Packit 745572
other Perl object. Our example uses an anonymous array, but this isn't
Packit 745572
a requirement. Because this example doesn't need to store any context
Packit 745572
information, we could have used a scalar or hash reference just as
Packit 745572
well. The next section demonstrates context data.
Packit 745572
Packit 745572
The association between the filter object and the source stream is made
Packit 745572
with the C<filter_add()> function. This takes a filter object as a
Packit 745572
parameter (C<$ref> in this case) and installs it in the source stream.
Packit 745572
Packit 745572
Finally, there is the code that actually does the filtering. For this
Packit 745572
type of Perl source filter, all the filtering is done in a method
Packit 745572
called C<filter()>. (It is also possible to write a Perl source filter
Packit 745572
using a closure. See the C<Filter::Util::Call> manual page for more
Packit 745572
details.) It's called every time the Perl parser needs another line of
Packit 745572
source to process. The C<filter()> method, in turn, reads lines from
Packit 745572
the source stream using the C<filter_read()> function.
Packit 745572
Packit 745572
If a line was available from the source stream, C<filter_read()>
Packit 745572
returns a status value greater than zero and appends the line to C<$_>.
Packit 745572
A status value of zero indicates end-of-file, less than zero means an
Packit 745572
error. The filter function itself is expected to return its status in
Packit 745572
the same way, and put the filtered line it wants written to the source
Packit 745572
stream in C<$_>. The use of C<$_> accounts for the brevity of most Perl
Packit 745572
source filters.
Packit 745572
Packit 745572
In order to make use of the rot13 filter we need some way of encoding
Packit 745572
the source file in rot13 format. The script below, C<mkrot13>, does
Packit 745572
just that.
Packit 745572
Packit 745572
    die "usage mkrot13 filename\n" unless @ARGV;
Packit 745572
    my $in = $ARGV[0];
Packit 745572
    my $out = "$in.tmp";
Packit 745572
    open(IN, "<$in") or die "Cannot open file $in: $!\n";
Packit 745572
    open(OUT, ">$out") or die "Cannot open file $out: $!\n";
Packit 745572
Packit 745572
    print OUT "use Rot13;\n";
Packit 745572
    while (<IN>) {
Packit 745572
       tr/a-zA-Z/n-za-mN-ZA-M/;
Packit 745572
       print OUT;
Packit 745572
    }
Packit 745572
Packit 745572
    close IN;
Packit 745572
    close OUT;
Packit 745572
    unlink $in;
Packit 745572
    rename $out, $in;
Packit 745572
Packit 745572
If we encrypt this with C<mkrot13>:
Packit 745572
Packit 745572
    print " hello fred \n";
Packit 745572
Packit 745572
the result will be this:
Packit 745572
Packit 745572
    use Rot13;
Packit 745572
    cevag "uryyb serq\a";
Packit 745572
Packit 745572
Running it produces this output:
Packit 745572
Packit 745572
    hello fred
Packit 745572
Packit 745572
=head1 USING CONTEXT: THE DEBUG FILTER
Packit 745572
Packit 745572
The rot13 example was a trivial example. Here's another demonstration
Packit 745572
that shows off a few more features.
Packit 745572
Packit 745572
Say you wanted to include a lot of debugging code in your Perl script
Packit 745572
during development, but you didn't want it available in the released
Packit 745572
product. Source filters offer a solution. In order to keep the example
Packit 745572
simple, let's say you wanted the debugging output to be controlled by
Packit 745572
an environment variable, C<DEBUG>. Debugging code is enabled if the
Packit 745572
variable exists, otherwise it is disabled.
Packit 745572
Packit 745572
Two special marker lines will bracket debugging code, like this:
Packit 745572
Packit 745572
    ## DEBUG_BEGIN
Packit 745572
    if ($year > 1999) {
Packit 745572
       warn "Debug: millennium bug in year $year\n";
Packit 745572
    }
Packit 745572
    ## DEBUG_END
Packit 745572
Packit 745572
The filter ensures that Perl parses the code between the <DEBUG_BEGIN>
Packit 745572
and C<DEBUG_END> markers only when the C<DEBUG> environment variable
Packit 745572
exists. That means that when C<DEBUG> does exist, the code above
Packit 745572
should be passed through the filter unchanged. The marker lines can
Packit 745572
also be passed through as-is, because the Perl parser will see them as
Packit 745572
comment lines. When C<DEBUG> isn't set, we need a way to disable the
Packit 745572
debug code. A simple way to achieve that is to convert the lines
Packit 745572
between the two markers into comments:
Packit 745572
Packit 745572
    ## DEBUG_BEGIN
Packit 745572
    #if ($year > 1999) {
Packit 745572
    #     warn "Debug: millennium bug in year $year\n";
Packit 745572
    #}
Packit 745572
    ## DEBUG_END
Packit 745572
Packit 745572
Here is the complete Debug filter:
Packit 745572
Packit 745572
    package Debug;
Packit 745572
Packit 745572
    use strict;
Packit 745572
    use warnings;
Packit 745572
    use Filter::Util::Call;
Packit 745572
Packit 745572
    use constant TRUE => 1;
Packit 745572
    use constant FALSE => 0;
Packit 745572
Packit 745572
    sub import {
Packit 745572
       my ($type) = @_;
Packit 745572
       my (%context) = (
Packit 745572
         Enabled => defined $ENV{DEBUG},
Packit 745572
         InTraceBlock => FALSE,
Packit 745572
         Filename => (caller)[1],
Packit 745572
         LineNo => 0,
Packit 745572
         LastBegin => 0,
Packit 745572
       );
Packit 745572
       filter_add(bless \%context);
Packit 745572
    }
Packit 745572
Packit 745572
    sub Die {
Packit 745572
       my ($self) = shift;
Packit 745572
       my ($message) = shift;
Packit 745572
       my ($line_no) = shift || $self->{LastBegin};
Packit 745572
       die "$message at $self->{Filename} line $line_no.\n"
Packit 745572
    }
Packit 745572
Packit 745572
    sub filter {
Packit 745572
       my ($self) = @_;
Packit 745572
       my ($status);
Packit 745572
       $status = filter_read();
Packit 745572
       ++ $self->{LineNo};
Packit 745572
Packit 745572
       # deal with EOF/error first
Packit 745572
       if ($status <= 0) {
Packit 745572
           $self->Die("DEBUG_BEGIN has no DEBUG_END")
Packit 745572
               if $self->{InTraceBlock};
Packit 745572
           return $status;
Packit 745572
       }
Packit 745572
Packit 745572
       if ($self->{InTraceBlock}) {
Packit 745572
          if (/^\s*##\s*DEBUG_BEGIN/ ) {
Packit 745572
              $self->Die("Nested DEBUG_BEGIN", $self->{LineNo})
Packit 745572
          } elsif (/^\s*##\s*DEBUG_END/) {
Packit 745572
              $self->{InTraceBlock} = FALSE;
Packit 745572
          }
Packit 745572
Packit 745572
          # comment out the debug lines when the filter is disabled
Packit 745572
          s/^/#/ if ! $self->{Enabled};
Packit 745572
       } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
Packit 745572
          $self->{InTraceBlock} = TRUE;
Packit 745572
          $self->{LastBegin} = $self->{LineNo};
Packit 745572
       } elsif ( /^\s*##\s*DEBUG_END/ ) {
Packit 745572
          $self->Die("DEBUG_END has no DEBUG_BEGIN", $self->{LineNo});
Packit 745572
       }
Packit 745572
       return $status;
Packit 745572
    }
Packit 745572
Packit 745572
    1;
Packit 745572
Packit 745572
The big difference between this filter and the previous example is the
Packit 745572
use of context data in the filter object. The filter object is based on
Packit 745572
a hash reference, and is used to keep various pieces of context
Packit 745572
information between calls to the filter function. All but two of the
Packit 745572
hash fields are used for error reporting. The first of those two,
Packit 745572
Enabled, is used by the filter to determine whether the debugging code
Packit 745572
should be given to the Perl parser. The second, InTraceBlock, is true
Packit 745572
when the filter has encountered a C<DEBUG_BEGIN> line, but has not yet
Packit 745572
encountered the following C<DEBUG_END> line.
Packit 745572
Packit 745572
If you ignore all the error checking that most of the code does, the
Packit 745572
essence of the filter is as follows:
Packit 745572
Packit 745572
    sub filter {
Packit 745572
       my ($self) = @_;
Packit 745572
       my ($status);
Packit 745572
       $status = filter_read();
Packit 745572
Packit 745572
       # deal with EOF/error first
Packit 745572
       return $status if $status <= 0;
Packit 745572
       if ($self->{InTraceBlock}) {
Packit 745572
          if (/^\s*##\s*DEBUG_END/) {
Packit 745572
             $self->{InTraceBlock} = FALSE
Packit 745572
          }
Packit 745572
Packit 745572
          # comment out debug lines when the filter is disabled
Packit 745572
          s/^/#/ if ! $self->{Enabled};
Packit 745572
       } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
Packit 745572
          $self->{InTraceBlock} = TRUE;
Packit 745572
       }
Packit 745572
       return $status;
Packit 745572
    }
Packit 745572
Packit 745572
Be warned: just as the C-preprocessor doesn't know C, the Debug filter
Packit 745572
doesn't know Perl. It can be fooled quite easily:
Packit 745572
Packit 745572
    print <
Packit 745572
    ##DEBUG_BEGIN
Packit 745572
    EOM
Packit 745572
Packit 745572
Such things aside, you can see that a lot can be achieved with a modest
Packit 745572
amount of code.
Packit 745572
Packit 745572
=head1 CONCLUSION
Packit 745572
Packit 745572
You now have better understanding of what a source filter is, and you
Packit 745572
might even have a possible use for them. If you feel like playing with
Packit 745572
source filters but need a bit of inspiration, here are some extra
Packit 745572
features you could add to the Debug filter.
Packit 745572
Packit 745572
First, an easy one. Rather than having debugging code that is
Packit 745572
all-or-nothing, it would be much more useful to be able to control
Packit 745572
which specific blocks of debugging code get included. Try extending the
Packit 745572
syntax for debug blocks to allow each to be identified. The contents of
Packit 745572
the C<DEBUG> environment variable can then be used to control which
Packit 745572
blocks get included.
Packit 745572
Packit 745572
Once you can identify individual blocks, try allowing them to be
Packit 745572
nested. That isn't difficult either.
Packit 745572
Packit 745572
Here is an interesting idea that doesn't involve the Debug filter.
Packit 745572
Currently Perl subroutines have fairly limited support for formal
Packit 745572
parameter lists. You can specify the number of parameters and their
Packit 745572
type, but you still have to manually take them out of the C<@_> array
Packit 745572
yourself. Write a source filter that allows you to have a named
Packit 745572
parameter list. Such a filter would turn this:
Packit 745572
Packit 745572
    sub MySub ($first, $second, @rest) { ... }
Packit 745572
Packit 745572
into this:
Packit 745572
Packit 745572
    sub MySub($$@) {
Packit 745572
       my ($first) = shift;
Packit 745572
       my ($second) = shift;
Packit 745572
       my (@rest) = @_;
Packit 745572
       ...
Packit 745572
    }
Packit 745572
Packit 745572
Finally, if you feel like a real challenge, have a go at writing a
Packit 745572
full-blown Perl macro preprocessor as a source filter. Borrow the
Packit 745572
useful features from the C preprocessor and any other macro processors
Packit 745572
you know. The tricky bit will be choosing how much knowledge of Perl's
Packit 745572
syntax you want your filter to have.
Packit 745572
Packit 745572
=head1 LIMITATIONS
Packit 745572
Packit 745572
Source filters only work on the string level, thus are highly limited
Packit 745572
in its ability to change source code on the fly. It cannot detect
Packit 745572
comments, quoted strings, heredocs, it is no replacement for a real
Packit 745572
parser.
Packit 745572
The only stable usage for source filters are encryption, compression,
Packit 745572
or the byteloader, to translate binary code back to source code.
Packit 745572
Packit 745572
See for example the limitations in L<Switch>, which uses source filters,
Packit 745572
and thus is does not work inside a string eval, the presence of
Packit 745572
regexes with embedded newlines that are specified with raw C</.../>
Packit 745572
delimiters and don't have a modifier C<//x> are indistinguishable from
Packit 745572
code chunks beginning with the division operator C. As a workaround
Packit 745572
you must use C<m/.../> or C<m?...?> for such patterns. Also, the presence of
Packit 745572
regexes specified with raw C delimiters may cause mysterious
Packit 745572
errors. The workaround is to use C<m?...?> instead.  See
Packit 745572
L<http://search.cpan.org/perldoc?Switch#LIMITATIONS>
Packit 745572
Packit 745572
Currently the content of the C<__DATA__> block is not filtered.
Packit 745572
Packit 745572
Currently internal buffer lengths are limited to 32-bit only.
Packit 745572
Packit 745572
Packit 745572
=head1 THINGS TO LOOK OUT FOR
Packit 745572
Packit 745572
=over 5
Packit 745572
Packit 745572
=item Some Filters Clobber the C<DATA> Handle
Packit 745572
Packit 745572
Some source filters use the C<DATA> handle to read the calling program.
Packit 745572
When using these source filters you cannot rely on this handle, nor expect
Packit 745572
any particular kind of behavior when operating on it.  Filters based on
Packit 745572
Filter::Util::Call (and therefore Filter::Simple) do not alter the C<DATA>
Packit 745572
filehandle, but on the other hand totally ignore the text after C<__DATA__>.
Packit 745572
Packit 745572
=back
Packit 745572
Packit 745572
=head1 REQUIREMENTS
Packit 745572
Packit 745572
The Source Filters distribution is available on CPAN, in 
Packit 745572
Packit 745572
    CPAN/modules/by-module/Filter
Packit 745572
Packit 745572
Starting from Perl 5.8 Filter::Util::Call (the core part of the
Packit 745572
Source Filters distribution) is part of the standard Perl distribution.
Packit 745572
Also included is a friendlier interface called Filter::Simple, by
Packit 745572
Damian Conway.
Packit 745572
Packit 745572
=head1 AUTHOR
Packit 745572
Packit 745572
Paul Marquess E<lt>Paul.Marquess@btinternet.comE<gt>
Packit 745572
Packit 745572
Reini Urban E<lt>rurban@cpan.orgE<gt>
Packit 745572
Packit 745572
=head1 Copyrights
Packit 745572
Packit 745572
The first version of this article originally appeared in The Perl
Packit 745572
Journal #11, and is copyright 1998 The Perl Journal. It appears
Packit 745572
courtesy of Jon Orwant and The Perl Journal.  This document may be
Packit 745572
distributed under the same terms as Perl itself.