diff --git a/CHANGES b/CHANGES new file mode 100644 index 0000000..611edd0 --- /dev/null +++ b/CHANGES @@ -0,0 +1,306 @@ +0.80 Wed Jan 18 23:14:32 GMT 2017 +* Change Linux unzip heuristic to match FreeBSD's [rt#119905] + +0.78 Wed Jul 27 20:40:15 2016 +* CVE-2016-1238: avoid loading optional modules from default . +* Add additional heuristics for unzip on FreeBSD + +0.76 Sat Jul 4 18:44:42 2015 +* Resolve RT#105425, putting refs in $/ has been + fatal since v5.20.0 + +0.74 Thu Nov 20 10:03:50 2014 +* NetBSD's tar does not like some archives + on CPAN, prefer gtar if it is available. + +0.72 Sun Jan 26 16:21:07 2014 + +* On FreeBSD favour info-unzip if it is found +* Treat DragonflyBSD as a FreeBSD + +0.68 Fri Feb 15 22:57:09 2013 + +* Solaris' tar doesn't like some archives on + CPAN, prefer gtar if it is available + +0.66 Sat Feb 9 18:21:52 2013 + +* Treat midnightbsd the same as freebsd for + the purposes of finding Info-ZIP unzip + +Chnages for 0.64 Sat Jan 26 21:07:03 2013 + +* Add deprecation usage to warn if the + module is loaded from corelib. + Archive::Extract is leaving core with + v5.20.0, but will remain available from + CPAN. + +0.62 Wed Jan 9 21:38:10 2013 + +* OpenBSD's tar does not like some archives + on CPAN, prefer gtar if it is available. +* Correctly detect lack of /bin/tar (Smylers) + +0.60 Mon Feb 20 22:28:10 2012 + +* Work around an edge-case on Linux with + Busybox's unzip + +0.58 Sat Oct 22 20:25:00 2011 + +* Apply patch from Craig A. Berry [rt#71846] + make _untar_bin use Unix-syntax archive names + on VMS + +0.56 Tue Aug 23 15:55:52 2011 + +* Amend the MSWin32 fixes for 'unzip' to + work with Cygwin-based tools too. + +0.54 Mon Aug 22 11:52:18 2011 + +* Resolve issues on MSWin32 when 'unzip' is + found in PATH + +0.52 Thu Apr 28 20:56:27 2011 + +* Only use unzip from /usr/local/bin when on + FreeBSD which will be the Info-ZIP version + +0.50 Tue Apr 12 19:17:23 2011 + +* Resolve RT #56208 reported by Apocalypse + +0.48 Fri Jan 7 20:45:37 2011 + +* upstream blead patches from Peter Acklam + +0.46 Tue Oct 26 23:45:52 2010 + +* Correct speeling errors in docs, spotted by + Avar +* Only use unzip from /usr/pkg/bin when on + NetBSD which will be the Info-ZIP version + +0.44 Wed Sep 29 15:51:26 2010 + +* Apply a patch from brian d foy that adds a + debug() method for $DEBUG output. + +0.42 Mon Jun 28 19:35:17 2010 + +* Apply a patch from Robin Barker RT #56927 + "Unnecessary chdir/INC manipulation in test" + +0.40 Fri May 14 13:31:32 2010 + +* Add support for TZ files, Paul Marquess provided + the patches in RT #57387 +* modified the lzma logic to favour IO::Uncompress::Unlzma + Paul Marquess via RT #57387 + +0.38 Wed Jan 6 23:48:52 2010 + +* Apply a patch from Michael G Schwern RT #53246 + extract() is vulnerable to print globals. + +0.36 Tue Nov 24 10:27:16 2009 + +* Explicitly use Cwd's chdir + +0.34 Sat Jun 27 14:23:54 2009 + +* Attempt to address #46948: unable to install install on win32. + Binaries on Win32 are notoriously unreliable and Strawberry perl + ships with all the needed perl modules, so skip failed binary + tests on Win32. +* Address #47053: Use Archive::Tar iter to keep from read the + tar into memory. Thanks to Robert Krimen and Doug Wilson for + their patches. + +0.32 Fri Jun 12 13:23:11 2009 + +* Some tars have extra newlines in their '-x' output, + so grep for 'length' when parsing the file list, or + errors like this one can crop up: + http://www.nntp.perl.org/group/perl.cpan.testers/2009/03/msg3476639.html +* Add docs that A::E is not always thread safe. See RT #45671 + +0.31_03 Wed Mar 11 12:38:43 2009 + +* Newer versions of 'tar' (1.21 and up) now print record size + to STDERR as well if v OR t is given (used to be both). This + is a 'feature' according to the changelog, so we must now only + inspect STDOUT, otherwise, failures like these occur: + nntp.perl.org/group/perl.cpan.testers/2009/02/msg3230366.html + Thanks to tcallawa from RedHat for tracking this down. + +0.31_02 Tue Mar 3 17:01:26 2009 + +* Address: #43278: Explicitly tell Archive::Zip where to put the files + In certain edge cases, archive::zip would extract to dirname($0) instead + of cwd(); Now we tell Archive::Zip explicitly where to extract to. + +0.31_01 Fri Feb 6 15:39:13 2009 + +* Address #40138; .tar files on Win32 would not extract properly with + gnu tar (default on cygwin) under Cmd.exe/Native Win32 system calls. + Gnu tar interprets file names with a : in them, like C:\tmp\foo.tar + as a remote path, and attempts to execute rsh instead. + Archive::Extract now checks for gnu tar on Win32 and adds the + '--force-local' flag to disable this behaviour. +* Improved diagnostics from the test suite +* Enabled debugging in the test suite for development releases. + +0.30 Tue Dec 16 17:32:13 2008 + +* This is a test suite fix; users of 0.28 need not upgrade. +* Archive::Extract would not reliably skip binary extraction + tests when no binaries were found in $PATH, which causes + test suite failures. + +0.28 Fri Oct 10 15:22:51 2008 + +* Address: #39554: Must set $Archive::Tar::CHOWN, not + $Archive::Tar::Constant::CHOWN to control the chown() + functionality when using Archive::Tar. +* Various test suite improvements to test warnings + and errors emitted by Archive::Extract. + +0.27_02 Wed Sep 24 17:37:52 2008 + +* Test & build improvements: users of 0.27_01 need + not upgrade. +* Test suite would die if no files were reported + to be extracted, because Test::More::is() would + choke on a negative array index on an empty array + ref. +* Up dependency of IPC::Cmd to 0.41_05 to handle + pipes and whitespace better on shell out commands. + +0.27_01 Mon Sep 22 14:20:20 2008 + +* Address: #32751: incorrect error message + Multiple error messages are now joined and will only be + reset upon next ->extract() +* Test suite improvement to split binary from pure perl + solutions to test implementations without fallback. + +0.26 Sun Jan 27 16:31:20 2008 + +* Address #32370: Archive::Extract will hang if stdin + is a pipe. /bin/bunzip2 tried to read STDIN when printing + out it's '--version'. +* Address rt #32336 by a.r.ferreira@gmail.com to add .lzma + support to archive::extract +* Older versions of bunzip2 insist on a .bz2 suffix to extract + things. Guard against this, adapt the test suite and mention + it in the caveats. + +0.24 Thu Sep 20 15:05:55 2007 + +* Older versions of /bin/bunzip2 only support archives + with a .bz2 suffix. Attempt to detect this with a new + method ->have_old_bunzip2, and skip tests accordingly. + Also document this in the CAVEATS section of the docs. +* Apply core change 31904: Clean up test files left + behind (on Win32, at least). + +0.23_01 Sun Sep 9 12:39:08 2007 + +* Address Update ticket #27991 (Archive::Extract and + very large tar.gz files). Add a CAVEATS entry explianing + how to deal with large files & memory consumption +* VMS shell out & test fixes by John Malmberg + +0.22 Tue May 29 21:18:46 2007 + +* Manifest didn't include one of the source files + for tests. If you have 0.20 installed there is no + need to upgrade. This is just a test suite fix. + +0.20 Mon May 28 12:06:33 2007 + +* Address #27195: Implement support for .Z files. +* Small POD fixes +* Disable some test diagnostics under perl core tests + +0.18 Wed Apr 11 21:20:53 2007 + +* Starting 5.9.5, this module installs itself in + the core-perl dir, rather than site-perl, as it's + now part of core. +* Reverse '$PREFER_BIN' conditional for tar/tar.gz + files. The current logic was actually exactly the + wrong way around. +* Small regex change to tar/tar.gz detection. + +0.16 Fri Jan 26 11:07:24 2007 + +* address #24578: Wrong check for `type' argument +* add new method ->types that returns a list of + all supported archive types +* promote 0.15_03 to stable + +0.15_03 Thu Jan 4 14:46:58 2007 + +* address #23999: Attempt to generate Makefile.PL + gone awry where paths starting with a '.' + are sometimes reported to be extracted in '.' + as well rather than their actual extraction dir. + +0.15_02 Sun Nov 26 18:06:02 2006 + +* fix testing logic error, that assumed extracted + files were always logged when we did not prefer + binaries to extract with. However, that does not + mean the perl module is available, so we might + not have buffers after all. +* make sure we dont have test count mismatches + between buffers/non-buffers + +0.15_01 Fri Nov 24 15:07:44 2006 + +* Apply patch from Paul Marquess to add pure + perl support for bzip2 encoded files. This + support requires you to have the module + IO::Uncompress::Bunzip2 installed. +* Error storage is now per object like it should + be rather than class wide. +* Update tests to not skip unnecessarily +* Up required version of IPC::Cmd to 0.36 for + 5.6.x compatibility +* Improve diagnostics in the test suite when + run verbosely + +0.14 Fri Oct 20 14:37:36 2006 + +* Tweak tests to play nicer on win32 +* Up required version of IPC::Cmd + +0.12 Sun Aug 13 14:55:54 2006 + +* Add diagnostics during the test suite informing + the user that File::Spec warnings may be + generated by Archive::Zip (see rt #19713). + These are safe to ignore. + +0.11_02 Thu Aug 3 14:18:16 2006 + +* attempt to address #19815: make test fail under Cygwin + again. Unzip giving trouble this time. Adding -o + option to force overwriting of old files +* address #20829: test failure on Win32: no bunzip2? + bunzip2 detection wasn't working properly in the + test suite so tests weren't skipped on missing + bunzip2 + +0.11_01 Wed Aug 2 13:04:37 2006 + +* attempt to address #19815: make test fail under Cygwin + by adding more sanity checks about buffers + +0.11 Wed Aug 2 11:15:48 2006 + +* add tentative bzip2 support diff --git a/MANIFEST b/MANIFEST new file mode 100644 index 0000000..cc3b41d --- /dev/null +++ b/MANIFEST @@ -0,0 +1,37 @@ +CHANGES +lib/Archive/Extract.pm +Makefile.PL +MANIFEST This list of files +README +t/01_Archive-Extract.t +t/src/double_dir.zip +t/src/x.bz2 +t/src/x.gz +t/src/x.ear +t/src/x.jar +t/src/x.par +t/src/x.tar +t/src/x.tar.gz +t/src/x.tar.xz +t/src/x.tgz +t/src/x.txz +t/src/x.war +t/src/x.xz +t/src/x.Z +t/src/x.Z +t/src/x.zip +t/src/x.lzma +t/src/y.ear +t/src/y.jar +t/src/y.par +t/src/y.tar +t/src/y.tar.bz2 +t/src/y.tar.gz +t/src/y.tar.xz +t/src/y.tbz +t/src/y.tgz +t/src/y.txz +t/src/y.war +t/src/y.zip +META.yml Module YAML meta-data (added by MakeMaker) +META.json Module JSON meta-data (added by MakeMaker) diff --git a/META.json b/META.json new file mode 100644 index 0000000..4264d23 --- /dev/null +++ b/META.json @@ -0,0 +1,55 @@ +{ + "abstract" : "Generic archive extracting mechanism", + "author" : [ + "Jos Boumans " + ], + "dynamic_config" : 1, + "generated_by" : "ExtUtils::MakeMaker version 7.24, CPAN::Meta::Converter version 2.150010", + "license" : [ + "perl_5" + ], + "meta-spec" : { + "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", + "version" : "2" + }, + "name" : "Archive-Extract", + "no_index" : { + "directory" : [ + "t", + "inc" + ] + }, + "prereqs" : { + "build" : { + "requires" : { + "ExtUtils::MakeMaker" : "0" + } + }, + "configure" : { + "requires" : { + "ExtUtils::MakeMaker" : "0" + } + }, + "runtime" : { + "requires" : { + "File::Basename" : "0", + "File::Path" : "0", + "File::Spec" : "0.82", + "IPC::Cmd" : "0.64", + "Locale::Maketext::Simple" : "0", + "Module::Load::Conditional" : "0.66", + "Params::Check" : "0.07", + "Test::More" : "0", + "if" : "0" + } + } + }, + "release_status" : "stable", + "resources" : { + "repository" : { + "url" : "https://github.com/jib/archive-extract" + } + }, + "version" : "0.80", + "x_serialization_backend" : "JSON::PP version 2.27400" +} diff --git a/META.yml b/META.yml new file mode 100644 index 0000000..03b5252 --- /dev/null +++ b/META.yml @@ -0,0 +1,33 @@ +--- +abstract: 'Generic archive extracting mechanism' +author: + - 'Jos Boumans ' +build_requires: + ExtUtils::MakeMaker: '0' +configure_requires: + ExtUtils::MakeMaker: '0' +dynamic_config: 1 +generated_by: 'ExtUtils::MakeMaker version 7.24, CPAN::Meta::Converter version 2.150010' +license: perl +meta-spec: + url: http://module-build.sourceforge.net/META-spec-v1.4.html + version: '1.4' +name: Archive-Extract +no_index: + directory: + - t + - inc +requires: + File::Basename: '0' + File::Path: '0' + File::Spec: '0.82' + IPC::Cmd: '0.64' + Locale::Maketext::Simple: '0' + Module::Load::Conditional: '0.66' + Params::Check: '0.07' + Test::More: '0' + if: '0' +resources: + repository: https://github.com/jib/archive-extract +version: '0.80' +x_serialization_backend: 'CPAN::Meta::YAML version 0.018' diff --git a/Makefile.PL b/Makefile.PL new file mode 100644 index 0000000..509e74a --- /dev/null +++ b/Makefile.PL @@ -0,0 +1,51 @@ +use ExtUtils::MakeMaker; +use strict; + +WriteMakefile1( + LICENSE => 'perl', + META_MERGE => { + resources => { + repository => 'https://github.com/jib/archive-extract', + }, + }, + NAME => 'Archive::Extract', + VERSION_FROM => 'lib/Archive/Extract.pm', # finds $VERSION + dist => { COMPRESS => 'gzip -9f', SUFFIX => 'gz' }, + PREREQ_PM => { + 'if' => 0, + 'IPC::Cmd' => 0.64, # pipe fix + 'Test::More' => 0, + 'File::Spec' => 0.82, + 'File::Path' => 0, + 'File::Basename' => 0, + 'Params::Check' => 0.07, + 'Module::Load::Conditional' => 0.66, + 'Locale::Maketext::Simple' => 0, + }, + INSTALLDIRS => ( $] >= 5.009005 && $] < 5.012 ? 'perl' : 'site' ), + AUTHOR => 'Jos Boumans ', + ABSTRACT => 'Generic archive extracting mechanism' +); + +sub WriteMakefile1 { #Written by Alexandr Ciornii, version 0.21. Added by eumm-upgrade. + my %params=@_; + my $eumm_version=$ExtUtils::MakeMaker::VERSION; + $eumm_version=eval $eumm_version; + die "EXTRA_META is deprecated" if exists $params{EXTRA_META}; + die "License not specified" if not exists $params{LICENSE}; + if ($params{BUILD_REQUIRES} and $eumm_version < 6.5503) { + #EUMM 6.5502 has problems with BUILD_REQUIRES + $params{PREREQ_PM}={ %{$params{PREREQ_PM} || {}} , %{$params{BUILD_REQUIRES}} }; + delete $params{BUILD_REQUIRES}; + } + delete $params{CONFIGURE_REQUIRES} if $eumm_version < 6.52; + delete $params{MIN_PERL_VERSION} if $eumm_version < 6.48; + delete $params{META_MERGE} if $eumm_version < 6.46; + delete $params{META_ADD} if $eumm_version < 6.46; + delete $params{LICENSE} if $eumm_version < 6.31; + delete $params{AUTHOR} if $] < 5.005; + delete $params{ABSTRACT_FROM} if $] < 5.005; + delete $params{BINARY_LOCATION} if $] < 5.005; + + WriteMakefile(%params); +} diff --git a/README b/README new file mode 100644 index 0000000..267bbe1 --- /dev/null +++ b/README @@ -0,0 +1,40 @@ +This is the README file for Archive::Extract, a perl module for +generic archive extraction + +Please refer to 'perldoc Archive::Extract' after installation for details. + +##################################################################### + +* Description + +Archive::Extract + + Archive::Extract is a generic archive extraction mechanism. + + It allows you to extract .tgz, .tar, .gz and .zip files, using + either perl modules or commandline tools + +##################################################################### + +* Installation + +Archive::Extract follows the standard perl module install process + +perl Makefile.PL +make +make test +make install + +The module uses no C or XS parts, so no c-compiler is required. + +###################################################################### + +AUTHOR + This module by Jos Boumans . + +COPYRIGHT + This module is copyright (c) 2002 Jos Boumans . All + rights reserved. + + This library is free software; you may redistribute and/or modify it + under the same terms as Perl itself. diff --git a/lib/Archive/Extract.pm b/lib/Archive/Extract.pm new file mode 100644 index 0000000..4348dee --- /dev/null +++ b/lib/Archive/Extract.pm @@ -0,0 +1,1731 @@ +package Archive::Extract; +use if $] > 5.017, 'deprecate'; + +use strict; + +use Cwd qw[cwd chdir]; +use Carp qw[carp]; +use IPC::Cmd qw[run can_run]; +use FileHandle; +use File::Path qw[mkpath]; +use File::Spec; +use File::Basename qw[dirname basename]; +use Params::Check qw[check]; +use Module::Load::Conditional qw[can_load check_install]; +use Locale::Maketext::Simple Style => 'gettext'; + +### solaris has silly /bin/tar output ### +use constant ON_SOLARIS => $^O eq 'solaris' ? 1 : 0; +use constant ON_NETBSD => $^O eq 'netbsd' ? 1 : 0; +use constant ON_OPENBSD => $^O =~ m!^(openbsd|bitrig)$! ? 1 : 0; +use constant ON_FREEBSD => $^O =~ m!^(free|midnight|dragonfly)(bsd)?$! ? 1 : 0; +use constant ON_LINUX => $^O eq 'linux' ? 1 : 0; +use constant FILE_EXISTS => sub { -e $_[0] ? 1 : 0 }; + +### VMS may require quoting upper case command options +use constant ON_VMS => $^O eq 'VMS' ? 1 : 0; + +### Windows needs special treatment of Tar options +use constant ON_WIN32 => $^O eq 'MSWin32' ? 1 : 0; + +### we can't use this extraction method, because of missing +### modules/binaries: +use constant METHOD_NA => []; + +### If these are changed, update @TYPES and the new() POD +use constant TGZ => 'tgz'; +use constant TAR => 'tar'; +use constant GZ => 'gz'; +use constant ZIP => 'zip'; +use constant BZ2 => 'bz2'; +use constant TBZ => 'tbz'; +use constant Z => 'Z'; +use constant LZMA => 'lzma'; +use constant XZ => 'xz'; +use constant TXZ => 'txz'; + +use vars qw[$VERSION $PREFER_BIN $PROGRAMS $WARN $DEBUG + $_ALLOW_BIN $_ALLOW_PURE_PERL $_ALLOW_TAR_ITER + ]; + +$VERSION = '0.80'; +$PREFER_BIN = 0; +$WARN = 1; +$DEBUG = 0; +$_ALLOW_PURE_PERL = 1; # allow pure perl extractors +$_ALLOW_BIN = 1; # allow binary extractors +$_ALLOW_TAR_ITER = 1; # try to use Archive::Tar->iter if available + +# same as all constants +my @Types = ( TGZ, TAR, GZ, ZIP, BZ2, TBZ, Z, LZMA, XZ, TXZ ); + +local $Params::Check::VERBOSE = $Params::Check::VERBOSE = 1; +local $Module::Load::Conditional::FORCE_SAFE_INC = 1; + +=pod + +=head1 NAME + +Archive::Extract - A generic archive extracting mechanism + +=head1 SYNOPSIS + + use Archive::Extract; + + ### build an Archive::Extract object ### + my $ae = Archive::Extract->new( archive => 'foo.tgz' ); + + ### extract to cwd() ### + my $ok = $ae->extract; + + ### extract to /tmp ### + my $ok = $ae->extract( to => '/tmp' ); + + ### what if something went wrong? + my $ok = $ae->extract or die $ae->error; + + ### files from the archive ### + my $files = $ae->files; + + ### dir that was extracted to ### + my $outdir = $ae->extract_path; + + + ### quick check methods ### + $ae->is_tar # is it a .tar file? + $ae->is_tgz # is it a .tar.gz or .tgz file? + $ae->is_gz; # is it a .gz file? + $ae->is_zip; # is it a .zip file? + $ae->is_bz2; # is it a .bz2 file? + $ae->is_tbz; # is it a .tar.bz2 or .tbz file? + $ae->is_lzma; # is it a .lzma file? + $ae->is_xz; # is it a .xz file? + $ae->is_txz; # is it a .tar.xz or .txz file? + + ### absolute path to the archive you provided ### + $ae->archive; + + ### commandline tools, if found ### + $ae->bin_tar # path to /bin/tar, if found + $ae->bin_gzip # path to /bin/gzip, if found + $ae->bin_unzip # path to /bin/unzip, if found + $ae->bin_bunzip2 # path to /bin/bunzip2 if found + $ae->bin_unlzma # path to /bin/unlzma if found + $ae->bin_unxz # path to /bin/unxz if found + +=head1 DESCRIPTION + +Archive::Extract is a generic archive extraction mechanism. + +It allows you to extract any archive file of the type .tar, .tar.gz, +.gz, .Z, tar.bz2, .tbz, .bz2, .zip, .xz,, .txz, .tar.xz or .lzma +without having to worry how it +does so, or use different interfaces for each type by using either +perl modules, or commandline tools on your system. + +See the C section further down for details. + +=cut + + +### see what /bin/programs are available ### +$PROGRAMS = {}; +CMD: for my $pgm (qw[tar unzip gzip bunzip2 uncompress unlzma unxz]) { + if ( $pgm eq 'unzip' and ON_FREEBSD and my $unzip = can_run('info-unzip') ) { + $PROGRAMS->{$pgm} = $unzip; + next CMD; + } + if ( $pgm eq 'unzip' and ( ON_FREEBSD || ON_LINUX ) ) { + local $IPC::Cmd::INSTANCES = 1; + ($PROGRAMS->{$pgm}) = grep { _is_infozip_esque($_) } can_run($pgm); + next CMD; + } + if ( $pgm eq 'unzip' and ON_NETBSD ) { + local $IPC::Cmd::INSTANCES = 1; + ($PROGRAMS->{$pgm}) = grep { m!/usr/pkg/! } can_run($pgm); + next CMD; + } + if ( $pgm eq 'tar' and ( ON_OPENBSD || ON_SOLARIS || ON_NETBSD ) ) { + # try gtar first + next CMD if $PROGRAMS->{$pgm} = can_run('gtar'); + } + $PROGRAMS->{$pgm} = can_run($pgm); +} + +### mapping from types to extractor methods ### +my $Mapping = { # binary program # pure perl module + is_tgz => { bin => '_untar_bin', pp => '_untar_at' }, + is_tar => { bin => '_untar_bin', pp => '_untar_at' }, + is_gz => { bin => '_gunzip_bin', pp => '_gunzip_cz' }, + is_zip => { bin => '_unzip_bin', pp => '_unzip_az' }, + is_tbz => { bin => '_untar_bin', pp => '_untar_at' }, + is_bz2 => { bin => '_bunzip2_bin', pp => '_bunzip2_bz2'}, + is_Z => { bin => '_uncompress_bin', pp => '_gunzip_cz' }, + is_lzma => { bin => '_unlzma_bin', pp => '_unlzma_cz' }, + is_xz => { bin => '_unxz_bin', pp => '_unxz_cz' }, + is_txz => { bin => '_untar_bin', pp => '_untar_at' }, +}; + +{ ### use subs so we re-generate array refs etc for the no-override flags + ### if we don't, then we reuse the same arrayref, meaning objects store + ### previous errors + my $tmpl = { + archive => sub { { required => 1, allow => FILE_EXISTS } }, + type => sub { { default => '', allow => [ @Types ] } }, + _error_msg => sub { { no_override => 1, default => [] } }, + _error_msg_long => sub { { no_override => 1, default => [] } }, + }; + + ### build accessors ### + for my $method( keys %$tmpl, + qw[_extractor _gunzip_to files extract_path], + ) { + no strict 'refs'; + *$method = sub { + my $self = shift; + $self->{$method} = $_[0] if @_; + return $self->{$method}; + } + } + +=head1 METHODS + +=head2 $ae = Archive::Extract->new(archive => '/path/to/archive',[type => TYPE]) + +Creates a new C object based on the archive file you +passed it. Automatically determines the type of archive based on the +extension, but you can override that by explicitly providing the +C argument. + +Valid values for C are: + +=over 4 + +=item tar + +Standard tar files, as produced by, for example, C. +Corresponds to a C<.tar> suffix. + +=item tgz + +Gzip compressed tar files, as produced by, for example C. +Corresponds to a C<.tgz> or C<.tar.gz> suffix. + +=item gz + +Gzip compressed file, as produced by, for example C. +Corresponds to a C<.gz> suffix. + +=item Z + +Lempel-Ziv compressed file, as produced by, for example C. +Corresponds to a C<.Z> suffix. + +=item zip + +Zip compressed file, as produced by, for example C. +Corresponds to a C<.zip>, C<.jar> or C<.par> suffix. + +=item bz2 + +Bzip2 compressed file, as produced by, for example, C. +Corresponds to a C<.bz2> suffix. + +=item tbz + +Bzip2 compressed tar file, as produced by, for example C. +Corresponds to a C<.tbz> or C<.tar.bz2> suffix. + +=item lzma + +Lzma compressed file, as produced by C. +Corresponds to a C<.lzma> suffix. + +=item xz + +Xz compressed file, as produced by C. +Corresponds to a C<.xz> suffix. + +=item txz + +Xz compressed tar file, as produced by, for example C. +Corresponds to a C<.txz> or C<.tar.xz> suffix. + +=back + +Returns a C object on success, or false on failure. + +=cut + + ### constructor ### + sub new { + my $class = shift; + my %hash = @_; + + ### see above why we use subs here and generate the template; + ### it's basically to not re-use arrayrefs + my %utmpl = map { $_ => $tmpl->{$_}->() } keys %$tmpl; + + my $parsed = check( \%utmpl, \%hash ) or return; + + ### make sure we have an absolute path ### + my $ar = $parsed->{archive} = File::Spec->rel2abs( $parsed->{archive} ); + + ### figure out the type, if it wasn't already specified ### + unless ( $parsed->{type} ) { + $parsed->{type} = + $ar =~ /.+?\.(?:tar\.gz|tgz)$/i ? TGZ : + $ar =~ /.+?\.gz$/i ? GZ : + $ar =~ /.+?\.tar$/i ? TAR : + $ar =~ /.+?\.(zip|jar|ear|war|par)$/i ? ZIP : + $ar =~ /.+?\.(?:tbz2?|tar\.bz2?)$/i ? TBZ : + $ar =~ /.+?\.bz2$/i ? BZ2 : + $ar =~ /.+?\.Z$/ ? Z : + $ar =~ /.+?\.lzma$/ ? LZMA : + $ar =~ /.+?\.(?:txz|tar\.xz)$/i ? TXZ : + $ar =~ /.+?\.xz$/ ? XZ : + ''; + + } + + bless $parsed, $class; + + ### don't know what type of file it is + ### XXX this *has* to be an object call, not a package call + return $parsed->_error(loc("Cannot determine file type for '%1'", + $parsed->{archive} )) unless $parsed->{type}; + return $parsed; + } +} + +=head2 $ae->extract( [to => '/output/path'] ) + +Extracts the archive represented by the C object to +the path of your choice as specified by the C argument. Defaults to +C. + +Since C<.gz> files never hold a directory, but only a single file; if +the C argument is an existing directory, the file is extracted +there, with its C<.gz> suffix stripped. +If the C argument is not an existing directory, the C argument +is understood to be a filename, if the archive type is C. +In the case that you did not specify a C argument, the output +file will be the name of the archive file, stripped from its C<.gz> +suffix, in the current working directory. + +C will try a pure perl solution first, and then fall back to +commandline tools if they are available. See the C +section below on how to alter this behaviour. + +It will return true on success, and false on failure. + +On success, it will also set the follow attributes in the object: + +=over 4 + +=item $ae->extract_path + +This is the directory that the files where extracted to. + +=item $ae->files + +This is an array ref with the paths of all the files in the archive, +relative to the C argument you specified. +To get the full path to an extracted file, you would use: + + File::Spec->catfile( $to, $ae->files->[0] ); + +Note that all files from a tar archive will be in unix format, as per +the tar specification. + +=back + +=cut + +sub extract { + my $self = shift; + my %hash = @_; + + ### reset error messages + $self->_error_msg( [] ); + $self->_error_msg_long( [] ); + + my $to; + my $tmpl = { + to => { default => '.', store => \$to } + }; + + check( $tmpl, \%hash ) or return; + + ### so 'to' could be a file or a dir, depending on whether it's a .gz + ### file, or basically anything else. + ### so, check that, then act accordingly. + ### set an accessor specifically so _gunzip can know what file to extract + ### to. + my $dir; + { ### a foo.gz file + if( $self->is_gz or $self->is_bz2 or $self->is_Z or $self->is_lzma or $self->is_xz ) { + + my $cp = $self->archive; $cp =~ s/\.(?:gz|bz2?|Z|lzma|xz)$//i; + + ### to is a dir? + if ( -d $to ) { + $dir = $to; + $self->_gunzip_to( basename($cp) ); + + ### then it's a filename + } else { + $dir = dirname($to); + $self->_gunzip_to( basename($to) ); + } + + ### not a foo.gz file + } else { + $dir = $to; + } + } + + ### make the dir if it doesn't exist ### + unless( -d $dir ) { + eval { mkpath( $dir ) }; + + return $self->_error(loc("Could not create path '%1': %2", $dir, $@)) + if $@; + } + + ### get the current dir, to restore later ### + my $cwd = cwd(); + + my $ok = 1; + EXTRACT: { + + ### chdir to the target dir ### + unless( chdir $dir ) { + $self->_error(loc("Could not chdir to '%1': %2", $dir, $!)); + $ok = 0; last EXTRACT; + } + + ### set files to an empty array ref, so there's always an array + ### ref IN the accessor, to avoid errors like: + ### Can't use an undefined value as an ARRAY reference at + ### ../lib/Archive/Extract.pm line 742. (rt #19815) + $self->files( [] ); + + ### find out the dispatch methods needed for this type of + ### archive. Do a $self->is_XXX to figure out the type, then + ### get the hashref with bin + pure perl dispatchers. + my ($map) = map { $Mapping->{$_} } grep { $self->$_ } keys %$Mapping; + + ### add pure perl extractor if allowed & add bin extractor if allowed + my @methods; + push @methods, $map->{'pp'} if $_ALLOW_PURE_PERL; + push @methods, $map->{'bin'} if $_ALLOW_BIN; + + ### reverse it if we prefer bin extractors + @methods = reverse @methods if $PREFER_BIN; + + my($na, $fail); + for my $method (@methods) { + $self->debug( "# Extracting with ->$method\n" ); + + my $rv = $self->$method; + + ### a positive extraction + if( $rv and $rv ne METHOD_NA ) { + $self->debug( "# Extraction succeeded\n" ); + $self->_extractor($method); + last; + + ### method is not available + } elsif ( $rv and $rv eq METHOD_NA ) { + $self->debug( "# Extraction method not available\n" ); + $na++; + } else { + $self->debug( "# Extraction method failed\n" ); + $fail++; + } + } + + ### warn something went wrong if we didn't get an extractor + unless( $self->_extractor ) { + my $diag = $fail ? loc("Extract failed due to errors") : + $na ? loc("Extract failed; no extractors available") : + ''; + + $self->_error($diag); + $ok = 0; + } + } + + ### and chdir back ### + unless( chdir $cwd ) { + $self->_error(loc("Could not chdir back to start dir '%1': %2'", + $cwd, $!)); + } + + return $ok; +} + +=pod + +=head1 ACCESSORS + +=head2 $ae->error([BOOL]) + +Returns the last encountered error as string. +Pass it a true value to get the C output instead. + +=head2 $ae->extract_path + +This is the directory the archive got extracted to. +See C for details. + +=head2 $ae->files + +This is an array ref holding all the paths from the archive. +See C for details. + +=head2 $ae->archive + +This is the full path to the archive file represented by this +C object. + +=head2 $ae->type + +This is the type of archive represented by this C +object. See accessors below for an easier way to use this. +See the C method for details. + +=head2 $ae->types + +Returns a list of all known C for C's +C method. + +=cut + +sub types { return @Types } + +=head2 $ae->is_tgz + +Returns true if the file is of type C<.tar.gz>. +See the C method for details. + +=head2 $ae->is_tar + +Returns true if the file is of type C<.tar>. +See the C method for details. + +=head2 $ae->is_gz + +Returns true if the file is of type C<.gz>. +See the C method for details. + +=head2 $ae->is_Z + +Returns true if the file is of type C<.Z>. +See the C method for details. + +=head2 $ae->is_zip + +Returns true if the file is of type C<.zip>. +See the C method for details. + +=head2 $ae->is_lzma + +Returns true if the file is of type C<.lzma>. +See the C method for details. + +=head2 $ae->is_xz + +Returns true if the file is of type C<.xz>. +See the C method for details. + +=cut + +### quick check methods ### +sub is_tgz { return $_[0]->type eq TGZ } +sub is_tar { return $_[0]->type eq TAR } +sub is_gz { return $_[0]->type eq GZ } +sub is_zip { return $_[0]->type eq ZIP } +sub is_tbz { return $_[0]->type eq TBZ } +sub is_bz2 { return $_[0]->type eq BZ2 } +sub is_Z { return $_[0]->type eq Z } +sub is_lzma { return $_[0]->type eq LZMA } +sub is_xz { return $_[0]->type eq XZ } +sub is_txz { return $_[0]->type eq TXZ } + +=pod + +=head2 $ae->bin_tar + +Returns the full path to your tar binary, if found. + +=head2 $ae->bin_gzip + +Returns the full path to your gzip binary, if found + +=head2 $ae->bin_unzip + +Returns the full path to your unzip binary, if found + +=head2 $ae->bin_unlzma + +Returns the full path to your unlzma binary, if found + +=head2 $ae->bin_unxz + +Returns the full path to your unxz binary, if found + +=cut + +### paths to commandline tools ### +sub bin_gzip { return $PROGRAMS->{'gzip'} if $PROGRAMS->{'gzip'} } +sub bin_unzip { return $PROGRAMS->{'unzip'} if $PROGRAMS->{'unzip'} } +sub bin_tar { return $PROGRAMS->{'tar'} if $PROGRAMS->{'tar'} } +sub bin_bunzip2 { return $PROGRAMS->{'bunzip2'} if $PROGRAMS->{'bunzip2'} } +sub bin_uncompress { return $PROGRAMS->{'uncompress'} + if $PROGRAMS->{'uncompress'} } +sub bin_unlzma { return $PROGRAMS->{'unlzma'} if $PROGRAMS->{'unlzma'} } +sub bin_unxz { return $PROGRAMS->{'unxz'} if $PROGRAMS->{'unxz'} } + +=head2 $bool = $ae->have_old_bunzip2 + +Older versions of C, from before the C release, +require all archive names to end in C<.bz2> or it will not extract +them. This method checks if you have a recent version of C +that allows any extension, or an older one that doesn't. + +=cut + +sub have_old_bunzip2 { + my $self = shift; + + ### no bunzip2? no old bunzip2 either :) + return unless $self->bin_bunzip2; + + ### if we can't run this, we can't be sure if it's too old or not + ### XXX stupid stupid stupid bunzip2 doesn't understand --version + ### is not a request to extract data: + ### $ bunzip2 --version + ### bzip2, a block-sorting file compressor. Version 1.0.2, 30-Dec-2001. + ### [...] + ### bunzip2: I won't read compressed data from a terminal. + ### bunzip2: For help, type: `bunzip2 --help'. + ### $ echo $? + ### 1 + ### HATEFUL! + + ### double hateful: bunzip2 --version also hangs if input is a pipe + ### See #32370: Archive::Extract will hang if stdin is a pipe [+PATCH] + ### So, we have to provide *another* argument which is a fake filename, + ### just so it wont try to read from stdin to print its version.. + ### *sigh* + ### Even if the file exists, it won't clobber or change it. + my $buffer; + scalar run( + command => [$self->bin_bunzip2, '--version', 'NoSuchFile'], + verbose => 0, + buffer => \$buffer + ); + + ### no output + return unless $buffer; + + my ($version) = $buffer =~ /version \s+ (\d+)/ix; + + return 1 if $version < 1; + return; +} + +################################# +# +# Untar code +# +################################# + +### annoying issue with (gnu) tar on win32, as illustrated by this +### bug: https://rt.cpan.org/Ticket/Display.html?id=40138 +### which shows that (gnu) tar will interpret a file name with a : +### in it as a remote file name, so C:\tmp\foo.txt is interpreted +### as a remote shell, and the extract fails. +{ my @ExtraTarFlags; + if( ON_WIN32 and my $cmd = __PACKAGE__->bin_tar ) { + + ### if this is gnu tar we are running, we need to use --force-local + push @ExtraTarFlags, '--force-local' if `$cmd --version` =~ /gnu tar/i; + } + + + ### use /bin/tar to extract ### + sub _untar_bin { + my $self = shift; + + ### check for /bin/tar ### + ### check for /bin/gzip if we need it ### + ### if any of the binaries are not available, return NA + { my $diag = !$self->bin_tar ? + loc("No '%1' program found", '/bin/tar') : + $self->is_tgz && !$self->bin_gzip ? + loc("No '%1' program found", '/bin/gzip') : + $self->is_tbz && !$self->bin_bunzip2 ? + loc("No '%1' program found", '/bin/bunzip2') : + $self->is_txz && !$self->bin_unxz ? + loc("No '%1' program found", '/bin/unxz') : + ''; + + if( $diag ) { + $self->_error( $diag ); + return METHOD_NA; + } + } + + ### XXX figure out how to make IPC::Run do this in one call -- + ### currently i don't know how to get output of a command after a pipe + ### trapped in a scalar. Mailed barries about this 5th of june 2004. + + ### see what command we should run, based on whether + ### it's a .tgz or .tar + + ### GNU tar can't handled VMS filespecs, but VMSTAR can handle Unix filespecs. + my $archive = $self->archive; + $archive = VMS::Filespec::unixify($archive) if ON_VMS; + + ### XXX solaris tar and bsdtar are having different outputs + ### depending whether you run with -x or -t + ### compensate for this insanity by running -t first, then -x + { my $cmd = + $self->is_tgz ? [$self->bin_gzip, '-c', '-d', '-f', $archive, '|', + $self->bin_tar, '-tf', '-'] : + $self->is_tbz ? [$self->bin_bunzip2, '-cd', $archive, '|', + $self->bin_tar, '-tf', '-'] : + $self->is_txz ? [$self->bin_unxz, '-cd', $archive, '|', + $self->bin_tar, '-tf', '-'] : + [$self->bin_tar, @ExtraTarFlags, '-tf', $archive]; + + ### run the command + ### newer versions of 'tar' (1.21 and up) now print record size + ### to STDERR as well if v OR t is given (used to be both). This + ### is a 'feature' according to the changelog, so we must now only + ### inspect STDOUT, otherwise, failures like these occur: + ### http://www.cpantesters.org/cpan/report/3230366 + my $buffer = ''; + my @out = run( command => $cmd, + buffer => \$buffer, + verbose => $DEBUG ); + + ### command was unsuccessful + unless( $out[0] ) { + return $self->_error(loc( + "Error listing contents of archive '%1': %2", + $archive, $buffer )); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_files( $archive ) ); + + } else { + ### if we're on solaris we /might/ be using /bin/tar, which has + ### a weird output format... we might also be using + ### /usr/local/bin/tar, which is gnu tar, which is perfectly + ### fine... so we have to do some guessing here =/ + my @files = map { chomp; + !ON_SOLARIS ? $_ + : (m|^ x \s+ # 'xtract' -- sigh + (.+?), # the actual file name + \s+ [\d,.]+ \s bytes, + \s+ [\d,.]+ \s tape \s blocks + |x ? $1 : $_); + + ### only STDOUT, see above. Sometimes, extra whitespace + ### is present, so make sure we only pick lines with + ### a length + } grep { length } map { split $/, $_ } join '', @{$out[3]}; + + ### store the files that are in the archive ### + $self->files(\@files); + } + } + + ### now actually extract it ### + { my $cmd = + $self->is_tgz ? [$self->bin_gzip, '-c', '-d', '-f', $archive, '|', + $self->bin_tar, '-xf', '-'] : + $self->is_tbz ? [$self->bin_bunzip2, '-cd', $archive, '|', + $self->bin_tar, '-xf', '-'] : + $self->is_txz ? [$self->bin_unxz, '-cd', $archive, '|', + $self->bin_tar, '-xf', '-'] : + [$self->bin_tar, @ExtraTarFlags, '-xf', $archive]; + + my $buffer = ''; + unless( scalar run( command => $cmd, + buffer => \$buffer, + verbose => $DEBUG ) + ) { + return $self->_error(loc("Error extracting archive '%1': %2", + $archive, $buffer )); + } + + ### we might not have them, due to lack of buffers + if( $self->files ) { + ### now that we've extracted, figure out where we extracted to + my $dir = $self->__get_extract_dir( $self->files ); + + ### store the extraction dir ### + $self->extract_path( $dir ); + } + } + + ### we got here, no error happened + return 1; + } +} + + +### use archive::tar to extract ### +sub _untar_at { + my $self = shift; + + ### Loading Archive::Tar is going to set it to 1, so make it local + ### within this block, starting with its initial value. Whatever + ### Achive::Tar does will be undone when we return. + ### + ### Also, later, set $Archive::Tar::WARN to $Archive::Extract::WARN + ### so users don't have to even think about this variable. If they + ### do, they still get their set value outside of this call. + local $Archive::Tar::WARN = $Archive::Tar::WARN; + + ### we definitely need Archive::Tar, so load that first + { my $use_list = { 'Archive::Tar' => '0.0' }; + + unless( can_load( modules => $use_list ) ) { + + $self->_error(loc("You do not have '%1' installed - " . + "Please install it as soon as possible.", + 'Archive::Tar')); + + return METHOD_NA; + } + } + + ### we might pass it a filehandle if it's a .tbz file.. + my $fh_to_read = $self->archive; + + ### we will need Compress::Zlib too, if it's a tgz... and IO::Zlib + ### if A::T's version is 0.99 or higher + if( $self->is_tgz ) { + my $use_list = { 'Compress::Zlib' => '0.0' }; + $use_list->{ 'IO::Zlib' } = '0.0' + if $Archive::Tar::VERSION >= '0.99'; + + unless( can_load( modules => $use_list ) ) { + my $which = join '/', sort keys %$use_list; + + $self->_error(loc( + "You do not have '%1' installed - Please ". + "install it as soon as possible.", $which) + ); + + return METHOD_NA; + } + + } elsif ( $self->is_tbz ) { + my $use_list = { 'IO::Uncompress::Bunzip2' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc( + "You do not have '%1' installed - Please " . + "install it as soon as possible.", + 'IO::Uncompress::Bunzip2') + ); + + return METHOD_NA; + } + + my $bz = IO::Uncompress::Bunzip2->new( $self->archive ) or + return $self->_error(loc("Unable to open '%1': %2", + $self->archive, + $IO::Uncompress::Bunzip2::Bunzip2Error)); + + $fh_to_read = $bz; + } elsif ( $self->is_txz ) { + my $use_list = { 'IO::Uncompress::UnXz' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc( + "You do not have '%1' installed - Please " . + "install it as soon as possible.", + 'IO::Uncompress::UnXz') + ); + + return METHOD_NA; + } + + my $xz = IO::Uncompress::UnXz->new( $self->archive ) or + return $self->_error(loc("Unable to open '%1': %2", + $self->archive, + $IO::Uncompress::UnXz::UnXzError)); + + $fh_to_read = $xz; + } + + my @files; + { + ### $Archive::Tar::WARN is 1 by default in Archive::Tar, but we've + ### localized $Archive::Tar::WARN already. + $Archive::Tar::WARN = $Archive::Extract::WARN; + + ### only tell it it's compressed if it's a .tgz, as we give it a file + ### handle if it's a .tbz + my @read = ( $fh_to_read, ( $self->is_tgz ? 1 : 0 ) ); + + ### for version of Archive::Tar > 1.04 + local $Archive::Tar::CHOWN = 0; + + ### use the iterator if we can. it's a feature of A::T 1.40 and up + if ( $_ALLOW_TAR_ITER && Archive::Tar->can( 'iter' ) ) { + + my $next; + unless ( $next = Archive::Tar->iter( @read ) ) { + return $self->_error(loc( + "Unable to read '%1': %2", $self->archive, + $Archive::Tar::error)); + } + + while ( my $file = $next->() ) { + push @files, $file->full_path; + + $file->extract or return $self->_error(loc( + "Unable to read '%1': %2", + $self->archive, + $Archive::Tar::error)); + } + + ### older version, read the archive into memory + } else { + + my $tar = Archive::Tar->new(); + + unless( $tar->read( @read ) ) { + return $self->_error(loc("Unable to read '%1': %2", + $self->archive, $Archive::Tar::error)); + } + + ### workaround to prevent Archive::Tar from setting uid, which + ### is a potential security hole. -autrijus + ### have to do it here, since A::T needs to be /loaded/ first ### + { no strict 'refs'; local $^W; + + ### older versions of archive::tar <= 0.23 + *Archive::Tar::chown = sub {}; + } + + { local $^W; # quell 'splice() offset past end of array' warnings + # on older versions of A::T + + ### older archive::tar always returns $self, return value + ### slightly fux0r3d because of it. + $tar->extract or return $self->_error(loc( + "Unable to extract '%1': %2", + $self->archive, $Archive::Tar::error )); + } + + @files = $tar->list_files; + } + } + + my $dir = $self->__get_extract_dir( \@files ); + + ### store the files that are in the archive ### + $self->files(\@files); + + ### store the extraction dir ### + $self->extract_path( $dir ); + + ### check if the dir actually appeared ### + return 1 if -d $self->extract_path; + + ### no dir, we failed ### + return $self->_error(loc("Unable to extract '%1': %2", + $self->archive, $Archive::Tar::error )); +} + +################################# +# +# Gunzip code +# +################################# + +sub _gunzip_bin { + my $self = shift; + + ### check for /bin/gzip -- we need it ### + unless( $self->bin_gzip ) { + $self->_error(loc("No '%1' program found", '/bin/gzip')); + return METHOD_NA; + } + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $cmd = [ $self->bin_gzip, '-c', '-d', '-f', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to gunzip '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_content( $self->archive ) ); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +sub _gunzip_cz { + my $self = shift; + + my $use_list = { 'Compress::Zlib' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc("You do not have '%1' installed - Please " . + "install it as soon as possible.", 'Compress::Zlib')); + return METHOD_NA; + } + + my $gz = Compress::Zlib::gzopen( $self->archive, "rb" ) or + return $self->_error(loc("Unable to open '%1': %2", + $self->archive, $Compress::Zlib::gzerrno)); + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $buffer; + $self->_print($fh, $buffer) while $gz->gzread($buffer) > 0; + $fh->close; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +################################# +# +# Uncompress code +# +################################# + +sub _uncompress_bin { + my $self = shift; + + ### check for /bin/gzip -- we need it ### + unless( $self->bin_uncompress ) { + $self->_error(loc("No '%1' program found", '/bin/uncompress')); + return METHOD_NA; + } + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $cmd = [ $self->bin_uncompress, '-c', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to uncompress '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_content( $self->archive ) ); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + + +################################# +# +# Unzip code +# +################################# + + +sub _unzip_bin { + my $self = shift; + + ### check for /bin/gzip if we need it ### + unless( $self->bin_unzip ) { + $self->_error(loc("No '%1' program found", '/bin/unzip')); + return METHOD_NA; + } + + ### first, get the files.. it must be 2 different commands with 'unzip' :( + { ### on VMS, capital letter options have to be quoted. This is + ### reported by John Malmberg on P5P Tue 21 Aug 2007 05:05:11 + ### Subject: [patch@31735]Archive Extract fix on VMS. + my $opt = ON_VMS ? '"-Z"' : '-Z'; + my $cmd = [ $self->bin_unzip, $opt, '-1', $self->archive ]; + + my $buffer = ''; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to unzip '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_files( $self->archive ) ); + + } else { + ### Annoyingly, pesky MSWin32 can either have 'native' tools + ### which have \r\n line endings or Cygwin-based tools which + ### have \n line endings. Jan Dubois suggested using this fix + my $split = ON_WIN32 ? qr/\r?\n/ : "\n"; + $self->files( [split $split, $buffer] ); + } + } + + ### now, extract the archive ### + { my $cmd = [ $self->bin_unzip, '-qq', '-o', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to unzip '%1': %2", + $self->archive, $buffer)); + } + + if( scalar @{$self->files} ) { + my $files = $self->files; + my $dir = $self->__get_extract_dir( $files ); + + $self->extract_path( $dir ); + } + } + + return 1; +} + +sub _unzip_az { + my $self = shift; + + my $use_list = { 'Archive::Zip' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc("You do not have '%1' installed - Please " . + "install it as soon as possible.", 'Archive::Zip')); + return METHOD_NA; + } + + my $zip = Archive::Zip->new(); + + unless( $zip->read( $self->archive ) == &Archive::Zip::AZ_OK ) { + return $self->_error(loc("Unable to read '%1'", $self->archive)); + } + + my @files; + + + ### Address: #43278: Explicitly tell Archive::Zip where to put the files: + ### "In my BackPAN indexing, Archive::Zip was extracting things + ### in my script's directory instead of the current working directory. + ### I traced this back through Archive::Zip::_asLocalName which + ### eventually calls File::Spec::Win32::rel2abs which on Windows might + ### call Cwd::getdcwd. getdcwd returns the wrong directory in my + ### case, even though I think I'm on the same drive. + ### + ### To fix this, I pass the optional second argument to + ### extractMember using the cwd from Archive::Extract." --bdfoy + + ## store cwd() before looping; calls to cwd() can be expensive, and + ### it won't change during the loop + my $extract_dir = cwd(); + + ### have to extract every member individually ### + for my $member ($zip->members) { + push @files, $member->{fileName}; + + ### file to extract to, to avoid the above problem + my $to = File::Spec->catfile( $extract_dir, $member->{fileName} ); + + unless( $zip->extractMember($member, $to) == &Archive::Zip::AZ_OK ) { + return $self->_error(loc("Extraction of '%1' from '%2' failed", + $member->{fileName}, $self->archive )); + } + } + + my $dir = $self->__get_extract_dir( \@files ); + + ### set what files where extract, and where they went ### + $self->files( \@files ); + $self->extract_path( File::Spec->rel2abs($dir) ); + + return 1; +} + +sub __get_extract_dir { + my $self = shift; + my $files = shift || []; + + return unless scalar @$files; + + my($dir1, $dir2); + for my $aref ( [ \$dir1, 0 ], [ \$dir2, -1 ] ) { + my($dir,$pos) = @$aref; + + ### add a catdir(), so that any trailing slashes get + ### take care of (removed) + ### also, a catdir() normalises './dir/foo' to 'dir/foo'; + ### which was the problem in bug #23999 + my $res = -d $files->[$pos] + ? File::Spec->catdir( $files->[$pos], '' ) + : File::Spec->catdir( dirname( $files->[$pos] ) ); + + $$dir = $res; + } + + ### if the first and last dir don't match, make sure the + ### dirname is not set wrongly + my $dir; + + ### dirs are the same, so we know for sure what the extract dir is + if( $dir1 eq $dir2 ) { + $dir = $dir1; + + ### dirs are different.. do they share the base dir? + ### if so, use that, if not, fall back to '.' + } else { + my $base1 = [ File::Spec->splitdir( $dir1 ) ]->[0]; + my $base2 = [ File::Spec->splitdir( $dir2 ) ]->[0]; + + $dir = File::Spec->rel2abs( $base1 eq $base2 ? $base1 : '.' ); + } + + return File::Spec->rel2abs( $dir ); +} + +################################# +# +# Bunzip2 code +# +################################# + +sub _bunzip2_bin { + my $self = shift; + + ### check for /bin/gzip -- we need it ### + unless( $self->bin_bunzip2 ) { + $self->_error(loc("No '%1' program found", '/bin/bunzip2')); + return METHOD_NA; + } + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + ### guard against broken bunzip2. See ->have_old_bunzip2() + ### for details + if( $self->have_old_bunzip2 and $self->archive !~ /\.bz2$/i ) { + return $self->_error(loc("Your bunzip2 version is too old and ". + "can only extract files ending in '%1'", + '.bz2')); + } + + my $cmd = [ $self->bin_bunzip2, '-cd', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to bunzip2 '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_content( $self->archive ) ); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +### using cz2, the compact versions... this we use mainly in archive::tar +### extractor.. +# sub _bunzip2_cz1 { +# my $self = shift; +# +# my $use_list = { 'IO::Uncompress::Bunzip2' => '0.0' }; +# unless( can_load( modules => $use_list ) ) { +# return $self->_error(loc("You do not have '%1' installed - Please " . +# "install it as soon as possible.", +# 'IO::Uncompress::Bunzip2')); +# } +# +# my $bz = IO::Uncompress::Bunzip2->new( $self->archive ) or +# return $self->_error(loc("Unable to open '%1': %2", +# $self->archive, +# $IO::Uncompress::Bunzip2::Bunzip2Error)); +# +# my $fh = FileHandle->new('>'. $self->_gunzip_to) or +# return $self->_error(loc("Could not open '%1' for writing: %2", +# $self->_gunzip_to, $! )); +# +# my $buffer; +# $fh->print($buffer) while $bz->read($buffer) > 0; +# $fh->close; +# +# ### set what files where extract, and where they went ### +# $self->files( [$self->_gunzip_to] ); +# $self->extract_path( File::Spec->rel2abs(cwd()) ); +# +# return 1; +# } + +sub _bunzip2_bz2 { + my $self = shift; + + my $use_list = { 'IO::Uncompress::Bunzip2' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc("You do not have '%1' installed - Please " . + "install it as soon as possible.", + 'IO::Uncompress::Bunzip2')); + return METHOD_NA; + } + + IO::Uncompress::Bunzip2::bunzip2($self->archive => $self->_gunzip_to) + or return $self->_error(loc("Unable to uncompress '%1': %2", + $self->archive, + $IO::Uncompress::Bunzip2::Bunzip2Error)); + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +################################# +# +# UnXz code +# +################################# + +sub _unxz_bin { + my $self = shift; + + ### check for /bin/unxz -- we need it ### + unless( $self->bin_unxz ) { + $self->_error(loc("No '%1' program found", '/bin/unxz')); + return METHOD_NA; + } + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $cmd = [ $self->bin_unxz, '-c', '-d', '-f', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to unxz '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_content( $self->archive ) ); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +sub _unxz_cz { + my $self = shift; + + my $use_list = { 'IO::Uncompress::UnXz' => '0.0' }; + unless( can_load( modules => $use_list ) ) { + $self->_error(loc("You do not have '%1' installed - Please " . + "install it as soon as possible.", + 'IO::Uncompress::UnXz')); + return METHOD_NA; + } + + IO::Uncompress::UnXz::unxz($self->archive => $self->_gunzip_to) + or return $self->_error(loc("Unable to uncompress '%1': %2", + $self->archive, + $IO::Uncompress::UnXz::UnXzError)); + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + + +################################# +# +# unlzma code +# +################################# + +sub _unlzma_bin { + my $self = shift; + + ### check for /bin/unlzma -- we need it ### + unless( $self->bin_unlzma ) { + $self->_error(loc("No '%1' program found", '/bin/unlzma')); + return METHOD_NA; + } + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $cmd = [ $self->bin_unlzma, '-c', $self->archive ]; + + my $buffer; + unless( scalar run( command => $cmd, + verbose => $DEBUG, + buffer => \$buffer ) + ) { + return $self->_error(loc("Unable to unlzma '%1': %2", + $self->archive, $buffer)); + } + + ### no buffers available? + if( !IPC::Cmd->can_capture_buffer and !$buffer ) { + $self->_error( $self->_no_buffer_content( $self->archive ) ); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +sub _unlzma_cz { + my $self = shift; + + my $use_list1 = { 'IO::Uncompress::UnLzma' => '0.0' }; + my $use_list2 = { 'Compress::unLZMA' => '0.0' }; + + if (can_load( modules => $use_list1 ) ) { + IO::Uncompress::UnLzma::unlzma($self->archive => $self->_gunzip_to) + or return $self->_error(loc("Unable to uncompress '%1': %2", + $self->archive, + $IO::Uncompress::UnLzma::UnLzmaError)); + } + elsif (can_load( modules => $use_list2 ) ) { + + my $fh = FileHandle->new('>'. $self->_gunzip_to) or + return $self->_error(loc("Could not open '%1' for writing: %2", + $self->_gunzip_to, $! )); + + my $buffer; + $buffer = Compress::unLZMA::uncompressfile( $self->archive ); + unless ( defined $buffer ) { + return $self->_error(loc("Could not unlzma '%1': %2", + $self->archive, $@)); + } + + $self->_print($fh, $buffer) if defined $buffer; + + close $fh; + } + else { + $self->_error(loc("You do not have '%1' or '%2' installed - Please " . + "install it as soon as possible.", 'Compress::unLZMA', 'IO::Uncompress::UnLzma')); + return METHOD_NA; + } + + ### set what files where extract, and where they went ### + $self->files( [$self->_gunzip_to] ); + $self->extract_path( File::Spec->rel2abs(cwd()) ); + + return 1; +} + +##################################### +# +# unzip heuristics for FreeBSD-alikes +# +##################################### + +sub _is_infozip_esque { + my $unzip = shift; + + my @strings; + my $buf = ''; + + { + open my $file, '<', $unzip or die "$!\n"; + binmode $file; + local $/ = \1; + local $_; + while(<$file>) { + if ( m![[:print:]]! ) { + $buf .= $_; + next; + } + if ( $buf and m![^[:print:]]! ) { + push @strings, $buf if length $buf >= 4; + $buf = ''; + next; + } + } + } + push @strings, $buf if $buf; + foreach my $part ( @strings ) { + if ( $part =~ m!ZIPINFO! or $part =~ m!usage:.+?Z1! ) { + return $unzip; + } + } + return; +} + +################################# +# +# Error code +# +################################# + +# For printing binaries that avoids interfering globals +sub _print { + my $self = shift; + my $fh = shift; + + local( $\, $", $, ) = ( undef, ' ', '' ); + return print $fh @_; +} + +sub _error { + my $self = shift; + my $error = shift; + my $lerror = Carp::longmess($error); + + push @{$self->_error_msg}, $error; + push @{$self->_error_msg_long}, $lerror; + + ### set $Archive::Extract::WARN to 0 to disable printing + ### of errors + if( $WARN ) { + carp $DEBUG ? $lerror : $error; + } + + return; +} + +sub error { + my $self = shift; + + ### make sure we have a fallback aref + my $aref = do { + shift() + ? $self->_error_msg_long + : $self->_error_msg + } || []; + + return join $/, @$aref; +} + +=head2 debug( MESSAGE ) + +This method outputs MESSAGE to the default filehandle if C<$DEBUG> is +true. It's a small method, but it's here if you'd like to subclass it +so you can so something else with any debugging output. + +=cut + +### this is really a stub for subclassing +sub debug { + return unless $DEBUG; + + print $_[1]; +} + +sub _no_buffer_files { + my $self = shift; + my $file = shift or return; + return loc("No buffer captured, unable to tell ". + "extracted files or extraction dir for '%1'", $file); +} + +sub _no_buffer_content { + my $self = shift; + my $file = shift or return; + return loc("No buffer captured, unable to get content for '%1'", $file); +} +1; + +=pod + +=head1 HOW IT WORKS + +C tries first to determine what type of archive you +are passing it, by inspecting its suffix. It does not do this by using +Mime magic, or something related. See C below. + +Once it has determined the file type, it knows which extraction methods +it can use on the archive. It will try a perl solution first, then fall +back to a commandline tool if that fails. If that also fails, it will +return false, indicating it was unable to extract the archive. +See the section on C to see how to alter this order. + +=head1 CAVEATS + +=head2 File Extensions + +C trusts on the extension of the archive to determine +what type it is, and what extractor methods therefore can be used. If +your archives do not have any of the extensions as described in the +C method, you will have to specify the type explicitly, or +C will not be able to extract the archive for you. + +=head2 Supporting Very Large Files + +C can use either pure perl modules or command line +programs under the hood. Some of the pure perl modules (like +C and Compress::unLZMA) take the entire contents of the archive into memory, +which may not be feasible on your system. Consider setting the global +variable C<$Archive::Extract::PREFER_BIN> to C<1>, which will prefer +the use of command line programs and won't consume so much memory. + +See the C section below for details. + +=head2 Bunzip2 support of arbitrary extensions. + +Older versions of C do not support arbitrary file +extensions and insist on a C<.bz2> suffix. Although we do our best +to guard against this, if you experience a bunzip2 error, it may +be related to this. For details, please see the C +method. + +=head1 GLOBAL VARIABLES + +=head2 $Archive::Extract::DEBUG + +Set this variable to C to have all calls to command line tools +be printed out, including all their output. +This also enables C errors, instead of the regular +C errors. + +Good for tracking down why things don't work with your particular +setup. + +Defaults to C. + +=head2 $Archive::Extract::WARN + +This variable controls whether errors encountered internally by +C should be C'd or not. + +Set to false to silence warnings. Inspect the output of the C +method manually to see what went wrong. + +Defaults to C. + +=head2 $Archive::Extract::PREFER_BIN + +This variables controls whether C should prefer the +use of perl modules, or commandline tools to extract archives. + +Set to C to have C prefer commandline tools. + +Defaults to C. + +=head1 TODO / CAVEATS + +=over 4 + +=item Mime magic support + +Maybe this module should use something like C to determine +the type, rather than blindly trust the suffix. + +=item Thread safety + +Currently, C does a C to the extraction dir before +extraction, and a C back again after. This is not necessarily +thread safe. See C bug C<#45671> for details. + +=back + +=head1 BUG REPORTS + +Please report bugs or other issues to Ebug-archive-extract@rt.cpan.orgE. + +=head1 AUTHOR + +This module by Jos Boumans Ekane@cpan.orgE. + +=head1 COPYRIGHT + +This library is free software; you may redistribute and/or modify it +under the same terms as Perl itself. + +=cut + +# Local variables: +# c-indentation-style: bsd +# c-basic-offset: 4 +# indent-tabs-mode: nil +# End: +# vim: expandtab shiftwidth=4: + diff --git a/t/01_Archive-Extract.t b/t/01_Archive-Extract.t new file mode 100644 index 0000000..cb67d27 --- /dev/null +++ b/t/01_Archive-Extract.t @@ -0,0 +1,557 @@ +BEGIN { chdir 't' if -d 't' }; +BEGIN { mkdir 'out' unless -d 'out' }; + +### left behind, at least on Win32. See core patch #31904 +END { rmtree('out') }; + +use strict; +use lib qw[../lib]; + +use constant IS_WIN32 => $^O eq 'MSWin32' ? 1 : 0; +use constant IS_CYGWIN => $^O eq 'cygwin' ? 1 : 0; +use constant IS_VMS => $^O eq 'VMS' ? 1 : 0; + +use Cwd qw[cwd]; +use Test::More qw[no_plan]; +use File::Spec; +use File::Spec::Unix; +use File::Path; +use Data::Dumper; +use File::Basename qw[basename]; +use Module::Load::Conditional qw[check_install]; + +### uninitialized value in File::Spec warnings come from A::Zip: +# t/01_Archive-Extract....ok 135/0Use of uninitialized value in concatenation (.) or string at /opt/lib/perl5/5.8.3/File/Spec/Unix.pm line 313. +# File::Spec::Unix::catpath('File::Spec','','','undef') called at /opt/lib/perl5/site_perl/5.8.3/Archive/Zip.pm line 473 +# Archive::Zip::_asLocalName('') called at /opt/lib/perl5/site_perl/5.8.3/Archive/Zip.pm line 652 +# Archive::Zip::Archive::extractMember('Archive::Zip::Archive=HASH(0x9679c8)','Archive::Zip::ZipFileMember=HASH(0x9678fc)') called at ../lib/Archive/Extract.pm line 753 +# Archive::Extract::_unzip_az('Archive::Extract=HASH(0x966eac)') called at ../lib/Archive/Extract.pm line 674 +# Archive::Extract::_unzip('Archive::Extract=HASH(0x966eac)') called at ../lib/Archive/Extract.pm line 275 +# Archive::Extract::extract('Archive::Extract=HASH(0x966eac)','to','/Users/kane/sources/p4/other/archive-extract/t/out') called at t/01_Archive-Extract.t line 180 +#BEGIN { $SIG{__WARN__} = sub { require Carp; Carp::cluck(@_) } }; + +if ((IS_WIN32 or IS_CYGWIN) && ! $ENV{PERL_CORE}) { + diag( "Older versions of Archive::Zip may cause File::Spec warnings" ); + diag( "See bug #19713 in rt.cpan.org. It is safe to ignore them" ); +} + +my $Me = basename( $0 ); +my $Class = 'Archive::Extract'; + +use_ok($Class); + +### debug will always be enabled on dev versions +my $Debug = (not $ENV{PERL_CORE} and + ($ARGV[0] or $Archive::Extract::VERSION =~ /_/)) + ? 1 + : 0; + +my $Self = File::Spec->rel2abs( + IS_WIN32 ? &Win32::GetShortPathName( cwd() ) : cwd() + ); +my $SrcDir = File::Spec->catdir( $Self,'src' ); +my $OutDir = File::Spec->catdir( $Self,'out' ); + +### stupid stupid silly stupid warnings silly! ### +$Archive::Extract::DEBUG = $Archive::Extract::DEBUG = $Debug; +$Archive::Extract::WARN = $Archive::Extract::WARN = $Debug; + +diag( "\n\n*** DEBUG INFORMATION ENABLED ***\n\n" ) if $Debug; + +# Be as evil as possible to print +$\ = "ORS_FLAG"; +$, = "OFS_FLAG"; +$" = "LISTSEP_FLAG"; + +my $tmpl = { + ### plain files + 'x.bz2' => { programs => [qw[bunzip2]], + modules => [qw[IO::Uncompress::Bunzip2]], + method => 'is_bz2', + outfile => 'a', + }, + 'x.tgz' => { programs => [qw[gzip tar]], + modules => [qw[Archive::Tar IO::Zlib]], + method => 'is_tgz', + outfile => 'a', + }, + 'x.tar.gz' => { programs => [qw[gzip tar]], + modules => [qw[Archive::Tar IO::Zlib]], + method => 'is_tgz', + outfile => 'a', + }, + 'x.tar' => { programs => [qw[tar]], + modules => [qw[Archive::Tar]], + method => 'is_tar', + outfile => 'a', + }, + 'x.gz' => { programs => [qw[gzip]], + modules => [qw[Compress::Zlib]], + method => 'is_gz', + outfile => 'a', + }, + 'x.Z' => { programs => [qw[uncompress]], + modules => [qw[Compress::Zlib]], + method => 'is_Z', + outfile => 'a', + }, + 'x.zip' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'a', + }, + 'x.jar' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'a', + }, + 'x.ear' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'a', + }, + 'x.war' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'a', + }, + 'x.par' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'a', + }, + 'x.lzma' => { programs => [qw[unlzma]], + modules => [qw[Compress::unLZMA]], + method => 'is_lzma', + outfile => 'a', + }, + 'x.xz' => { programs => [qw[unxz]], + modules => [qw[IO::Uncompress::UnXz]], + method => 'is_xz', + outfile => 'a', + }, + 'x.txz' => { programs => [qw[unxz tar]], + modules => [qw[Archive::Tar + IO::Uncompress::UnXz]], + method => 'is_txz', + outfile => 'a', + }, + 'x.tar.xz'=> { programs => [qw[unxz tar]], + modules => [qw[Archive::Tar + IO::Uncompress::UnXz]], + method => 'is_txz', + outfile => 'a', + }, + ### with a directory + 'y.tbz' => { programs => [qw[bunzip2 tar]], + modules => [qw[Archive::Tar + IO::Uncompress::Bunzip2]], + method => 'is_tbz', + outfile => 'z', + outdir => 'y', + }, + 'y.tar.bz2' => { programs => [qw[bunzip2 tar]], + modules => [qw[Archive::Tar + IO::Uncompress::Bunzip2]], + method => 'is_tbz', + outfile => 'z', + outdir => 'y' + }, + 'y.txz' => { programs => [qw[unxz tar]], + modules => [qw[Archive::Tar + IO::Uncompress::UnXz]], + method => 'is_txz', + outfile => 'z', + outdir => 'y', + }, + 'y.tar.xz' => { programs => [qw[unxz tar]], + modules => [qw[Archive::Tar + IO::Uncompress::UnXz]], + method => 'is_txz', + outfile => 'z', + outdir => 'y' + }, + 'y.tgz' => { programs => [qw[gzip tar]], + modules => [qw[Archive::Tar IO::Zlib]], + method => 'is_tgz', + outfile => 'z', + outdir => 'y' + }, + 'y.tar.gz' => { programs => [qw[gzip tar]], + modules => [qw[Archive::Tar IO::Zlib]], + method => 'is_tgz', + outfile => 'z', + outdir => 'y' + }, + 'y.tar' => { programs => [qw[tar]], + modules => [qw[Archive::Tar]], + method => 'is_tar', + outfile => 'z', + outdir => 'y' + }, + 'y.zip' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'z', + outdir => 'y' + }, + 'y.par' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'z', + outdir => 'y' + }, + 'y.jar' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'z', + outdir => 'y' + }, + 'y.ear' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'z', + outdir => 'y' + }, + 'y.war' => { programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'z', + outdir => 'y' + }, + ### with non-same top dir + 'double_dir.zip' => { + programs => [qw[unzip]], + modules => [qw[Archive::Zip]], + method => 'is_zip', + outfile => 'w', + outdir => 'x' + }, +}; + +### XXX special case: on older solaris boxes (8), +### bunzip2 is version 0.9.x. Older versions (pre 1), +### only extract files that end in .bz2, and nothing +### else. So remove that test case if we have an older +### bunzip2 :( +{ if( $Class->have_old_bunzip2 ) { + delete $tmpl->{'y.tbz'}; + diag "Old bunzip2 detected, skipping .tbz test"; + } +} + +### show us the tools IPC::Cmd will use to run binary programs +if( $Debug ) { + diag( "IPC::Run enabled: $IPC::Cmd::USE_IPC_RUN " ); + diag( "IPC::Run available: " . IPC::Cmd->can_use_ipc_run ); + diag( "IPC::Run vesion: $IPC::Run::VERSION" ); + diag( "IPC::Open3 enabled: $IPC::Cmd::USE_IPC_OPEN3 " ); + diag( "IPC::Open3 available: " . IPC::Cmd->can_use_ipc_open3 ); + diag( "IPC::Open3 vesion: $IPC::Open3::VERSION" ); +} + +### test all type specifications to new() +### this tests bug #24578: Wrong check for `type' argument +{ my $meth = 'types'; + + can_ok( $Class, $meth ); + + my @types = $Class->$meth; + ok( scalar(@types), " Got a list of types" ); + + for my $type ( @types ) { + my $obj = $Class->new( archive => $Me, type => $type ); + ok( $obj, " Object created based on '$type'" ); + ok( !$obj->error, " No error logged" ); + } + + ### test unknown type + { ### must turn on warnings to catch error here + local $Archive::Extract::WARN = 1; + + my $warnings; + local $SIG{__WARN__} = sub { $warnings .= "@_" }; + + my $ae = $Class->new( archive => $Me ); + ok( !$ae, " No archive created based on '$Me'" ); + ok( !$Class->error, " Error not captured in class method" ); + ok( $warnings, " Error captured as warning" ); + like( $warnings, qr/Cannot determine file type for/, + " Error is: unknown file type" ); + } +} + +### test multiple errors +### XXX whitebox test +{ ### grab a random file from the template, so we can make an object + my $ae = Archive::Extract->new( + archive => File::Spec->catfile($SrcDir,[keys %$tmpl]->[0]) + ); + ok( $ae, "Archive created" ); + ok( not($ae->error), " No errors yet" ); + + ### log a few errors + { local $Archive::Extract::WARN = 0; + $ae->_error( $_ ) for 1..5; + } + + my $err = $ae->error; + ok( $err, " Errors retrieved" ); + + my $expect = join $/, 1..5; + is( $err, $expect, " As expected" ); + + ### this resets the errors + ### override the 'check' routine to return false, so we bail out of + ### extract() early and just run the error reset code; + { no warnings qw[once redefine]; + local *Archive::Extract::check = sub { return }; + $ae->extract; + } + ok( not($ae->error), " Errors erased after ->extract() call" ); +} + +### XXX whitebox test +### test __get_extract_dir +SKIP: { my $meth = '__get_extract_dir'; + + ### get the right separator -- File::Spec does clean ups for + ### paths, so we need to join ourselves. + my $sep = [ split '', File::Spec->catfile( 'a', 'b' ) ]->[1]; + + ### bug #23999: Attempt to generate Makefile.PL gone awry + ### showed that dirs in the style of './dir/' were reported + ### to be unpacked in '.' rather than in 'dir'. here we test + ### for this. + for my $prefix ( '', '.' ) { + skip "Prepending ./ to a valid path doesn't give you another valid path on VMS", 2 + if IS_VMS && length($prefix); + + my $dir = basename( $SrcDir ); + + ### build a list like [dir, dir/file] and [./dir ./dir/file] + ### where the dir and file actually exist, which is important + ### for the method call + my @files = map { length $prefix + ? join $sep, $prefix, $_ + : $_ + } $dir, File::Spec->catfile( $dir, [keys %$tmpl]->[0] ); + + my $res = $Class->$meth( \@files ); + $res = &Win32::GetShortPathName( $res ) if IS_WIN32; + + ok( $res, "Found extraction dir '$res'" ); + is( $res, $SrcDir, " Is expected dir '$SrcDir'" ); + } +} + +### configuration to run in: allow perl or allow binaries +for my $switch ( [0,1], [1,0] ) { + my $cfg = "PP: $switch->[0] Bin: $switch->[1]"; + + local $Archive::Extract::_ALLOW_PURE_PERL = $switch->[0]; + local $Archive::Extract::_ALLOW_BIN = $switch->[1]; + + diag("Running extract with configuration: $cfg") if $Debug; + + for my $archive (keys %$tmpl) { + diag("Archive : $archive") if $Debug; + + ### check first if we can do the proper + + my $ae = Archive::Extract->new( + archive => File::Spec->catfile($SrcDir,$archive) ); + + ### Do an extra run with _ALLOW_TAR_ITER = 0 if it's a tar file of some + ### sort + my @with_tar_iter = ( 1 ); + push @with_tar_iter, 0 if grep { $ae->$_ } qw[is_tbz is_tgz is_txz is_tar]; + + for my $tar_iter (@with_tar_iter) { SKIP: { + + ### Doesn't matter unless .tar, .tbz, .tgz, .txz + local $Archive::Extract::_ALLOW_TAR_ITER = $tar_iter; + + diag("Archive::Tar->iter: $tar_iter") if $Debug; + + isa_ok( $ae, $Class ); + + my $method = $tmpl->{$archive}->{method}; + ok( $ae->$method(), "Archive type $method recognized properly" ); + + my $file = $tmpl->{$archive}->{outfile}; + my $dir = $tmpl->{$archive}->{outdir}; # can be undef + my $rel_path = File::Spec->catfile( grep { defined } $dir, $file ); + my $abs_path = File::Spec->catfile( $OutDir, $rel_path ); + my $abs_dir = File::Spec->catdir( + grep { defined } $OutDir, $dir ); + my $nix_path = File::Spec::Unix->catfile( + grep { defined } $dir, $file ); + + ### check if we can run this test ### + my $pgm_fail; my $mod_fail; + for my $pgm ( @{$tmpl->{$archive}->{programs}} ) { + ### no binary extract method + $pgm_fail++, next unless $pgm; + + ### we dont have the program + $pgm_fail++ unless $Archive::Extract::PROGRAMS->{$pgm} && + $Archive::Extract::PROGRAMS->{$pgm}; + + } + + for my $mod ( @{$tmpl->{$archive}->{modules}} ) { + ### no module extract method + $mod_fail++, next unless $mod; + + ### we dont have the module + $mod_fail++ unless check_install( module => $mod ); + } + + ### where to extract to -- try both dir and file for gz files + ### XXX test me! + #my @outs = $ae->is_gz ? ($abs_path, $OutDir) : ($OutDir); + my @outs = $ae->is_gz || $ae->is_bz2 || $ae->is_Z || $ae->is_lzma || $ae->is_xz + ? ($abs_path) + : ($OutDir); + + ### 10 tests from here on down ### + if( ($mod_fail && ($pgm_fail || !$Archive::Extract::_ALLOW_BIN)) + || + ($pgm_fail && ($mod_fail || !$Archive::Extract::_ALLOW_PURE_PERL)) + ) { + skip "No binaries or modules to extract ".$archive, + (10 * scalar @outs); + } + + ### we dont warnings spewed about missing modules, that might + ### be a problem... + local $IPC::Cmd::WARN = 0; + local $IPC::Cmd::WARN = 0; + + for my $use_buffer ( IPC::Cmd->can_capture_buffer , 0 ) { + + ### test buffers ### + my $turn_off = !$use_buffer && !$pgm_fail && + $Archive::Extract::_ALLOW_BIN; + + ### whitebox test ### + ### stupid warnings ### + local $IPC::Cmd::USE_IPC_RUN = 0 if $turn_off; + local $IPC::Cmd::USE_IPC_RUN = 0 if $turn_off; + local $IPC::Cmd::USE_IPC_OPEN3 = 0 if $turn_off; + local $IPC::Cmd::USE_IPC_OPEN3 = 0 if $turn_off; + + + ### try extracting ### + for my $to ( @outs ) { + + diag("Extracting to: $to") if $Debug; + diag("Buffers enabled: ".!$turn_off) if $Debug; + + my $rv = $ae->extract( to => $to ); + + SKIP: { + my $re = qr/^No buffer captured/; + my $err = $ae->error || ''; + + ### skip buffer tests if we dont have buffers or + ### explicitly turned them off + skip "No buffers available", 8 + if ( $turn_off || !IPC::Cmd->can_capture_buffer) + && $err =~ $re; + + ### skip tests if we dont have an extractor + skip "No extractor available", 8 + if $err =~ /Extract failed; no extractors available/; + + ### win32 + bin utils is notorious, and none of them are + ### officially supported by strawberry. So if we + ### encounter an error while extracting while running + ### with $PREFER_BIN on win32, just skip the tests. + ### See rt#46948: unable to install install on win32 + ### for details on the pain + skip "Binary tools on Win32 are very unreliable", 8 + if $err and $Archive::Extract::_ALLOW_BIN + and IS_WIN32; + + ok( $rv, "extract() for '$archive' reports success ($cfg)"); + + diag("Extractor was: " . $ae->_extractor) if $Debug; + + ### if we /should/ have buffers, there should be + ### no errors complaining we dont have them... + unlike( $err, $re, + "No errors capturing buffers" ); + + ### might be 1 or 2, depending whether we extracted + ### a dir too + my $files = $ae->files || []; + my $file_cnt = grep { defined } $file, $dir; + is( scalar @$files, $file_cnt, + "Found correct number of output files (@$files)" ); + + ### due to prototypes on is(), if there's no -1 index on + ### the array ref, it'll give a fatal exception: + ### "Modification of non-creatable array value attempted, + ### subscript -1 at -e line 1." So wrap it in do { } + is( do { $files->[-1] }, $nix_path, + "Found correct output file '$nix_path'" ); + + ok( -e $abs_path, + "Output file '$abs_path' exists" ); + ok( $ae->extract_path, + "Extract dir found" ); + ok( -d $ae->extract_path, + "Extract dir exists" ); + is( $ae->extract_path, $abs_dir, + "Extract dir is expected '$abs_dir'" ); + } + + SKIP: { + skip "Unlink tests are unreliable on Win32", 3 if IS_WIN32; + + 1 while unlink $abs_path; + ok( !(-e $abs_path), "Output file successfully removed" ); + + SKIP: { + skip "No extract path captured, can't remove paths", 2 + unless $ae->extract_path; + + ### if something went wrong with determining the out + ### path, don't go deleting stuff.. might be Really Bad + my $out_re = quotemeta( $OutDir ); + + ### VMS directory layout is different. Craig Berry + ### explains: + ### the test is trying to determine if C + ### is part of C. Except in VMS + ### syntax, that would mean trying to determine whether + ### C is part of C + ### Because we have both a directory delimiter + ### (dot) and a directory spec terminator (right + ### bracket), we have to trim the right bracket from + ### the first one to make it successfully match the + ### second one. Since we're asserting the same truth -- + ### that one path spec is the leading part of the other + ### -- it seems to me ok to have this in the test only. + ### + ### so we strip the ']' of the back of the regex + $out_re =~ s/\\\]// if IS_VMS; + + if( $ae->extract_path !~ /^$out_re/ ) { + ok( 0, "Extractpath WRONG (".$ae->extract_path.")"); + skip( "Unsafe operation -- skip cleanup!!!" ), 1; + } + + eval { rmtree( $ae->extract_path ) }; + ok( !$@, " rmtree gave no error" ); + ok( !(-d $ae->extract_path ), + " Extract dir successfully removed" ); + } + } + } + } + } } + } +} diff --git a/t/src/double_dir.zip b/t/src/double_dir.zip new file mode 100644 index 0000000..57e5d2f Binary files /dev/null and b/t/src/double_dir.zip differ diff --git a/t/src/x.Z b/t/src/x.Z new file mode 100644 index 0000000..c40bb0d Binary files /dev/null and b/t/src/x.Z differ diff --git a/t/src/x.bz2 b/t/src/x.bz2 new file mode 100644 index 0000000..b56f3b9 Binary files /dev/null and b/t/src/x.bz2 differ diff --git a/t/src/x.ear b/t/src/x.ear new file mode 100644 index 0000000..84d5b54 Binary files /dev/null and b/t/src/x.ear differ diff --git a/t/src/x.gz b/t/src/x.gz new file mode 100644 index 0000000..43b5a02 Binary files /dev/null and b/t/src/x.gz differ diff --git a/t/src/x.jar b/t/src/x.jar new file mode 100644 index 0000000..84d5b54 Binary files /dev/null and b/t/src/x.jar differ diff --git a/t/src/x.lzma b/t/src/x.lzma new file mode 100644 index 0000000..060724c Binary files /dev/null and b/t/src/x.lzma differ diff --git a/t/src/x.par b/t/src/x.par new file mode 100644 index 0000000..84d5b54 Binary files /dev/null and b/t/src/x.par differ diff --git a/t/src/x.tar b/t/src/x.tar new file mode 100644 index 0000000..204b24b Binary files /dev/null and b/t/src/x.tar differ diff --git a/t/src/x.tar.gz b/t/src/x.tar.gz new file mode 100644 index 0000000..00f012d Binary files /dev/null and b/t/src/x.tar.gz differ diff --git a/t/src/x.tar.xz b/t/src/x.tar.xz new file mode 100644 index 0000000..531eee8 Binary files /dev/null and b/t/src/x.tar.xz differ diff --git a/t/src/x.tgz b/t/src/x.tgz new file mode 100644 index 0000000..00f012d Binary files /dev/null and b/t/src/x.tgz differ diff --git a/t/src/x.txz b/t/src/x.txz new file mode 100644 index 0000000..531eee8 Binary files /dev/null and b/t/src/x.txz differ diff --git a/t/src/x.war b/t/src/x.war new file mode 100644 index 0000000..84d5b54 Binary files /dev/null and b/t/src/x.war differ diff --git a/t/src/x.xz b/t/src/x.xz new file mode 100644 index 0000000..ea28d9e Binary files /dev/null and b/t/src/x.xz differ diff --git a/t/src/x.zip b/t/src/x.zip new file mode 100644 index 0000000..84d5b54 Binary files /dev/null and b/t/src/x.zip differ diff --git a/t/src/y.ear b/t/src/y.ear new file mode 100644 index 0000000..a645d7c Binary files /dev/null and b/t/src/y.ear differ diff --git a/t/src/y.jar b/t/src/y.jar new file mode 100644 index 0000000..a645d7c Binary files /dev/null and b/t/src/y.jar differ diff --git a/t/src/y.par b/t/src/y.par new file mode 100644 index 0000000..a645d7c Binary files /dev/null and b/t/src/y.par differ diff --git a/t/src/y.tar b/t/src/y.tar new file mode 100644 index 0000000..21173cb Binary files /dev/null and b/t/src/y.tar differ diff --git a/t/src/y.tar.bz2 b/t/src/y.tar.bz2 new file mode 100644 index 0000000..3d1a715 Binary files /dev/null and b/t/src/y.tar.bz2 differ diff --git a/t/src/y.tar.gz b/t/src/y.tar.gz new file mode 100644 index 0000000..836e9b0 Binary files /dev/null and b/t/src/y.tar.gz differ diff --git a/t/src/y.tar.xz b/t/src/y.tar.xz new file mode 100644 index 0000000..dfca273 Binary files /dev/null and b/t/src/y.tar.xz differ diff --git a/t/src/y.tbz b/t/src/y.tbz new file mode 100644 index 0000000..3d1a715 Binary files /dev/null and b/t/src/y.tbz differ diff --git a/t/src/y.tgz b/t/src/y.tgz new file mode 100644 index 0000000..836e9b0 Binary files /dev/null and b/t/src/y.tgz differ diff --git a/t/src/y.txz b/t/src/y.txz new file mode 100644 index 0000000..dfca273 Binary files /dev/null and b/t/src/y.txz differ diff --git a/t/src/y.war b/t/src/y.war new file mode 100644 index 0000000..a645d7c Binary files /dev/null and b/t/src/y.war differ diff --git a/t/src/y.zip b/t/src/y.zip new file mode 100644 index 0000000..a645d7c Binary files /dev/null and b/t/src/y.zip differ