<?xml version="1.0"?>
<!DOCTYPE xml PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" 
"http://www.w3.org/TR/MathML2/dtd/xhtml-math11-f.dtd" [ 
<!ENTITY mathml "http://www.w3.org/1998/Math/MathML">]>
<html xmlns="http://www.w3.org/1999/xhtml">
<head><title>Wiki: MiscPerl</title><link type="text/css" rel="stylesheet" href="http://hardcarve.com/muse/wiki.css" /><meta name="robots" content="INDEX,NOFOLLOW" /><link rel="alternate" type="application/rss+xml" title="Wiki" href="http://www.hardcarve.com/muse/muse.pl?action=rss" /><link rel="alternate" type="application/rss+xml" title="Wiki: MiscPerl" href="http://www.hardcarve.com/muse/muse.pl?action=rss;rcidonly=MiscPerl" /></head><body class="http://www.hardcarve.com/muse/muse.pl"><div class="header"><span class="gotobar bar"><a class="local" href="http://www.hardcarve.com/muse/muse.pl/HomePage">HomePage</a> <a class="local" href="http://www.hardcarve.com/muse/muse.pl/RecentChanges">RecentChanges</a> </span><h1><a title="Click to search for references to this page" href="http://www.hardcarve.com/muse/muse.pl?search=MiscPerl">MiscPerl</a></h1></div><div class="content browse"><h1>Acrobat 6.0 error</h1><p>from <a class="url outside" href="http://www.deerbrook.com/downloads/sysreqs/Bridging/new_page_1.htm">http://www.deerbrook.com/downloads/sysreqs/Bridging/new_page_1.htm</a>: </p><h3>&lt;%%%&gt;</h3><p>Issue</p><p>When you try to open Adobe Reader or Adobe Acrobat, it freezes when you try to open it, or it opens and then immediately closes.</p><p>Details You may also receive the error "Can't find Acrobat Plug in. The file name, directory or volume label syntax is incorrect."</p><p>The process "acrobat.exe" uses 95% or more of the capacity of the system processor.</p><p>Solutions</p><p>Do one or more of the following:</p><p>Solution 1: Delete temporary files, and update to Acrobat 6.0.1 or Adobe Reader 6.0.1. </p><ul><li>Enable Windows to show hidden files and folders. For instructions, see the documentation included with Windows.</li><li>Delete all temporary files from the following folders:<ul><li>Windows\Temp</li><li>Documents and Settings\[user profile]\Local Settings\Temp</li></ul></li><li>Update to Acrobat 6.0.1 or Adobe Reader 6.0.1:<ul><li>To update to Acrobat 6.0.1 Standard or Professional, install the update from the Adobe Web site at www.adobe.com/support/downloads/main.html.</li><li>To update to Adobe Reader 6.0.1, remove Adobe Reader 6.0 from the computer, and then install the 6.0.1 version from the Adobe Web site at www.adobe.com/products/acrobat/readstep2.html. </li></ul></li></ul><h3>&lt;%%%&gt;</h3><p>To remove the 64k files - just select-all, delete would crash windows explorer - I wrote a short perl script. </p><pre class="real">#!/usr/bin/perl -w
use Shell qw(rm); 

@list = split(//, "01234567890ABCDEF");
$length = scalar(@list);  
for($ctr=0; $ctr &lt; $length; $ctr++){
	for($ctr2=0; $ctr2 &lt; $length; $ctr2++){
		$command = "Acr".$list[$ctr].$list[$ctr2]."\* "; 
		rm($command); 
		print "rm ".$command."\n"; 
	}
}
</pre><p>It loops twice over all the hex characters and asks the shell to remove all matching files (example: "rm Acr2F*")</p><h1>Renaming files in a directory</h1><p>say you want to rename files in a directory using regexes. I'm sure there is a way to do this within the shell, but this script will do it too (not the most efficient way .. eh well): </p><pre class="real">#!/usr/bin/perl -w
use Shell qw(cp rm ls); 
$files = ls();
@fila = split(/\n/, $files); 
print("length fila: " . @fila . "\n"); 
foreach $f(@fila){
	if($f =~ m/Picture\s\d+\.pdf/){
	    $f = q(") . $f . q("); #the file has a space, need to quote it.
	    $nf = $f; 
	    $f =~ s/Picture\s/Aug282005_/; 
	    cp($nf . " " . $f); 
	    print("cp " . $nf . " " . $f."\n");
	    rm($nf); 
	    print("rm ".$nf."\n"); 
	}
}
</pre><h1>Removing items from an array</h1><p>Say you have a file, and you want to delete some lines from it. Use <em>Tie</em> to handle the file I/O, and remove the offensive elements from the resulting array using the perl fuction <em>splice</em>: </p><pre class="real">use Tie::File; 
my $o = tie @lines, 'Tie::File', 'test.txt', or die 'help';
$o-&gt;flock(2);
# show proper Tie and flock usage...

@lines = qw( hello 45 there 25 all 3535 you 457 crazye 705 funksters ); 
my $i;
for( $i = 0; $i &lt; scalar(@lines); $i++){
	if( $lines[$i] =~ /\d/){
		splice @lines, $i, 1; 
		print scalar(@lines) . "\n"; 
	}
}
$o = ''; # from the tie documentation: the best way to unlock a file is to
untie @lines; # discard the object and untie the array. 
</pre><p>This works fine because the for loop is aware of the changing length of the array. The script prints out: </p><pre class="real">hello
there
all
you 
crazye
funksters
</pre><p>Whereas the following does not work: </p><pre class="real">use Tie::File; 
my $o = tie @lines, 'Tie::File', 'test.txt', or die 'help';
$o-&gt;flock(2);

@lines = qw( hello 45 there 25 all 3535 you 457 crazye 705 funksters ); 

foreach $line (@lines){
	if( $line =~ /\d/ ){
		splice @lines, $i, 1; 
		print scalar(@lines) . "\n"; 
	}
}
$o = ''; 
untie @lines; 
</pre><p>it prints out this: </p><pre class="real">3535
you
457
crazye
705
funksters
</pre><p>Not sure why!! It would also be possible to replace the elements of the array with <em>null</em>, and later remove the elements by join/split operations: </p><pre class="real"> # other code same as above.. 
my $i;
for( $i = 0; $i &lt; scalar(@lines); $i++){
	if( $lines[$i] =~ /\d/){
		$lines[$i] = '';
	}
}
# remove all the null entries. 
$temp = join "\n", @lines;
$temp =~ s/\n\n/\n/g; #remove double returns
@lines = split /\n/, $temp; 
 rest of code the same too ...
</pre><div><div class="sectionlink"><a class="edit" title="Click to edit this section" href="http://hardcarve.com/muse/muse.pl?action=edit;id=MiscPerl;section=Importing%20a%20IMAP%20mailbox%20to%20MySQL">Edit</a></div><h2>Importing a IMAP mailbox to MySQL</h2></div><p>IMAP is great for webaccess to email, but often the server that you request data from is overloaded(like hardcarve.com). so, of course, it would be nice to keep all the data on a local machine for rapid searching. An easy way to do this is with any of the standard email clients - outlook, eudora, etc. But what if you don't like other people's products and would like to roll your own while maintaining the simple webinterface? It's not so hard with a bit of perl and MySQL<a class="edit" title="Click to edit this page" href="http://www.hardcarve.com/muse/muse.pl?action=edit;id=MySQL">?</a>. Plus, once you are done you can aggregate a whole bunch of mailboxes, perform fulltext search on them very quickly, and possibly generate a nice front end out of PHP or perl. (for now my front end is phpmyadmin wow).</p><p>The program flow is pretty simple, though it took a solid 7 hours to get working (sorta). Perl can be dense and slow at times. I hope I'll be able to read and understand the code later. Outline: </p><ol><li>setup MySQL<a class="edit" title="Click to edit this page" href="http://www.hardcarve.com/muse/muse.pl?action=edit;id=MySQL">?</a>, apache2, and phpmyadmin on your machine</li><li>install Net::IMAP::Simple, Net::POP3, and DBI through cpan, if you do not already have them. (n.b. you need to be su to run cpan)</li><li>make a table in mysql to hold the data. mine looks like this now: </li></ol><pre class="real">+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | int(20)      | NO   | PRI | NULL    | auto_increment |
| Date         | datetime     | NO   |     |         |                |
| DeliveryDate | datetime     | NO   |     |         |                |
| EnvelopeTo   | varchar(255) | NO   |     |         |                |
| From         | varchar(255) | NO   |     |         |                |
| ReplyTo      | varchar(255) | NO   |     |         |                |
| Subject      | varchar(255) | NO   | MUL |         |                |
| To           | text         | NO   |     |         |                |
| CC           | text         | NO   |     |         |                |
| Body         | text         | NO   | MUL |         |                |
| Attachements | varchar(255) | NO   |     |         |                |
+--------------+--------------+------+-----+---------+----------------+
</pre><p>11 rows in set (0.00 sec) </p><ul><ul><li>don't use utf-8-binary collation - the fullttext index will be case-sensitive. I used utf-8-unicode-ci.</li><li>you can edit /etc/mysql/my.cnf to allow fulltext searches of three-letter words by adding a line: <em>ft_min_word_len         = 3 </em> somewhere below the [mysqld] line.</li></ul></ul><ol><li>(within the script from here on): connect to the IMAP server</li><li>loop over all entries in a mailbox to retreive the header and body.</li><li>get rid of the nonprinting characters in the header (this REALLY annoied me!)</li><li>see if the message is multipart; if it is, try to extract the separator.<ol><li>you need to escape the parenthesis, which some mail clients use in the separator.</li></ol></li><li>if the message is multipart, split on the separator and keep only the plaintext version (later I'll keep the attachments too).</li><li>quote the SQL escape, '</li><li>convert the imap dates to SQL dates</li><li>generate a query to add it to the database.</li></ol><p>This file is available on subversion, too: <a class="url" href="https://hardm.ath.cx/public/m8ta/m8ta_mail.pl">https://hardm.ath.cx/public/m8ta/m8ta_mail.pl</a> </p><pre class="real">use Net::IMAP::Simple;
use Net::POP3;
use Email::Simple;
use DBI; 
use strict; 
require "imapConvertDate.pl";

my $server;
my $nmessages;  

if(0){
  # http://search.cpan.org/src/CFABER/Net-IMAP-Simple-1.14/README

  my $server = Net::IMAP::Simple-&gt;new( 'hardcarve.com' );
  $server-&gt;login( 'tim@hardcarve.com', '***' );

  #my @boxes = $server-&gt;mailboxes; 
  #foreach my $l (@boxes){ print "$l \n"; }

  $nmessages = $server-&gt;select( 'familie' );
}else{
  # http://perl.active-venture.com/lib/Net/POP3.html

  $server= Net::POP3-&gt;new('neuro.duke.edu');
  if( defined($server)){
    print "created a POP3 connection successfully!\n"; 
    $nmessages = $server-&gt;login('hanson@neuro.duke.edu', '***'); 
  }
}
print "you have $nmessages messages!\n"; 


# get ready for the large loop: connect to the database. 
my $database = 'DBI:mysql:mail:localhost'; 
my $username = 'root'; 
my $password = 'ktme3673';

my $query; 
my $dbi = DBI-&gt;connect($database, $username, $password) or die $DBI::errstr;
my $sth; 

for(my $mess=1; $mess &lt; 100; $mess++){
my $header = $server-&gt;top($mess); 
my $ln = 0;
my %hdr; 
my($key, $value); 
foreach my $lk (@{$header}){
  # print "top-$ln--$lk"; 
  $ln++; 

  #need to remove nonprinting characters.ANNOYING
  my $pat = '\0'; 
  for(my $i=0; $i&lt;3; $i++){
    for(my $j=0; $j &lt; 8; $j++){
      my $pp; 
      $pp = $pat . $i . $j;
      if($pp !~ /000/){$lk =~ s/$pp//;}
    }
  }

  # need to remove excessive spaces. 
  $lk =~ s/\s{2,100}/ /i; 

  if( $lk =~ /^([\w-]+)\:(.*)/){
    $key = $1; 
    $value = $2;
    if(exists $hdr{$key}){
         $hdr{$key} .= $value; 
       }else{
	 $hdr{$key} = $value; 
       }
  }else{
    $hdr{$key} .= $lk; 
  }
}
print "\nand now the contents of the hash!! \n\n"; 
for my $kl (sort keys %hdr){
  print "$kl -- $hdr{$kl}\n"; 
}

my $message = $server-&gt;get($mess);
# it seems that the body is simply prefaced by a blank line. 
# also, if there are multiple parts in the mail, 
# then there will be a 'content-type' string in the header. 
# with a keyword boundary="====="
$ln = 0; 
my $found = 0; 
my $body; 
foreach my $l (@{$message}){
  if($found){ $body .= $l;}
  $ln++; 
  if($l !~ /[^\s\n]/ &amp;&amp; $found == 0){ 
    print "found start of body!\n"; 
    $found = 1; 
  }
}

my $mimesep; 
my $mimekey; 
for my $kl (sort keys %hdr){
  if( $kl =~ /content-type/i){
    $mimekey = $kl; 
  }
}
if(exists $hdr{$mimekey}){
  if($hdr{$mimekey} =~ /boundary=(.+?)$/i){ # ! must be case insensitive !
    # some mailers put quotes around it, others do not. 
    # remove quotes if they exist. 
    $mimesep = $1; 
    $mimesep =~ s/^\s*"//;
    $mimesep =~ s/"\s*$//; 
    print " MIME separator = $mimesep \n";
    # now, we need to escape the parenthesis should that occur in the separator
    $mimesep =~ s/\(/\\(/g; 
    $mimesep =~ s/\)/\\)/g; 
    print " escaped MIME separator = $mimesep \n";
  my @bdy = split($mimesep, $body); 

  # by default, include the first part of a multipart -- 
  # somtimes there is no plain text part.
  $body = $bdy[0]; 

 foreach my $multipart (@bdy){
   print "\n multipart! \n"; 
  if($multipart =~ /content-type:\s*text\/plain/i){
    print " found the plain text! \n"; 
    print $multipart; 
    $body = $multipart; 
  }
}
}
}

#need to quote the SQL escape character, '
foreach $key (keys %hdr){
  $hdr{$key} =~ s/'/&amp;#39/gm; 
}
$body =~ s/'/&amp;#39/gm; 

#otay, need to break the hash into variables. 
my ($contentType, $date, $deliveryDate, $envelopeTo, $from, $replyTo, $subject, $to, $cc); 
foreach $key (keys %hdr){
  if($key =~ /content-type$/i){$contentType = $hdr{$key}; }
  elsif($key =~ /delivery-date$/i){$deliveryDate =  $hdr{$key}; }
  elsif($key =~ /date$/i){$date =  $hdr{$key}; }
  elsif($key =~ /envelope-to$/i){$envelopeTo =  $hdr{$key}; }
  elsif($key =~ /from$/i){$from =  $hdr{$key}; }
  elsif($key =~ /reply-to$/i){$replyTo =  $hdr{$key}; }
  elsif($key =~ /subject$/i){$subject =  $hdr{$key}; }
  elsif($key =~ /to$/i){$to =  $hdr{$key}; }
  elsif($key =~ /cc$/i){$cc =  $hdr{$key}; }
}

# need to convert the dates. 
if(defined($date)){ $date = imapConvertDate($date);}
if(defined($deliveryDate)){ $deliveryDate = imapConvertDate($deliveryDate);}

$query = &lt;&lt;"EOT";
INSERT INTO `base` 
(`Date`,`DeliveryDate`,`EnvelopeTo`,`From`,`ReplyTo`,`Subject`,`To`,`CC`,`Body`,`Attachements` )
VALUES 
('$date', '$deliveryDate', '$envelopeTo', '$from', '$replyTo', '$subject', '$to', '$cc', '$body', ''); 
EOT
print "SQL query: $query \n"; 

#$sth = $dbi-&gt;prepare($query) or die "prepare error: " . $sth-&gt;errstr; 
#$sth-&gt;execute() or die "execute error:" . $sth-&gt;errstr; 


} #end loop over the mailbox. 

$server-&gt;quit();
</pre></div><div class="footer"><hr /><span class="gotobar bar"><a class="local" href="http://www.hardcarve.com/muse/muse.pl/HomePage">HomePage</a> <a class="local" href="http://www.hardcarve.com/muse/muse.pl/RecentChanges">RecentChanges</a> </span><span class="edit bar"><br /> <a class="edit" accesskey="e" title="Click to edit this page" href="http://www.hardcarve.com/muse/muse.pl?action=edit;id=MiscPerl">Edit this page</a> <a class="history" href="http://www.hardcarve.com/muse/muse.pl?action=history;id=MiscPerl">View other revisions</a> <a class="admin" href="http://www.hardcarve.com/muse/muse.pl?action=admin;id=MiscPerl">Administration</a></span><span class="time"><br /> Last edited 2006-07-29 04:27 UTC by 152.16.229.56 <a class="diff" href="http://www.hardcarve.com/muse/muse.pl?action=browse;diff=2;id=MiscPerl">(diff)</a></span><form method="get" action="http://www.hardcarve.com/muse/muse.pl" enctype="multipart/form-data" class="search">
<p>Search: <input type="text" name="search"  size="20" accesskey="f" /> <input type="submit" name="dosearch" value="Go!" /></p></form></div>
</body>
</html>
