ListMailPRO Email Marketing Software Forums
ListMailPRO Email Marketing Software Forums => Server Optimization, Tweaks => Topic started by: DW on October 04, 2006, 01:55:28 pm
-
I've been working on a command-line PHP script to parse the maillogs and otherwise check for blocks against the server.
This script could be run on a regular cron schedule, such as follows for every 12 hours:
0 2,14 * * * /usr/bin/php -q /path/to/spamblockfinder.php > /dev/null 2>&1
There's still a lot to be done (such as BL, spamcop, etc. lookups and ignoring blocks based on time) but here it is so far:
<?php
// find spam blocks - parse maillog to find urls, check hotmail connection
// config
$report_to = 'dean@listmailpro.com';
$maillog = '/usr/local/psa/var/log/maillog';
// end config
// parse the maillog
$fp = fopen($maillog,'r');
$i = 1;
$urls=array();
$out = '';
while(!feof($fp)){
$line = fgets($fp,1024);
$line = substr($line,0,strlen($line)-1);
if(!strstr($line,'http://') // make sure logline contains a url
|| strstr($line,'AOL_SPAM_COMPLAINT') // my custom aol script
|| stristr($line,'greylist') // greylisting, temporary error
|| stristr($line,'postgrey')
|| strstr($line,'User_unknown') // normal error, let bounce
|| strstr($line,'Your_email_address_is_not_listed_in_the_Address_Book_of') // yahoo 'greenlist' error, let bounce
){ continue; } // skip
// start line at http
$full = substr($line,strpos($line,'http://'));
// strip trailing /
if(substr($full,strlen($full)-1)=='/') $full = substr($full,0,strlen($full)-1);
$base = substr($full,0,strpos($full,'/',7));
// skip if base url already processed
if(!in_array($base,$urls)){
$urls[]=$base;
// log line
$out .= "LogLine[$i]: $line\n";
// full url
$out .= "FullURL[$i]: $full\n";
// smart url
$patterns = array('/_/','/\//','/>/');
$surl = preg_replace($patterns,' ',substr($full,strpos($full,'/',7)));
$surl = explode(' ',$surl);
$surl = $base.'/'.$surl[1];
$out .= "SmartURL[$i]: $surl\n";
// base url
if(strstr($base,'aol.com')) $aol=1;
$out .= "BaseURL[$i]: $base\n";
$out .= "\n";
}
$i++;
}
// check hotmail
getmxrr('hotmail.com', $mxhosts);
foreach($mxhosts as $mx){
if(!$hot){ $hot = @fsockopen($mx,25,$null,$null,10); @fclose($hot); }
}
// get server info
$serv_host = str_replace("\n",'',shell_exec('hostname'));
$serv_ip = gethostbyname($serv_host);
// output
$head = "Spamblock Report for $serv_host ($serv_ip)\n\n";
if(!$hot) $head .= "* HotMail is refusing our connections! (http://postmaster.hotmail.com)\n\n";
if($aol) $head .= "* One or more AOL blocks are in place. It is recommended you phone the AOL postmaster hotline with your IP (seen
above) for more information and, most likely, a quick removal of the block. It is also recommended to apply for whitelisting and a
feedback loop.\n - http://postmaster.aol.com/contact (List of contact phone numbers)\n - http://postmaster.aol.com (Further informa
tion)\n\n";
$out = $head.$out;
echo $out;
if(count($urls)==0&&$hot) exit();
mail($report_to,"$serv_host ($serv_ip)",$out,"From: \"Spam Block Finder\" <blocks@$serv_host>");
?>
-
This seems like it could be interesting...
I ran this and the email came back blank except for saying report for hostname/ip, etc...
I take it this means it didn't find anything?
-
Yes, this would mean it didn't find anything. One reason could be that your maillog is automatically rotated. Currently the script checks just the maillog file and not any rotated files. What you might do is is unzip the first processed maillog (if it's gzipped) and concatenate it with the old one.
cd /tmp
cp /var/log/maillog .
cp /var/log/maillog.processed.gz .
gunzip *.gz
cat maillog.processed >> ./maillog
Then re-run the script with the new path to the maillog, /tmp/maillog.
The script seems to need quite a bit of work still. The main problem is URLs included in the logs are very difficult to parse due to the addition of certain messages which appear as valid extensions to the URL. If you ever have a lot of blocks you will see what I mean.
Regards
-
Script updated:
- Will not notify if there are no blocks
- Correctly detects hostname when run from cron
- Added a few more ignore strings
-
I've copied the above code into a new script file and when I try to run it, I get the following errors:
Warning: feof(): supplied argument is not a valid stream resource in /home/powerkey/public_html/LM/spamblock.php on line 16
Warning: fgets(): supplied argument is not a valid stream resource in /home/powerkey/public_html/LM/spamblock.php on line 17
I'm guessing that it relates to the URL of the maillog, and would like to know how I can find out what the correct URL is for my server?
-
Try the command
locate maillog
If that fails you can either update the "locate database" and try again
updatedb
or try this alternative
find /* -name maillog
Note that this cannot currently handle multiple log files so if your maillog was recently rotated there won't be much information in there...
Regards
-
Hi,
I found my maillogs, they're in the /var/log folder - but they're HUGE, they are rotated every week and I have log files over 500 MB in size every week (about 1/2 GB)!
What you might do is is unzip the first processed maillog (if it's gzipped) and concatenate it with the old one.
Will this work with these HUGE mail log files?
-
For best results you should probably parse the last week or so of data. Anything more than that and you will probably run into expired blocks, etc.
You can grab the first 500,000 lines of a file like this:
head -n 500000 > outputlog
or the last 500,000 lines:
tail -n 500000 > outputlog
Regards