If you need to parse HTTP headers from a pcap file, this is a straight forward way to do it.
I saw a lot of hacks on how to do the same thing but, in my opinion, this is the easiest way.

Tshark command

tshark -V -O http -r yourfile.pcap -R "tcp && (http.response || http.request)"

Example Perl script

$result = `tshark -V -O http -r yourfile.pcap -R "tcp && (http.response || http.request)"`;
@chunks = split ("\n\n", $result);

foreach $chunk (@chunks) {
        if ($chunk =~ /^Frame (\d+?):.*Hypertext Transfer Protocol\n(.*)\\r\\n\s+\\r\\n/s) {
                print "## Frame: $1 ##\n$2\n###############\n\n";
        }
}

This tutorial is to set up an Ubuntu machine to do SSL man-in-the-middle. The target for SSL interception will be a Windows machine.
Prerequisites:

# apt-get install python-dev
# apt-get install python-pip
# apt-get install libxml2-dev
# apt-get install libxslt1-dev

1. Install mitmproxy

We will be using mitmproxy for interception (mitmproxy.org)

# pip install mitmproxy

This tutorial is customized for an Ubuntu installation and will require the following:

# apt-get install wget
# apt-get install perl5
# apt-get install unzip

 

The following steps will get you started on building a simple web scraper in perl

1. Create a target list

You can create your own custom list or you can download a file from alexa.com containing the top 1 million visited websites

# wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

unzip the file:

# unzip top-1m.csv.zip

The unzipped file top-1m.csv should have the following format:

1,google.com
2,facebook.com
3,youtube.com
4,yahoo.com
5,baidu.com
...

2. Perl code to read a file

In order to read a file, you need to open it and loop through each line:

open FH, "top-1m.csv";
while (my $line =<FH> ) {
  chomp $line;
  print "$line\n";
}
close FH;

3. Perl code to parse each line

Inside the while loop, use a regular expression to parse each line:

if ($line =~ /^(\d+),(.*)$/) {
  my $index = $1;
  my $website = $2;
}

4. Download webpages

Inside the if statement, call wget:

my $command = "wget -P $website/ $website";
print "$index: $command\n";
`$command`;

You may not want to scrape all 1 million websites, so you can add the following:

if ($index>100) {exit;} # this wil limit how many websites you scrape to the first 100

If you are experiencing a "MySQL client ran out of memory' error, it may be because you are retrieving a very large table. By default, MySQL will try to buffer all of it into memory which will cause that error.

In order to force MySQL to process each row, one at a time, set the mysql_use_result value of the db handler to 1:

$dbh= DBI->connect("DBI:mysql:<database>", "root", "<password>");
$dbh->{'mysql_use_result'}=1;