Packet Loss?

Discussion in 'The Lounge' started by goldfish, Nov 1, 2006.

  1. goldfish

    goldfish Lt. Sushi.DC

    I've been offered a job here at my uni as a "Student Internet Advocate"... I'm not entirely sure what that's supposed to mean but, I get free internet access for it. (well, it only cost me £2 in the first place but y'know).

    I've been working on some Perl scripts to help me plot network performance. Although I'm still testing them, the results arn't too promising.
    http://img.photobucket.com/albums/v186/goldfish654/ping_plot.png
    This is a measure of ping times from here to one of google.com's local servers. That's a nice little spike there, isn't it? :p From some older graphs you'd get them once every 20 minutes or so. Ah, but the best is to come!
    http://img.photobucket.com/albums/v186/goldfish654/pckloss_plot.png
    WOW. (Ignoring the point recorded in 1970 causing the lines going across)
    That graph shows that most of the time, more than HALF OF MY PACKETS NEVER GET THERE.

    Now with one person, packet loss is annoying. VoIP sucks. But when you've got 300 or so people using the connection, you end up sending 2 or 3 times more traffic through the network than you need to - which then means you send even MORE packets, bringing the whole thing to a horrible grinding halt in a splated pile of CAT5.

    This has happened 5 times in the last 3 days.

    Something isn't up to the job here... and it's my mission to work out what it is.
     
  2. theefool

    theefool Geekified

    Very interesting information. Looking forward to your results! :)
     
  3. goldfish

    goldfish Lt. Sushi.DC

    These graphs are getting quite interesting now...

    For logged in users...

    http://img.photobucket.com/albums/v186/goldfish654/users_plot-1.png

    Check out the sharp drop. According to the ISP's website, that was due to:
    01-11-2006 13:12:21 02-11-2006 00:54:55 NAS-Reboot

    Right - so it basically dropped all idle connections. And the effect?

    http://img.photobucket.com/albums/v186/goldfish654/pckloss_plot-1.png

    Packet loss dramatically decreses.

    http://img.photobucket.com/albums/v186/goldfish654/ping_plot-1.png

    As does ping response time.

    Coincidence?? I think not! I think we have a software bug on our hands... I'll be having words with Colubris about this...
     
  4. G.T.

    G.T. R.I.P February 4, 2007. You will be missed.

    Great sleuthing Goldy. They should be paying you more. :)
     
  5. goldfish

    goldfish Lt. Sushi.DC

    Lol, they should be paying me full stop :p

    Looking at yet more graphs (I won't bother uploading them this time...) it seems that the system can quit happily support 60 users. But any mor than that and you start getting insane ammounts of packet loss, slow ping responses and all sorts of bad stuff.

    I think they need to distribute the network better. It's clear that the fiber connection between each node isn't fast enough to support the number of users that are connected. It could, of course, be that the multiservice controller hasn't got enough RAM to route the packets properly. It's all stuff that I'm going to look into ;)
     
  6. matt.chugg

    matt.chugg MajorGeek

    NIce work with the scripts by the way ;) DOn't do perl myself
     
  7. goldfish

    goldfish Lt. Sushi.DC

    Here, I'll post the code :)
    Code:
    #!/usr/bin/perl
    # pckloss_stats.pl
    # This is the line that confirms we got a response back to our ping.
    # The log file, where we save failed pings, and the perl file
    # we use to update the web page. you could work it in to one file.
    $LOG_FILE         = "./logs/pckloss_stats.dat";
    while (1) {
    # The list of hosts we try to ping comma seperated, in ""
    ($sec, $min, $hrs, $day, $month, $year) = (localtime) [0,1,2,3,4,5];
    $year = $year + 1900;
    $month++;
    $startDay = $day;
    $fstartDate = "$year-$month-$day";
    $host              = "64.233.167.99";
    $DUMP_PERIOD     = 1;  # every number of pings to update file
    $PING_PERIOD     = 30;  # how frequently to ping, in seconds
    
    use POSIX;
    $dump_count         = 0;  # counter for dump frequency
    my $sentinel = 1;
    while($sentinel)
    {
         # get the current time
         ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst)  = localtime(time);
          $year = $year + 1900;
            $mon++;
            # we assume no response until otherwise proven for this host
          $got_response = 0;
          # Run the ping command, get results back in stdin
    
          open PING_RESULTS, "/bin/ping -c 30 $host 2>&1 |"
                  or die "cant fork";
    
          while($line = <PING_RESULTS>)
           {
            if($line =~ /, ([0-9]*)% packet loss/)
             {
               $got_response = 1;
    
               open LOG, ">> $LOG_FILE";
               printf(LOG "$hour:$min:$sec+$mday-$mon-$year $1\n");
               print "$hour:$min:$sec+$mday-$mon-$year $1\n";
               close LOG;
              }
           }
    
    if ($day != $startDay) {
            $sentinel = 0;
    }
    
      sleep $PING_PERIOD;
    }
    ($sec, $min, $hrs, $day, $month, $year) = (localtime) [0,1,2,3,4,5];
    $month++;
    system("mv ./logs/pckloss_stats.dat ./logs/archive/pckloss_stats_$fstartDay.dat");
    
    
    }
    
    Code:
    #!/usr/bin/perl
    # ping_stats.pl
    # This is the line that confirms we got a response back to our ping.
    # The log file, where we save failed pings, and the perl file
    # we use to update the web page. you could work it in to one file.
    $LOG_FILE         = "./logs/ping_stats.dat";
    print "Ping test starting \n";
    while (1) {
    
    ($sec, $min, $hrs, $day, $month, $year) = (localtime) [0,1,2,3,4,5];
    $month++;
    $year = $year + 1900;
    $startDay = $day;
    $fstartDate = "$year-$month-$day";
    $host              = "64.233.167.99";
    $DUMP_PERIOD     = 2;  # every number of pings to update file
    $PING_PERIOD     = 10;  # how frequently to ping, in seconds
    
    use POSIX;
    $dump_count         = 0;  # counter for dump frequency
    $sentinel = 1;
    while($sentinel)
    {
         # get the current time
            ($sec, $min, $hrs, $day, $month, $year) = (localtime) [0,1,2,3,4,5];
             # we assume no response until otherwise proven for this host
         $year = $year + 1900;
    $got_response = 0;
    $month++;
          # Run the ping command, get results back in stdin
    
          open PING_RESULTS, "/bin/ping -c 1 $host 2>&1 |"
                  or die "cant fork";
    
          while($line = <PING_RESULTS>)
           {
            if($line =~ / = ([0-9]*\.?[0-9]*)\/([0-9]*\.?[0-9]*)\/([0-9]*\.?[0-9]*)\/([0-9]*\.?[0-9]*) ms/)
             {
               $got_response = 1;
               open LOG, ">> $LOG_FILE";
               printf(LOG "$hrs:$min:$sec+$day-$month-$year $2\n");
               print "$hrs:$min:$sec+$day-$month-$year $2\n";
               close LOG;
              }
            }
    
    if ($day != $startDay) {
            $sentinel = 0;
    }
      sleep $PING_PERIOD;
    }
    
    # So, it's the end of the day - let's clean up after ourselves before we restart
    
    system("mv ~/logs/ping_stats.dat ~/logs/archive/ping_stats_$fstartDate.dat");
    
    }
    
    
    The one for getting Authorised Users I've published on my blog:
    http://www.goldfishsbowl.co.uk/?p=201

    Then a couple of other scripts will get the data that those scripts have saved and archived and use GNUPlot to turn them into a graph. Here's graphs.sh
    Code:
    #!/bin/bash
    # This script is designed to grab the latest data and make a nice graph out of it
    while (true)
    do
    echo "Sending graphs ... "
    mkdir temp
    
    #                       This part we'll be doing the cool funky rolling 6 hour
    ./data_grabber.pl
    #                       data collection thingy!  That'll be nice!
    echo "Plotting Users ...."
    cat plots/user_plot.conf | gnuplot
    echo "Plotting Ping ..."
    cat plots/ping_plot.conf | gnuplot
    #cat plots/bandwidth_plot.conf | gnuplot
    echo "Plotting Packet Loss ...."
    cat plots/pckloss_plot.conf | gnuplot
    
    echo "Uploading Via FTP..."
    # And then send them to the webserver....
    cat ftp_script.ftp | ftp 172.16.31.253 > /dev/null
    
    # rm -rf temp
    sleep 60
    done
    
    The script that gets the data looks like this:
    Code:
    #!/usr/bin/perl -w
    
    # This script will grab 24 hours worth of
    # data from the sample files which can then
    # be graphed and look all pretty.
    
    #       First we work out when yesterday was...
    ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time-86400);
    #my @yesterday = localtime(time-86400);
    $year+=1900;
    $mon++;
    $fyesterday = $year.'-'.$mon.'-'.$mday;
    
    my @today = localtime(time);
    # Define the names of the files we need to deal with.
    print $fyesterday."\n";
    my @files = ("ping","pckloss","user");
    system("mkdir ~/temp/");
    
    foreach $file (@files)
    {
            print "==== STARTING $file ========\n";
            print "OUTPUT: /home/stats/temp/".$file."_stats.dat\n";
            open OUTPUT, ("> /home/stats/temp/".$file."_stats.dat");
            print "INPUT /home/stats/logs/archive/".$file."_stats_".$fyesterday.".dat\n";
            if (-e "/home/stats/logs/archive/".$file."_stats_".$fyesterday.".dat")
            {
                    print "FILE EXISTS\n";
                    open (INPUT, "/home/stats/logs/archive/".$file."_stats_".$fyesterday.".dat");
                    while ($line = <INPUT>)
                    {
                            if ($line =~ /(\d*):(\d*):(\d*)/) { # For valid lines..
                                    if ($1 >= $yesterday[2]) { # If the hour is greater or equal
                                            if ($2 >= $yesterday[3]) { # if the minute is...
                                                    print OUTPUT $line; # output the line
                                                    print $line;
                                            }
                                    }
                            }
                    }
    
                    close INPUT;
            }
            if (-e "/home/stats/logs/".$file."_stats.dat")
            {
                    open (INPUT2, "/home/stats/logs/".$file."_stats.dat");
                    while ($line = <INPUT2>)
                    {
                            print OUTPUT $line;
                            print $line;
                    }
            }
            close INPUT2;
            close OUTPUT;
    }
    
    The plot files look something like this:
    Code:
    set terminal png
    cd '/home/stats/images/'
    set output 'pckloss_plot.png'
    set xdata time
    set timefmt '%H:%M:%S+%d-%m-%Y'
    set xlabel 'Time'
    set ylabel 'Packet Loss (%)'
    set title 'Packet loss from Google.com'
    set yrange [0:100]
    plot '/home/stats/temp/pckloss_stats.dat' using 1:2 index 0 title "% packet loss" with lines
    
    The FTP script simply PUTs the files to a FTP server:
    Code:
    binary
    put
    ~/images/ping_plot.png
    ping_plot.png
    put
    ~/images/users_plot.png
    users_plot.png
    put
    ~/images/bandwidth_plot.png
    bandwidth_plot.png
    put
    ~/images/pckloss_plot.png
    pckloss_plot.png
    put
    ~/reboots.txt
    reboots.txt
    quit
    
    The reboots.txt file is something I've grabbed from my ISPs history page. I used WWW::Mechanize and HTML::TableContentParse to extract it.
    Code:
    #!/usr/bin/perl -w
    # This script is to get any faults reported on the
    # DV history page.
    
    use WWW::Mechanize;
    use HTML::TableContentParser;
    
    my $username = 'my_username';
    my $password = 'my_password';
    while (1) {
    my $agent = WWW::Mechanize->new();
    my $login = 0;
    
    print "Getting the index...\n";
    $agent->get("https://www.my-isps-site.com/generic/home.aspx");
    $tempcontent = $agent->content();
    
    if ($tempcontent =~ /value=\"Sign-In\"/) {
            $login = 1;
    }
    
    if ($login) {
            print "Navigating to the uber-form\n";
            $agent->form(0);
            print "Entering Login information ...\n";
            $agent->field("tbUserId",$username);
            $agent->field("tbPassword",$password);
            print "Click the button ... \n";
            $agent->click("btnSignin");
    }
    
    print "Following the My History link ...\n";
    $agent->follow("My History");
    
    print "Make an instance of HTML::TableContentParser\n";
    $p = HTML::TableContentParser->new();
    
    print "Loading in the HTML\n";
    $html = $agent->{content};
    
    print "Parsing HTML...\n";
    $tables = $p->parse($html);
    
    my $i = 0;
    my $j = 0;
    my @usage;
    
    foreach $t (@$tables) {
            while (($key, $value) = each(%$t)){
                    if ($key eq 'id' && $value eq 'gvUsage') {
                            for $r (@{$t->{rows}}) {
                                    for $c (@{$r->{cells}}) {
                                            $data =$c->{data};
                                            $data =~ s/<[^\<]+>//g;
                                            if ($j == 1) {
                                                    $usage[$i]{'time'} = $data;
                                            } elsif ($j == 5) {
                                                    $usage[$i] {'reason'} = $data;
                                            }
                                            $j++;
                                    }
                            $j=0;
                            $i++;
                            }
                    }
            }
    }
    open(LOGFILE,'>/home/stats/reboots.txt');
    print "--------------------------------------\n";
    for (my $k=1; $k < scalar (@usage); $k++)
    {
    #          print "Time: ".$usage[$k]{'time'}." Reason: ".$usage[$k]{'reason'}."\n";
            if ($usage[$k]{'reason'} eq 'NAS-Reboot') {
                    print LOGFILE "NAS-Reboot at $usage[$k]{'time'}\n";
                    print "NAS-Reboot at $usage[$k]{'time'}\n";
            }
    
    }
    close LOGFILE;
    print "--------------------------------------";
    
    sleep(600);
    
    }
    
     
  8. matt.chugg

    matt.chugg MajorGeek

    Hmm, nice, Perl actually looks simpler than I thought it would be lol
     
  9. Phantom

    Phantom Brigadier Britches

    They either need a fatter data line, more RAM, or a better Postal Exchange , <for "lost packets", eh>. :D
     
  10. goldfish

    goldfish Lt. Sushi.DC

    Perl is lovely :) it stands for Practical Extraction and Report Language, and that's exactly whatI'm using it for!

    The only problems I've been having is getting my head around the prefixes for variables. If you look at the last one to get the data from a table I have to do some funky things with what I'd normally call a 2D array, but in perl it's a list containing hashes. Oww, my brain! The whole@$table %$@t thing confuses the heck out of me.

    I like Ruby too, but for this particular job Perl seemed to suit the job best.
     
  11. goldfish

    goldfish Lt. Sushi.DC

    Wow that message took a while to get there :p Actually I just left my PC and forgot to press post :D

    All in all, the box is a bit craptastic and I've got evidence to show them that it is. I don't care if they have to upgrade their carrier pidgeons to fix it, but they ARE gunna fix it :p
     
  12. matt.chugg

    matt.chugg MajorGeek

    Maybe they forgot to feed the hamster over the weekend?

    /me goes to learn perl
     

MajorGeeks.Com Menu

Downloads All In One Tweaks \ Android \ Anti-Malware \ Anti-Virus \ Appearance \ Backup \ Browsers \ CD\DVD\Blu-Ray \ Covert Ops \ Drive Utilities \ Drivers \ Graphics \ Internet Tools \ Multimedia \ Networking \ Office Tools \ PC Games \ System Tools \ Mac/Apple/Ipad Downloads

Other News: Top Downloads \ News (Tech) \ Off Base (Other Websites News) \ Way Off Base (Offbeat Stories and Pics)

Social: Facebook \ YouTube \ Twitter \ Tumblr \ Pintrest \ RSS Feeds