NeDi Community

NeDi Software Specific => Discovery => Topic started by: nschwarz on November 11, 2007, 03:11:00 PM

Title: Concurrent Discovery
Post by: nschwarz on November 11, 2007, 03:11:00 PM
We have a lot of devices (~ 1700). A discovery job ( "nedi.pl -cobrA") spents about 8 hours.

Suggestion:

Some sort of discovery concurrent jobs, which discover selected (configured) different sections
of the net.
Title: Re: Concurrent Discovery
Post by: rickli on November 12, 2007, 10:06:37 AM
Threaded discovery has been on the plate for a while. I'll look into this, but it would be good, to know someone who's got experience with threaded perl programming...? Anyone?

In the meantime you can play with -A and different seedfiles. Just make sure, the discovery without -A finishes first...
Title: Re: Concurrent Discovery
Post by: duanew on November 13, 2007, 01:42:47 AM
I would like this as well.  I read a bit about Multi-threading but wasn't sure how difficult it would be.  Because the whole discovery is kept in memory I don't know if multiple threads could access the parents threads variables.
Title: Re: Concurrent Discovery
Post by: rickli on November 13, 2007, 10:38:02 AM
Interesting docs:

http://perldoc.perl.org/threads.html

http://migo.sixbit.org/papers/Perl_Threads/
Title: Re: Concurrent Discovery
Post by: oxo on November 13, 2007, 06:50:10 PM
Well I have been playing with perl threads on and off for a while
-mostly for rrd updates.

In order to implement this, the DB access needs (maybe allready has) to be changed to updating one bit of information at a time.

In older releases, it was delete table and put new values in
- this 'ain't good with threads.
I believe this may have changed in the new release which I'll be checking now with RC4.

Say the word and I'll drive down for a code hacking session :)
- in Essen this weekend with the Cacti European meeting.

br Owen
Title: Re: Concurrent Discovery
Post by: rickli on November 14, 2007, 08:52:54 AM
From what I understand (so far), you can use shared threads. If this doesn't work, the whole things needs to be completely rewritten  :o
Title: Re: Concurrent Discovery
Post by: duanew on November 22, 2007, 03:30:33 AM
I have been experimenting with multi-threading NeDi and will share my experience.

For Perl 5.8 we need to "use threads;" and "use threads::shared;".  Starting and joining the threads is easy but there is a lot of work in sharing the variables.

I changed the main discovery loop to:

   
Code: [Select]
if ($threadcount > 1){

#Initilise the threads
my @threads = ();
for (my $i = 0; $i < $threadcount; $i++){
push @threads, threads->create(\&misc::DiscoverThreaded,);
#print "Thread $i Process Id $threads[$i]\n";
}

#Wait for the threads to end
for (my $i = 0; $i < $threadcount; $i++){
$threads[$i]->join;
}

In DiscoverThreaded I had it maintain the todo and done arrays.  This is the start of DiscoverThreaded:

Code: [Select]
sub DiscoverThreaded {

my $p;
{
lock(@todo);
$p = shift(todo);
}

while ($p){
#my $peer  = $doip{$_[0]};
my $peer  = $doip{$p};

and this is the end:

Code: [Select]
#We have done this device, update the done list
{
lock(@misc::doneid);
lock(@misc::doneip);
lock(@misc::donenam);
push (@misc::doneid,$p);
push (@misc::doneip,$peer);
push (@misc::donenam, $name) if ($name);
}

#Now get the next device from the todo list
{
lock(@misc::todo);
$p = shift(@misc::todo);
}
}

At the start of the scripts share the variables like this:

Code: [Select]
use vars qw($p $now $nediconf $cdp $lldp $oui);
use vars qw(%nod %dev %int %mod %link %vlan %opt %net %usr);

use threads;
use threads::shared;

share(%nod);
share(%dev);
share(%int);
share(%mod);
share(%link);
share(%vlan);
share(%opt);
share(%net);
share(%usr);

But the difficult part is sharing all the hashes.  The share only works on one level of the hash and so every time a new level is added it has to be shared.

In the MSQ library modules for example:

Code: [Select]
#===================================================================
# Read Links
#===================================================================
sub ReadLink {

my $nlink = 0;
my $where = "";
if($_[0] and $_[1]){$where = "WHERE $_[0] = \"$_[1]\""}

my $dbh = DBI->connect("DBI:mysql:$misc::dbname:$misc::dbhost", "$misc::dbuser", "$misc::dbpass", { RaiseError => 1, AutoCommit => 1});
my $sth = $dbh->prepare("SELECT * FROM links $where");
$sth->execute();
while ((my @l) = $sth->fetchrow_array) {
[b]#Shared Hashes have to created for every valiable
$main::link{$l[1]} = &share({});
$main::link{$l[1]}{$l[2]} = &share({});
$main::link{$l[1]}{$l[2]}{$l[3]} = &share({});
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]} = &share({});[/b]
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]}{bw} = $l[5];
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]}{ty} = $l[6];
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]}{pr} = $l[7];
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]}{du} = $l[8];
$main::link{$l[1]}{$l[2]}{$l[3]}{$l[4]}{vl} = $l[9];
$nlink++;
}
$sth->finish if $sth;
$dbh->disconnect;
return "$nlink links ($where) read from MySQL:$misc::dbname.links\n";
}


It is quite a bit of work identifying all these new hash initialisations.  This would certainly not be a minor release.

I can send my current code but it still needs quite a bit of work.
Title: Re: Concurrent Discovery
Post by: rickli on November 22, 2007, 09:51:23 AM
Are you a programmer? Wow, that's quite impressive...

I wonder though, if the whole discovery should be rewritten for thread support. I'm sure there's more efficient ways to process the collected information when the focus is on multi-threading right from the start (Like storing data for each device in the db right away, to avoid the need of sharing complex hashes etc).

Thanks a lot for your work! I'll definitely look at this, but I guess it'll be for 2.0 and hopefully with a bunch of developers like you  8)
Title: Re: Concurrent Discovery
Post by: nschwarz on March 20, 2008, 09:17:24 AM
Hi,
I have checked several approaches executing snmp requests in parallel

1.    (reinventing the wheel ...) doing the task by generating a lot of threads doing the snmp-requests
      This results in a lot of system overhead. Requesting more than 120 threads results in "out of memory" problems.
      A typical result after changing the test implementation (get results of finished threads and close before
       creating new threads) is:
       checking 25,000 snmp variables of 250 cisco switches spent ~ 63 sec real  / 183 sec cpu (threads 25-30 in parallel)

2.    Trying the perl module (cpan) "SNMP::Effective"  - sorry - no success to get it working ....

3.    I found a different implementation of parallel snmp requests:  "SNMP::Query::AsynchMulti" at
       http://www.perlmonks.org/?node=664360
       Trying this, I received impressive performance:
       checking 465,000 snmp variables of 1520 cisco switches spent ~ 35 sec real / 28 sec cpu !

The last software implementation could be a good choice to get an implementation that is differing between
discovery process (only a few times every day) and doing gathering of performance e.g. every 5 minutes.
(e.g. just fill the rrd interface data files in parallel to the nedi discovery process)
       
Title: Re: Concurrent Discovery
Post by: rickli on March 25, 2008, 09:03:55 AM
Sounds great! Thanks for your research. As soon as 1.0 is out of the way this (amongst other things) is something to be looked at.
Title: Re: Concurrent Discovery
Post by: rickli on July 13, 2008, 09:45:49 PM
As promised I've started massaging the code towards threaded discovery. I'm nowhere near completion, but DB updates are done on a device basis. A nice side effect is, that modules and IFs are not lost anymore, if a device is unreachable for some reason and delete requests (from device-status) are carried out immediately. The only things left are nodes and links, which need to be calculated after the discovery. I'll either have to flatten their hashes for easier sharing or store them alternatively in order to process the data...
Title: Re: Concurrent Discovery
Post by: oxo on January 22, 2009, 07:08:49 PM
Hi

have you anymore thoughts on threading...?

Edit:
Notes for myself
http://perldoc.perl.org/perlthrtut.html
Limitations of lock and share mentioned in earlier posts @ http://perldoc.perl.org/threads/shared.html as
Quote
lock VARIABLE
lock places a lock on a variable until the lock goes out of scope. If the variable is locked by another thread, the lock call will block until it's available. Multiple calls to lock by the same thread from within dynamically nested scopes are safe -- the variable will remain locked until the outermost lock on the variable goes out of scope.

Locking a container object, such as a hash or array, doesn't lock the elements of that container. For example, if a thread does a lock(@a), any other thread doing a lock($a[12]) won't block.

lock() follows references exactly one level. lock(\$a) is equivalent to lock($a), while lock(\\$a) is not.

Note that you cannot explicitly unlock a variable; you can only wait for the lock to go out of scope. This is most easily accomplished by locking the variable inside a block.
Quote
share will traverse up references exactly one level. share(\$a) is equivalent to share($a) , while share(\\$a) is not. This means that you must create nested shared data structures by first creating individual shared leaf nodes, and then adding them to a shared hash or array.
Title: Re: Concurrent Discovery
Post by: rickli on January 24, 2009, 05:41:44 PM
Tx for the info. I'm gearing the next 1.0.x release towards parallel discovery. But to actually implement it will require many different approaches (on link and node handling). I'm thinking of dropping MAC forwarding tables alltogether and use more efficient mechanisms to assign nodes. I just don't know when I'll find the time for this...
Title: Re: Concurrent Discovery
Post by: nerdi on February 06, 2009, 05:18:28 PM
This is all good, and I'm watching and hoping a reliable threaded discovery makes it's way into Nedi. But as a stop gap, as mentioned in the original post, a method that would allow for a carving up of the network address space between multiple non-concurrent jobs, could help with larger network discoveries. While this doesn't speed overall discovery, it does allow for sections of the network, needing more frequent discovery, to be refreshed. I'm guessing there are ways to structure this (multiple nedi.conf instances etc...), but wonder if someone has already hacked a something like this together.
Title: Re: Concurrent Discovery
Post by: rickli on February 06, 2009, 07:12:28 PM
Just keep in mind this creates higher traffic peaks and stresses the device much more than a single walk at the time. I fear this could crash certain not-so-well engineered devices once in a while. Still amazing how much faster it gets! Of course it would be a great alternative to my slo-mo approach  8)
Title: Re: Concurrent Discovery
Post by: oxo on February 07, 2009, 12:00:49 AM
Here is my 10 cents ...

I have been using nedi since about 2003 cos I hate making GUI's: last year I went to ver 1.0
The delay in 0.8 -> 1.0 was I needed an infrastructure:
I spent a bit of time with cacti to get automatic graph  (NeDi returned to the Cacti scene with auto cacti)

From the end of december, I began to gear up to (once again) try and see if NeDi could 5 min RRD
- there was a mock up of a solution for about 2 years ago while I was away, but it wasn't good enough (didn't use NeDi routines enough).

NeDi will soon be writting Device Table etc as soon as a device is done and not "all at once"
- this removes a lot of work for me.

I have now played about with pre-beta-release code, and now I am sure that:
- NeDi will be able (at some near future stage) to do RRD's within 5 min

Some bullets for what I think that a future version of nedi could contain:

(Don't get me started on multi-NeDi machines ...)

Can NeDi's father do it?
- definitly but time, time, time.

Can we as the million monkeys help him?
- definitly...

I have some ideas I want to work out in the forum.
The code can be worked so that, if the monkeys can type, a general snmp non blocking code with threads could be made and get the specialized NeDi code worked in.
NeDi code is great for cut and paste ( I hate writing something from scratch)

It is important for NeDi's father that  non-thread, non-non-blocking nedi.pl still exists; this should be possible inside the same code base without to many if/else.

NeDi's father is working hard: so keep using his wish list and the PayPal Icon
- it doesn't get the bread on the table, but it helps with a bit of butter.


Title: Re: Concurrent Discovery
Post by: duanew on February 08, 2009, 01:11:21 PM
Tx for the info. I'm gearing the next 1.0.x release towards parallel discovery. But to actually implement it will require many different approaches (on link and node handling). I'm thinking of dropping MAC forwarding tables alltogether and use more efficient mechanisms to assign nodes. I just don't know when I'll find the time for this...

We have a new server (Quad Xeon) and I would like to try your latest code.  Is it in SVN?

I can see 1.0.w-1 and 2.  I haven't used SVN before, how do I get the code out?
Title: Re: Concurrent Discovery
Post by: oxo on February 11, 2009, 12:04:51 AM
(Please note that there is a usefull candidate for rrd graphs @ http://forum.nedi.ch/index.php?topic=433.msg1704#msg1704)