Welcome, Guest. Please login or register.

Author Topic: Work the code Monkeys: Concurrent Discovery  (Read 5290 times)

oxo

  • Guest
Work the code Monkeys: Concurrent Discovery
« on: February 07, 2009, 12:26:14 AM »
Work in progress
- a continuation of Concurrent Discovery @ http://forum.nedi.ch/index.php?topic=44.0 for hackers.

Please don't post things like " oh I want this": use http://forum.nedi.ch/index.php?topic=44.0

I have "reserved" the next few messages for myself to "work the code" ...
« Last Edit: February 07, 2009, 12:30:12 AM by oxo »


oxo

  • Guest
RRD
« Reply #2 on: February 07, 2009, 12:42:26 AM »
Summarize NeDi usage of snmp objects for RRD
- what is the least snmp info that has to be collected for NeDi routines to be used like &ManageRRD

Code: [Select]
#===================================================================
# Update or create RRDs if necessary
# Parameters: device name
# Global: -
# Return: -
#===================================================================
sub ManageRRD {

my $dv = $_[0];
my $ok = 0;

Depend on correct definition file for device, a product of sysObjectID

Code: [Select]
system (
$rrdcmd,"update",
"$rrdpath/$dv/system.rrd",
"N:
$main::dev{$_[0]}{cpu}:
$main::dev{$_[0]}{mcp}:
$main::dev{$_[0]}{mio}:
$main::dev{$_[0]}{tmp}
");
Code: [Select]
system ($rrdcmd,"update",
"$rrdpath/$dv/$irf.rrd",
"N:
$main::int{$_[0]}{$i}{ioc}:
$main::int{$_[0]}{$i}{ooc}:
$main::int{$_[0]}{$i}{ier}:
$main::int{$_[0]}{$i}{oer}
");

So we need (about)
5 gets per device
4 coloums from a/some tables (can be a lot of gets :) )

For things to go well/quick
Don't guess the community/version: they have to be known, if they don't work: to bad "next".
(present code works on an array of communities and uses version 1 ?: if there is no reply, for some reason, time is wasted going thru communities on this).

An RRD specific run would benefit from:
http://search.cpan.org/~dtown/Net-SNMP-5.2.0/lib/Net/SNMP.pm#get_entries()_-_retrieve_table_entries_from_the_remote_agent
« Last Edit: February 11, 2009, 11:16:02 PM by oxo »

duanew

  • Guest
Re: Work the code Monkeys: Concurrent Discovery
« Reply #3 on: February 08, 2009, 12:46:01 PM »
I tested some Concurrent discovery code using multi threading.  It is not necessary to use non-blocking SNMP if you have multiple threads.  I tried three or four and this worked fine.  By running multiple threads other devices can be queried while some devices block.

SNMP and Telnet didn't seem to have any problem with the threads running concurrently.

Rickli says he has already done some work on this.  In discovery the main code needs to change so that the discovery isn't kept in hash variables and then written to DB at end.  Instead as each device is dicovered it is written to DB.  The main part I had trouble with is MAC/IP tables which need to be resolved across multiple devices.

While walking a switch mac address tables we get the device mac address and physical Interface, but we cannot resolve from the mac address to the ip address until we walk the layer 3 device arp table.

Maybe as we walk mac address tables we add (or update) the mac address and interface (only if it is a physical interface) to the nodes table.  Similarly, as we walk arp tables we add/update mac address/ip address in the nodes table.

After the discovery is complete we try to resolve the names of all nodes in the nodes table.

oxo

  • Guest
Re: Work the code Monkeys: Concurrent Discovery
« Reply #4 on: February 08, 2009, 03:33:31 PM »
Yep: I think threads are needed, but I will be comparing 850 devices with:
- block
- non-block
- threads, probably with non-block

I have almost written the inc/lib that can allow block or non-block code so NeDi's father is happy.

rickli

  • Administrator
  • Hero Member
  • *****
  • Posts: 2701
    • View Profile
    • NeDi
Re: Work the code Monkeys: Concurrent Discovery
« Reply #5 on: February 08, 2009, 09:08:45 PM »
Wow, cool. When NeDi's ready for multiple discovery threads it would make a great feature to choose between number of threads and blocking or non-blocking...
Please consider Other-Invoices on your NeDi installation for an annual contribution, tx!
-Remo

oxo

  • Guest
Progress: 188 sec for 155 devices: supports 5 threads
« Reply #6 on: February 11, 2009, 11:14:46 PM »
http://odp.svn.sourceforge.net/viewvc/odp/Others/nedi/branches/WorkTheCode/?pathrev=272
188 sec for 155 devices: supports 5 threads
1:system 2:if 3:ipAddr 4:cdpCache (0 is "main")
- "get"-ing ALL entries ...

Note: the 155 devices use Version 2 get_bulk. Your mileage may vary if you are using Version 1 devices ...

I have tried to make code that can use threads and non-blocking, or not, as NeDi's father wanted.

The next stage is to only collect the ifTable snmp entries needed for RRD (cpu, temp etc waits for a while)
- and then try and create RRD files using a version of ManageRRD that I hope I can call after a small conversion program from libodpsnmp.pl data structure to NeDi data structure to maintain compatibility.

It could be "nice" to assign 2 threads to ifTable: but need some way of doing this so that the threads are assigned their "own" devices ( I'll need some form of semafore or lock per device so that a thread picks up the next not assigned device...)

And of course the cam tables need a thread at some stage ...(snmp: I don't use telnet/ssh for these ...)
« Last Edit: February 12, 2009, 12:04:43 AM by oxo »

oxo

  • Guest
Status
« Reply #7 on: February 13, 2009, 01:18:10 PM »
OK: now I can get all interface entries for all devices within 5 min.

RRD graphs ...

The current NeDi seems to update an RRD from the data nedi.pl has and save to DB.

I want to try save to DB and have another program to make RRD's.

Why?
Well, I think adding more threads to flint.pl for interface collection when one reaches 1000 + 1 devices is not the answer.
I would prefer to add another machine that takes care of some of the devices, while another machine takes care of the other devices and they save their interface data to DB
- this could scale better.

The creation of RRD's involves disk access / cache etc and the RRD's have to be "directly" accesible by the web server ( or mounted ...).

So, will a DB server stand being updated by one machine and queried by another every 5 min for all devices ...

Well, what do you think?

In the meantime, I'll sharpen flint.pl for the next step (whatever that is)
- cam entries are still not done...
« Last Edit: February 13, 2009, 03:49:37 PM by oxo »