Welcome, Guest. Please login or register.

Author Topic: slow discovery  (Read 6439 times)

cornua

  • Guest
slow discovery
« on: October 26, 2011, 02:25:40 PM »
hi,

I had v1.0.5 running fine for a while, and decided to upgrade to to v1.0.7. I did a parallel install in a different directory to ave both running at the same time. 2 different databases are used. Problem is that it now took almost 5 days to complete.

We have about 800 routers/switches in the network and a total of about 9000 devices if I count the APs and the IP phones.

Anyidea what would slow down the discovery that much (used to be a few hours)?

thanks,

harry

  • Full Member
  • ***
  • Posts: 131
    • View Profile
Re: slow discovery
« Reply #1 on: October 27, 2011, 01:41:59 AM »
I thnk it should be a seperate instance on seperate machine. I guess you are running two process of nedi with tow seperate database. That can impact the discovery process.
Any one please correct me if I am wrong.

Harry.

tristanbob

  • Full Member
  • ***
  • Posts: 159
    • View Profile
Re: slow discovery
« Reply #2 on: October 28, 2011, 09:35:50 PM »
There may be different settings on what type of devices to ignore.  For example, if Nedi tries to connect to lightweight wireless access points, it can waste 30 seconds on each AP.   This happened to us, so we had to edit nedi.conf to ignore devices with a certain text in the CDP fields.

Please visit "Other"->"Invoices" on your NeDi installation to make an annual contribution and support Nedi!

cornua

  • Guest
Re: slow discovery
« Reply #3 on: October 31, 2011, 03:36:18 PM »
thanks for the suggestions.
I took pretty much the same nedi.conf settings I had for 1.0.5 (which was working fine) for the new installation v1.0.7.

I'm now running a single discovery using the v1.0.7 and looks like it'll be taking a loooong time to complete.

We use only Cisco devices on our network, and I kept the default nosnmpdev string.
nosnmpdev       IP\s(Phone|Telephone)|^ATA|AIR-LAP11|MAP-|AP(\s|_)Controlled


One thing I did was to delete some of the old snmp community strings I still have configured on nedi.conf. I'll see how it goes, but any other suggestions are welcome.

Also, should anything be done with php and/or perl? memory increase/max memory?

Thanks,
~Alex

cornua

  • Guest
Re: slow discovery
« Reply #4 on: November 03, 2011, 07:27:39 PM »
...and now, even though I do a single discovery with either version, it takes days to complete. :-|

cornua

  • Guest
Re: slow discovery
« Reply #5 on: November 09, 2011, 11:56:12 PM »
anyone else have any other idea? php config, re-install?

I'm not sure what to look for at this point.

SteffenS

  • Guest
Re: slow discovery
« Reply #6 on: November 11, 2011, 12:22:58 AM »
Hi cornua,

I had the same problem until 1 year ago.
Since then, I have solved this by splitting refresh-discovery to many discovery-jobs running at same time.

In NeDi 1.0.5, I've used many "nedi -AU <different-configfiles>"-jobs in crontab with differnent netfilter-definitions in this config-files for every subnets.
Since 1.0.6, I used "nedi.pl -A <filter>" instead of "-U <different-configfiles>".
Thats works great!
( 6h-refresh-frequence at 6,12,18 o'clock (faster) AND 0 o'clock (slower with neithborfinding,backup,...)

have fun

Steffen

rickli

  • Administrator
  • Hero Member
  • *****
  • Posts: 2777
    • View Profile
    • NeDi
Re: slow discovery
« Reply #7 on: November 19, 2011, 11:50:41 AM »
If you use -v you'll see how long each device took. Anything unusual there? The networks I've tested actually got faster than 1.0.5...
Please consider Other-Invoices on your NeDi installation for an annual contribution, tx!
-Remo

cornua

  • Guest
Re: slow discovery
« Reply #8 on: November 28, 2011, 03:54:32 PM »
Hi,

I ran a verbose discovery on a 6503-E, took 7min... One thing I noticed in the BridgeFwd section, takes quite a long time as it timesout for snmpwalk.

@@@@@@@@@@@@@@@@@@@@@@@
BridgeFwd (SNMP) --------------------------------------------------------------
SNMP:Connect w.x.y.z snmpstring@953 v2 Tout:11s MaxMS:1472
FWDS:Walking BridgeFwd
ERR :Fp953 No response from remote host 'w.x.y.z'
(...)
@@@@@@@@@@@@@@@@@@@@@@@


I'd also guess that a CLI discovery would be faster, but for some reason, the usename/password pass authentication, but fails the enable, any idea?

@@@@@@@@@@@@@@@@@@@@@@@
Prepare (CLI)  ----------------------------------------------------------------
TEL :z-cwuser:23 Tout:3s OS:IOS EN:(.+?)#\s?$
CLI2:Matched Username: , sending username
CLI3:Username username sent
CLI3:Matched Password:, sending password
CLI3:Password sent
CLI4:Matched switch>, enabling
CLI7:Matched Password:, sending password
ERR :


cornua

  • Guest
Re: slow discovery
« Reply #9 on: December 06, 2011, 12:40:55 PM »
ok... found out that the previous snmp string was causing problem, anyone knows what are the restrictions on the snmp string used? any special characters should be avoided?

Our previous snmp string included a @ in it.

Now, even though the discovery time is back to normal, I still have a couple problems;
- I can't edit the device definition as our snmp string has a ! in it, so it won't fully populate the "Community" field of the Device definition generator
- On many devices, even if a device definition exist, half the information get pulled on many devices following discovery
- strange behavior, in some distribution switches, pulling one will poll all information while will remove some of the info from the other one (serial will disappear and IP address will change). the opposite will happen if I poll the 2nd distribution switch.

thanks,
« Last Edit: December 06, 2011, 02:14:52 PM by cornua »

rickli

  • Administrator
  • Hero Member
  • *****
  • Posts: 2777
    • View Profile
    • NeDi
Re: slow discovery
« Reply #10 on: December 06, 2011, 10:11:41 PM »
Cisco uses the @ for vlan indexing sometimes, but I'm not aware of other restrictions. I'll check Defgen, though...

Can you find out more details, when discovering with -v? E.g. what exactly fails?
Please consider Other-Invoices on your NeDi installation for an annual contribution, tx!
-Remo

cornua

  • Guest
Re: slow discovery
« Reply #11 on: December 07, 2011, 03:20:30 PM »
Hi,

goes fine for most, but it fails at BridgeFwd walk.
here's a snapshot where the problem starts. PM me if you need the full discovery.

(...)
BridgeFwd (SNMP) --------------------------------------------------------------
SNMP:Connect xxx.xxx.xxx.xxx ROstringwith@in_it@559 v2 Tout:18s MaxMS:1472
FWDS:Walking BridgeFwd
ERR :Fp559 No response from remote host 'xxx.xxx.xxx.xxx'
FWDS:Walking FWD Port to IF index
ERR :Fx559 No response from remote host 'xxx.xxx.xxx.xxx'
SNMP:Connect xxx.xxx.xxx.xxx ROstringwith@in_it@398 v2 Tout:18s MaxMS:1472
FWDS:Walking BridgeFwd
ERR :Fp398 No response from remote host 'xxx.xxx.xxx.xxx'
FWDS:Walking FWD Port to IF index
ERR :Fx398 No response from remote host 'xxx.xxx.xxx.xxx'
SNMP:Connect xxx.xxx.xxx.xxx ROstringwith@in_it@828 v2 Tout:18s MaxMS:1472
FWDS:Walking BridgeFwd
ERR :Fp828 No response from remote host 'xxx.xxx.xxx.xxx'
FWDS:Walking FWD Port to IF index
ERR :Fx828 No response from remote host 'xxx.xxx.xxx.xxx'
SNMP:Connect xxx.xxx.xxx.xxx ROstringwith@in_it@206 v2 Tout:18s MaxMS:1472
FWDS:Walking BridgeFwd
ERR :Fp206 No response from remote host 'xxx.xxx.xxx.xxx'
FWDS:Walking FWD Port to IF index
ERR :Fx206 No response from remote host 'xxx.xxx.xxx.xxx'
SNMP:Connect xxx.xxx.xxx.xxx ROstringwith@in_it@443 v2 Tout:18s MaxMS:1472
FWDS:Walking BridgeFwd
(...)

cornua

  • Guest
Re: slow discovery
« Reply #12 on: December 08, 2011, 04:35:29 PM »
Hi,

     Maybe I should start a new post on this, but regarding the issue I have with one discovery scan a device and overwriting the 2nd one..

here's the setup;
- we have 2 distribution switches per sites.
- one uplink from the distrubution switch to every access switches (providing redundancy).
- HSRP for every vlans interfaces between distribution switches.

problem:
- polling the distribution switches individually on one specific site works fine.
- problem is that it removes the serial# and bootimage from the other switch; e.g. poll switch1 ok, poll switch2 ok but it removes bootimage and serial# info of switch1. same happen the other way around.

Not sure if related, I see sometimes "duplicate IP" message in nedi's warning messages, sometimes about the hsrp virtual IP address (on both distribution switches), sometimes about an interface/IP address (even though admin down on 1 of the switches).

I though the primary key was the device name, am I wrong? or something relies on something different?

thanks,

rickli

  • Administrator
  • Hero Member
  • *****
  • Posts: 2777
    • View Profile
    • NeDi
Re: slow discovery
« Reply #13 on: December 10, 2011, 05:58:12 PM »
Yes, the name is the primary key. What do those names look like? Are both switches shown Devices-List or just one?

The duplicate IP won't factor in IF admin status, good point though! I'll look into adjusting the event level accordingly...

I also tried reproducing your problem with a "!" in the community, but it works fine here...
« Last Edit: December 10, 2011, 06:06:06 PM by rickli »
Please consider Other-Invoices on your NeDi installation for an annual contribution, tx!
-Remo

cornua

  • Guest
Re: slow discovery
« Reply #14 on: December 12, 2011, 05:31:22 PM »
Hi Rickli,

Thanks again for the reply. Bellow are the answers to your questions.

Issue#1:
Yes, the name is the primary key. What do those names look like? Are both switches shown Devices-List or just one?

The duplicate IP won't factor in IF admin status, good point though! I'll look into adjusting the event level accordingly...
- they all have unique names, which are the same as their DNS name, e.g. site-dsw1 and site-dsw2.
- they'll both be shown, except the bootimage and serial# that they'll overwrite each other..


Issue#2:
I also tried reproducing your problem with a "!" in the community, but it works fine here...
- the major issue I had was with devices using a "@" in their snmp string., gets stucked and timed out at BridgeFwd portion of the discovery.
- the problem with a "!", is seen when trying to edit the device definition of a device that uses a snmp string having a "!" in it. but they get discovered.


thanks,