This document is in Draft Status
Post Upgrade Functionality Checks
This document is to provide a list of checks that will test the functionality of the BDII services after an upgrade has been done.
- login to is1.grid.iu.edu or is2.grid.iu.edu and become root.
- Execute /opt/service-monitor/is/test.sh, if a problem is found by this test you will get e-mail.
- Check Web Page Display (This is not a vital part of the BDII Service but allows a manual check of the incoming data.)
- From Status Page at http://is1.grid.iu.edu/cgi-bin/status.cgi or http://is2.grid.iu.edu/cgi-bin/status.cgi
- Check freshness of Raw Incoming Data most resources should be < 5 Minutes
- Check freshness of Data Feeds to OSG and WLCG these should also be < 5 Minutes
- Checks from Command Line of LDAP Server Functionality (Do not use Mac OS X 10.6.*, test will fail even with a functional BDII)
- Run ldapsearch -h is1.grid.iu.edu -p 2170 -x -b mds-vo-name=local,o=grid | wc -l
- Run ldapsearch -h is1.grid.iu.edu -p 2180 -x -b o=grid | wc -l
- Repeat these with -h is2.grid.iu.edu. Compare the -p 2170 results to each other, they should be the same within 5%. Likewise the -p 2180 results.
- All these are optional and may be obsolete. Maintenance may continue to the other instance at this point.
- Several scripted checks run on these services including:
- Check of BDII Freshness at CERN Top Level BDII and SAM BDII available at http://tinyurl.com/2u3xl8q only WLCG resources will be listed here as they are the only resources publishing to CERN BDIIs, these tests are run each hour on the 30 minute mark. So may be behind up to an hour after upgrade has completed.
- Email alerts are sent to the GOC-ALERTS mailing list for the following conditions
- More than 10% of resources are not updating BDII information (either WLCG or OSG)
- FNAL or BNL is not available from the CERN Top Level BDIIs
- RSV Probes check timestamps of information in the BDII failure are reported via mail
- The BDII Service also reports many system level metrics via Munin, these should be checked continuously for anomaly after an upgrade.
LDAP Server Not Running
This error will happen in the LDAP Server is not responding.
ldap_bind: Can't contact LDAP server (-1)
Data not found
This type of error will happen if no data is found matching your query, first check the ldapsearch syntax if it is correct data is missing.
# extended LDIF
# base with scope subtree
# filter: (objectclass=*)
# requesting: ALL
# local, grid
# search result
result: 0 Success
# numResponses: 2
# numEntries: 1
- 11 Aug 2010
- 24 May 2011
Topic revision: r6 - 04 May 2016 - 14:47:43 - ScottTeige