Tuesday, June 24, 2008

Truer Words Were Never Spoken

"Nothing in the world is worth having or worth doing unless it means effort, pain, difficulty..." -- Theodore Roosevelt

Apparently, Teddy worked for the System Center Operations Manager application development team.


Labels: ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Friday, June 6, 2008

New Certifications


May 2008 was a busy month for me.

In addition to writing a book, I passed five exams in the first three weeks and earned my MCITP:Enterprise Messaging Administrator (the premier Exchange 2007 administrator certification) and three MCTS certifications (SCOM 2007, ForeFront and Exchange 2007).

That makes 34 exams in a row that I've passed without failing, including my CISSP. Yes!! The streak remains unbroken!

I've put together a certifications page that lists the current certifications that I hold, which I'm rather proud of.

Tomorrow I'm off to TechEd and I can't wait! I'll be blogging at least once a day while I'm there. Check my blog all week. If you're going to TechEd yourself, I might meet you at the TechEd Blogger Ultra Lounge. See you there!

Labels: , , , , , , , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Tuesday, May 20, 2008

Unable to Successfully Promote SCOM RMS Server

If the root management server (RMS) in a System Center Operations Manager 2007 (SCOM 2007) implementation fails or becomes unavailable for some reason the entire SCOM system will fail. Well, not exactly. The managed agents will still collect performance and alert data and will either queue this data or forward it to its management server. The management servers will be unable to forward this information to the SQL database and administrators will be unable to launch either the Operations or web consoles, so it's as good as dead.

There are two ways to rectify this -- bring the RMS server back online or promote an existing SCOM management server to an RMS. Microsoft article, "How to Promote a Management Server to a Root Management Server Role in Operations Manager 2007" does a good job of explaining the steps required, so I won't go through them here. But what happens if you get the following error when promoting the new RMS?

The machine managementserver is a server for multiple management groups (not supported)!

This occurs when the registry contains extra "Parent Health Service" or "Send Priority" keys under the Server Management Groups key. Navigate to:

HKLM-Software-Microsoft-Microsoft Operations Manager-3.0-Server Management Groups

Under this key you should see a key that matches the name of your SCOM management group. There should not be any other keys at the same level as the management group name. Back them up and delete them. In the example below, backup and delete the "Send Priority" key and its subkeys.

Run the same ManagementServerConfigTool.exe PromoteRMS command and it should work now.


Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Thursday, May 15, 2008

SQL Exceptions during SCOM 2007 RMS Promotion

The Micosoft article, "How to Promote a Management Server to a Root Management Server Role in Operations Manager 2007" does a pretty good job of explaining how to promote a SCOM 2007 management server to a root management server.

While performing a disaster recovery test today, I found that I was getting the following SQL exceptions when I ran the ManagementServerConfigTool.exe PromoteRMS command:

The type initializer for 'Microsoft.MOMv3.Setup.MOMv3ManagedCAs' threw an exception.

Turns out this is because I ran the ManagementServerConfigTool.exe PromoteRMS command directly from the SCOM SP1 Support Tools folder, which is missing some of the DLLs required to run the command.

Simply copy the files from the Support Tools folder on the SP1 CD to the local \Program Files\System Center Operations Manager 2007 folder and re-run the command.

Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Wednesday, May 14, 2008

Error Running SecureStorageBackup


When backing up or restoring the RMS keys using the SecureStorageBackup utility in SCOM SP1, you may come across the following error:

Could not load file or assembly 'Microsoft.Mom.Common, Version=6.0.4900.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.

To fix this, copy Microsoft.Mom.Common.dll from C:\Program Files\System Center Operations Manager 2007 to the same folder where SecureStorageBackup.exe is run. Then run SecureStorageBackup again.

Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Monday, May 5, 2008

Well, that was painful...

I'm installing a new SCOM 2007 SP1 infrastructure in a test environment.

I built up a couple of SQL 2005 database servers and two management servers, one of each in each of two sites. I installed the SCOM database on the first SQL server and then installed SCOM on the first management server, making it the root management server (RMS).

After SCOM installs, setup asks if you want to run the Operations Console. I cleared the checkbox to do so and began to immediately upgrade to SCOM 2007 SP1. Big mistake. Now I couldn't log into the console with any account. It seems that SCOM needs to do some more setup when you run the console for the first time.

I ended up completely uninstalling SCOM from the RMS and deleting the OperationsManager database from the SQL server, then I reinstalled everything. This time I launched the console before upgrading to SP1. It worked, but wasted about an hour and a half.

Learn from my mistake.

Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Tuesday, March 4, 2008

Good article on the SCOM Root Management Server function

The Operations Manager Product Team posted a good article explaining the role and purpose of the SCOM Root Management Server (RMS).

Microsoft could do better in the business continuance/disaster recovery arena by providing a simple wizard to automate the promotion/demotion of the RMS.

In my experience, most DR scenarios usually involve a site failure (power or network) that simple clustering won't resolve. The steps required to failover to a remote site (importing the RMS keys and updating the agents) currently require someone with sufficient rights to follow a separate DR procedure document. It would be nice if this could be done from the GUI (where most of the admins live). This would facilitate the DR process when resource and time constraints are most critical.

Labels: ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Monday, March 3, 2008

Temporary fix for "Performance Module could not find a performance counter"

The SCOM Team has posted a temporary fix for the "Performance Module could not find a performance counter" we've all been seeing after applying SCOM SP1.

Check out this post on the Operations Manager Product Team blog.

Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Wednesday, February 27, 2008

SCOM 2007 SP1 Upgrade Notes


I upgraded a client's SCOM 2007 infrastructure today from SCOM SP1 RC (build 6246) to SP1 RTM (build 6278).

No real problems encountered, except I should have followed my own #1 rule: Always restart your server before installing a major update. The only issue I ran up against was that the upgrade hung when installing the Management Packs on the Root Management Server (RMS). I reviewed the event logs during the install and found three of these events:
The OpsMgr Config Service service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.
Followed one minute later with:
The Service Control Manager tried to take a corrective action (Restart the service) after the unexpected termination of the OpsMgr Config Service service, but this action failed with the following error:
An instance of the service is already running.
I'm not sure that these caused the hang, but after I canceled the installation, restarted the RMS server and reinstalled SP1 again, it worked fine with no errors.


My biggest recommendation is to thoroughly read the online version of the SCOM SP1 Upgrade Guide before beginning your upgrade. The online version includes notes that didn't make it into the release notes included in the SP1 package itself. Particularly important are the notes about having to repair all agent installations if you are upgrading from SP1 RC, like I was.

The upgrade path for SP1 is very strict and must be performed in this order:
  1. Prerequisite work (expanding the database and logs), disabling notification subscriptions (why, oh why, can't we do this against multiple subscriptions at once!), and removing pending agent installations.


  2. Upgrade the RMS


  3. Upgrade the Reporting Server


  4. Upgrade stand-alone Management Consoles


  5. Upgrade Management Servers


  6. Upgrade Gateway Servers


  7. Upgrade (or in my case, repair) Agents on managed computers


  8. Upgrade the Audit Collection Service (ACS) server


  9. Reboot the SCOM servers (my suggestion, not required) and re-enable the subscriptions
The entire upgrade took about 3 hours to upgrade nine SCOM servers and 289 managed computers.


Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Tuesday, February 26, 2008

SCOM SP1 Released

In case you didn't know (I didn't until today), Microsoft quietly released System Center Operations Manager 2007 SP1 on February 22nd.

If you're upgrading from SP1 RC1, like I am, be sure to read important information about upgrading from the Operations Manager Product Team Blog:
If users are upgrading from SP1 RC (6246) to SP1 RTM (6278) then will need to run repair to upgrade the agents rather than approve them from pending management view. This was not called out in the upgrade document we shipped in SP1. We have updated the web version of the upgrade guide as well as the release notes.
I'm hopeful that this release will fix a bunch of issues I've been having.

Labels: ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Thursday, February 14, 2008

Fix for SCOM Aggregate Health State Errors


Microsoft System Center Operations Manager (SCOM) sometimes displays that the aggregate state of the Health Service is unhealthy, but each of the component states are healthy as in the example above. If you open Health Explorer everything looks healthy and there doesn't seem a way to clear this condition.

There are other times when the Health Rollup state is in an unhealthy state, but all the child items are healthy, as shown in this example:



To fix both of these conditions, you need to put the server, Health Service and Health Service Watcher into maintenance mode for 5 minutes. Here's how to do it:

  • Open the SCOM 2007 Operations Console and configure two new state views. You'll only need to do this once:

    • Open the Monitoring node

    • Right-click Monitoring and create a new state view called Health Service, show data related to: Health Service. Click the Display tab and sort columns by State, Descending

    • Right-click Monitoring and create a new state view called Health Service Watcher, show data related to: Health Service Watcher. Click the Display tab and check Agent. Sort columns by State, Descending

  • Now put the affected servers and their Health Services and Health Service Watchers into maintenance mode for 5 minutes (the minimum duration)

Once the servers come out of maintenance mode the condition will be cleared. This problem is expected to be resolved in SP1, which is due very soon.

Labels: ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Thursday, January 10, 2008

Fix for SCOM 2007 Health Script failures


I had a problem with a couple of Windows Server 2003 domain controllers that were constantly showing as unhealthy in SCOM. The Health Explorer showed that the AD Op Master Roles monitor was failing. The Operations Manager event log would show the following events:

Event Type: Warning
Event Source: Health Service Script
Event Category: None
Event ID: 1
Date: 1/10/2008
Time: 5:50:05 AM
User: N/A
Computer: SADC01
Description:
AD Op Master Response : The script 'AD Op Master Response' failed to create object 'McActiveDir.ActiveDirectory'. This is an unexpected error.
The error returned was: 'The specified module could not be found.' (0x8007007E)

and

Event Type: Warning
Event Source: Health Service Script
Event Category: None
Event ID: 1000
Date: 1/10/2008
Time: 5:55:05 AM
User: N/A
Computer: SADC01
Description:
AD Lost And Found Object Count : The script 'AD Lost And Found Object Count' failed to create object 'McActiveDir.ActiveDirectory'. This is an unexpected error.
The error returned was 'The specified module could not be found.' (0x8007007E)

The solution is to run the OomADs.msi file in the C:\Program Files\System Center Operations Manager 2007\HelperObjects folder on the server having the problem. In my case, the domain controllers. Installation is quick and will not require a reboot. Once that's done restart the OpsMgr Health Service and you're good to go.



Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email

Tuesday, December 4, 2007

How to Forcibly Remove MOM Agents

I'm currently involved with a large System Center Operations Manager (SCOM) 2007 deployment. The client is moving from MOM 2000 to SCOM 2007.

In the course of deployment we're successfully removing the MOM agent via the MOM management console, but for some reason the agent software and OnePoint service remain on the servers. This causes various WMI and other errors on the newly unmanaged servers.
To fix this, I wrote a batch file that will forcibly remove the OnePoint service and all Microsoft Operations Manager 2000 software from the target server.

The usage is: KILLMOM [TargetComputer]
@echo off
if "%1"=="" goto Syntax
echo.
echo WARNING! This command forcibly removes the MOM agent from the target server.
echo.
echo Press CTRL-C to quit or
pause
echo.
sc \\%1 stop onepoint
ping 127.0.0.1 -n 40 > nul
sc %1 stop snmp
ping 127.0.0.1 -n 20 > nul
sc \\%1 delete onepoint
rd "\\%1\c$\Program Files\Microsoft Operations Manager 2000" /s /q
sc \\%1 start snmp
echo The MOM agent has been removed successfully.
goto End
:Syntax
echo Usage: KillMOM [servername]
echo.
:End

I keep this batch file on the MOM DAS and run it from there. You must be an administrator on the target computer to successfully uninstall the MOM agent.

Labels: , ,


Subscribe to my feed   StumbleUpon Toolbar

Subscribe to The EXPTA {blog} by Email