Monthly Archives: March 2017

SCOM2016 compatibility of my MPs

Short updates for all users of the MPs that I published on TechNet Gallery (https://gallery.technet.microsoft.com/site/search?f%5B0%5D.Type=User&f%5B0%5D.Value=Stelian%20Postea)

– All MPs are working fine with SCOM2016 as well as SCOM2012 R2

– All MPs are still needed in SCOM2016, as none of the new features of SCOM2016 renders the additional functionality of my MPs ineffective or obsolete

– AddOn DFS Replication Management is now updated.
DFS-R: Backlog File Count (Monitor) allows now configuration of overrides for the checking interval (default 3600 sec) as well as for the backlog count threshold (default 0).
Please keep in mind that optimal overrides configuration would be using “For all objects of class: DFS-R Replication Connection” for cookdown considerations.

Lifting limitations for monitoring UNIX/Linux LogFiles using SCOM

If you landed on this article because you found the title appealing then you are already familiar with the limitations of monitoring UNIX/Linux LogFiles using SCOM.

Whatever method you followed (either by using monitoring templates or advanced MP Authoring) to create monitoring that involve Microsoft.Unix.SCXLog.VarPriv.DataSource, you most probably felt the pain of the following limitations:

  • It only works well with certain behavior of the log file
  • It only works with one log file
  • It doesn’t actually suppress alerts corresponding to entries logged during maintenance mode; the alerts come anyhow soon after maintenance window ends

My latest TechNet gallery contribution https://gallery.technet.microsoft.com/UNIXLinux-LogFile-Library-4133064b is a library Management Pack that allows you to create rules to monitor UNIX/Linux LogFiles with all the above limitations lifted:

  • works for any kind of log file, including circular
  • allows wildcard in specifying what log files you’d like to monitor
  • fully aware of maintenance mode

There is a catch – the prerequisites for using the MP:

There are good chances you already have all these ingredients.

To configure monitoring, please follow closely the instructions below, as the syntax is important to get things right.

First of all test your grep / egrep commands on your system(s) before attempting to configure monitoring.

Below is a walk through of creating a monitoring rule that uses grep under the hood.

Using Authoring pane in Operations Manager console go to Rules and select Create a new rule…; this will bring you to the screen below:

unix1

Please notice the new entries (grep and egrep) introduced by my library MP. Select the highlighted for grep.

unix2

Make your selections for the highlighted. I strongly suggest that you use your own Unix Local Application already discovered in your environment for the rule target. You can also go the route of targeting the UNIX/Linux Computer class but remember to disable the rule/monitor by default, and use an override to enable it for a group or instance (not elegant, not my favorite).

unix3

Select how often the log file check will be performed. Exercise caution in configuring it too often.

unix4

Actually it doesn’t matter what you type for Script, it will be ignored anyhow.

What matters are the Script Arguments! Here is the Script Arguments entry:

'ERROR' /log:'/tmp/*.log' /temp:/tmp

explained:
‘ERROR’ = the error pattern to be passed to grep; expression cannot contain spaces, please use . instead
/log:’/tmp/*.log’ = /log: followed by the logfiles path and name between single quotes
/temp:/tmp = /temp: followed by some temporary path that can be used to store some control files automatically handled by the datasource module; the account used in Run As profile must be able to write files in the path specified here!

Note. There is only one space in-between the arguments and single quotes are used.

unix5

Don’t bother with the alert description; regardless what you’ll configure there the alert description will be:
LogFile Error Entry below in format [LogFile]:[LineNumber]:[Entry]
{0}

That’s at least what the wizard template will deliver to you. If this is not satisfying then you’re welcomed to do some minor editing inside your unsealed MP after the rule is created.

If steps above are followed carefully then you should see something like this inside your Management Pack:

<Rule ID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3" Enabled="false" Target="MicrosoftUnixLibrary7510450!Microsoft.Unix.Computer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>Alert</Category>
<DataSources>
  <DataSource ID="InvokeDS" TypeID="AddOnUnixLogFileLibrary!AddOn.GREPDIFF.DS">
	<TargetSystem>$Target/Property[Type="MicrosoftUnixLibrary7510450!Microsoft.Unix.Computer"]/NetworkName$</TargetSystem>
	<ErrorCode>'ERROR'</ErrorCode>
	<LogFilePattern>'/tmp/*.log'</LogFilePattern>
	<TempFolder>/tmp</TempFolder>
	<UserName>$RunAs[Name="MicrosoftUnixLibrary7510450!Microsoft.Unix.ActionAccount"]/UserName$</UserName>
	<Password>$RunAs[Name="MicrosoftUnixLibrary7510450!Microsoft.Unix.ActionAccount"]/Password$</Password>
	<Frequency>300</Frequency>
	<Timeout>30</Timeout>
  </DataSource>
</DataSources>
<WriteActions>
  <WriteAction ID="Alert" TypeID="Health!System.Health.GenerateAlert">
	<Priority>1</Priority>
	<Severity>0</Severity>
	<AlertName />
	<AlertDescription />
	<AlertOwner />
	<AlertMessageId>$MPElement[Name="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3.AlertMessage"]$</AlertMessageId>
	<AlertParameters>
	  <AlertParameter1>$Data/Property[@Name='ErrorLogEntry']$</AlertParameter1>
	</AlertParameters>
  </WriteAction>
</WriteActions>
</Rule>

<DisplayString ElementID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3">
  <Name>Type a rule name here</Name>
</DisplayString>
<DisplayString ElementID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3.AlertMessage">
  <Name>Type an alert name here</Name>
  <Description>
		  LogFile Error Entry below in format [LogFile]:[LineNumber]:[Entry]

		  {0}
		</Description>
</DisplayString>

To configure the egrep version of the rule, please follow the same steps as for grep, except:

  • in Unix log Files Details screen type egrep in the Script (again it actually doesn’t matter, will be ignored)
  • enter something like this in Script Arguments:
'ERROR|^Job.status:.FATAL|EOD.Report$' /log:'/tmp/*.log' /temp:/tmp

Note. Remember: no spaces in the regular expression, use . instead! Also use single quotes.

'ERROR|^Job.status:.FATAL|EOD.Report$'

will match entries like the ones below:

2017-03-03T12:11:10.000Z ERROR [TEST] – file in use
Job status: FATAL FAILURE – file not found
2017-03-03T12:11:10.000Z Critical error in EOD Report

Hope this helps your SCOM Cross-Platform monitoring. Feedback is welcomed.