Lifting limitations for monitoring UNIX/Linux LogFiles using SCOM

If you landed on this article because you found the title appealing then you are already familiar with the limitations of monitoring UNIX/Linux LogFiles using SCOM.

Whatever method you followed (either by using monitoring templates or advanced MP Authoring) to create monitoring that involve Microsoft.Unix.SCXLog.VarPriv.DataSource, you most probably felt the pain of the following limitations:

  • It only works well with certain behavior of the log file
  • It only works with one log file
  • It doesn’t actually suppress alerts corresponding to entries logged during maintenance mode; the alerts come anyhow soon after maintenance window ends

My latest TechNet gallery contribution https://gallery.technet.microsoft.com/UNIXLinux-LogFile-Library-4133064b is a library Management Pack that allows you to create rules to monitor UNIX/Linux LogFiles with all the above limitations lifted:

  • works for any kind of log file, including circular
  • allows wildcard in specifying what log files you’d like to monitor
  • fully aware of maintenance mode

There is a catch – the prerequisites for using the MP:

There are good chances you already have all these ingredients.

To configure monitoring, please follow closely the instructions below, as the syntax is important to get things right.

First of all test your grep / egrep commands on your system(s) before attempting to configure monitoring.

Below is a walk through of creating a monitoring rule that uses grep under the hood.

Using Authoring pane in Operations Manager console go to Rules and select Create a new rule…; this will bring you to the screen below:

unix1

Please notice the new entries (grep and egrep) introduced by my library MP. Select the highlighted for grep.

unix2

Make your selections for the highlighted. I strongly suggest that you use your own Unix Local Application already discovered in your environment for the rule target. You can also go the route of targeting the UNIX/Linux Computer class but remember to disable the rule/monitor by default, and use an override to enable it for a group or instance (not elegant, not my favorite).

unix3

Select how often the log file check will be performed. Exercise caution in configuring it too often.

unix4

Actually it doesn’t matter what you type for Script, it will be ignored anyhow.

What matters are the Script Arguments! Here is the Script Arguments entry:

'ERROR' /log:'/tmp/*.log' /temp:/tmp

explained:
‘ERROR’ = the error pattern to be passed to grep; expression cannot contain spaces, please use . instead
/log:’/tmp/*.log’ = /log: followed by the logfiles path and name between single quotes
/temp:/tmp = /temp: followed by some temporary path that can be used to store some control files automatically handled by the datasource module; the account used in Run As profile must be able to write files in the path specified here!

Note. There is only one space in-between the arguments and single quotes are used.

unix5

Don’t bother with the alert description; regardless what you’ll configure there the alert description will be:
LogFile Error Entry below in format [LogFile]:[LineNumber]:[Entry]
{0}

That’s at least what the wizard template will deliver to you. If this is not satisfying then you’re welcomed to do some minor editing inside your unsealed MP after the rule is created.

If steps above are followed carefully then you should see something like this inside your Management Pack:

<Rule ID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3" Enabled="false" Target="MicrosoftUnixLibrary7510450!Microsoft.Unix.Computer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>Alert</Category>
<DataSources>
  <DataSource ID="InvokeDS" TypeID="AddOnUnixLogFileLibrary!AddOn.GREPDIFF.DS">
	<TargetSystem>$Target/Property[Type="MicrosoftUnixLibrary7510450!Microsoft.Unix.Computer"]/NetworkName$</TargetSystem>
	<ErrorCode>'ERROR'</ErrorCode>
	<LogFilePattern>'/tmp/*.log'</LogFilePattern>
	<TempFolder>/tmp</TempFolder>
	<UserName>$RunAs[Name="MicrosoftUnixLibrary7510450!Microsoft.Unix.ActionAccount"]/UserName$</UserName>
	<Password>$RunAs[Name="MicrosoftUnixLibrary7510450!Microsoft.Unix.ActionAccount"]/Password$</Password>
	<Frequency>300</Frequency>
	<Timeout>30</Timeout>
  </DataSource>
</DataSources>
<WriteActions>
  <WriteAction ID="Alert" TypeID="Health!System.Health.GenerateAlert">
	<Priority>1</Priority>
	<Severity>0</Severity>
	<AlertName />
	<AlertDescription />
	<AlertOwner />
	<AlertMessageId>$MPElement[Name="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3.AlertMessage"]$</AlertMessageId>
	<AlertParameters>
	  <AlertParameter1>$Data/Property[@Name='ErrorLogEntry']$</AlertParameter1>
	</AlertParameters>
  </WriteAction>
</WriteActions>
</Rule>

<DisplayString ElementID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3">
  <Name>Type a rule name here</Name>
</DisplayString>
<DisplayString ElementID="MomUIGeneratedRulec7749c523f08444abd9ebc8dc5b409a3.AlertMessage">
  <Name>Type an alert name here</Name>
  <Description>
		  LogFile Error Entry below in format [LogFile]:[LineNumber]:[Entry]

		  {0}
		</Description>
</DisplayString>

To configure the egrep version of the rule, please follow the same steps as for grep, except:

  • in Unix log Files Details screen type egrep in the Script (again it actually doesn’t matter, will be ignored)
  • enter something like this in Script Arguments:
'ERROR|^Job.status:.FATAL|EOD.Report$' /log:'/tmp/*.log' /temp:/tmp

Note. Remember: no spaces in the regular expression, use . instead! Also use single quotes.

'ERROR|^Job.status:.FATAL|EOD.Report$'

will match entries like the ones below:

2017-03-03T12:11:10.000Z ERROR [TEST] – file in use
Job status: FATAL FAILURE – file not found
2017-03-03T12:11:10.000Z Critical error in EOD Report

Hope this helps your SCOM Cross-Platform monitoring. Feedback is welcomed.

Advertisements

7 responses to “Lifting limitations for monitoring UNIX/Linux LogFiles using SCOM

  1. Pingback: System Center Mart 2017 Bülten – Sertaç Topal

  2. Pingback: System Center Mart 2017 Bülten – Sertac Topal

  3. Pingback: System Center Mart 2017 Bülten – Christopher Golden Blog

  4. Pingback: System Center Mart 2017 Bülten – IT-News von PC-Meister

  5. Michiel April 13, 2017 at 9:15 am

    Great work. Does this also work when you use the latest version of Microsoft.Unix.ShellCommand.Library for SCOM 2012, 7.5.1068.0? Would love to use this in SCOM 2012 environments also. I could mp2xml your MP, but if this only works with SCOM 2016, I won’t even bother.

    Like

    • Michiel April 13, 2017 at 9:26 am

      I guess I already found the answer: Microsoft.Unix.ScriptDetailsPage is used in your MP, but that UI Page is not available in the latest version of the MP for SCOM 2012.

      Like

      • spostea April 13, 2017 at 2:40 pm

        Thanks for feedback! My MP is working for SCOM2012 as well as for SCOM2016. Indeed, the dependency on UNIX/Linux Shell Command and Script Library MP is just for the Create Rule Wizard UI. I don’t see any reason you cannot have UNIX/Linux Shell Command and Script Library MP version 7.6.1064.0 (just this one, not all UNIX/Linux MPs) imported in SCOM2012 (it’s not for SCOM2016 only).

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: