How I Learned to Love The Client Health Script

November 1, 2018 / Bryan Dam / 36 Comments

Many moons ago I decided to get serious about client health. First, because I kept finding endpoints that didn’t have the ConfigMgr client. Second, because those that did have the client were failing to install updates. In this post I’m going walk through how I implemented a health script for my organization. I want to state early on here that you may not agree with all my choices and that’s fine. I’m a very special kind of broken that sometimes makes me draw hard lines where others might not. Another important factor to understand is that we manage all of our servers with ConfigMgr which led to some learning opportunities.

Sophie’s Choice

The first problem is of course what health script to use. The primary candidates are Jason Sandy’s Client Startup Script and Anders Rødland’s ConfigMgr Client Health Script. I’ve read through both scripts fairly extensively and both are absolutely excellent. I initially implemented Jason’s script in our lab as a startup script and it made some good headway. Jason’s script is VBS which is a big plus because it will run on basically any OS that calls itself Windows. After a while however there were some remediation steps I wanted to try and automate. About that same time Anders released his script and was iterating at a pretty fast pace. So I dove into the code and decided it was something I could not only use but something I could improve in small ways. When I did, Anders was kind enough to merge those changes.

To reiterate, both scripts are great and using either of them is infinitely better than neither.

Startup, Logon, or Something Else?

When we initially set up Jason’s script we did so using Group Policy to run it at startup according to its namesake. However, we quickly found that to be a bad idea for our servers. If the ConfigMgr client isn’t installed or isn’t working for whatever reason then the device isn’t getting patched. If it’s not getting patched, it’s not getting rebooted. If it’s not getting rebooted, it’s not running the health script that would fix all of that.

We then followed Anders’ guide to configure it as a logon script but ended up with similar results. Taken as a whole, most servers don’t get logged into all that often. Crucially, there’s plenty of forgotten servers that no one can even explain why it exists or what it does anymore. These devices are ripe for going out to lunch and no one but security noticing.

The above experience forced us to realize that we needed to run the script on a specified schedule rather than on startup or logon. At the same time, we don’t want all our servers running the script at the exact same time. My organization’s VMWare guys already hate me enough (they tell me it’s my face) and I wasn’t eager to give them even more reason to do so. Our solution was to use Group Policy to schedule the script to run daily on all devices but with a randomized start time.

Computer Configuration > Preferences > Control Panel Settings > Scheduled Tasks

Action: Start a Program
Program/script: Powershell
Add arguments (optional): -executionpolicy bypass -File “%ProgramData%\ConfigMgrClientHealth\ConfigMgrClientHealth.ps1”
Start in (optional): %ProgramData%\ConfigMgrClientHealth

The trigger of the scheduled task is daily at 8 AM with an 8 hour random start time.

Note: Make sure the preference action is to Create. If you use Replace it will reset the randomized start time when group policy is evaluated. That may prevent the script from running at all or at the very least much less frequently than intended.

UNC Path? I Don’t Need No Stinking UNC Path

One of the things I wanted to avoid was needing to read or write to any kind of centralized folder. The typical way to run these scripts is to place them on a UNC path and then make sure every device has read access to it. I’m not a huge fan of that configuration. It makes my health check dependent upon reaching a particular path and every device is going to reach out every single time it runs the script. In the case of logging it means giving every device write access to that folder. The majority of our devices are workgroup devices so that meant totally wide open folder permissions. My security team shot that down with extreme prejudice. Beyond my general aversion to this kind of centralization I wanted the script itself to be local on every box to make it as easy as possible to run manually as part of our support process. Sure, you could go to the UNC path but we found having it local easier. We also created some simple batch files to easily run the script silently or verbosely. To get the script locally on every device we again used Group Policy.

Computer Configuration > Preferences > Windows Settings > Files

Action: Replace
Source Files: \\contoso.com\SysVol\contoso.com\Policies\{Policy GUID}\Machine\Scripts\SCCM Client Health Check\*
%ProgramData%\ConfigMgrClientHealth
(Common Tab) Apply once and do not reapply: Enabled

The source files point to the policy’s own folder. This is conveniently replicated to all our domain controllers. So if you have remote DCs it’s about as close to distributed as you’re going to get. If you ever need to update the files you can simply modify the policy to disable the ‘Apply once and do not reapply’ setting, wait for the policy to replicate, then re-enable the setting. Below is an image of the files we’re copying to the devices. You’ll notice that I’ve included ccmsetup which is all you need to kick off the client install. The downside is that this should be updated when you update ConfigMgr itself. Note that if you want to distribute updates as well you need to create separate file policies as the wildcard used above will not recurse into subdirectories. In our case we have additional file policies for Win 7/2008 R2 servicing stack and Windows Update Agent updates.

Here’s the batch files:

powershell -executionpolicy bypass -File "%~dp0ConfigMgrClientHealth.ps1" -Verbose
pause

powershell -executionpolicy bypass -File "%~dp0ConfigMgrClientHealth.ps1"
pause

Here’s our configuration file for anyone interested. Notice how I’m using Powershell environment variables in the config file. This was one of the enhancements I made to the script. It will attempt to evaluate these paths so that you do not need to hard code them.

<?xml version="1.0" encoding="utf-8"?>
<Configuration>
<LocalFiles>$($env:ProgramData)\ConfigMgrClientHealth</LocalFiles> <!-- Path locally on computer for temporary files and local clienthealth.log if LocalLogFile="True" -->
<Client Name="Version">5.00.8498.1711</Client>
<Client Name="SiteCode">###</Client>
<Client Name="Domain">contoso.com</Client>
<Client Name="AutoUpgrade">True</Client>
<Client Name="Share"></Client>
<Client Name="CacheSize" Value="10240" DeleteOrphanedData="True" Enable="True" />
<Client Name="Log" MaxLogSize="4096" MaxLogHistory="1" Enable="True" />
<ClientInstallProperty>SMSSITECODE=###</ClientInstallProperty>
<ClientInstallProperty>DNSSUFFIX=contoso.com</ClientInstallProperty>
<ClientInstallProperty>CCMLOGMAXSIZE=2621440</ClientInstallProperty>
<ClientInstallProperty>/skipprereq:silverlight.exe</ClientInstallProperty>
<ClientInstallProperty>/MP:mp.contoso.com</ClientInstallProperty>
<Log Name="File" Share="$($env:ProgramData)\ConfigMgrClientHealth\Logs" Level="Full" MaxLogHistory="8" LocalLogFile="True" Enable="False" /> <!-- Level: Full = everything. ClientInstall = only if installation of sccm agent fails.  -->
<Log Name="SQL" Server="" Enable="False" />
<Log Name="Time" Format="ClientLocal" /> <!-- Valid formats: ClientLocal / UTC  -->
<Option Name="CcmSQLCELog" Enable="False" /> <!-- Optional check on the ConfigMgr agent if local database is corrupt -->
<Option Name="BITSCheck" Fix="True" Enable="True" />
<Option Name="DNSCheck" Fix="True" Enable="True" />
<Option Name="Drivers" Enable="True" />
<Option Name="Updates" Share="" Fix="True" Enable="True" />
<Option Name="PendingReboot" StartRebootApplication="True"  Enable="False" />
<Option Name="RebootApplication" Application="" Enable="False" />
<Option Name="MaxRebootDays" Days="7" Enable="False" />
<Option Name="OSDiskFreeSpace">10</Option>
<Option Name="HardwareInventory" Days="10" Fix="True" Enable="True" />
<Option Name="SoftwareMetering" Fix="True" Enable="True" />
<Option Name="WMI" Fix="True" Enable="True"/>
<Option Name="RefreshComplianceState" Days="30" Enable="True"/>
<Service Name="BITS" StartupType="Automatic (Delayed Start)" State="Running" Uptime="" />
<Service Name="winmgmt" StartupType="Automatic" State="Running" Uptime="" />
<Service Name="wuauserv" StartupType="Automatic (Delayed Start)" State="Running" Uptime="" />
<Service Name="lanmanserver" StartupType="Automatic" State="Running" Uptime="" />
<Service Name="RpcSs" StartupType="Automatic" State="Running" Uptime="" />
<Service Name="W32Time" StartupType="Automatic" State="Running" Uptime="" />
<Service Name="ccmexec" StartupType="Automatic (Delayed Start)" State="Running" Uptime="7" />
<Remediation Name="AdminShare" Fix="True" />
<Remediation Name="ClientProvisioningMode" Fix="True" />
<Remediation Name="ClientStateMessages" Fix="True" />
<Remediation Name="ClientWUAHandler" Fix="True" Days="" />
<Remediation Name="ClientCertificate" Fix="True" />
</Configuration>

Logging? Who Needs Logging?

When we first implemented Anders’ script it supported writing its results to log files on a centralized UNC share or to a database. In both cases this meant giving write access to every device in your organization. As I mentioned above, my security team had a good laugh at this and when they wiped the tears from their eyes they politely told me ‘no’. Only they used much more colorful language when doing so. Beyond this, the logical place to put such a database would be on the SQL server used by ConfigMgr. However, if you’re using the SQL Standard license that is included with ConfigMgr then doing so is a pretty clear violation of that license based on the recently released FAQ on that topic. I have seen some people put their faith in wishful thinking on this front and decide that it’s ‘ok’ because it helps ConfigMgr. I strongly disagree; if you need convincing go read the feedback on the article for some pretty explicit clarification.

For the reasons above I have not centralized logging. That being said, the most recent version Anders’ script solves these problems. First, he has created a custom web service that can be used to interact with the database. Instead of every single device needing write access you now only need a service account that the webservice runs in that has the appropriate DB permissions. Further, Anders has clarified that SQL Express is perfectly fine for this use case and it obviously doesn’t need a lot of resources. I plan on investigating this in the near future and highly recommend you do likewise. Being able to use the reports would be a huge bonus.

Stay On Target

Once I had all of this figured out we had to decide how to target this. It seems simple enough but I found myself having to defend my position a lot. In fact, I had to argue with Microsoft itself. My position being: this should apply to every device in the domain because it doubles as our client push. We regularly find endpoints that have not been created using our supported OSD methods and therefore lack the ConfigMgr client. Trying to target a specific OU or security group wasn’t going to help that problem. The whole point is that someone did something stupid and we aren’t going to know about it until security flags the box as vulnerable. As always, your mileage may vary. We only have a single class of devices that are exempt from being managed by ConfigMgr so it was easy to exclude them with a WMI filter on the policy.

The End

So there you have it, that’s how we got semi-serious about improving client heath. It absolutely worked and in the first few days we saw a significant number of clients appear in the console. I’m absolutely positive not everyone will agree with my implementation method and that’s just fine. I offer this as merely one option among many and it’s what worked for my particular organization with it’s particular quirks based on a multitude of variables. With any luck, someone will find this approach useful for their organization.

Uncategorized

35 Comments

Navneet
July 5, 2023 at 5:33 am

Hello,
Thanks for the script it is very helpful, Can you please let me know if you used the same powershell script or made changes in powershell script as well as per your changes. Also why there are 2 XML files and also use of Batch files in this?

Reply
Allen
July 29, 2021 at 6:07 pm

I’m having issue with clients not getting there self-signed certs. They are missing the logs files when it does this health check? Anyway to ensure it gets the self signed cert or force it to grab them? What would I want to do?

Reply
- Bryan Dam (Post author)
  September 15, 2021 at 8:59 am
  
  Allen, I’m not aware of anything in the health script that addresses that. I _do_ remember having a problem at some point where there was a permission issue with the folder where the certs get stored. That could totally be a thing added into the health script but I’m not longer at that org to try and make it a thing.
  
  Reply
Robert Stein
June 19, 2020 at 6:21 pm

Have you ever encountered this one “CMHEALTH”? I found it running at my new job as a baseline.

Like Jason’s script it’s written in VBS. I looked this guy up on LinkedIn and he’s no longer a PFE and now at Tanium (boo!).

ConfigMgr Client Health
Version 6.22.2011
Jason Johnson
Dedicated Support Engineer
Microsoft Premier Field Engineering

Reply
- bryandam
  June 20, 2020 at 10:17 am
  
  I’ve heard of it from a former PFE (who also went to Tanium) but never actually seen it in the wild nor does anything obvious come up via search. Could that be the client health script/solution that Premier can be engaged to deliver/install?
  
  Reply
hkystar35
June 12, 2020 at 4:28 pm

This is great. Following your logic for the script to run independently, I’m testing out a new function to allow the script to attempt to download a static XML hosted on an internal IIS site.
I know there’s a web service for this already, but it’s more feasible to use an existing file hosting IIS site we already have than implement an new one, so I came up with this Pull Request, if you care to share your thoughts.

https://github.com/AndersRodland/ConfigMgrClientHealth/pull/51

Reply
- bryandam
  June 15, 2020 at 8:16 am
  
  Cool, yea, that’d totally work. For my purposes the config file really wouldn’t change all that much. Initially my main concern was updating the script itself in a semi-distributed way and AD’s SYSVOL is a poor-man’s distributed file system.
  
  Reply
AusA380
January 12, 2020 at 7:47 am

Hi,
I’ve followed the instructions given above and the GPO to copy the folder locally to workstations works, however when configuring the GPO to set a scheduled task on workstations as a Computer policy, the ‘Run As’ option is greyed out due to a security issue Microsoft addressed quite some time ago.

How have people gotten around this, as the script needs to run as ‘NTAUTHORITY\SYSTEM’? Does anyone know what permission the GPO runs as if the ‘Run As’ field cant be filled in? I’m hoping this is still able to be done because the method above is so simple!

Thanks

Reply
JHL
September 24, 2019 at 3:51 pm

Is the script safe to run on Domain Controllers?

Reply
- bryandam
  September 24, 2019 at 3:52 pm
  
  I can’t say definitively one way or the other. I _can_ say that I didn’t exclude our DCs and it’s never come up as an issue.
  
  Reply
David Zemdegs
September 19, 2019 at 8:13 pm

Server 2019 update:
Sadly, the OS caption for server 2019 1903 has removed the version number!@#?@!!!. Its still there for earlier versions of server 2019. To support both these server 2019 versions, make the following changes to the client health script:
In function get-operatingsystem:
duplicate the server 2016 line and change to 2019
after the switch statement and before the write-output statement add the following line:
if ($os.version -eq 10.0.18362) {$OSName = “Windows Server 1903 ” + $OSArchitecture }
In function Get-LastInstalledPatches
duplicate the 2016 line twice and change one to 2019 and one to 1903
In function Test-Service
duplicate the ‘OR’ test in the if statement with server 2016 twice and add both 2019 and 1903

Thats it.

Reply
- David Zemdegs
  September 19, 2019 at 8:16 pm
  
  ooops forgot the quotes – $os.version -eq ‘10.0.18362’
  
  Reply
  - bryandam
    September 19, 2019 at 8:47 pm
    
    Cool, I’d submit that as a pull requests to Ander’s git hub repo: https://github.com/AndersRodland/ConfigMgrClientHealth
    
    Reply
    - David Zemdegs
      September 19, 2019 at 10:34 pm
      
      Would love to but I have no idea how to use git 🙁
      
      Reply
      - bryandam
        September 19, 2019 at 10:40 pm
        
        No need to at all. Fork the project, upload your altered file, send a pull request back to Anders’ master repo.
        
        Reply
        
        David Zemdegs
        September 19, 2019 at 10:55 pm
        
        Thanks – I just checked an he has a new release in train – with 2019 support
        
        Reply
David Zemdegs
March 12, 2019 at 5:37 pm

Thanks for this – Im going through the process of implementing this but I was just wondering what you created the batch files for?

Reply
- bryandam
  March 12, 2019 at 6:47 pm
  
  The batch files aren’t really necessary, they just make it easy to run the script on-demand when logged into the endpoint.
  
  Reply
  - David Zemdegs
    March 12, 2019 at 8:00 pm
    
    Thanks for that. As I do heaps of SQL logging normally with other scripts, Ive decide to rewrite the SQL bit to call a stored procedure instead. Much neater IMHO.
    
    Reply
  - David Zemdegs
    March 13, 2019 at 4:18 pm
    
    Also I assume your scheduled task has ‘Run whether user is logged on or not’. What account did you use?
    Im guessing you included ccmsetup for any manual kickoffs if required?
    
    Reply
    - bryandam
      March 13, 2019 at 4:27 pm
      
      It runs as system which I think means the other setting isn’t enabled. If not … I don’t remember and am no longer at that org to check. CCMSETUP is there to install the client if it’s not there or the script determines it needs to be re-installed. We used this technique as our way of enforcing that the client exists on every device.
      
      Reply
      - David Zemdegs
        March 13, 2019 at 4:42 pm
        
        Thanks – In the origianl script, the config.xml had a client > share property to point to the client install files. In which case you must have modifed the script to point to the local ccmssetup Im guessing.
        
        Reply
        
        bryandam
        March 13, 2019 at 6:30 pm
        
        Anders merged a pull request I sent to support relative paths.
        
        Reply
        
        David Zemdegs
        March 20, 2019 at 5:27 pm
        
        You have to run it locally dont you? Putting a UNC path for the ccmsetup.exe wont work as the scheduled script is running under the LocalSystem account.
        
        Reply
        
        Bryan Dam (Post author)
        March 20, 2019 at 7:30 pm
        
        You don’t have to, that’s just a choice I made.
        Assuming the device is joined to a domain then Local System (or really the Machine Account) can access UNC paths just fine … it’s just a matter of permission. With the correct the permissions on the policy your endpoints can run the script from the same policy folder I referenced above. In fact, that’s how they’re pulling down the files in my scenario.
        
        Reply
        
        David Zemdegs
        April 11, 2019 at 5:37 pm
        
        Script is working a treat. And my mod to call a SQL stored procedure its generating lots of goodness in my SQL table. I am now looking at what bits of this script may not be needed for 1902.
        
        Reply
Kevin Fason
January 15, 2019 at 9:35 pm

Great solution. I’ve been eyeballing Anders clienthealth for a while and your method got me moving on it finally. Did have a few differences. I was unable to get the GPP Copy to work using %PROGRAMDATA% variable so called out C:\ProgramData path. Guessing since this is still 2008R2 functional level. I put the files in netlogon with the client install for Jasons script vs in the GPO itself as you have done. Easier to maintain IMO. For the Updates I have that sourcing from netlogon via UNC as well. Added Windows 10 1809 folders for once it detects those. Ended up setting up Webservice as well so need to push WMF5.1.

Reply
Rodrigo
January 10, 2019 at 8:17 am

Great article.
We really have this problem in the SCCM client, and we’re going to deploy something along those lines.
I wonder if you considered creating a baseline (CI) and applying to clients?

Reply
- Bryan Dam (Post author)
  January 10, 2019 at 8:33 am
  
  You couldn’t use a CI as the only distribution method because that creates a Catch-22: you need a healthy client to run the CI to make sure the client is healthy. So if you used a CI it would only be in addition to some other method to make sure it gets there on devices missing the client or are broken. Now, what you could do is use a CI to report the results of the health script and thus get some level of reporting.
  
  Reply
  - Rodrigo
    January 10, 2019 at 2:15 pm
    
    Ok, I got it. Thanks for the answer.
    
    Reply
Tim Jeffries
November 5, 2018 at 3:14 pm

thanks for this. i’ve thought about this a few times and never looked into getting a script like this to run on the servers. this should help my workstation clients out a ton as well as most users do not logon/off very frequently.

Reply
CS
November 2, 2018 at 9:29 pm

Hi,

I have a lot of clients that are in dev/testing environment. They are stand alone workgroup, they do not join to corporate domain.

In this case, what would you suggest to do to achieve this health check purpose?

Thanks.

Regards,
CS

Reply
- bryandam
  November 2, 2018 at 10:51 pm
  
  Good question. For those I deliver the health script as an application, I should have outlined that part because it’s a significant portion of our environment as well. That’s not ideal obviously since you need a healthy client to install the health check. So making it part of OSD is important.
  
  Reply
  - Rasmus
    September 15, 2021 at 6:57 am
    
    Hi Bryan!
    
    First of all, thank you for all the work you do.
    
    Second, we have non-domain joined computers that we want to use this script on. How have you set up the SCCM application to deploy the health check?
    
    Br,
    Rasmus
    
    Reply
    - Bryan Dam (Post author)
      September 15, 2021 at 7:44 am
      
      For workgroups it an app that runs a script to copy the heath script files and create the scheduled task. This is part of OSD for those devices as well. Not ideal mind you … but better than not doing it.
      
      Reply