Dam Good Admin

Or at least not entirely useless

Software Update Maintenance: It’s a Thing. That You Should Do.

If you don’t need convincing to maintain software updates feel free to skip to the good stuff:
Fully Automate Software Update Maintenance in Configuration Manager

In this foul year of our lord 2017 it should go without saying that it is critical to actively maintain Configuration Manager’s software updates.  Yet … and yet … I still see people on the forums, Reddit, Slack, and even now Facebook that are unaware that Windows Server Update Services (WSUS) needs care and feeding.  Being used by Configuration Manager doesn’t change this fact in the slightest.  If for some reason you don’t believe me then go do some reading:

The Problem: Won’t anyone think about the Windows Update Agent!?

The problem described in the articles above is that every client downloads the metadata for every update that is not declined in Windows Server Update Services (WSUS) and then tries to scan for compliance.  This is because the main role for WSUS in a Configuration Manager environment is to provide the update catalog (metadata) and EULAs to the clients.  Every update that is synced from Microsoft becomes part of the catalog whether you deploy it or not.  This is the reason that syncing the entire product list will crush WSUS into oblivion.  The clients must consume the entire catalog and earlier versions of the Windows Update Agent (WUA) didn’t do this particularly efficiently causing them to run out of memory and fail to scan.  The result are entries like the following in WindowsUpdate.log:

WARNING: ISusInternal::GetUpdateMetadata2 failed, hr=8007000E

Without successfully scanning, the device isn’t going to get updates.  Sure, over 90% of the updates are not applicable because they do not match the update’s targeted operating system but every client still has to load that update’s metadata and run the applicability methods.  This problem has a particularly profound impact on 32-bit systems that are limited to 4 GB of memory.  For more information refer to this post: ConfigMgr 2012 update scan fails and causes incorrect compliance status.

The Other Problem: WSUS is Mad as Hell and Isn’t Going to Take it Anymore

Much more recently, organizations have seen their WSUS Internet Information Services (IIS) application pools (WsusPool) consume all available CPU and memory.  To understand the problem you need to know how WSUS works as part of Configuration Manager.  As mentioned above it delivers the update catalog (metadata) and EULAs to the clients.  However, it doesn’t just deliver the entire catalog to each client every time.  Instead, WSUS generates a delta catalog for each client containing only update metadata that the client doesn’t have.  In order to speed things along it caches the generated update metadata in memory for the next client that requests it.  Additionally, the new cumulative update rollups have caused the amount of metadata per update to explode.  Each month, new updates are released with the same metadata as the previous month plus a few new bits.  Putting this all together there are two separate feedback loops that can occur as a result.

The first feedback loop has to do with generating the update metadata.  If WSUS takes longer than ASP.NET’s timeout length (default is 110 seconds) then WSUS will kill the thread and throw out all of its progress.  So when CPU and memory constraints slow down the process of generating the metadata from the database it will make matters worse by failing and having to start over.  Eventually it will succeed but in the meantime it’s just burning resources at a time when a lack of resources is causing it to fail in the first place.

The second feedback loop has to do with caching the update metadata.  Because of the explosion in the amount of metadata, the memory needed to cache it all has likewise increased.  IIS application pools are configured to consume a specific amount of memory and default to 1843200 KB (1.8 GB) which is inadequate for most environments.  When the application pool hits this limit, CPU and disk usage is going to spike causing two things to happen.  One, WSUS will start returning ‘HTTP status 503 the service is temporarily overloaded’.  In WAUHandler.log you’re going to see things like this:

OnSearchComplete – Failed to end search job. Error = 0x80244022.
Scan failed with error = 0x80244022.

Two, the app pool might recycle its memory by simply clearing it.  That means all the cached metadata is gone and WSUS now needs to rebuild it all from the database taking up CPU and memory.  Pairing this with the timeout problem above makes two separate problems exponentially worse.

Note that there is no silver bullet for these WSUS problems.  WSUS is simply going to need the resources that it needs based on client count and scan frequency.  If all you have is 2 or 4 GB of memory to dedicate then no matter what you do it might just fail.

For more information on these issues and configuration suggestions see High CPU/High Memory in WSUS following Update Tuesdays and WSUS sync fails with HTTP 503 errors.    You can alleviate these problems by providing more resources or lowering the number of resources WSUS needs.  While the hotfixes listed in the articles above will help lower this by being more efficient they don’t magically fix WSUS resource constraints.  However, there is something you can do to drastically lower the amount of resources WSUS needs.

Do the Needful: Maintain WSUS

A major factor in reducing these problems is to maintain your update catalog by declining updates that are no longer needed or that you never intend to deploy.  This is the only way to remove them from the update catalog and prevent WSUS from generating their metadata and clients from scanning against them.  How do you do that?

Manually Declining Updates in the WSUS Console

The most direct way of declining updates in WSUS is to just open its console and do it.  Now wait … I know what you’re thinking … there’s a long-held tradition of never opening the WSUS console on penalty of being labelled ‘unsupported’ by Microsoft.  I had the opportunity to talk to one of the senior support engineers overseeing software updates for Configuration Manager and he called this a myth and challenged me to find this statement in writing by Microsoft.  You may absolutely go into the WSUS console, carefully poke around, run the WSUS cleanup wizard, and decline updates.  You won’t want to though because doing so is tedious and painful.  There is a much better way.

Running the WSUS Cleanup Wizard and Declining Updates with Configuration Manager

With the release of Configuration Manager Current Branch 1511 the product team has added the ability to run the WSUS Cleanup Wizard (see Software updates maintenance).  You can also configure the Software Update Component to decline superseded updates older than a defined number of months (see Supersedence Rules).  A note at the end of that documentation section states:

When the WSUS cleanup task runs, the updates set to Expired in Configuration Manager are set to a status of Declined on the WSUS server and the Windows Update Agent on computers will no longer scan for these updates.

The problem is, that does not appear to actually happen and the reason is simple.  As the documentation for the WSUS Cleanup Wizard states:

Decline superseded updates decline all updates that meet all the following criteria:

  • The superseded update is not mandatory
  • The superseded update has been on the server for thirty days or more
  • The superseded update is not currently reported as needed by any client
  • The superseded update has not been explicitly deployed to a computer group for ninety days or more
  • The superseding update must be approved for install to a computer group

The problem here is that last line.  The WSUS Cleanup Wizard will only decline updates where the newer superseding update has been approved for install.  I have dug into the relevant WSUS stored procedure (spDeclineSupersededUpdates) and verified this requirement.  Guess what Configuration Manager never does?  Approve updates for install in WSUS.  As a result the cleanup wizard will never decline updates based on supersedence in a Configuration Manager environment.  This is why the documentation for supersedence rules is incorrect.  I have twice submitted corrections to the docs team only to have the changes rejected.  I also created a UserVoice item to resolve this issue: When Expiring Updates based on Supersedence Rules also Decline them in WSUS.

There’s another issue that I have noticed and corroborated with a small number of other Configuration Manager administrators.  The scheduled cleanup via Configuration Manager just doesn’t seem to do anything.  There’s practically no logging for the wizard so it’s hard to tell what’s going on exactly.  However, in every environment where I’ve manually run the wizard for the first time  I have seen thousands of obsolete updates and computers being removed.  Those first runs took hours to perform despite Configuration Manager running the wizard monthly for years.  Your mileage may vary of course but I highly recommend running the WSUS Cleanup Wizard outside of Configuration Manager’s schedule.

Even if the built-in features of Configuration Manager actually worked and declined superseded updates in WSUS that still isn’t enough.  There’s hundreds or likely even thousands of non-superseded updates that you will never deploy in your environment.  Do you have Itanium devices?  Are you deploying every channel for Office 365?  Are you deploying every single language of every edition for every version of Windows 10?  Probably not but every single device is requesting and consuming the metadata for all of it.  Configuration Manager is never going to expire those so you would be stuck declining them manually in the WSUS console.  If you do that, you have failed as modern-day administrator.

Using Custom Maintenance Scripts

If manually declining updates in WSUS is failure and Configuration Manager is never going to do everything we want it to then the only option left is to script our way out of this situation.  The articles I referenced above all contain scripts that can help automate WSUS maintenance by declining updates based on certain criteria.  Although I’m sure they work just fine I wasn’t particularly interested in orchestrating multiple different scripts.  Further, they were not particularly extensible in allowing me to decline updates that require more advanced logic to select (ex. Windows 10 languages and editions).  Lastly, there are other software update maintenance operations that go beyond declining updates like cleaning Software Update Groups.  As a result I have written and released my own script that performs literally every maintenance task I can think to automate in Configuration Manager: Fully Automate Software Update Maintenance in Configuration Manager.  By running this script I reduced my active update count from 9,768 to 3,333 which is a 66% reduction.  That has a huge impact on the amount of resources needed by WSUS.  To be clear though, I don’t really care what script you use … just do something to decline updates in WSUS.

The Other … Other Kind of WSUS Maintenance

When maintaining the list of updates within WSUS do not forget that WSUS runs on top of a SQL database which itself needs care and feeding outside of just the data it serves.  There’s plenty of info out there on rebuilding WSUS indexes and such but at the end of the day it’s a SQL database like any other and SQL databases should be maintained using Ola Hallengren’s maintenance script.  In particular, his SQL Server Index and Statistics Maintenance script.

5 Comments

  1. Thank you for posting this, more visibility on this is vital for CM Admins imho. Especially the SCCM cleanup gaps, I noticed the exact same… Both in lack of approval causing lack of cleanup but also the “it doesn’t seem to actually do anything” even when it does run.

    We had severe client issues about 6 months into our Win7/CM2012 migration. 30% of all deployments would not occur due to client issues. After a year + of MS support, increasing user policy delays and aggressive client reinstall/fixes we stumbled onto Kents House of Cards post…

    Our problems presented by CM Application policy and deployment portion of the client breaking, package programs worked fine just not apps. Technicians would uninstall/reinstall the client as the only known fix for over a year.

    Our WSUS was a mess since “you don’t touch WSUS”, pretty much overnight our clients and deployments were instantly better after maintaining WSUS now. The horrible part was we mentioned it to our MS reps and they said “oh yeah a wsus engineer posted about that last month”… gee thanks….

    Like you I created my own clean up script… though I found CM does manage the update groups ok unless you are really particular. As long as your decline script is working at least:)

  2. Thanks very much for the script!

    I was wondering what else did you decline to end up with 3333 updates because I’m at ~4500 which is not too bad since it’s running well enough now.

    • Bryan Dam

      December 15, 2017 at 10:24 pm

      Max, check out the post I link to that talks about the script in more detail. There I posted the summary of my first run in production which shows the breakdown. Note that comparing final numbers is pretty hard … environments are likely different ages and we’re probably syncing different products.

  3. “Note that there is no silver bullet for these WSUS problems.”

    Actually, as of October 2017, there IS a silver bullet. Try it for yourself, it really does completely resolve these nagging issues:

    https://blogs.technet.microsoft.com/configurationmgr/2017/08/18/high-cpuhigh-memory-in-wsus-following-update-tuesdays/

    Solution

    A WSUS update is now available that includes improvements for update metadata processing. This update should be applied to all WSUS servers in your environment.

    Windows Server 2016 (KB4039396)
    Windows Server 2012 R2 (KB4041693)
    Windows Server 2012 (KB4041690)
    WSUS 3.0 SP2 (KB4039929)

    • Bryan Dam

      November 30, 2017 at 4:20 pm

      Yep, I link to that very post in the article. It’s not a silver bullet. I just spent the better part of an hour helping someone get my maintenance script running. He’s had that hotfix applied pretty much since it was released and his WSUS server still couldn’t cope. He had 13k+ active updates, 5k clients and 2 Gb of RAM … it just wasn’t going to work. I’ve also seen more than one user on Reddit apply that update and still have issues.

Leave a Reply

© 2018 Dam Good Admin

Theme by Anders NorenUp ↑