This, the foul year of our lord 2020, has really really gone south on us all real quick. We went from masters of our own domain to empty nesters in record time. As our users flee for home we’ve all had to come to grips on that new reality. There’s been a lot of great info put out there in a very short period of time and in this post I’m going to try to summarize it the best I can. Hopefully it will help lead you through the set of decisions you need to make. I suspect most fires have already been put out by now but with any luck you might find something here to further refine your current strategy.
Maybe It’s Just You
Before we get into it too deep into this I want you to ask yourself a question: could this all be an elaborate ruse to get you to stop coming into the office? Is there some personal or social hygiene problem here? Have you considered that everyone in your organization might still be going to work at the office like normal and high-fiving themselves that you aren’t there to microwave fish again? I, for one, find it very suspicious that this ‘pandemic’ coincides with pretty much all major video-conferencing tools supporting custom backgrounds. For the sake of argument though, let’s just assume that it’s real and that this post is not in itself part of the ruse.
There’s No Such Thing As Free Lunch
The first thing I want to get out of the way is that there are certain situations that are really just unrecoverable from in any kind of automated way. If you sent your users home and you hadn’t already configured either a Virtual Private Network (VPN) or a Cloud Management Gateway (CMG) then you are totally and utterly screwed. Without one of those in place before your devices fled your presence then they have no way to connect to your internal infrastructure. You have no way to reach out to those devices to bring them back into a manageable state.
You might think that sounds crazy (who doesn’t have a VPN?!) but I’ve seen several posts on /r/SCCM from people facing that exact scenario. If that’s you I have no comforting words. If it were me I would be looking at how far my WiFi extends into our parking lot and cooking up something via GPO or ConfigMgr so that they could drive by, connect from the parking lot, and be brought into a managed state using one of the solutions below.
Another way to attack this problem, especially if you couldn’t provide laptops for all your users is to investigate a virtual desktop solution such as Windows Virtual Desktop. I know a few people working in the contractor space doing nothing but WVD implementations right now. Maybe it’s only a temporary stop-gap but if you didn’t have hardware for everyone it’s an attractive option.
From here on out I’m going to assume you have at least a VPN.
Getting a Grip on VPN Traffic
This is by far the most immediate pain that administrators faced. User went home and if they had a VPN then things more or less worked just like they always did. Except for the small fact that nearly all the traffic that usually traversed just their LAN was now going across their corporate internet connection. There’s a whole bunch of things you can do to help minimize the traffic that traverses your corporate pipe. I’m going to try to walk you through the options from easiest/quickest to more long term.
LEDBAT: Just One Click
If your Distribution Points (DPs) are running on Windows Server 2016 or better then stop reading right this blog right now and go enable LEDBAT on all of them. Listen, I know you’re lazy so here’s a one-liner to do it globally:
Get-CMDistributionPoint | Set-CMDistributionPoint -EnableLedbat $true
TL;DR: LEDBAT is a sender-side congestion management protocol. It will automatically back off as latency increases. The key here is that it doesn’t care why or where that latency comes from and will work flawlessly across a VPN. It’s almost purpose built. Now look, I know. I know. There’s always that guy. If you have a internet connection where latency isn’t impacted by congestion then LEDBAT isn’t going to do squat.
NOT JUST FOR DPs ANYMORE
LEDBAT is an OS-level feature that not tied to ConfigMgr in any way. While the product team has given us that ‘one click’ solution for DPs there’s no reason to stop there. Literally any outbound TCP can be managed by LEDBAT. While DP content is the obvious big hitter, what about your Software Update Points (SUPs), Management Points (MPs), or Primary Site Servers (PSSs)?
From Phil Wilcock‘s post (here) about enabling LEDBAT on SUPs we get this PowerShell code for enabling LEDBAT on pretty much anything:
Set-NetTCPSetting -SettingName InternetCustom -CongestionProvider LEDBAT
New-NetTransportFilter -SettingName InternetCustom -LocalPortStart #### -LocalPortEnd #### -RemotePortStart 0 -RemotePortEnd 65535
NOT ALWAYS A CONFETTI FARTING UNICORN
There’s two downsides to LEDBAT that are worth mentioning.
First, if your internet connections is always at capacity then LEDBAT traffic may never happen and your devices will not receive any ConfigMgr content. That’d be ‘bad’.
Second, by design LEDBAT maximizes the use of all available bandwidth. Not 70%, 90%, or even 98%. It will try to keep your internet pipe at as near to 100% as it can get without increasing latency by 60 milliseconds. Your networking team with those alerts set at some stupidly low number? Yea, they’re going to be flipping out. When face with this ask them to document the business impact reported by actual users.
Wait, I’m the Network Admin and The Alerts Burns Me!
Sigh. I hate you. Seriously, just get over it. Unless people are complaining that YouTube is slow why are we still talking?
Not over it? Fine. Fine. This is fine.
The German PFEs (Jonas, Roland Spindeler and Stefan Röll) got you covered: Mastering Configuration Manager Bandwidth limitations for VPN connected Clients.
You essentially have two options: setting a max bandwidth limit in IIS or setting up a local Quality of Service (QoS) policy. Setting the max bandwidth in ISS is dead-simple to do but this will impact all traffic. If you still have endpoints running in the office (ex. servers) they’ll be impact by that same limit. Implementing a local QoS policy is arguably more complex but it allows you to only limit data being sent to the subnets used by your VPN clients.
Branch Cache: Not Just For P2P Anymore
When I bring up Branch Cache (BC) you might get all worked up about your VPN clients sharing content with each other. Admittedly that’d be terrible and while that’s unlikely unless you’re letting broadcast traffic between VPN clients that’s not what I’m talking about here. If you enable Branch Cache in local caching mode (docs) then the endpoint will not attempt to contact peers for content nor try to serve content to them. Instead, it will maintain it’s own local BC cache of de-duped data. When downloading new content from a Branch Cache enabled distribution point it will only download the data not already in the local cache. This can lead to significant reduction in data without a single endpoint talking to each other. It’s dedupe across the wire without any P2P.
So Much For The Easy Stuff
The things I’ve mentioned above are things that as a ConfigMgr/MEMCM admin should be within your control. They should be easy wins you can implement in hours without too much hassle from other departments. Combined they should be more than enough to stop the bleeding.
The things that follow are longer-term solutions that you need to plan for and will likely need to involve other departments in.
Always On VPN
You might have an existing VPN solution but it requires users to log on and as a result it might not be reliably enough to count on for management purposes. If there’s no reason for the user to connect the VPN then they simply won’t. The solution here is to take the user out of the equation and implement a VPN system that always connects when the devices is running. Every major VPN vendor I’m aware of has some version of Always On VPN (AoV). Microsoft’s own AoV solution was Direct Access but with the release of Windows 10 has been transitioned to Always On VPN™. Note that most AoV solutions are based on machine certs for authentication so you are likely going to need Public Key Infrastructure (PKI).
One of the reasons I love AoV and consider it as the first line of defense is that systems management is probably not your only problem. If you have any on-prem ‘line of business’ apps or security tools their problems gets solved too. Need Remote Control Viewer? CMG isn’t going to cut it (yet!) but you should be fine using AoV.
The One Trick ‘BIG PIPE’ Doesn’t Want You To Know: Split-Tunneling
Split-tunneling your VPN allows internet traffic to go over the users’ own internet connection while still tunneling data to your on-prem resources through the secured VPN connection. If your endpoint can get the content from the internet then it will just do that instead of pulling it over your corporate connection … twice.
For some more in-depth detail about split-tunneling with ConfigMgr Rob York has another great post: Managing Patch Tuesday with Configuration Manager in a remote work world. If you didn’t read it above, the German PFE’s article has some great stuff too: Mastering Configuration Manager Bandwidth limitations for VPN connected Clients.
Actually implementing split-tunneling is a very VPN-specific thing so I can’t really provide guidance on how to configure your VPN. If you only want to split off certain ‘trusted’ internet traffic versus all of it then you can use the following URLs provide by the ConfigMgr docs team. If your VPN only supports split-tunneling configuration based on IP addresses you’re out of luck: the content networks used for things such as software updates just doesn’t make that feasible.
Windows 10 Servicing
Some network administrators are going to shut this conversation down hard. Split-tunneling in effect becomes a bridge between the wild-wild-west of the internet and their safe, secure, and comforting intranet. You honestly may not be able to change their minds but know that on the whole the world has moved on. Here is Microsoft’s Dr. Tom Shinder (current Azure Security PM) talking about this: More on DirectAccess Split Tunneling and Force Tunneling. In short: it’s 2020 and you shouldn’t be trusting that laptop at all. There are better, more modern, ways to protect the endpoint and your company without offering your internet connection as a sacrifice to the gods.
If you can’t or won’t enable split-tunneling in your organization then stop reading right now; nothing I discuss beyond this point will help. Everything that follows is an attempt to pull content from the internet instead of from your on-prem infrastructure. If you don’t split-tunnel then internet traffic is going over your corporate internet connection not once but twice. In which case doing the following might actually make things worse.
Pull Update Content from the Internet
If you have a VPN and have split-tunneling configured then a very easy kill is to configure your clients and infrastructure to have them download first-party (ie: Microsoft) update content from the internet instead of from your distribution points. This means that multiple GBs of data per client will simply side-step your org’s internet connection.
Rob York wrote up an excellent post that outlines the needed configurations: Managing remote machines with cloud management gateway in Microsoft Endpoint Configuration Manager. Don’t let the mention of CMG throw you off here. All of the configuration Rob talks about except for the whole ‘assign the CMG to your Boundary Group (BG)’ thing directly applies to VPN-only clients as well.
High-level, here’s what you need:
- Be on Current Branch 1902+.
- Define a dedicated Boundary Group for your VPN clients.
- Configure that group to ‘Prefer cloud distribution points over distribution points’ (docs).
- Configure your update deployments to enable ‘If software updates are not available on distribution point in current, neighbor or site boundary groups, download content from Microsoft Update’ (docs).
Note: I’ve seen this not work as intended. The concept here is that when presented with an option between on-prem DPs and Microsoft Update that the clients will prefer/prioritize Microsoft Update locations. If you’re not seeing that happen you may need to take slightly more drastic steps. First, contact Premier if you can to ‘fix the problem’ but if you’re desperate there’s another option. Spin up a new DP exclusively for your VPN clients, make sure that DP doesn’t have update content distributed to it, and make sure the deployments do no allow the clients to failover to neighbor or default boundary groups. Mike Terrill goes over this in detail here: Forcing Configuration Manager VPN Clients to get patches from Microsoft Update.
What About Applications and 3rd party Updates?
If getting first-party Microsoft update content off of your org’s internet pipe isn’t enough then the next step it to implement a Configuration Manager Gateway (CMG) (docs) to offload content for applications and 3rd party updates as well. Note that the Cloud Distribution Point (CDP) role is deprecated (docs) in lieu of enabling your CMG to act as a CDP by serving data stored in Azure Storage (docs). This will allow you to push up your app and 3rd party update into Azure and have clients download it from the cloud over the internet instead of your on-prem DPs.
There’s a a very niche benefit to using a CMG to deliver content, possibly first-party update content too: the IP ranges for your CMG isn’t static but the ranges are documented. Sort of. If you were are in the ‘I can only configure split-tunneling based on IPs’ camp then Ken Wygant has the goods: Get Azure IP Ranges for Your Cloud Management Gateway.
For regulatory or political reasons (BUT NOT PRICE DAMMIT) you may not be able to implement a cloud-based solution in which case you might look into Internet Based Client Management (IBCM) (docs). Note however that IBCM will not offload application and 3rd party update content form your organization’s internet connection. If you can get IBCM working you can get Windows 10’s Aways on VPN working so I have a hard time arguing for IBCM in these situations.
Wait? Doesn’t The New CMG Token Solve All Our Problems?!
One of the Current Branch 2002 features I was excited to see was token-based authentication for cloud management gateway (docs). In particular, the capability to create a bulk-token that can be used to register clients that don’t reliably connect to your network. At first this sounded like a loop-hole for the ‘No Free Lunch’ scenarios I described above. Alas, this new feature understandably requires that the client be Current Branch 2002. If you can update the client then you’ve already ‘fixed the problem’. It’ll come in mighty handy when the next pandemic comes around though.
Tangentially related, if you want a great run-down on your configuration options for CMG I highly recommend this Jason Sandys‘ post: Cloud Management Gateway Choices.
Intune and Co-Management
I’ve seen several questions on Reddit asking about solving the problem with co-management/Intune. Another version is moving the update or application sliders in an existing co-management configuration. While this is a legitimate strategy know that it doesn’t violate the ‘no free lunch’ rule. In order to co-manage a device it needs to be enrolled in Intune which, to do at scale, requires that the box be managed so you can apply a GPO, script, configuration (docs). If it’s already co-managed the device needs to be able to reach a Management Point to get the new machine policy that tells the client that you’ve moved the workload to Intune. If you can do those things then the box is already manageable. In which case these solutions should be looked at from a longer-term perspective for their own sakes rather than solving your immediate issues.
Can’t I Just Open Port 80 To the Internet?
No. No, man. Shit, no, man. That is the worse idea I’ve ever heard in my life. It’s horrible … this idea. Seriously, just save everyone the hassle and when you do this use it to distribute the ransomware of your choice. ConfigMgr is a very powerful attack vector … at least make they try.
Remember, Keep Your Stick On the Ice
Ok, that’s all I’ve got. I suspect nearly everyone is implementing some of this already and only Adam Gross is implementing it all at the same time (he’s on a #MEMCM Pokémon kick right now). However, I hope you picked up something you can use to refine your current processes.