Uncategorized

Hi, my name is Jeff Milla, I’m a principal program manager with the office 365 engineering team And I’m joined by my colleague Paul today. Hi, I’m Paul kovitch I’m a senior program manager in the office resis lot of engineering team team and we’re here to talk about understanding optimizing and securing an enterprise network connectivity to office 365 One of the key things that we’ve we’ve learned from customers over the years, is that the connectivity? strategy that they that they choose is a very important determining factor in the level of quality that they they receive when consuming office 365 and the connectivity strategy can have a very very high level of impact on the availability of the service the performance of the service and essentially the the end user perception of how good office 365 is it’s super important to be aware that with What services that are considered to be mission-critical to an organization like phone services that might be provided by Skype for business Email services that are expected to always be available that the connectivity strategy that an enterprise chooses is going to be a Very predictive factor of what end-users are actually going to going to perceive So today we’re going to spend quite a bit of time talking about our recommendations on connectivity strategy and how to how to best consume office 365 when thinking about network architecture within an enterprise So there’s been a fairly significant shift when we think about moving to cloud services from a more traditional approach Where you might be deploying similar services on prem in in past years in an on-premises solution you’re typically looking at a Sort of self-contained architecture where you might have a single or maybe a couple of data centers that are all You know owned operated managed by the enterprise and all of the networking and connectivity is you know very much controlled by that that same organization and the considerations are Very different compared to what we’re looking at today in terms of movie – cloud services it’s been a bit of a shift over the years and moving to more of an outsource model where many customers have taken services, like email or our SharePoint or maybe VoIP telephony and move that into a separate data center that might be provided by a third-party service provider and in those types of situations Often you’ll have about one to one connectivity model where you’re connecting that enterprise network into that remote data center but it’s also again a fairly fairly simple process of getting that connectivity established and a fairly simple process of managing and maintaining that architecture when we think about consuming cloud services cloud services modern cloud services are generally optimized for delivery over the Internet and so when we start to think about Ways to consume a service like office 365 that’s very much optimized to deliver over the Internet Some different considerations that come into play much of an enterprises Internet connectivity is likely through a very specialized security stack. That’s Managing some of this, you know, untrusted traffic to random internet sites we’ll talk quite a bit about what that does to office 365 traffic and other ways to think about that and and manage that traffic but many of these considerations that That begin to come into play with with cloud connectivity are very different than the the historic Architecture models, so we’ll talk quite a bit more about that over the next few minutes so as I mentioned moving towards software as a service delivered products like office 365 is Very significantly disrupting traditional connectivity models the the types of architectures that you would have seen in in in prior years with on-premises deployments or outsourced bottles we also see security controls moving and shifting into the cloud and actually moving away from on-premises deployments with Some of our our third-party partners in this space that are providing things like cloud proxy’s We’re also seeing that the the distance between the internet and a cloud service provider is continuing to become shorter and shorter in other words the cloud service providers like Microsoft are shifting those service front doors closer to the the edges of of Our networks and getting getting that distance between your users and our cloud dramatically shorter We’re also seeing many more options with With with connectivity choices things like ST LAN the cloud proxy options that I mentioned previously

certainly direct connectivity models that come into play that you need to consider the the impact of And we’re also seen as I mentioned with some of these cloud proxy and ST wind providers things like internet connectivity itself moving to the cloud and The way in which you consume Internet connectivity being different over time What we’ve tried to do is distill down our guidance in this space to for simple connectivity Principles and we’re gonna walk through these in some detail over the next few minutes but in summary the the first one is that we really find I Find it very important to be able to identify and differentiate office 365 traffic in other words amongst all of the traffic egressing off of your network on to the internet or to to other software as a service providers or to office 365 we want you to be able to Detect what traffic amongst all of that? That Internet traffic is associated with office 365 so that you can treat it differently And we’ll talk through what we mean by treat it differently As we move on today, we want you to egress your office 365 traffic as close to the end user as possible in other words get it off of the enterprise network and on to that that That internet connectivity or if there’s a direct peering model in place To have that also be as close to the end user as possible and we want to have that paired with local DNS resolution for various reasons that we’ll go into some detail on but essentially so that our cloud services Have a good way to detect where that traffic is coming from. We also want you to avoid networked hair pins and Ideally ensure that the the traffic egressing the enterprise network is Jumping on to Microsoft’s network as close as possible to that egress point and lastly Very much associated with the first connectivity principle where we’re identifying traffic We want to be able to treat that office365 traffic a little bit differently and wherever possible bypass proxies traffic inspection devices any What we would consider potentially extraneous security solutions on the edge of your network and use similar technology within office 365 to provide the same service So with that I’ll hand over to Paul to talk through principle number one in some detail. So Principle number one as it says identify and differentiate the office365 traffic using the published endpoints data What does that mean? Well Microsoft published the URLs required for the service and the associated IPs were possible. I am through our URL and IP page And here’s a screenshot of the front of that page. The vast majority of customers will be than the worldwide instance. There are other versions of this 421 by net which is our China version of office 365 office 365 Germany and some US government relations, but As I say the vast majority of customers will be using the worldwide instance here So this is the page to go to to understand what is required in terms of URL and IPS to access the service and this is also where we publish changes so Understanding the endpoints in here and how the data is presented is key to understanding how to handle that traffic So here’s an example section and you can see at the top there. The first few rows are Most services require some shared Services portal authentication etc. So those are always listed in a per service basis on this tool and in the Box on the bottom, there’s a key difference between number four there and number five So in column two, we have the description of what the services in column three, we have a description of the authentication that’s going to be required and where the traffic going from and to the vast majority These from the client the service to office 365 but there are some reverse flows and this is where you look for that information Column for the URLs required for that particular element of the service Column five is where we list whether that service is routable via Express routes if it says yes those URLs when resolved to a public IP address can be sent via Express route if your Organization is using it for office 365 the next column indicates the IPS provided In column Row 4 there you can see the exchange online IP addresses and then finally the ports required for those endpoints Now there’s a key difference between row 4 and row 5 there Row 5 says no for express route for office 365 Because those endpoints are CD ends They may not live on Microsoft’s infrastructure and subsequently You’ll notice n/a for the IP ranges because we can’t provide them for those endpoints CD ends our end points where we hold scripts generic intimate images non customer data

But because of that they don’t live on Microsoft infrastructure necessarily and we don’t provide IPS So therefore that’s that differentiation point we might want to send the Express routable traffic although that traffic which we provide our ps4 via a direct path, but the URLs in the bottom column there may need to go via a proxy where we can let that traffic out So in terms of what you need to understand from that page ROS marketers required obviously are required for that particular service to work Other services might be marked as optional because you may or may not need to use that element An example might be a DFS, for example, if you’re not using that then you wouldn’t need to open the endpoints of them As mentioned if IPS provided they are microsoft owned endpoints They live within our infrastructure If they’re not then that’s not necessarily the case all the IPS listed for the row are required for the service to work We don’t operate regional IP list and office 365 so for the example You saw earlier on the exchange IP ranges in its entirety is needed to be available for that endpoint to be reachable And the same goes for the ports available there If marked as a CDN as mentioned, it might not live on Microsoft infrastructure. And therefore there won’t be IPS for it and The endpoints which we don’t provide our P’s are going to need an unrestricted direct path or a pathway Where the URL is permitted for example via proxy? And understanding what those endpoints do allows us to differentiate them and deal with them appropriately So, how do we manage change in this space well this space changes on a very regular basis We we aim for a cadence of monthly updates with one months notice until those endpoints alive so it’s something that a process needs to be in place is to Monitor those changes and enact them within the the environment for example update our PAC file with additional URLs and to do that There’s an RSS feed off that page which will allow you to be notified when that page changes and then you can consume those changes And make whichever changes you need within the environment and we also publish example PAC files off the URL page which will give you an example of which URLs are Express routable a list of all the URLs required for all services or a split between those that we provide our ps4 and those what we don’t So that may make it easy for you to just extract that data And we update those PAC files as the URLs and IPS are updating so in summary There’s the XML file linked off the URL an IP page is used for a device configuration The RSS feed is used for change notifications and the HTML page We looked at for review of those details so this is how we’ve published this data since Office 365 was released customers have reported to us that this URL and IP page is not quite as dynamic as it could be it’s not something that is often easy to script and Ease the administrative challenge of applying those changes to a corporate network So here at Microsoft we’ve been looking at ways to improve this area in the near future early 8 early 2018 We will publish a beta version of the a tool which will allow customers to consume that change in a more friendly format, for example in an XML file also in a JSON file and various other formats that will allow your Firewall security teams to ingest that information in the format that’s best for you – perhaps scripts and automate changes or even allow firewalls to automatically pull in those changes So a pilot for this is aimed at early 2018 if you have any particular feedback on the versions of files the The types of files you’d like then you can feedback directly to us via the current URL an IP page And just another point of note where we’re making improvements in this space Work is also ongoing to consolidate the IP ranges that are required for the service into wider ranges This is the aim to reduce the number of IP spaces You have to open to connect to the service and also reduce the change rate where you have to make changes within firewalls, etc An example being Skype have reduced over 50% in the last few months The amount of IP is required and this will be ongoing as we move forward through 2018 And with that, we’ll move on to principle number two So principle number two involves

egress in office 365 network traffic as close to the end user as practical and also ensure that the the way in which DNS has resolved for those users and Matches that that egress point as closely as possible as well Microsoft operates a very very extensive network of data centers as well as network infrastructure connect those data centers and connect to the internet within office 365 We have a number of global regions which the vast majority of our customers are contained within And that list of global regions continues to grow as you’ll see there are some announced regions that have not yet launched on this slide we also have a number of sovereign regions Paul mentioned the The China instance operated by 21 via net and we also have the US government regions within the US Obviously and our Germany region which are treated a bit differently in terms of how they’re they’re managed maintained When we think about how you access your office 365 service within one of those regions All of that access is accomplished over the Microsoft global network. We have a very Very well built network. It’s one of the the top networks in the world it has extremely high amounts of bandwidth in order to support the demands of not only office 365 but every internet facing service that Microsoft provides whether those are commercial services like office 365 or Azure or Consumer services like Xbox Live Bing etc much of this infrastructure includes Microsoft privately owned dark fiber across many of these regions and in many cases we’re talking about multiple terabyte connections between data centers in order to support things like data replication to ensure high availability of our services across the board our Overall aim is to get our our customers network traffic on to the Microsoft global network as quickly as possible So that we can use all of these investments that we’ve made in this network Infrastructure to provide the the very best possible end user experience This is a visual diagram of Microsoft’s global network and this is not the Internet even though it may look like Some diagrams you may have seen in the past of the internet. This is Microsoft Owned and operated infrastructure around the world that we use to provide a great and user experience to all of our customers You may have read in the press In in recent months about the the launch of the Marea transatlantic subsea cable It’s a joint venture between Microsoft and and some other cloud providers and ISPs To add a very very significant amount of bandwidth between the United States and and Europe this is one example of the types of infrastructure investments that Microsoft is making in order to provide a great a great experience on With with our services across the board So Microsoft’s global network, which we refer to as a SAT 75 That’s the autonomous system number that’s used for Internet routing for those of you who have some familiarity with with asns and BGP routing That network is used to provide presents appearing and backhaul to our customers. So presents in terms of presence around the world as we we showed in some of those visuals a moment ago peering in terms of our peering strategy with ISPs around the world as well as some of the direct peering options that we make available and backhaul in terms of Utilizing that that network infrastructure to backhaul customer network traffic wherever it needs to go to reach the the various portions of our infrastructure that are used to process requests and and and provide the services that That we make available as part of office 365 and other Microsoft services as well we have a distributed front what we call front door infrastructure and what we mean by that is a set of Service capacity that are the the first hop into our network That clients connect to so depending on the the workload that we’re talking about whether it’s exchange Skype SharePoint or some of the the other workloads within office? 365 there are different types of service front or infrastructure that exist around the world On the essentially the edge of our network and we have different ways to ensure that Network traffic from our customers is hitting the closest service front door as possible again Depending on where that that customer network traffic egress is on to our network We continue to invest heavily in that area as well with the goal of moving that edge closer and closer to our end users the reason for that is that we we minimize latency and Provide a much better experience from a an end user performance perspective by moving that infrastructure closer and closer So when we think about how we deal with enterprises that have multiple sites Obviously, you know depending on the number of sites that we’re talking about it may not be feasible to actually have ever single site

Egress traffic onto the internet and get that traffic on to Microsoft’s Network as closely as possible So it’s important to assess this from a data driven perspective and the way that can be done is by looking at latency between the the the remote location And the availability of connectivity from that location to the the Internet and then on to Microsoft’s network It’s certainly advised to do some form of network assessment whether that’s you know looking at trace routes from those remote sites to an endpoint on Microsoft’s network or even having Microsoft Services come in and do a formal network assessment offering to advise on the best path path forward The key goal across the board is to get that traffic onto Microsoft’s network as quickly as possible but obviously you need to balance that with You know various other aspects that come into play cost Availability of those links etc. Okay. So here we have Contoso which has three sites in the US overlaid over Microsoft’s global network the dots here indicate Microsoft’s local peering points and the dashed red line is the contoso corporate network This as it looks now is an ideal scenario each Three of those sites has a yellow dot line being the local internet breakout for its San Francisco office. Its Chicago office and its New York office Therefore the traffic is on Microsoft’s network locally very quickly and we can backhaul it to where it needs to go from there But a level of consolidation is to be expected within a corporate network It’s not always feasible to have a managed network egress every location So the network assessment that Jeff mentioned before would be used to understand Okay, where does it make sense for us to invest in a local egress? For example the New York office here may be very well connected to the head office in Chicago And we only have maybe 50 users in our New York office therefore It doesn’t make sense to have a local break out The additional latency to backhaul to Chicago is not going to impact our performance Whereas if we look over on the west coast the San Francisco office There’s a fairly lengthy trip over to the head office there to egress and and that data location where you see that cloud Could be anywhere in the US and it’s important to think of office 365 data is not in a single place Your data is in many places and therefore by getting that traffic onto Microsoft network as quickly as possible Allows us to route that traffic to where it needs to go so that San Francisco office as an example It may well be worth investing in that a local network egress there rather than backhaul that traffic across the u.s. To egress in Chicago But obviously that data that should be a data-driven decision based on number of users network capability where that particular site is and so on when planning Network egress to office 365 it’s also critical to consider local DNS resolution What we mean by this is wherever traffic is egressing on to the Internet of an on to Microsoft’s global network customers need to consider having local DNS infrastructure, or utilizing Potentially ISP network infrastructure that’s associated with the Internet service provider used in that location if you choose to use ISP DNS infrastructure It’s also important to work with that is be to ensure that those those DNS servers that you’re pointed at are actually you know fairly close to the physical or the the Geographical location where that that egress is occurring the reason for this is that some office 365 services utilize geo DNS lookup technology to to determine where inbound DNS requests to Microsoft are coming from geographically and then we use that information to to point the user to service front doors that are geographically close to that that the location of that traffic egressing onto our network if Local DNS infrastructure is not being used. We may make a very poor decision about where that traffic is originating and cause A fair amount of additional latency on each one of those requests coming from from each customer also important to understand that global dns providers providers that might advertise that there They’re making available anycast DNS services as an example in order to to route those Incoming DNS requests to infrastructure around the world may not always do local lookup even though it is you know a global service they may not have service capacity or infrastructure in every physical location where you are Egressing traffic onto the internet or onto to microsoft’s network So be aware of that and make sure that wherever possible you have some local DNS infrastructure close to those egress points

So I’ll hand this over to Paul to walk through principle number three So principle number three avoid network care pins and optimize connectivity directly onto Microsoft local network So there’s that map again Giving you an example of Microsoft local network the dots being those peering locations. Where data from your ISP Can get onto Microsoft global backbone and there’s actually a list you can go to a kms a five which will take you to peering DB which lists every location where Microsoft has peering and also, Who else is it that exchange as well? there’s a list current to late 2017 there that shows all the locations around the world where we have those peering points and many cities in this List will have multiple locations. So for example in London, we have more than one location where traffic and ingress and egress from our network So it’s it’s fairly easy to do a quick test and understand how your particular network is Peering onto this global network by a simple trace route So any of the Microsoft owned endpoints and and by Microsoft owned? I mean the ones that we provide IPS for a good example would be your tenant named SharePoint comm or outlook office 365 com a Trace route from your network will show where the traffic leaves your ISP and gets onto Microsoft support global Beckman Here’s an example from a home broadband connection in in the UK and you can see there the peering occurs in it actually happens in Seven, but the important thing we’re looking for here is that it’s pier Danse Microsoft’s Network, which is MSN net They’re in hop eight in 35 milliseconds And the other highlighted police piece is lol zero four So that means it’s on Microsoft’s network in London in a reasonable amount of time. And that’s the important thing I’m looking for It’s in an expected location and an expected amount of latency Here’s an example of the peering occurring in France So a colleague of mine took a trace here in Paris and you can see there in hop seven in eight milliseconds in PA r02 Paris we’re on that MSN net network, which is Microsoft’s global backbone Here’s a trace from Miami in Florida and you can see here the trace bounces between a number of ISPs perfectly normal That’s how the internet works and then in hop 10 we can see in 24 milliseconds We’re on Microsoft’s network in mi a Miami and then we’ll back all that traffic Across to Europe which is where this particular end point was Here’s an example where that hasn’t quite worked as it should do this is a trace from a customer in Scotland, so the peering should happen very similar to that one. The first one we looked at in the UK Through London and you can see in hop six there were in London, but we’re still on the ISPs network But then the next hop, we see a jump in latency And we’re in New York still on the ISPs network and we bounce around New York a little bit and then by hop twelve in 87 milliseconds we hit Microsoft’s network in NYC and then we have to back all that traffic back across the Atlantic Again, because the endpoint we’re hitting is in Europe So on unnecessary trip across the Atlantic there why this particular ISP at the time? Wasn’t pairing correctly with Microsoft the only place they knew to do that was New York So if you see something like this you can speak to your ISP and ask them to look into it or Use your Microsoft accounts him and we can go and talk to the ISP for you in this instance within a week We change this peering so the ISP was provide that traffic to Microsoft in London and therefore reducing the latency in the performance issue that this Particular issue was causing. It’s important to note here as well when looking at peering It’s not necessarily going to be the perfect location for you. It really depends on why your ISP has connectivity An example might be you might be in, New York We have peering locations in New York, but your ISP may choose to peer in Washington DC For example, as long as it’s within a reasonable distance, then that’s to be expected What we don’t want is the example here where we’re crossing continents for example to put the the traffic onto Microsoft’s network And Part of this principle is to talk about avoid hairpin in traffic In in essence, it means sending the traffic to a place to then come back again kind of like the example We just looked at but in the example here from a customer We have a particular connection here that goes from San Francisco that

Gets out to the internet in Orlando Which hits the service front door near Miami and then that connects into the data in this example 65 milliseconds Reasonable amount of time the service will work. Well at this this level of latency, but it’s not optimal We have peering locations as mentioned all over the globe. So there’s a nearer one to San Francisco. We have one in San Jose Which if we had a net were local network egress we could use that use a service front door closer to San Jose in This instance. It’s in Seattle and that brings that latency down to 25 milliseconds But as I say these service front doors are growing in location numbers etc And that service front door may well become available closer to the user in San Jose in this example Which further brings down the latency to a possible five milliseconds? another thing to think about is the fact that Microsoft do micro improvements and adjustments Where possible so we may even realize where that where you’re connecting to is from Move the users data within the the political region where your tenant is Closer to where you’re getting the traffic onto our network So in this instance, there’s a big difference between sixty sixty five and five milliseconds to get to your data Another example of where we see happening is when customers use Express routes in this case, the Express route circuit might be in Washington DC and Therefore were backhauling from Orlando in this instance all the way across to Washington To get our data and this pushes the latency above any of those options we’ve seen before So by putting in a premium network connection like Express route, but in a limited number of locations We’re actually increasing the latency and reducing the performance compared to that local egress and this is Happening traffic another example might be using a force VPN to an or to send traffic into our corporate network For our roaming users and then back out again So where possible avoid this and get the traffic out on smog source network as quickly as possible and with that well Look at principle number four Principle number four is very much related to principle number one where we talked about identifying traffic in principle number four We’re gonna talk about what you can do Once you’ve identified that office 365 traffic to potentially treat it a bit differently from other traffic. That’s You might consider to be just generic Internet traffic Egressing the the enterprise network and an accessing me, you know random internet sites as opposed to accessing actual customer data and utilizing that customer data in the way that you would within office 365 So when we think about traditional internet egress security It’s actually a fairly significant problem for customers that look to onboard office 365 the reason for this is that many of these customers choose to use whatever egress method they have in place to access office 365 It’s aligned with what we’ve been talking about so far you know having that that local internet egress and It’s certainly the method that’s been used historically you know by by the customer to to access anything over the Internet the challenge, is that because Enterprises typically do not trust internet traffic by default. There are varying levels of Security solutions applied at the edge of the network in order to to better understand what what’s actually contained within that traffic potentially to limit that traffic to enforce policy on what websites or other internet resources can or cannot be accessed and to Even monitor the payload of some of those requests when considering onboarding office 365 traffic through that that security stack we start to get very concerned about scalability given the number of long-lived connections that are typically associated with things like Exchange Online and the the volume of requests and the volume of bandwidth utilization associated with office 365 all up so typically at most of the customers that we that we work with we see that this this internet security stack that’s deployed at the edge of the customer network is often not already scaled to meet those demands and Costly upgrades are required in order to meet that demand We also find that that Proxy infrastructure that historically has been used to provide access to websites and and enforce policy and and provide some level of monitoring Often doesn’t handle UDP traffic properly UDP. Traffic is Extremely important to be able to pass through to the Internet for Skype media Let’s see video voice any sort of media traffic associated with Skype Without the ability to pass UDP traffic to Microsoft’s network through that that security stack the Skype client must actually

Failover to TCP and there’s a very significant very measurable performance impact in doing so One that will be noticed by all the end users within an organization so it’s important to be able to identify that that UDP traffic or that Skype traffic and allow it to to bypass proxy if at all possible obviously as traffic goes through this security stack at the edge of the network any device or any additional Processing that those network packets must go through as the potential to add jitter and latency, which also certainly from a media perspective can can have a measurable impact but even when thinking about things like an outlook connection cache mode Additional latency can have a measurable impact even though much of that That activity that the outlook client is performing behind the scenes You know is is in the background there are aspects of it that will still come to the forefront and can cause end-user complaints So very very important to understand, you know what this traffic is and wherever possible Bypass that that security stack in order to to ensure a high level of performance and a high level of quality particularly for the media traffic When we think about Bypassing the security stack. We’re typically talking about needing to trust this traffic Differently than we would trust traditional internet traffic Trusted service services are Generally simpler to to connect to and when we think about the the older connectivity models that I discussed at the beginning of this presentation where we were considering on-premises solutions or Traditional outsourcer type solutions where you have a maybe a single workload And a single site that you’re connecting to All of that is totally trusted because it’s a single implementation of that workload It’s very specific to the to the customer and you essentially trust that as if you would trust any Locally developed application that’s deployed on your network When talking about trust and accessing services on the Internet we often like to use this this diagram that displays visually the the level of complexity typically associated with that that Security stack on the edge of a network as opposed to the the level of trust of the the site that’s being accessed so on the left side of this chart what we’re referring to is Something like a generic internet site any site that a user might be trying to access where you have no control whatsoever over the data that’s presented via that site or you know any sort of malicious activity that might be associated with that site and typically the level of complexity or The amount of control that you want to have over that that traffic is very high so you know as we look at that curve through the the chart very high level of Complexity associated with the security stack for things like generic internet sites the other side of the chart as the level of trust increases that complexity should At least in theory come down as we begin to trust that that traffic or the the payload of that traffic more We need to apply less security at the edge of the network to that. So something like an on-premises solution Obviously, you’re gonna be very far down towards the the trust side something like a generic internet site. You’re obviously gonna be very very high up the chart in terms of the complexity and there’s different types of Technology available to apply to this problem at the edge of the network They do some examples of the the types of devices that are typically applied at various points in that that curve of trust Office 365 we would like to see somewhere Towards the trusted side of that graph And this varies from customer to customer in terms of you know? where they they choose to to place it on the graph and and what that risk assessment looks like, you know within within a given IT department or security organization in general? We want to see that office 365 traffic of as close as possible to the the fully trusted on-premises side of that that trust graph Because we we provide quite a bit of detail about how that traffic is managed on our network how we secure it as that traffic moves across the internet and given that we’re able to provide quite a bit of assurance that that traffic is Is not in any way malicious are associated with things that you might associate with the generic internet requests As you start to to move up that graph it’s important to remember that things like Moving that traffic through proxies potentially doing SSL decryption Break and inspect of that traffic all of that can add potentially very high levels of latency that as I mentioned can provide Quite a bit of degradation in terms of of end-user experience

So what we want you to do is actually do some due diligence as you you think about Where you’re going to place office 365 on that that that graph of trust or that that curve of trust? and understand all of the different technologies that we make available in office 365 In order to to secure and manage your data and the traffic associated with accessing your data When we think about the features within office 365 It’s also very important to think through the the outcomes that you’re trying to to achieve here Not necessarily the implementation when you think about the implementation the the potential Ways in which you’re trying to prevent threats? oftentimes it’s It’s tempting to apply security both at the edge as well as within office 365 itself wherever possible what we want you to do is actually think about the the overall outcome of the the the application of these security technologies as it applies to the the actual client experience or the the end-user experience of consuming office 365 and realize that by adding that additional layer of complexity at the edge of your network that Is in many cases? Duplicating the the security that’s already provided within office. 365 You’re having a potentially a very negative impact on the overall experience These are just some examples of some of the technologies that we make available as part of the office 365 suite that we encourage you to to take a look at fully understand with your risk management teams Exactly what those are what benefits they provide and where those those technologies may be overlapping with Technology that you have deployed at the edge of your network today So overall the recommendation identify the the traffic that’s associated with office 365 secure that traffic with the features within office 365 that are already part of the suite and Providing that that level of assurance to you and continue to treat that That generic internet traffic in the same way that you do today. Just differentiate differentiate that traffic at your network edge and And do the right thing to ensure a high level of of end-user experience and a high level of performance The challenge with office 365 is that the overall office 365 service is not a single end point It’s not a single service that all of the the clients associated with office 365 access office 365 is made up of multiple workloads like Exchange SharePoint Skype all of these have their own separate endpoints That that clients access additionally portions of these workloads are provided via things like content delivery networks that are not actually Microsoft owned that we don’t publish IP addresses for We know in working with with many Many enterprise customers as well as smaller customers that it’s very difficult to consume the the broad set of IP addresses that make up all of these services as well as the fairly long list of namespaces associated with all of these services and Keep that up to date and bypass all of that traffic at the edge of a network and so is in response to that feedback What we’ve come up with is a URL classification system and what what we’re moving towards is a more simplified list of what we consider Very small set of Microsoft owned Core services or core endpoints that are associated with the vast majority of the traffic used by an office 365 customer that we really want our customers to optimize what we mean by optimize is Bypass the that that edge network infrastructure as much as possible These are things like the the core endpoints associated with the big workload Exchange. SharePoint Skype That represent the vast majority of that traffic or the vast majority of the connection volume There are other portions of the traffic where we want you to to obviously allow that traffic send it direct wherever possible and not SSL intercept it and then the the broad set of remaining endpoints We consider to be sort of a default category and what we want you to do There is just follow whatever the standard company policy is to be able to get that traffic out unto the internet, you know Obviously we need to be able to make those connections but But you can apply whatever security is appropriate given existing company policy and not manage those Necessarily any differently than you would today. So we’re targeting early 2018 for Adjusting our publication of these endpoints and this will be an adjustment to the the publication Format that we have for for those endpoints

Compared to what’s what’s out there today? so by moving to this this new classification system, our our goal is to ensure that the the endpoints that the really impact end user experience and the the performance associated with the overall office 365 suite we want to see those those endpoints be able to be accessed in the best possible way and For those all those other endpoints that are not as latency sensitive we want our customers to have the flexibility to manage that traffic in whatever way they need to what this means is that the the set of end points or namespaces that That need to be treated differently should be a very small set and shouldn’t be changing that frequently compared to the The change rate that we have today in our our published endpoint data So we’re hoping to make this an easier process for our customers to manage this traffic as it egress is the network and also provide A much better end user experience by optimizing that that latency sensitive traffic so we’ve looked at those four principles and let’s see how that applies to real-world traffic and and we’ll walk through the three main services here Skype exchange in SharePoint straight onedrive for business and See how these services have changed how they connect to improve performance over the past year or so and by following those principles You would have been able to abstract yourself from any of those changes and also consume the benefits of those changes instantly So we’ll start with Skype for business Well walk quickly how that signs in and connects and then we’ll walk through the the improvements that have been made So when Skype the client signs in it connects to a pool in the location of your tenant And that informs the client of the relay to use for media services for its calls meetings, etc So that signing is over TCP port 443 It’s not necessarily latency sensitive because it’s not real-time media But that tells the client where it needs to go to to make calls, etc For the media traffic the client will actually make multiple attempts to connect over different methods It’s quite adaptable as a product to be able to connect out because obviously users change networks. They have different egress models So the Skype client will actually try and connect a media call In three ways at the same time first The optimal method is UDP direct where we’ll send a UDP connection to that Relay server, and and this is an area we’ve made improvements over the past year or so Previously, we’d require ports 50,000 to 50 9999 to be open for that to work Again an area if you go to your security team say I want to open these ranges on our firewall You may get a funny look from them. And this is where it puts a blocker to that optimal performance so in the recent months the Skype team have re-engineered the product to allow it to connect over just for UDP ports which again makes having the optimal connectivity Mothe method available So that’s what Skype tries and if it gets a response from this this is what it uses the second preferred connectivity model is direct over port 443 so by direct I mean connecting to the public IP address of the relay over port 443 and The least preferred method as Jeff said before going via a proxy over TCP It’s not something that the service prefers, but it will work And if the top two methods don’t get a response Then that’s the method that is used and that allows us to connect out in very different Network environments But using the best method possible where it is available and just a reference here of again, reading back to that URL and IP page Skype specifically here. We list some of the URLs The ports required and what they’re for and you can see there in Row two Audio/video desktop sharing the URL required start link comm and then the ports required It says a brand the the old UDP ports optional we prefer if they’re open It actually just gives some very slight performance improvements on uncork connectivity, but it’s not necessary. So All you need for UDP connectivity is the four ports listed there. So In terms of calls between two users It’s important to note that In many cases in a corporate network where we’ve got direct connectivity between two machines on the same flat network Then often that call will be peer-to-peer that call would never go out to office 365 service so it doesn’t need to use the internet egress an example being two offices connected on a one a

Call between two users there should just go peer-to-peer between those two But if those two users are on different networks split by firewalls etc, or one user in this example is in a hotel in Ibiza and The other user is in the head office in Germany Then this is how Skype connects currently early 2018 If we need to make a call on to disparate networks We need to bounce that call off a media relay server in the location of our tenant So in this instance, we’re a us-based customer. We have a tenant based in the USA So I call here needs to bounce off a relay server in the USA This is how the service is designed to work with local egress and you can see both of those locations Have local egress world maxwell’s network locally Etc. The service will work fine. But we’ve got an unnecessary trip across the Atlantic here So the Skype for business team are rolling out currently again early 2018 Something called a transport relay which allow those two users to have a call but this time we can bounce the call off a transport relay in the Region where those users are so we’ve cut out the need for that transatlantic trip. They’re reducing latency and therefore call performance, but for this to work we need those principles in place of local egress Etc. Because if that particular customer had a an internet break out in the USA having a local Transport relay in your app isn’t beneficial to us at all So this is a an improvement that should being rolled out Slowly by the Skype for business team as we make sure everything is working as it should be And you should see this applicable to your tenant in the near future And a slight segue here Away from how the service connects but we’ve talked about network Assessments and how to understand where local egress makes sense, etc. The Skype for business team. Have a great tool Which outlines the requirements for Skype for business that it can deliver a high quality call or meeting in terms of latency? to Microsoft’s network packet loss inter-arrival jitter packet reorder and they set out a set of clear examples of Figures that are needed for the service to work at its highest level. So this is what we need to aim for and The the team have a tool which Is available publicly That essentially mimics a call for 17 seconds and it measures those things for you and will tell you it’s passed or fail with The figures so this is a great tool for me If it works for Skype for business, it’s good for everything else Skype for business is the most latency sensitive Network Issue sensitive. So the tool if you can get it Pruett for Skype everything else will work very nicely over there as well So from a network assessment point of view We recommend running the tool at your client sites and also the network egress for those sites For one working week so you can catch those peaks and troughs of user user data and measure that every five minutes so run it at a machine at a client site and the egress and What that allows you to do when you see the results is to say we had perhaps high levels of jitter We saw on the client side trace But we don’t see that on the egress trace which tells us very clearly that problem exists within our own corporate network and not externally so again allows us to Hone in on a problem and fix it and isolate where that particular problem may be So scope for business network assessments all great tool for assessing your network performance And with that, we’ll move on to Exchange Online which connects in a completely different way to Skype So when we think about exchange online connectivity there are a couple of different systems that are actually used in tandem one What the exchange online team refers to as active active? There are there’s a Geo DNS system, which I referred to previously when we talked about local DNS resolution associated with those those local egress points or attempting to connect Outlook to a Front-end server or a cafe server cafe stands for client access front end That’s in the region where the user is ideally is close to the the end user as possible And we’re also now using an inny caste system that’s designed to to get IP addresses of local front end servers but not using geo DNS and we’ll talk through exactly how this works in a In a minute in some detail but the the overall point of this is to to get the the outlook client or the

Outlook Web App client to connect to a service front door a cafe server or another type of front door again As close as as physically or as close as geographically possible to the end user To provide minimal latency and a great experience with the service Key for this to work successfully is obviously the the fundamental principles that we’ve talked about getting that traffic out to our network In a way where you’re identifying it and differentiating it Egress unit closely to the end user matching DNS resolution for that that Geo DNS solution to work avoiding network hairpin trying to minimize that latency and Wherever possible bypassing proxies and you know applying the the minimal level of Educating so in the case of geo dns This is a visual representation of how this process works So a client on premises or perhaps, you know a user traveling Pup’s open the the outlook client And once the connect to their exchange online hosted mailbox So they’re going to go do a DNS query for Outlook that office 365 comm which is the the global namespace that we use for essentially the the vast majority of the the regions associated with with Exchange Online for some of the The sovereign regions within the service we do have a different name space but in general outlook office 365 com is the answer as part of that that DNS request as I mentioned if this is an active active solution, so some of our responses will end up Going to to a Geo D DNS solution some will go to the anycast solution in this particular case we’re doing a Geo DNS resolution and so we we get a DNS response back from Internet DNS infrastructure the points the user to a regional namespace for for exchange online and that user that’s currently perceived to be located in in the AMIA region connects to a Cafe server in the AMIA region if that user was using DNS infrastructure that was in North America. For example We may perceive that user to physically be in North America and send them to a cafe server in North America That would obviously add a fair amount of additional latency In this particular case the the tenant for the user actually exists in North America so their their data is stored in North America and From that amia based client access for an end cafe server were then proxy in the connection across Microsoft’s network to the the mailbox server in north. America where the data is stored again from a client performance perspective from an end-user experience perspective the The most beneficial thing is for us to have that that initial end point that service front or cafe server close to the end user It’s it’s less important for that that mailbox data to be physically close So we’re optimizing for having that Cave cafe server regionally close to the end user So how do we provide resiliency in this case So again in this example, we’re we’re doing the same DNS request and we’re gonna go try and an access a cafe server and data center two in this diagram In this particular case as we’re trying to access Cafe servers in data center two. We’re actually not getting a response from that initial TCP connection We’re gonna retry that three times We’re going through the list of IP addresses that are associated with the the regional namespace that we got back from the DNS request And as part of standard behavior within the the TCP stack on the the workstation for that that user And the way the outlet client takes advantage of that TCP stack We also received some IP addresses associated with capacity in another data center in that that that same region That data center is available and we are able to connect to that in this example. And so we Can actually gets proxy to the backend And the same via the same process that we described previously and the end user can connect so we’re actually via the system able to to lose connectivity to an entire data center worth of Client access front end servers and still maintain resiliency and availability of the service for our end users In the anycast example, it’s slightly different. We’re not using geo dns here so we go ahead and we do the same dns request out that looked at office 365 comm and the response that comes back actually points us to a a single global IP address that’s used via anycast for the solution and that points us to DNS infrastructure, that’s actually co-located with cafe servers in our data centers those

those cafe servers of the DNS infrastructure on those cafe servers Will actually respond back with a set of IP addresses that are appropriate for that local region. So instead of Detecting where the user is based on a Geo DNS lookup we’re actually detecting where they are based on which cafe server is receiving that that DNS request So we get back a set of IP addresses associated with the region and we connect in via the same process When we think about Outlook Web App, it’s a slightly different process compared to the way Outlook connects into the service so in the case of an Outlook Web App client connecting into office 365 that connectivity is going out to the Internet and it’s coming in through a different set of service front doors that are essentially reverse proxy servers managed as part of the what we refer to as the azure front door infrastructure within the Microsoft global network and The end point for Outlook Web App is actually a single global IP address. That’s that’s resolved via anycast so the The the user will be directed to the closest service front door That that can be accessed based on normal internet routing and the way anycast routing works on the Internet The connection is then proxied from there to the location where the the end users mailbox is being served from so very similar process to At a high level and the way Outlook connects but the actual mechanics of it it’s all handled via anycast Which we can do because these are short-lived HTTP requests Associated with traditional web browsing as opposed to the long live requests that don’t work quite as well with any cast that you get with With an outlook client connected into the service And find you we’re going to look at sharepoint and how that connects in a completely different way again, but again by following those principles We can optimize our connectivity to this service So about a year ago sharepoint used to work in a very different way – it works now you Used to have a unicast connection where we do a DNS lookup for our tenant named SharePoint com We’d be given the unicast address of the front end of our SharePoint farm So in this instance my users in a mere my SharePoint tenant is in North America and we’d get the IP address of that SharePoint farm in North America, so we’d make a TCP connection all the way from the client to the service in North America Obviously if we’ve got the network principles in place this traffic will flow the majority of its journey over the Microsoft global backbone Which will deliver optimal performance? But there are issues here. We have have to set up a TCP connection over a very long latency We might have one. We will have slow start algorithms and etc to run through If that particular customer or that egress has TCP issues TCP window scaling disabled, for example, they carry that entire way across that long longer latency connection SharePoint Online now has switched globally to an any caste system that uses the same IP globally So if I look up my tenant it will come back with that. Anycast address 13.1 o7, not 6.15 one Which will be the same for a customer tenant now when you make a connection to the IP address the any care system will attract via BGP route advertisement to the nearest Lowest cost end point the edge server that is configured for that IP And that then means the front door of that service where that TCP connection is terminated should be very close to the user So again in that example before I should hit then any cached front-end for SharePoint in Emir which means that the long live can came back to my talent in the USA can go over a Hot TCP connection, so we walk through how that works here. The Emir client does the DNS lookup We get that unique it had the anycast IP address back which will find its way to the nearest front end for the sharepoint service and Then connect through to a hot TCP session between that amia location and North America in this instance So that tcp slow-start algorithm doesn’t have to happen. It’s already done We’re on an optimized hot TCP session We can also fix at this location any suboptimal TCP settings So that window scaling that was disabled by my proxy is fixed at the edge server Very close to my user minimizing the impact of that particular problem

So if we actually look at how this works if I connect to a SharePoint tenant in London We can see here that goes through to a Shepherd edge server in Amsterdam in Sydney the same thing an edge server in Sydney and In Charlotte we pair in Atlanta and terminate in Ashburn the point here is that These are the front doors that we mentioned earlier on that are growing in number and by following those Network Principles of getting our traffic out locally as the service evolves as it improves as we add Capacity in different areas that front door is moving closer and closer to your users With that change we saw in some instances Three times the 10 times increasing download speeds depending on the particular situation Across the board we saw improvement with this and we continue to do so So as a customer if you have that network principle in place, we have local egress Direct connectivity for the core endpoints etc then these improvements would have instantly become available to you when that changing connectivity was made and The important thing is the change was abstract to you that you wouldn’t have had to do anything to consume this change And Moving away from sharepoint here just to drill home that point that by following those principles As the service evolves and improves and we add new capability the principles will allow you to consume that in an instant and Something will in pilot that the minute which should be available later this year Multi geo capability in this example using exchange online. So multi geo will mean that as an enterprise currently your mailboxes will be in the Location of we all tenant. So for in North America all our mailboxes sit there regardless of where my users are But it’s been a common customer request that we want our users data and maybe for our mirror users need to sit in a mere data centers so multi geo will apply that so you can say my users in a mere shift their mailboxes to Amir and by following those Network principles with local network egress There is no change required to the corporate network infrastructure to facilitate this change. So Consuming new services and consuming changes in services become pretty simple for an enterprise to roll out when they need So with that, we’ll wrap up those four principles understand and differentiate those endpoints and as I say moving forward where we’re hopefully going to make it easy for you to do that by highlighting those that are really Bandwidth or latency critical to allow you to optimize those that really matter Egress your office 365 data as close to your users as practical with matching DNS resolution. So that outlook is a good example of where That DNS resolution really does need to be where I regress is Avoid hair pinning an optimize our connectivity on tomorrow’s global backbone and Assess for those core endpoints that we’ve identified in step one Where we can bypass our SSL interception where it really makes sense where we’ve duplicated that security in the service assess whether we can apply DLP eav scanning etc to those particular endpoints and bypass them away from where they’re going to cause a bottleneck can ask you to stack and With that I’ll wrap up and thank you very much for watching. Hope it was useful

You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.