Network Working Group K. Majumdar Internet Draft Microsoft Intended status: Standard Track L. Dunbar Expires: March 18, 2024 Futurewei V.Kasiviswanathan Arista A. Ramchandra Microsoft September 18, 2023 Multi-segment SD-WAN via Cloud DCs draft-dmk-rtgwg-multisegment-sdwan-02 Abstract The document describes the methods to optimize the stitching of multiple SD-WAN segments on Cloud DCs Gateways. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on Dec 18, 2020. xxx, et al. Expires March 18, 2024 [Page 1] Internet-Draft Multi-segment SD-WAN Copyright Notice Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction..............................................3 2. Conventions used in this document.........................3 3. Use Cases.................................................5 3.1. Multi-segment SD-WAN via Single Cloud GW.............5 3.2. Multi-segment SD-WAN via Cloud Backbone..............6 3.3. Analysis of Policy-based Traffic Steering............7 3.4. End to End Encryption................................8 4. Data Plane encoding for SD-WAN Transit....................8 4.1. GENEVE Header Encoding...............................8 4.2. Multi-Segment SD-WAN Option Class....................9 4.3. SD-WAN Tunnel Endpoint Sub-TLV.......................9 4.4. SD-WAN Tunnel Originator Sub-TLV....................10 4.5. Egress GW Sub-TLV...................................11 4.6. Include-Transit Sub-TLV.............................11 4.7. Exclude-Transit Sub-TLV.............................11 5. IPsec Flow through Cloud GWs Illustration................12 5.1. Single Hop Cloud GW.................................12 5.2. Multi-hop Transit GWs...............................13 5.3. Data Authentication and Integrity Check by Cloud GW.15 6. Illustration of Traffic from Private VPN to IPsec Tunnel.16 7. Control Plane considerations.............................18 7.1. Control Plane for CPEs..............................18 7.2. Control Plane between CPEs and Cloud GWs............18 8. Observability Consideration..............................19 9. Security Considerations..................................19 10. Manageability Considerations............................21 Dunbar, et al. Expires Dec 18, 2024 [Page 2] Internet-Draft Multi-segment SD-WAN 11. IANA Considerations.....................................21 12. References..............................................22 12.1. Normative References...............................22 12.2. Informative References.............................23 13. Acknowledgments.........................................24 1. Introduction SD-WAN is widely deployed to connect enterprises' on-premises CPEs with services in cloud DCs. As described in [Net2Cloud], there are multiple options for enterprises to connect to Cloud DCs: - Direct Interconnect model, - Direct Interconnect model with enterprise's own virtual appliances in the Cloud, - Indirect Interconnect model via SD-WAN paths, and - Managed Hybrid WAN model using Enterprise's existing VPN connections. For the enterprise branches that have private VPN circuits interconnecting with a Cloud GW via IXP (Internet eXchange Point), the enterprise can extend into Cloud DC without having to set up IPsec paths between their on-premises CPEs and the Cloud GWs. This document describes a method for a Cloud DCs' gateway (GW) to connect multiple SD-WAN segments between the Cloud GW and the enterprise's CPEs without the Cloud GW decrypting and encrypting the payloads. By integration with Cloud Operators' gateways, enterprises can have advanced visibility through the Cloud Providers' global network topology, attachment level performance metrics, and telemetry data. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and Dunbar, et al. Expires Dec 18, 2024 [Page 3] Internet-Draft Multi-segment SD-WAN "OPTIONAL" in this document are to be interpreted as described in BCP14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The following acronyms and terms are used in this document: Cloud DC: Off-Premises Data Center, managed by the third party, that hosts applications, services, and workload for different organizations or tenants. CPE: Customer (Edge) Premises Equipment. IXP: Internet exchange points (IXes or IXPs) are common grounds of IP networking, allowing participant Internet service providers (ISPs) to exchange data destined for their respective networks. (https://en.wikipedia.org/wiki/Internet_exchange_ point). OnPrem: On Premises data centers and branch offices. RR Route Reflector. SD-WAN An overlay connectivity service that optimizes transport of IP Packets over one or more Underlay Connectivity Services by recognizing applications (Application Flows) and determining forwarding behavior by applying Policies to them. [MEF-70.1] VPN Virtual Private Network. Dunbar, et al. Expires Dec 18, 2024 [Page 4] Internet-Draft Multi-segment SD-WAN 3. Use Cases 3.1. Multi-segment SD-WAN via Single Cloud GW For enterprise branches that have established SD-WAN paths to a Cloud GW for accessing Cloud services, the Cloud GW can be utilized to connect those branches, as shown in Figure 1. Here are some reasons for connecting those branches via a Cloud GW: - The public internet among those branches might have limited bandwidth, unpredictable connection performance, or be prone to cyber-attacks. In comparison, the network paths from CPEs to the Cloud GW have more reliable connections and are constantly monitored by sophisticated network functions. - It is easier to utilize Cloud based security functions, such as Firewall, DDoS, etc., to apply consistent policy enforcement for workloads/services to the Cloud and across the branches. - Cloud-based tools and SaaS (Software as a Service) can be easily utilized to collect and analyze the threat to the traffic. Dunbar, et al. Expires Dec 18, 2024 [Page 5] Internet-Draft Multi-segment SD-WAN (^^^^^^^^^^^^) ( Cloud ) ( +----+ +----+ ) + -----(-|Edge| + GW | ) Direct | ( +----+ +/--\+ ) Connect | (^^^^^^^/^^^^\^) {-+---} / \ SD-WAN Path CPE<->GW { VPN } / \ {-+---} / IPsec Tunnel +-------+----/------+ \ | / | \ ++--/+ | +-\--+ |CPE1| +----+CPE2| +----+ +----+ Client Route: 11.1.1.x 10.1.1.x 21.1.1.x 20.1.1.x 30.1.1.x Figure 1 Multi-Segment SD-WAN stitching via a Cloud GW 3.2. Multi-segment SD-WAN via Cloud Backbone For geographic faraway enterprise branches that have established SD-WAN paths to their corresponding Cloud GWs to access Cloud services in different geographic locations, the Cloud backbone can connect those branches, as shown in Figure 2. The reasons to utilize the Cloud Backbone to interconnect those branches are similar to interconnecting multiple branches via a single Cloud GW described in the previous section. Dunbar, et al. Expires Dec 18, 2024 [Page 6] Internet-Draft Multi-segment SD-WAN (^^^^^^^^^^^^^^^) ( Cloud ) ( +----+ +----+ ) +-----+ + ---(-|Edge|==| GW1|=================== GW2 | Direct | ( +----+ +/--\+ ) +--|--+ Connect | (^^^^^^^/^^^^\^) | {-+---} / \ | { VPN } / \ +-----+ {-+---} / IPsec Tunnel |CPE10| +-------+--/--------+ \ +-----+ | / | \ 10.2.1.x ++/--+ | +\---+ 20.2.1.x |CPE1| +----+CPE2| 30.2.1.x +----+ +----+ Client Route: 11.1.1.x 10.1.1.x 21.1.1.x 20.1.1.x 30.1.1.x Figure 2 Multi-Segment SD-WAN Stitching via Cloud Backbone 3.3. Analysis of Policy-based Traffic Steering There are many well-developed methods, such as SRv6 or MPLS-TE, to steer traffic through specific nodes. Those traffic steering methods are effective when the entire network domain is under one administrative control. However, the traffic from on-premises CPEs to Cloud GWs via the public internet can only be forwarded based on the packets' destination addresses. SD-WAN allows for the setup of multiple links (paths), some of which are the Public Internet, from the same SD-WAN branch CPE to a Cloud GW; each link (or path) represents a dual tunnel connection from a unique public IP of the SD- WAN CPE to two different instances of Cloud GW. Using Cloud GW to interconnect those on-premises CPEs eliminates the need to manage the multiple ISPs' links/paths between the CPEs. Dunbar, et al. Expires Dec 18, 2024 [Page 7] Internet-Draft Multi-segment SD-WAN 3.4. End to End Encryption To ensure the confidentiality, integrity, and availability of communication among CPEs, the traffic between the CPEs should be encrypted by the IPsec SAs if traversing the public Internet. When the traffic between the enterprise's CPEs doesn't terminate within the Cloud DCs, the processing burden on Cloud GWs can be significantly reduced if the Cloud GWs don't need to decrypt and re-encrypt transit IPsec encrypted traffic among CPEs. This document describes the mechanisms for the IPsec encrypted traffic between CPEs to traverse the Cloud GWs without being decrypted and re-encrypted by the Cloud GWs. 4. Data Plane encoding for SD-WAN Transit For Cloud GWs to differentiate the packets destined towards their internal hosts/services, which require decryption, and transit packets to be forwarded to the respective destination branch CPEs, proper marking is needed in the packets' header. As the GENEVE Encapsulation [RFC8926] is supported by most Cloud Service Providers, GENEVE is chosen as the encapsulation header for Cloud GWs to steer IPsec encrypted packets among CPEs without decryption. 4.1. GENEVE Header Encoding Geneve header shown below is specified by RFC8926: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Ver| Opt Len |O|C| Rsvd. | Protocol Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Virtual Network Identifier (VNI) | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Variable-Length Options ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 GENEVE Header VNI (virtual network Identifier) is used to represent the Customer Identifier. The Protocol Type (16 bits) = 50 (ESP) [RFC4303] indicates that IPsec ESP encapsulated data are appended at the end of the GENEVE header. Dunbar, et al. Expires Dec 18, 2024 [Page 8] Internet-Draft Multi-segment SD-WAN 4.2. Multi-Segment SD-WAN Option Class A new GENEVE Option Class (Type value=TBD) is used to indicate that the Multi-segment SD-WAN relevant Sub-TLVs are encoded in the GENEVE header. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | multi-seg-SD-WAN Option Class | Type |R|R|R| Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ SD-WAN Tunnel Endpoint Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Optional SD-WAN Tunnel Originator Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ~ Optional Egress GW Sub-TLV ~ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // // // Optional Type Length Value objects (variable) // // // +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 Multi Segment SD-WAN Option Class Type indicates the various types of multi-segment SD-WAN. Type = 1: Single Hop Transit SD-WAN Type = 2: Multi-Hop Transit SD-WAN with explicitly specified egress Cloud GW. Type = 3: Multi-Hop Transit SD-WAN without specified egress Cloud GW. 4.3. SD-WAN Tunnel Endpoint Sub-TLV The SD-WAN Endpoint sub-TLV indicates the destination CPE of the IPsec Tunnel. Dunbar, et al. Expires Dec 18, 2024 [Page 9] Internet-Draft Multi-segment SD-WAN For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown in Figure 1, the Tunnel Endpoint Sub-TLV of the Geneve Header has the CPE2's IP address. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SD-WAN Endpoint| length | Reserved | TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SD-WAN Dst Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | SD-WAN end point Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 5 SD-WAN Endpoint Sub-TLV TTL is set by the SD-WAN Tunnel Originator, e.g., CPE1. Each transit node or transit region/zone (visible to the CPEs) SHOULD decrement the TTL so that the destination CPE can know the number of logical transit nodes (cloud regions or zones) the packet has traversed. Enterprises can also use TTL to set the maximum transit nodes/regions the packets traverse. 4.4. SD-WAN Tunnel Originator Sub-TLV The SD-WAN Tunnel Originator Sub-TLV is an optional Sub-TLV inside the multi-seg-SD-WAN Option Class to indicate the originating CPE of the IPsec Tunnel. For example, for the SD-WAN IPsec SA from CPE1 to CPE2 shown in Figure 1, the Tunnel Originator Sub-TLV inside the Geneve Header of the packets indicates CPE1's address. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SDWAN Origin | length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SD-WAN Org Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | SD-WAN Tunnel Originator Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 6 SD-WAN Tunnel Originator Sub-TLV Dunbar, et al. Expires Dec 18, 2024 [Page 10] Internet-Draft Multi-segment SD-WAN The Tunnel Originator Sub-TLV in the GENEVE header can assist Cloud transit nodes in applying appropriate policies when forwarding the packet. 4.5. Egress GW Sub-TLV For the multi-segment SD-WAN via Cloud Backbone scenario, the originator CPE can use the Egress GW Sub-TLV to specify the Egress Cloud GW for reaching the destination CPE. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |SDWAN EgressGW | length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Egress GW Addr Family | Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ (variable) + ~ ~ | Egress GW Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 7 SD-WAN Egress GW Sub-TLV The originator CPE can get the Egress GW address by configuration or by control plane protocol exchanged with destination CPEs. The Control Plane protocol is out of the scope of this document. 4.6. Include-Transit Sub-TLV Include-Transit Sub-TLV is an optional Sub-TLV for explicitly including a list of Cloud Availability Regions or Zones for reasons like: - Those regions have certain OAM and security functions for the improved visibility. - To comply with regulations, etc. 4.7. Exclude-Transit Sub-TLV Exclude-Transit Sub-TLV is an optional Sub-TLV for explicitly excluding a list of Cloud Availability Regions or Zones for reasons like Dunbar, et al. Expires Dec 18, 2024 [Page 11] Internet-Draft Multi-segment SD-WAN - To comply with regulations, - To avoid regions that impose certain risks. 5. IPsec Flow through Cloud GWs Illustration This section illustrates Cloud GWs connecting traffic flow carried by the IPsec tunnels. 5.1. Single Hop Cloud GW Assuming that all CPEs are under one administrative control (e.g., iBGP). Using Figure 1 as an example: - There is a bidirectional IPsec tunnel between CPE1 and Cloud GW; with IPsec SA1 for the traffic from the CPE1 to the Cloud-GW; and IPsec SA2 for the traffic from the Cloud-GW to the CPE1. - There is a bidirectional IPsec tunnel between CPE2 and Cloud GW; with IPsec SA3 for the traffic from the CPE2 to the Cloud-GW; and IPsec SA4 for the traffic from the Cloud-GW to the CPE2. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. The CPEs notify their peers of their corresponding Cloud GW addresses (which is out of the scope of this document). When 11.1.1.x and 10.1.1.x need to communicate with each other, CPE1 and CPE2 establish a bidirectional IPsec Tunnel, with SA5 for the traffic from CPE1 to CPE2 and SA6 for the traffic from CPE2 to CPE1. Assume the IPsec ESP Tunnel Mode is used. A packet from 11.1.1.1 to 10.1.1.2 has the following outer header: Dunbar, et al. Expires Dec 18, 2024 [Page 12] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | protocol = 17(UDP) | | src = CPE1 | | dst = Cloud GW | +---------------------------+ | Source Port =xxxx | | Dst Port = 6081 (GENEVE) | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto = 50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.1.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 8 Packet header illustration of traffic to Cloud GWs 5.2. Multi-hop Transit GWs Traffic to/from geographic apart CPEs can cross multiple Cloud DCs via Cloud backbone. The on-premises CPEs are under one administrative control (e.g., iBGP). Using Figure 2 as an example: - There is a bidirectional IPsec tunnel between CPE1 and the Cloud GW1; with IPsec SA1 for the traffic from the CPE1 to the Cloud-GW1; and IPsec SA2 for the traffic from the Cloud-GW1 to the CPE1. Dunbar, et al. Expires Dec 18, 2024 [Page 13] Internet-Draft Multi-segment SD-WAN - There is a bidirectional IPsec tunnel between CPE10 and the Cloud GW2; with IPsec SA3 for the traffic from the CPE10 to the Cloud-GW2; and IPsec SA4 for the traffic from the Cloud-GW2 to the CPE10. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. CPEs notify their peers of their corresponding Cloud GW addresses. When 11.1.1.x and 10.2.1.x need to communicate with each other, CPE1 and CPE10 establish a bidirectional IPsec Tunnel, with SA5 for the traffic from CPE1 to CPE10 and SA6 for the traffic from CPE10 to CPE1. Assume the IPsec ESP Tunnel Mode is used, a packet from 11.1.1.1 to 10.2.1.2 has the following outer header: Dunbar, et al. Expires Dec 18, 2024 [Page 14] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = CPE1 | | dst = Cloud GW1 | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto = 50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE10)| +---------------------------+ | EgressGW-SubTLV | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.2.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 9 GENEVE header encapsulated IPsec packet 5.3. Data Authentication and Integrity Check by Cloud GW The IPsec SA already encrypts the client payload between the CPEs, the Cloud GW doesn't need to decrypt and re- encrypt the payload when relaying it to the destination CPE. However, data authentication and integrity check are needed as the traffic traverse an untrusted network. [RFC2403] and [RFC2404] define the authentication algorithms used in AH and ESP. SHA2 224/256/384/512 are some of the cryptographic hashing algorithms. They are part of a Hashed Message Authentication Code. 5.4. Packet Header Processing Dunbar, et al. Expires Dec 18, 2024 [Page 15] Internet-Draft Multi-segment SD-WAN In Figure 1, upon receiving a GENEVE encapsulated packet with the GENEVE Protocol Type = 50 (ESP), the Cloud GW does the following: - Authenticate the packet using a preconfigured authentication method. - Extract the destination CPE address from the SD-WAN Endpoint Sub-TLV inside the GENEVE header. Replace the outer IP destination address with the destination CPE address. - Optionally replace the outer IP source address with the Cloud GW address. - GENEVE header is unchanged. - Forward the packet to the destination CPE. The cloud GW SHOULD drop all packets with the source addresses or the values in the Sub-TLVs of the GENEVE header that are not recognized or registered to prevent unauthorized users from using the Cloud services. 5.5. Error Handling As traffic through Cloud Backbone takes precious resources, the Cloud GW SHOULD drop the packets with invalid or unregistered source or destination addresses. Cloud GW SHOULD drop the packets originated from unpaid (or unregistered) address (CPE). Cloud GW SHOULD validate the value of the SD-WAN Endpoint Sub-TLV and drop the packet if the value of the SD-WAN Endpoint Sub-TLV is an unpaid (or unregistered) address. 6. Illustration of Traffic from Private VPN to IPsec Tunnel This section illustrates a Cloud GW connecting client traffic from a branch CPE via a Private VPN to another CPE via an IPsec tunnel. Using Figure 1 as an example: Dunbar, et al. Expires Dec 18, 2024 [Page 16] Internet-Draft Multi-segment SD-WAN - CPE1 send traffic via a Private VPN (Direct Connect to the Cloud Edge) to the Cloud GW. The traffic is not encrypted. - There is a bidirectional IPsec tunnel between CPE2 and the Cloud GW; with IPsec SA1 for the traffic from the CPE2 to the Cloud-GW; and IPsec SA2 for the traffic from the Cloud-GW to the CPE2. - All the CPEs are under one iBGP administrative domain, with a Route Reflector (RR) as their controller. CPEs notify their peers of their corresponding Cloud GW addresses. Assume the IPsec ESP Tunnel Mode is used for the IPsec SA between Cloud GW and CPE2. For a packet from 11.1.1.1 to 10.2.1.2, the following header is added by CPE1 sending over the Private VPN: Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = CPE1 | | dst = Cloud GW | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto =TCP/UDP/etc. | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < -+ | payload IP header: | | | src = 11.1.1.1 | | | dst = 10.2.1.2 | | +---------------------------+ Not Encrypted | TCP header + | | ~ payload (variable) ~ | | | | +===========================+ <-+ Figure 10 Illustration of packet through VPN Upon receiving the GENEVE encapsulated packet with the "Multi-Segment-SD-WAN" option, the Cloud GW extracts the destination CPE from the GENEVE header and encrypts the packet with the IPsec SA2 to forward to the destination (i.e., CPE2). The GENEVE Header is carried to the CPE2. Dunbar, et al. Expires Dec 18, 2024 [Page 17] Internet-Draft Multi-segment SD-WAN Outer IP header: +---------------------------+ | proto = 17 (UDP) | | src = Cloud GW | | dst = CPE2 | +===========================+ | GENEVE Header | | multi-seg-SD-WAN Option | |GENEVE Proto =50 (ESP) | +- - -- -- - - -- - --+ |SD-WAN EndPt SubTLV (CPE2) | +---------------------------+ < ----------+ |SPI(Security Parameter Idx)| Authenticated +---------------------------+ | | sequence number | | +---------------------------+ <-+ | | payload IP header: | | | | src = 11.1.1.1 | | | | dst = 10.2.1.2 | | | +---------------------------+ Encrypted | | TCP header + | | | ~ payload (variable) ~ | | | | | | +===========================+ <-+ -------+ | Authentication Data | +---------------------------+ Figure 11 Illustration of packet from the Egress Cloud GW 7. Control Plane considerations 7.1. Control Plane for CPEs The control plane enables SD-WAN edges to discover their properties and attached routes. The on-premises CPEs and their vCPEs (or Virtual Appliances in Cloud DC) can be controlled by one iBGP instance. [SDWAN-Edge-Discover] describes the mechanism for SD-WAN edges to discover each other's properties. The IPsec Key Exchange between on- premises CPEs and the vCPE is via the iBGP Update through RR. [SD-WAN-Edge-Discovery]. 7.2. Control Plane between CPEs and Cloud GWs It is common to have eBGP sessions between enterprises CPEs and the Cloud GWs. An enterprise-owned vCPE can establish an eBGP session with the Cloud VPN GW for accessing the Dunbar, et al. Expires Dec 18, 2024 [Page 18] Internet-Draft Multi-segment SD-WAN workloads hosted in the Cloud DCs. If an IPsec tunnel is required between the Cloud DC GW and the vCPE, the full suite of IPSec IKEv2 must be exchanged between the vCPE and the Cloud GW. 8. Observability Consideration This section is intended for describing some metrics that enterprises can get from Cloud providers for the traffic transited. To be added. 9. Security Considerations 9.1. Threat Analysis As shown in Figure 3, the information carried by the GENEVE Header is not encrypted, which is susceptible to Man-in-the- Middle (MitM) attacks. An attacker can intercept and potentially alter the information in the GENEVE header between the branch CPEs and the Cloud GWs without the enterprise and the Cloud provider's knowledge or consent. Here is the threat analysis of the MitM attacks between CPEs and Cloud GWs: a) Eavesdropping: Attackers can get knowledge of the enterprise's branch locations and their respective contracted Cloud GWs. As the payload between the CPEs is encrypted, attackers can't get any data exchanged between CPEs. This threat is no different from direct IPsec SAs between two CPEs. b) Data Manipulation: Attackers alter the content (Sub-TLVs) in the GENEVE header. As packets with unrecognized source addresses or invalid values in the Sub-TLVs of the GENEVE header are dropped by Cloud GWs, there might be a higher packet drop rate between the CPEs. Packet drop is not a new problem. Applications' transport layer, such as TCP or QUIC, can handle packet drop well. c) Potential steeling of Cloud Backbone bandwidth: A threat actor might want to leverage Cloud Backbones to transport its own traffic between two locations without paying for the services. For example, a legitimate Cloud Dunbar, et al. Expires Dec 18, 2024 [Page 19] Internet-Draft Multi-segment SD-WAN subscriber pays for the Cloud Backbone transport services for traffic between CPE-A and CPE-B. The attacker, who has two locations far apart (say Node-A and Node-B), can use CPE-A's address as the source address and CPE-B as the value in the SD-WAN Endpoint Sub-TLV for a packet from Node-A to Node-B before reaching the ingress Cloud GW. When the packet is sent from the egress Cloud GW via the Internet towards CPE-B, the actor can change the source address back to Node-A and the destination address to Node- B. By doing so, Node-A and Node-B can maintain the IPsec tunnel via the Cloud Backbone without paying for the service. Therefore, it is necessary to have some level data integrity and authentication for traffic between CPEs and Cloud GWs even though it is not necessary for Cloud GWs to decrypt and re-encrypt the payload between CPEs. 9.2. HMAC-based Integrity and Authentication HMAC (Hash-Based Message Authentication Code) can be used to ensure the integrity and authenticity of data between CPEs and Cloud GWs to verify that GENEVE header has not been tampered with during transmission via the public Internet. The basic idea behind HMAC is to combine a secret key and a hash function to produce a fixed-size authentication code for the GENEVE header between CPEs and the Cloud GW. This authentication code is then sent along with the data itself. When the Cloud GW and the destination CPEs receive the data and the authentication code, they can independently compute the HMAC using the same key and hash function. If the computed HMAC matches the received authentication code, it indicates that the data has not been altered, as long as the secret key remains confidential. The HMAC authentication code can be carried by an HMAC Sub- TLV in the GENEVE Header, as specified below: Dunbar, et al. Expires Dec 18, 2024 [Page 20] Internet-Draft Multi-segment SD-WAN 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |MultiSDWAN-HMAC| length | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ~ | HMAC Authentication Code for entire GENEVE Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 12 Multi Segment SD-WAN HMAC Sub-TLV The HMAC Authentication Code, a.k.a. the HMAC hash value, is computed including all the bytes in the GENEVE header and with the MultiSDWAN-HMAC value field setting to 0. 9.3. AH based Integrity and Authentication For enterprises or Cloud providers worrying about secret HMAC keys being compromised, they can add another layer of AH encryption [RFC4301] or ESP-NULL [RFC2410] [RFC6071] on top of the IPsec encryption between the two CPEs. Both AH and ESP-NULL IPsec encryption require pairwise IPsec key management between Cloud GWs and the CPEs, therefore requiring more processing on Cloud GWs and CPEs. In addition, the AH encrypted packets can't traverse NAT because of outer IP address changes. 10. Manageability Considerations To be added. 11. IANA Considerations IANA is requested to assign a new GENEVE Option Class from the IETF Review range as shown below: Option Class Description Assignee/Contact Reference ------ ------------------- ---------------- ----------- tbd Multi Segment SD-WAN IETF [this document] IANA is requested to create a registry as below with the initial values shown in the Multi Segment SD-WAN Geneve Option Class registry group: Dunbar, et al. Expires Dec 18, 2024 [Page 21] Internet-Draft Multi-segment SD-WAN Registry: Multi Segment SD-WAN Sub-TLVs Assignment Policy: IETF Review Reference: [this document] Sub-TLV Type Description Reference ------------ ---------------------- --------------- 0 Reserved 1 SD-WAN Endpoint [this document] 2 SD-WAN Originator [this document] 3 SD-WAN Egress GW [this document] 4 Multi SD-WAN-HMAC [this document] 5-254 Unassigned 255 Reserved 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, January 2006, . [RFC4301] S. Kent and K. Seo, "Security Architecture for the Internet Protocol", RFC4301, Dec. 2005. [RFC4303] S. Kent, "IP Encapsulating Security Payload (ESP)". RFC4303, Dec. 2005. [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, "Multiprotocol Extensions for BGP-4", RFC 4760, DOI 10.17487/RFC4760, January 2007, . [RFC7296] C. Kaufman, et al, "Internet Key Exchange Protocol Version 2 (IKEv2)", RFC7296, Oct. 2014. Dunbar, et al. Expires Dec 18, 2024 [Page 22] Internet-Draft Multi-segment SD-WAN [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8926] J. Gross, et al, "Geneve: Generic Network Virtualization Encapsulation", RFC8926, Nov 2020. [RFC9012] Patel, K., Van de Velde, G., Sangli, S., and J. Scudder, "The BGP Tunnel Encapsulation Attribute", RFC 9012, DOI 10.17487/RFC9012, April 2021, . 12.2. Informative References [RFC2410] R. Glenn and S. Kent, "The NULL encryption Algorithm and Its Use with IPsec", RFC2310, Nov. 1998. [RFC6071] S. Frankel and S. Krishnan, "IP Security (IPsec) and Internet Key Exchange (IKE) Document Roadmap", Feb. 2011. [RFC8192] S. Hares, et al, "Interface to Network Security Functions (I2NSF) Problem Statement and Use Cases", July 2017 [RFC5521] P. Mohapatra, E. Rosen, "The BGP Encapsulation Subsequent Address Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute", April 2009. [RFC9061] Marin-Lopez, R., Lopez-Millan, G., and F. Pereniguez-Garcia, "A YANG Data Model for IPsec Flow Protection Based on Software-Defined Networking (SDN)", RFC 9061, DOI 10.17487/RFC9061, July 2021, . Dunbar, et al. Expires Dec 18, 2024 [Page 23] Internet-Draft Multi-segment SD-WAN [CONTROLLER-IKE] D. Carrel, et al, "IPsec Key Exchange using a Controller", draft-carrel-ipsecme-controller-ike- 01, work-in-progress. [MEF-70.1] MEF 70.1 SD-WAN Service Attributes and Service Framework. Nov. 2021. [Net2Cloud] L. Dunbar and A. Malis, "Dynamic Networks to Hybrid Cloud DCs Problem Statement", draft-ietf- rtgwg-net2cloud-problem-statement-29, Aug, 2023. [SD-WAN-Edge-Discovery] L. Dunbar, et al, "BGP UPDATE for SD- WAN Edge Discovery", draft-ietf-idr-sdwan-edge- discovery-10, June 2023. 13. Acknowledgments Acknowledgements to Donald Eastlake, Aseem Choudh, Stephen Farrell for their review and suggestions. This document was prepared using 2-Word-v2.0.template.dot. Dunbar, et al. Expires Dec 18, 2024 [Page 24] Internet-Draft Multi-segment SD-WAN Authors' Addresses Linda Dunbar Futurewei Email: ldunbar@futurewei.com Kausik Majumdar Microsoft Email: kmajumdar@microsoft.com Venkit Kasiviswanathan Arista Email: venkit@arista.com Ashok Ramchandra Microsoft Email: aramchandra@microsoft.com Contributors' Addresses Dunbar, et al. Expires Dec 18, 2024 [Page 25]