Considering computing and networking is quite different in terms of resource
granularity as well as their status stability, a hierarchical segment routing is
proposed and introduced as an end-to-end CATS process. However, it brings about
potential problems as illustrated in [I-D.yuan-cats-end-to-end-problem-requirement].
In order to solve the mentioned problems and to improve and perfect a hierarchical
solution, corresponding aggregation methods are discussed and hierarchical entries
are proposed in this draft.¶
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF). Note that other groups may also distribute working
documents as Internet-Drafts. The list of current Internet-Drafts is
at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.¶
Compared to non hierarchical routing methods, a hierarchical segment solution has its
unique features and proposes additional requirements as follows:¶
Aggregate explicit and detailed information of multiple service instances
appropriately to avoid a loop caused by improper routing and forwarding
decisions led by inadequate cohesive information.¶
Solve the microloop problem occurred in hierarchical segment routing under a
multi-point decision-making circumstance due to inconsistent convergence
time.¶
In IP networks, due to the distributed LSDB of IGP, there might be microloop issues
when IGP converges out of order. Solutions has proposed such as Order FIB and Order
Metric, but due to their principles of controlling the convergence order of network
devices to guarantee orderly convergence, the convergence process becomes much more
complex and the convergence time increases significantly. Thus, these schemes have
not been widely applied and deployed in networks. Currently, SR technology is
commonly used to address microloop issues, such as constructing an acyclic SRv6
Segment List to eliminate loops.¶
However, an explicit destination is determined since the source device in IP routing
circumstances while a specific service instance may not be designated during the
first segment in a hierarchical segment service routing process. There is a lack of
connection between forwarding behaviors on multiple devices. Thus, a conventional SR
based solution requires incremental designs.¶
Therefore, this draft proposes possible aggregation methods, designs hierarchical
entries including global bases and local bases and introduces a forwarding behaviour
with Computing Segments.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD
NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are
to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as
shown here.¶
Assume that for a computing related service, a Service ID is utilized as an
identifier as proposed in [I-D.ma-intarea-identification-header-of-san]. Various computing related services may be sensitive to
different attributes among metadata of computing resources and network capabilities,
such as CPU cores, CPU load, I/O, memory, delay, bandwidth, etc.¶
For a service instance which is able to provide a computing related service, metadata
of sensitive attributes collected is capable of indicating the performance at the
moment. Furthermore, as illustrated in [I-D.yuan-cats-middle-ware-facility], a Metric
value can be calculated with the metadata of sensitive attributes as input
variables.¶
Therefore, the aggregation of computing resource status information is divided into
the following two categories.¶
As shown below, a set of meta information been sensitive to a computing related
service is recorded as a Attribute Set, the Attribute Set collected at Instance
1 for Service ID 1 is Attribute Set 1,1 for instance.¶
There are multiple instances located in a edge cloud or a central data center
connected to corresponding PEs. Based on the respective metadata of dynamic
computing status and network conditions meta information of these service
instances at a certain time, corresponding Metric values representing their
capabilities can be calculated. As shown below, Instance 1, 2 and 3 located at
PE 1 are all able to provide services represented by Service ID 1, 2 and 3.
Accordingly, Metric 1,1 to Metric 3,3 are calculated respectively.¶
Based on a framework of hierarchical computing status awareness and segment
service routing, edge devices apply a corresponding aggregation algorithm to
these Metric values, and publish and notify the aggregated results to the
network. For a computing-related service represented by a Service ID,
aggregation algorithms include but are not limited to:¶
Take the average of the corresponding Metric values calculated for a
specific service among all service instances.¶
Take the weighted average of the corresponding Metric values calculated
for a specific service among all service instances.¶
Take the maximum of the corresponding Metric value calculated for a
specific service among all service instances.¶
Take the minimum of the corresponding Metric value calculated for a
specific service among all service instances.¶
Take the median of the corresponding Metric values calculated for a
specific service among all service instances.¶
The differential evaluation methods of each service can degenerate into a unified
or service class based evaluation method based on conditions. Specifically here,
different Service IDs can also correspond to a same Metric value calculated by a
unified evaluation algorithm or a set of Service IDs corresponds to one Metric.¶
The other aggregation method is shown above. For instance 1 located at PE 1,
the Attribute Set of Service ID 1 to Service ID 3 are Attribute Set 1 to
Attribute Set 3 respectively. The union of meta information in these
Attribute Sets of multiple computing related services is recorded as the
Metadata Set of instance 1. Similar to the process of aggregating Metric
values, edge devices can also aggregate multiple Metadata Sets into a
Cohesive Metadata Set, and then publish and notify the aggregated results to
the network. For all service instances at an edge device, the aggregation
algorithm includes but is not limited to:¶
Take the average value of same elements in Metadata Sets as the
corresponding element value of the Cohesive Metadata Set.¶
Take the weighted average of same elements in Metadata Sets as the
corresponding element value of the Cohesive Metadata Set.¶
Take the maximum value of same elements in Metadata Sets as the
corresponding element value of the Cohesive Metadata Set.¶
Take the minimum value of same elements in Metadata Sets as the
corresponding element value of the Cohesive Metadata Set.¶
Take the median value of same elements in Metadata Sets as the
corresponding element value of the Cohesive Metadata Set.¶
Apply a corresponding strategy to select an instance and directly use
the Metadata Set of the selected instance as the Cohesive metadata
set.¶
In conclusion, a PE can aggregate multiple Metric values or Metadata Sets and further
publish and advertise the coarse granularity and relatively stable information to
the network. As analyzed in [I-D.yuan-cats-end-to-end-problem-requirement], an
aggregation result should maintain its comparability to the information of any
explicit service instance.¶
With the application of appropriate aggregation functions, the exposed entries gain
essential correctness. However, due to the indeterminacy of forwarding behaviours
and inseparable entries, a microloop problem still occurs under circumstances of
sudden failures or dynamic updates. Therefore, a design of hierarchical entries is
proposed in this section.¶
Taking an aggregation of Metric values as an example, Metric values of several
service instances at an edge device PE are aggregated and published and advertised
in the network. By collecting and exchanging entries, a Global RIB was constructed.
On the other hand, PE generates a Local RIB by collecting the running status of its
local service instances. Afterwards, scheduling strategies are applied in the
control plane and corresponding decisions are made. Suppose the entry with the
smallest Metric value is selected as the optimal entry, it will further be
distributed to the forwarding plane on the device and a Global FIB and a Local FIB
is generated respectively, ultimately instructing the packet forwarding process.¶
A typical form of GRIB, GFIB, LRIB and LFIB, taking PE 4 as an example, is displayed
as follows. To be noted, the computing status of PE 4 itself is also displayed as an
entry in the GRIB with an aggregated manner.¶
Although a design of hierarchical entries separates entries with information of
different granularity, global and local entries both further require to be
correlated with packet features and defined forwarding behaviours.¶
As defined in [I-D.zhou-intarea-computing-segment-san],
a Computing Segment is introduced. With the introduction of hierarchical entries
displayed in the previous sections, a Computing Segment END.C can be further
assorted as END.CG and END.CL associated with GFIB and LFIB respectively. Except for
the different entries to lookup, END.CG and END.CL have identical semantics as
stated in the previous work. The form of a packet delivered in the forwarding
process is also shown below.¶
As shown above, END.CG(PE 1) and END.CL(PE 4) are Computing Segments configured at PE
1 and PE 4 respectively. END.CG(PE 1) correlates with GFIB at PE 1 while END.CL(PE
4) correlates with LFIB at PE 4. The forwarding process is determined by the SIDs
and corresponding FIBs.¶
A microloop problem is displayed as follows with a circumstance of non hierarchical
entries. A minimum Metric value is taken as the aggregated Metric value published by
a PE. Instance 4,4 located at PE 4 is considered to be the most appropriate service
instance with the minimum Metric value which represents the best performance when a
service request accesses from PE 1. With a hierarchical computing status awareness
and routing scheme, the traffic is first steered to PE 4 and then a sudden failure
happens at Instance 4,4. PE 4 discovers the invalidity of Instance 4,4 and
distributes a new FIB entry by recalculating in the control plane. PE 1 is selected
as the updated next hop determined by PE 4. Thus, the traffic is steered back to PE
1. However, the event of the invalidity of Instance 4,4 has not been notified to the
remote PE 1. Therefore, a microloop exists before PE 1 updates its entries.¶
An identical condition with introduction of hierarchical entries is analyzed below. A
global choice is made at the access device which is PE 1 indicated by a END.CG SID.
Then, PE 4 is selected as the most appropriate next hop with the minimum aggregated
Metric value. Identically, a sudden failure happens at Instance 4,4 and PE 1 has not
been notified yet. Unlike the previous mentioned conditions, a specific local
behaviour to lookup the LFIB denoted by a END.CL SID is implemented at PE 4.
Although Instance 4,4 becomes invalid, Instance 4,1 is determined as the
substitution with a suboptimal Metric value. Contrary to early analysis, the traffic
is not steered back between PEs. Thus, a microloop problem is prevented through the
design of hierarchical entries and the introduction of Computing Segments.¶
Li, C., Du, Z., Boucadair, M., Contreras, L. M., Drake, J., Huang, D., and G. S. Mishra, "A Framework for Computing-Aware Traffic Steering (CATS)", Work in Progress, Internet-Draft, draft-ldbc-cats-framework-02, , <https://datatracker.ietf.org/doc/html/draft-ldbc-cats-framework-02>.
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.