Internet-Draft | IoT Edge Computing | September 2023 |
Hong, et al. | Expires 18 March 2024 | [Page] |
Many Internet of Things (IoT) applications have requirements that cannot be satisfied by traditional cloud-based systems (i.e., cloud computing). These include time sensitivity, data volume, connectivity cost, operation in the face of intermittent services, privacy, and security. As a result, IoT is driving the Internet toward edge computing. This document outlines the requirements of the emerging IoT Edge and its challenges. It presents a general model and major components of the IoT Edge to provide a common basis for future discussions in the T2TRG and other IRTF and IETF groups. This document is a product of the IRTF Thing-to-Thing Research Group (T2TRG).¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 18 March 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
Currently, many IoT services leverage cloud computing platforms, because they provide virtually unlimited storage and processing power. The reliance of IoT on back-end cloud computing provides additional advantages such as scalability and efficiency. Today's IoT systems are fairly static with respect to integrating and supporting computation. It is not that there is no computation, but that systems are often limited to static configurations (edge gateways and cloud services).¶
However, IoT devices generate large amounts of data at the edges of the network. To meet IoT use case requirements, data is increasingly being stored, processed, analyzed, and acted upon close to the data sources. These requirements include time sensitivity, data volume, connectivity cost, and resiliency in the presence of intermittent connectivity, privacy, and security, which cannot be addressed by centralized cloud computing. A more flexible approach is necessary to address these needs effectively. This involves distributing computing (and storage) and seamlessly integrating it into the edge-cloud continuum. We refer to this integration of edge computing and IoT as "IoT edge computing". This draft describes the related background, use cases, challenges, system models, and functional components.¶
Owing to the dynamic nature of the IoT edge computing landscape, this document does not list existing projects in this field. Section 4.1 presents a high-level overview of the field, based on a limited review of standards, research, open-source and proprietary products in [I-D.defoy-t2trg-iot-edge-computing-background].¶
This document represents the consensus of the Thing-to-Thing Research Group (T2TRG). It has been reviewed extensively by the Research Group (RG) members who are actively involved in the research and development of the technology covered by this document. It is not an IETF product and is not a standard.¶
Since the term "Internet of Things" (IoT) was coined by Kevin Ashton in 1999 working on Radio-Frequency Identification (RFID) technology [Ashton], the concept of IoT has evolved. It now reflects a vision of connecting the physical world to the virtual world of computers using (often wireless) networks over which things can send and receive information without human intervention. Recently, the term has become more literal by connecting things to the Internet and converging on Internet and Web technologies.¶
A Thing is a physical item made available in the IoT, thereby enabling digital interaction with the physical world for humans, services, and/or other Things ([I-D.irtf-t2trg-rest-iot]). In this document we will use the term "IoT device" to designate the embedded system attached to the Thing.¶
Resource-constrained Things such as sensors, home appliances and wearable devices often have limited storage and processing power, which can provide challenges with respect to reliability, performance, energy consumption, security, and privacy [Lin]. Some, less resource-constrained Things, can generate a voluminous amount of data. This range of factors led IoT designs that integrate Things into larger distributed systems, for example edge or cloud computing systems.¶
Cloud computing has been defined in [NIST]: "cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction". The low cost and massive availability of storage and processing power enabled the realization of another computing model, in which virtualized resources can be leased in an on-demand fashion and be provided as general utilities. Platform-as-a-Service and cloud computing platforms widely adopted this paradigm for delivering services over the Internet, gaining both economical and technical benefits [Botta].¶
Today, an unprecedented volume and variety of data is generated by Things, and applications deployed at the network edge consume this data. In this context, cloud-based service models are not suitable for some classes of applications which require very short response times, access to local personal data, or generate vast amounts of data. These applications may instead leverage edge computing.¶
Edge computing, also referred to as fog computing in some settings, is a new paradigm in which substantial computing and storage resources are placed at the edge of the Internet, close to mobile devices, sensors, actuators, or machines. Edge computing happens near data sources [Mahadev], as well as close to where decisions are made or where interactions with the physical world take place ("close" here can refer to a distance which is topological, physical, latency-based, etc.). It processes both downstream data (originating from cloud services) and upstream data (originating from end devices or network elements). The term "fog computing" usually represents the notion of multi-tiered edge computing, that is, several layers of compute infrastructure between end devices and cloud services.¶
An edge device is any computing or networking resource residing between end-device data sources and cloud-based data centers. In edge computing, end devices consume and produce data. At the network edge, devices not only request services and information from the Cloud but also handle computing tasks including processing, storage, caching, and load balancing on data sent to and from the Cloud [Shi]. This does not preclude end devices from hosting computation themselves, when possible, independently or as part of a distributed edge computing platform.¶
Several standards developing organization (SDO) and industry forums have provided definitions of edge and fog computing:¶
Based on these definitions, we can summarize a general philosophy of edge computing as distributing the required functions close to users and data, while the difference to classic local systems is the usage of management and orchestration features adopted from cloud computing.¶
Actors from various industries approach edge computing using different terms and reference models although, in practice, these approaches are not incompatible and may integrate with each other:¶
IoT edge computing can be used in home, industry, grid, healthcare, city, transportation, agriculture, and/or educational scenarios. Here, we discuss only a few examples of such use cases, to identify differentiating requirements, providing references to other use cases.¶
Smart Factory¶
As part of the 4th industrial revolution, smart factories run real-time processes based on IT technologies, such as artificial intelligence and big data. Even a very small environmental change in a smart factory can lead to a situation in which production efficiency decreases or product quality problems occur. Therefore, simple but time-sensitive processing can be performed at the edge, for example, controlling the temperature and humidity in the factory, or operating machines based on the real-time collection of the operational status of each machine. However, data requiring highly precise analysis, such as machine lifecycle management or accident risk prediction, can be transferred to a central data center for processing.¶
The use of edge computing in a smart factory can reduce the cost of network and storage resources by reducing the communication load to the central data center or server. It is also possible to improve process efficiency and facility asset productivity through real-time prediction of failures and to reduce the cost of failure through preliminary measures. In the existing manufacturing field, production facilities are manually run according to a program entered in advance; however, edge computing in a smart factory enables tailoring solutions by analyzing data at each production facility and machine level. Digital twins [Jones] of IoT devices have been jointly used with edge computing in industrial IoT scenarios [Chen].¶
Smart Grid¶
In future smart city scenarios, the Smart Grid will be critical in ensuring highly available/efficient energy control in city-wide electricity management. Edge computing is expected to play a significant role in these systems to improve the transmission efficiency of electricity, to react to, and restore power after a disturbance, to reduce operation costs, and to reuse energy effectively, since these operations involve local decision-making. In addition, edge computing can help monitor power generation and power demand, and make local electrical energy storage decisions in smart grid systems.¶
Smart Agriculture¶
Smart agriculture integrates information and communication technologies with farming technology. Intelligent farms use IoT technology to measure and analyze parameters, such as the temperature, humidity, sunlight, carbon dioxide, and soil quality, in crop cultivation facilities. Depending on the analysis results, control devices are used to set the environmental parameters to an appropriate state. Remote management is also possible through mobile devices such as smartphones.¶
In existing farms, simple systems such as management according to temperature and humidity can be easily and inexpensively implemented using IoT technology. Field sensors gather data on field and crop condition. This data is then transmitted to cloud servers that process data and recommend actions. The use of edge computing can reduce the volume of back-and-forth data transmissions significantly, resulting in cost and bandwidth savings. Locally generated data can be processed at the edge, and local computing and analytics can drive local actions. With edge computing, it is easy for farmers to select large amounts of data for processing, and data can be analyzed even in remote areas with poor access conditions. Other applications include enabling dashboarding, for example, to visualize the farm status, as well as enhancing Extended Reality (XR) applications that require edge audio/video processing. As the number of people working on farming has been decreasing over time, increasing automation enabled by edge computing can be a driving force for future smart agriculture.¶
Smart Construction¶
Safety is critical at construction sites. Every year, many construction workers lose their lives because of falls, collisions, electric shocks, and other accidents. Therefore, solutions have been developed to improve construction site safety, including the real-time identification of workers, monitoring of equipment location, and predictive accident prevention. To deploy these solutions, many cameras and IoT sensors have been installed on construction sites, to measure noise, vibration, gas concentration, etc. Typically, the data generated from these measurements is collected in on-site gateways and sent to remote cloud servers for storage and analysis. Thus, an inspector can check the information stored on the cloud server to investigate an incident. However, this approach can be expensive because of transmission costs, for example, of video streams over a mobile network connection, and because usage fees of private cloud services.¶
Using edge computing, data generated at the construction site can be processed and analyzed on an edge server located within or near the site. Only the result of this processing needs to be transferred to a cloud server, thus reducing transmission costs. It is also possible to locally generate warnings to prevent accidents in real-time.¶
Self-Driving Car¶
Edge computing plays a crucial role in safety-focused self-driving car systems. With a multitude of sensors, such as high-resolution cameras, radar, LIDAR, sonar sensors, and GPS systems, autonomous vehicles generate vast amounts of real-time data. Local processing utilizing edge computing nodes allows for efficient collection and analysis of this data to monitor vehicle distances and road conditions and respond promptly to unexpected situations. Roadside computing nodes can also be leveraged to offload tasks when necessary, for example, when the local processing capacity of the car is insufficient because of hardware constraints or a large data volume.¶
For instance, when the car ahead slows, a self-driving car adjusts its speed to maintain a safe distance, or when a roadside signal changes, it adapts its behavior accordingly. In another example, cars equipped with self-parking features utilize local processing to analyze sensor data, determine suitable parking spots, and execute precise parking maneuvers without relying on external processing or connectivity. It is also possible to use in-cabin cameras coupled with local processing to monitor the driver's attention level and detect signs of drowsiness or distraction. The system can issue warnings or implement preventive measures to ensure driver safety.¶
Edge computing empowers self-driving cars by enabling real-time processing, reducing latency, enhancing data privacy, and optimizing bandwidth usage. By leveraging local processing capabilities, self-driving cars can make rapid decisions, adapt to changing environments, and ensure safer and more efficient autonomous driving experiences.¶
Digital Twin¶
A digital twin can simulate different scenarios and predict outcomes based on real-time data collected from the physical environment. This simulation capability empowers proactive maintenance, optimization of operations, and the prediction of potential issues or failures. Decision makers can use digital twins to test and validate different strategies, identify inefficiencies, and optimize performance.¶
With edge computing, real-time data is collected, processed, and analyzed directly at the edge, allowing for the accurate monitoring and simulation of physical assets. Moreover, edge computing effectively minimizes latency, enabling rapid responses to dynamic conditions as computational resources are brought closer to the physical object. Running digital twin processing at the edge enables organizations to obtain timely insights and make informed decisions that maximize efficiency and performance.¶
Other Use Cases¶
AI/ML systems at the edge empower real-time analysis, faster decision-making, reduced latency, improved operational efficiency, and personalized experiences across various industries, by bringing artificial intelligence and machine learning capabilities closer to edge devices.¶
In addition, oneM2M has studied several IoT edge computing use cases, which are documented in [oneM2M-TR0001], [oneM2M-TR0018] and [oneM2M-TR0026]. The edge computing related requirements raised through the analysis of these use cases are captured in [oneM2M-TS0002].¶
This section describes the challenges faced by IoT that are motivating the adoption of edge computing. These are distinct from the research challenges applicable to IoT edge computing, some of which are mentioned in Section 4.¶
IoT technology is used with increasingly demanding applications, for example, in industrial, automotive and healthcare domains, leading to new challenges. For example, industrial machines such as laser cutters produce over 1 terabyte of data per hour, and similar amounts can be generated in autonomous cars [NVIDIA]. 90% of IoT data is expected to be stored, processed, analyzed, and acted upon close to the source [Kelly], as cloud computing models alone cannot address these new challenges [Chiang].¶
Below, we discuss IoT use case requirements that are moving cloud capabilities to be more proximate, distributed, and disaggregated.¶
Many industrial control systems, such as manufacturing systems, smart grids, and oil and gas systems often require stringent end-to-end latency between the sensor and control nodes. While some IoT applications may require latency below a few tens of milliseconds [Weiner], industrial robots and motion control systems have use cases for cycle times in the order of microseconds [_60802]. In some cases, speed-of-light limitations may simply prevent a cloud-based solutions; however, this is not the only challenge relative to time sensitivity. Guarantees for bounded latency and jitter ([RFC8578] section 7) are also important for industrial IoT applications. This means that control packets must arrive with as little variation as possible and within a strict deadline. Given the best-effort characteristics of the Internet, this challenge is virtually impossible to address, without using end-to-end guarantees for individual message delivery and continuous data flows.¶
Some IoT deployments may not face bandwidth constraints when uploading data to the Cloud. 5G and Wi-Fi 6 networks both theoretically top out at 10 gigabits per second (i.e., 4.5 terabytes per hour), allowing to transfer large amounts of uplink data. However, the cost of maintaining continuous high-bandwidth connectivity for such usage is unjustifiable and impractical for most IoT applications. In some settings, for example, in aeronautical communication, higher communication costs reduce the amount of data that can be practically uploaded even further. Minimizing reliance on high-bandwidth connectivity is therefore a requirement, for example, by processing data at the edge and deriving summarized or actionable insights that can be transmitted to the Cloud.¶
Many IoT devices, such as sensors, actuators, and controllers, have very limited hardware resources and cannot rely solely on their own resources to meet their computing and/or storage needs. They require reliable, uninterrupted, or resilient services to augment their capabilities to fulfill their application tasks. This is difficult and partly impossible to achieve using cloud services for systems such as vehicles, drones, or oil rigs that have intermittent network connectivity. Conversely, a cloud back-end might want to device data even if it is currently asleep.¶
When IoT services are deployed at home, personal information can be learned from detected usage data. For example, one can extract information about employment, family status, age, and income by analyzing smart-meter data [ENERGY]. Policy makers have begun to provide frameworks that limit the usage of personal data and impose strict requirements on data controllers and processors. Data stored indefinitely in the Cloud also increases the risk of data leakage, for instance, through attacks on rich targets.¶
It is often argues that industrial systems do not provide privacy implications, as no personal data is gathered. However, data from such systems is often highly sensitive, as one might be able to infer trade secrets such as the setup of production lines. Hence, owners of these systems are generally reluctant to upload IoT data to the Cloud.¶
Furthermore, passive observers can perform traffic analysis on device-to-cloud paths. Therefore, hiding traffic patterns associated with sensor networks can be another requirement for edge computing.¶
We first look at the current state of IoT edge computing (Section 4.1), and then define a general system model (Section 4.2). This provides a context for IoT edge-computing functions, which are listed in Section 4.3, Section 4.4 and Section 4.5.¶
This section provides an overview of today's IoT edge computing field based on a limited review of standards, research, open-source and proprietary products in [I-D.defoy-t2trg-iot-edge-computing-background].¶
IoT gateways, both open-source (such as EdgeX Foundry or Home Edge) and proprietary products, represent a common class of IoT edge-computing products, where the gateway provides a local service on customer premises and is remotely managed through a cloud service. IoT communication protocols are typically used between IoT devices and the gateway, including CoAP [RFC7252], MQTT [mqtt5], and many specialized IoT protocols (such as OPC UA and DDS in the Industrial IoT space), while the gateway communicates with the distant cloud typically using HTTPS. Virtualization platforms enable the deployment of virtual edge computing functions (using VMs and application containers), including IoT gateway software, on servers in the mobile network infrastructure (at base stations and concentration points), edge data centers (in central offices), and regional data centers located near central offices. End devices are envisioned to become computing devices in forward-looking projects, but are not commonly used today.¶
In addition to open-source and proprietary solutions, a horizontal IoT service layer is standardized by the oneM2M standards body to reduce fragmentation, increase interoperability and promote reuse in the IoT ecosystem. Furthermore, ETSI MEC developed an IoT API [ETSI_MEC_33] that enables the deployment of heterogeneous IoT platforms and provides a means to configure the various components of an IoT system.¶
Physical or virtual IoT gateways can host application programs that are typically built using an SDK to access local services through a programmatic API. Edge cloud system operators host their customers' application VMs or containers on servers located in or near access networks that can implement local edge services. For example, mobile networks can provide edge services for radio-network information, location, and bandwidth management.¶
Resilience in the IoT can entail the ability to operate autonomously in periods of disconnectedness to preserve the integrity and safety of the controlled system, possibly in a degraded mode. IoT devices and gateways are often expected to operate in always-on and unattended modes, using fault detection and unassisted recovery functions.¶
The life cycle management of services and applications on physical IoT gateways is generally cloud-based. Edge cloud management platforms and products (such as StarlingX, Akraino Edge Stack, or proprietary products from major Cloud providers) adapt cloud management technologies (e.g., Kubernetes) to the edge cloud, that is, to smaller, distributed computing devices running outside a controlled data center. The service and application life-cycle is typically using an NFV-like management and orchestration model.¶
The platform typically enables advertising or consuming services hosted on the platform (e.g., the Mp1 interface in ETSI MEC supports service discovery and communication), and enables communication with local and remote endpoints (e.g., message routing function in IoT gateways). The platform is typically extensible to edge applications because it can advertise a service that other edge applications can consume. The IoT communication services include protocol translation, analytics, and transcoding. Communication between edge-computing devices is enabled in tiered or distributed deployments.¶
An edge cloud platform may enable pass-through without storage or local storage (e.g., on IoT gateways). Some edge cloud platforms use distributed storage such as that provided by a distributed storage platform (e.g., EdgeFS, Ceph), or, in more experimental settings, by an ICN network, for example, systems such as Chipmunk [chipmunk] and Kua [kua] have been proposed as distributed information-centric objects stores. External storage, for example, on databases in distant or local IT cloud, is typically used for filtered data deemed worthy of long-term storage, although in some cases it may be for all data, for example when required for regulatory reasons.¶
Stateful computing is supported on platforms that host native programs, VMs, or containers. Stateless computing is supported on platforms providing a "serverless computing" service (also known as function-as-a-service, e.g., using stateless containers), or on systems based on named function networking.¶
In many IoT use cases, a typical network usage pattern is a high volume uplink with some form of traffic reduction enabled by processing over edge-computing devices. Alternatives to traffic reduction include deferred transmission (to off-peak hours or using physical shipping). Downlink traffic includes application control and software updates. Downlink-heavy traffic patterns are not excluded but are more often associated with non-IoT usage (e.g., video CDNs).¶
Edge computing is expected to play an important role in deploying new IoT services integrated with Big Data and AI enabled by flexible in-network computing platforms. Although there are many approaches to edge computing, in this section, we attempt to lay out a general model and the list associated logical functions. In practice, this model can be mapped to different architectures, such as:¶
In the general model described in Figure 1, the edge computing domain is interconnected with IoT devices (southbound connectivity), possibly with a remote/cloud network (northbound connectivity), and with a service operator's system. Edge-computing nodes provide multiple logical functions or components that may not be present in a given system. They may be implemented in a centralized or distributed fashion, at the network edge, or through interworking between the edge network and remote cloud networks.¶
In the distributed model described in Figure 2, the edge-computing domain is composed of IoT edge gateways and IoT devices which are also used as computing nodes. Edge computing domains are connected to a remote/cloud network and their respective service operator's system. IoT devices/computing nodes provide logical functions, for example as part of distributed machine learning or distributed image processing applications. The processing capabilities in IoT devices are limited; they require the support of other nodes, and in a distributed machine learning application, the training process for AI services can be executed at IoT edge gateways or cloud networks and the prediction (inference) service is executed in the IoT devices. In a distributed image processing application, some image processing functions can be similarly executed at the edge or in the cloud, while preprocessing, which helps limiting the amount of uploaded data, is performed by the IoT device.¶
In the following, we enumerate major edge computing domain components. They are here loosely organized into OAM (Operations, Administration, and Maintenance), functional, and application components, with the understanding that the distinction between these classes may not always be clear, depending on actual system architectures. Some representative research challenges are associated with those functions. We used input from co-authors, IRTF attendees, and some comprehensive reviews of the field ([Yousefpour], [Zhang2], [Khan]).¶
Edge computing OAM extends beyond the network-related OAM functions listed in [RFC6291]. In addition to infrastructure (network, storage, and computing resources), edge computing systems can also include computing environments (for VMs, software containers, functions), IoT devices, data, and code.¶
Operation-related functions include performance monitoring for service-level agreement measurements, fault management and provisioning for links, nodes, compute and storage resources, platforms, and services. Administration covers network/compute/storage resources, platforms and services discovery, configuration, and planning. Discovery during normal operation (e.g., discovery of compute or storage nodes by endpoints) is typically not included in OAM; however, in this document, we do not address it separately. Management covers the monitoring and diagnostics of failures, as well as means to minimize their occurrence and take corrective actions. This may include software update management and high service availability through redundancy and multipath communication. Centralized (e.g., SDN) and decentralized management systems can be used. Finally, we arbitrarily chose to address data management as an application component, however, in some systems, data management may be considered similar to a network management function.¶
We further detail a few relevant OAM components.¶
Discovery and authentication may target platforms and , infrastructure resources, such as computing, networking, and storage, as well as other resources such as IoT devices, sensors, data, code units, services, applications, and users interacting with the system. Broker-based solutions can be used, for example, using an IoT gateway as a broker to discover IoT resources. More decentralized solutions can also be used in replacement or complement, for example, CoAP enables multicast discovery of an IoT device, and CoAP service discovery enables obtaining a list of resources made available by this device [RFC7252]. For device authentication, current centralized gateway-based systems rely on the installation of a secret on IoT devices and computing devices (e.g., a device certificate stored in a hardware security module, or a combination of code and data stored in a trusted execution environment).¶
Related challenges include:¶
In a distributed system context, once edge devices have discovered and authenticated each other, they can be organized, or self-organized, into hierarchies or clusters. The organizational structure may range from centralized to peer-to-peer, or it may be closely tied to other systems. Such groups can also form federations with other edges or with remote clouds.¶
Related challenges include:¶
Some IoT edge computing systems make use of virtualized (compute, storage and networking) resources to address the need for secure multi-tenancy at the edge. This leads to "edge clouds" that share properties with remotes clouds and can reuse some of their ecosystems. Virtualization function management is largely covered by ETSI NFV and MEC standards and recommendations. Projects such as [LFEDGE-EVE] further cover virtualization and its management in distributed edge-computing settings.¶
Related challenges include:¶
A core function of IoT edge computing is to enable local computation on a node at the network edge, typically for application-layer processing, such as processing input data from sensors, making local decisions, preprocessing data, offloading computation on behalf of a device, service, or user. Related functions include orchestrating computation (in a centralized or distributed manner) and managing application lifecycles. Support for in-network computation may vary in terms of capability, for example, computing nodes can host virtual machines, software containers, software actors, uni-kernels running stateful or stateless code, or a rule engine providing an API to register actions in response to conditions such as IoT device ID, sensor values to check, thresholds, etc.¶
Edge offloading includes offloading to and from an IoT device, and to and from a network node. [Cloudlets] offer an example of offloading computation from an end device to a network node. In contrast, oneM2M is an example of a system that allows a cloud-based IoT platform to transfer resources and tasks to a target edge node [oneM2M-TR0052]. Once transferred, the edge node can directly support IoT devices that it serves with the service offloaded by the cloud (e.g., group management, location management, etc.).¶
QoS can be provided in some systems through the combination of network QoS (e.g., traffic engineering or wireless resource scheduling) and compute/storage resource allocations. For example, in some systems, a bandwidth manager service can be exposed to enable allocation of the bandwidth to/from an edge-computing application instance.¶
In-network computation can leverage the underlying services, provided using data generated by IoT devices and access networks. Such services include IoT device location, radio network information, bandwidth management and congestion management (e.g., the congestion management feature of oneM2M [oneM2M-TR0052]).¶
Related challenges include:¶
Local storage or caching enables local data processing (e.g., preprocessing or analysis) as well as delayed data transfer to the cloud or delayed physical shipping. An edge node may offer local data storage (in which persistence is subject to retention policies), caching, or both. Caching generally refers to temporary storage to improve performance without persistence guarantees. An edge-caching component manages data persistence, for example, it schedules the removal of data when it is no longer needed. Other related aspects include the authentication and encryption of data. Edge storage and caching can take the form of a distributed storage systems.¶
Related challenges include:¶
An edge cloud may provide a northbound data plane or management plane interface to a remote network, such as a cloud, home or enterprise network. This interface does not exist in stand-alone (local-only) scenarios. To support such an interface when it exists, an edge computing component needs to expose an API, deal with authentication and authorization, and support secure communication.¶
An edge cloud may provide an API or interface to local or mobile users, for example, to provide access to services and applications, or to manage data published by local/mobile devices.¶
Edge-computing nodes communicate with IoT devices over a southbound interface, typically for data acquisition and IoT device management.¶
Communication brokering is a typical function of IoT edge computing that facilitates communication with IoT devices, enabling clients to register as recipients for data from devices, as well as forwarding/routing of traffic to or from IoT devices, enabling various data discovery and redistribution patterns, for example, north-south with clouds, east-west with other edge devices [I-D.mcbride-edge-data-discovery-overview]. Another related aspect is dispatching alerts and notifications to interested consumers both inside and outside the edge-computing domain. Protocol translation, analytics, and video transcoding can also be performed when necessary. Communication brokering may be centralized in some systems, for example, using a hub-and-spoke message broker, or distributed with message buses, possibly in a layered bus approach. Distributed systems can leverage direct communication between end devices over device-to-device links. A broker can ensure communication reliability and traceability and, in some cases, transaction management.¶
Related challenges include:¶
IoT edge computing can host applications, such as those mentioned in Section 2.4. While describing the components of individual applications is out of our scope, some of those applications share similar functions, such as IoT device management and data management, as described below.¶
IoT device management includes managing information regarding IoT devices, including their sensors, and how to communicate with them. Edge computing addresses the scalability challenges of a large number of IoT devices by separating the scalability domain into edge/local networks and remote networks. For example, in the context of the oneM2M standard, a device management functionality (called "software campaign" in oneM2M) enables the installation, deletion, activation, and deactivation of software functions/services on a potentially large number of edge nodes [oneM2M-TR0052]. Using a dashboard or management software, a service provider issues these requests through an IoT cloud platform supporting the software campaign functionality.¶
Challenges listed in Section 4.3.1 may be applicable to IoT devices management as well.¶
Data storage and processing at the edge are major aspects of IoT edge computing, directly addressing the high-level IoT challenges listed in Section 3. Data analysis, for example, through AI/ML tasks performed at the edge, may benefit from specialized hardware support on the computing nodes.¶
Related challenges include:¶
IoT Edge Computing introduces new challenges to the simulation and emulation tools used by researchers and developers. A varied set of applications, networks, and computing technologies can coexist in a distributed system, making modeling difficult. Scale, mobility, and resource management are additional challenges [SimulatingFog].¶
Tools include simulators, where simplified application logic runs on top of a fog network model, and emulators, where actual applications can be deployed, typically in software containers, over a cloud infrastructure (e.g., Docker and Kubernetes) running over a network emulating network edge conditions such as variable delays, throughput and mobility events. To gain in scale, emulated and simulated systems can be used together in hybrid federation-based approaches [PseudoDynamicTesting], whereas to gain in realism, physical devices can be interconnected with emulated systems. Examples of related work and platforms include the publicly accessible MEC sandbox work recently initiated in ETSI [ETSI_Sandbox], and open source simulators and emulators ([AdvantEDGE] emulator and tools cited in [SimulatingFog]). EdgeNet [Senel] is a globally distributed edge cloud for Internet researchers, using nodes contributed by institutions, and based on Docker for containerization and Kubernetes for deployment and node management.¶
Digital twins are virtual instances of a physical system (twin) that are continually updated with the latter's performance, maintenance, and health status data throughout the life cycle of the physical system. [Madni]. In contrast to a traditional emulation or simulated environment, digital twins, once generated, are maintained in sync by their physical twin, which can be, among many other instances, an IoT device, edge device, an edge network. The benefits of digital twins go beyond those of emulation and include accelerated business processes, enhanced productivity, and faster innovation with reduced costs [I-D.irtf-nmrg-network-digital-twin-arch].¶
Privacy and security are drivers of the adoption of edge computing for the IoT (Section 3.4). As discussed in Section 4.3.1, authentication and trust (among computing nodes, management nodes, and end devices) can be challenging as scale, mobility, and heterogeneity increase. The sometimes disconnected nature of edge resources can avoid reliance on third-party authorities. Distributed edge computing is exposed reliability and denial of service attacks. Personal or proprietary IoT data leakage is also a major threat, particularly because of the distributed nature of the systems (Section 4.5.2). Furthermore, blockchain-based distributed IoT edge computing must be designed for privacy, since public blockchain addressing does not guarantee absolute anonymity [Ali].¶
However, edge computing also offers solutions in the security space: maintaining privacy by computing sensitive data closer to data generators is a major use case for IoT edge computing. An edge cloud can be used to perform actions based on sensitive data or to anonymize or aggregate data prior to transmission to a remote cloud server. Edge computing communication brokering functions can also be used to secure communication between edge and cloud networks.¶
IoT edge computing plays an essential role, complementary to the cloud, in enabling IoT systems in certain situations. In this document, we presented use cases and listing the core challenges faced by IoT that drive the need for IoT edge computing. The first part of this document may therefore help focus future research efforts on the aspects of IoT edge computing where it is most useful. The second part of this document presents a general system model and structured overview of the associated research challenges and related work. The structure, based on the system model, is not meant to be restrictive and exists for the purpose of having a link between individual research areas and where they are applicable in an IoT edge computing system.¶
This document has no IANA actions.¶
The authors would like to thank Joo-Sang Youn, Akbar Rahman, Michel Roy, Robert Gazda, Rute Sofia, Thomas Fossati, Chonggang Wang, Marie-José Montpetit, Carlos J. Bernardos, Milan Milenkovic, Dale Seed, JaeSeung Song, Roberto Morabito, Carsten Bormann and Ari Keränen for their valuable comments and suggestions on this document.¶