Internet-Draft Mailing-List Modifications August 2023
Chuang Expires 14 February 2024 [Page]
Workgroup:
Independent Stream
Internet-Draft:
draft-chuang-mailing-list-modifications-03
Published:
Intended Status:
Experimental
Expires:
Author:
W. Chuang
Google, Inc.

Tolerating Mailing-List Modifications

Abstract

Mailing-lists distribute email to multiple recipients by forwarding and potentially modifying messages to document the distribution to the recipients. Unfortunately forwarding breaks SPF (RFC7208) authentication and message modification breaks DKIM (RFC6376) authentication. This document is based on ARC (RFC8617) to provide a framework to describe forwarding with extensions to tolerate common mailing-list message modifications. This specification characterizes the mailing-list transforms such that a receiver can reverse them to enable digital signatures verification and attribution of the message content. These message modifications are: 1) adding a description string to the Subject header, 2) rewriting the From header, 3) removing the original DKIM-Signature and 4) appending a footer to the message body. This also specifies those modifications for the purpose of making them reversible.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 14 February 2024.

Table of Contents

1. Introduction

Mailing-lists have long complicated email authentication. They break SPF [RFC7208] authentication due to forwarding and break DKIM [RFC6376] authentication due to message modification that used to identify the mailing-list. This specification provides methods to restore authentication even in the presence of forwarding and message modification. Being able to restore authentication is particularly important as senders may specify a sender-defined receiver email handling policy that may prevent delivery of the message as defined in DMARC [RFC7489]. Moreover malicious content is currently attributed to all parties in the mail flow. This specification permits a receiver to attribute the email content to its author be it the sender or the mailing-list.

The approach in this document is to be highly opinionated about only supporting common case mutations to lessen the implementation burden upon mailing-lists and receivers. This also seeks to eliminate any burden for messages that don't go through mailing-lists. At origination, it is sufficient to sign a message with a DKIM signature as typically done already. It does ask those mailing-lists that wish to use this specification to characterize the modifications it performs at each forwarding step. The goal is to permit receivers to reverse the modification so that the message hash can be recovered and the prior signature verified at each forwarding step, be it from DKIM or ARC. This allows the receiver to determine which forwarder or the originating sender contributed which content, in the received message. This specification uses the ARC [RFC8617] framework to describe each forwarder, and then record the mutation performed at each forwarding step. Consequently this attribution enables more precise reputation systems and UI features. The supported modifications are: 1) adding a description string to the Subject header, 2) rewriting the "From" header, 3) rewriting the original DKIM-Signature and 4) appending a footer to the message body. This document specifies the characterization of the modifications and how to apply the modifications.

The validation results in this specification are orthogonal to the results in

draft-chuang-replay-resistant-arc. In addition to better supporting DMARC in the presence of mailing-list modifications, this specification enables attribution of malicious content back to the author. However this specification is vulnerable to replay much like DKIM and ARC. draft-chuang-replay-resistant-arc validation is tolerant of header and message body modifications but unable to provide attribution. It also detects and prevents replay. A receiver may want to combine the results of these two validations to create a strong authentication result that provides greater certainty that a message was sent by some originating sender through a mailing-list.

1.1. Terminology and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

2. Algorithm

The specification in this document is divided into procedures computed at the mailing-list forwarders and at the receiver and is based on the procedures in ARC [RFC8617]. The steps at the forwarder are further divided at the forwarder into the inbound validation procedures and outbound forwarding procedures. The receiver only performs the inbound validation procedures. The inbound validation procedures are variable length corresponding to the depth of the ARC sets. At each ARC set corresponding to some earlier forwarder, the receiver computes any reversing procedure described later for any described mailing-list mutations to obtain the message hash of the message as seen at the inbound of the mailing-list. This hash comes from an ARC-Message-Signature if the sender is a forwarder and a DKIM-Signature (or possibly ARC-message-signature) if the sender is the originator. That signature MUST verify correctly, otherwise this is considered a signature verification failure by this specification. Interpretation of that failure is described later.

2.1. Forwarder Characterization

Conformant forwarders characterize message mutations to enable receivers to interpret mutations and potentially verification failures. A description is added as a tag-value to the ARC-Message-Signature. The mutator description tag is "m" and has the following values:

mailinglist:

Describes a forwarder that distributes messages to multiple recipients and any modification conforms to the specification in this document. Consequently signature validation failures are described as "fail".

gateway

Optional mutating forwarder that works on behalf of the originating sender or final receiver. The receiver is presumed to have an established trust model with the gateway outside of this document and, and validation failure result is interpreted by the receiver. The receiver with such an arrangement MUST NOT use this description to detect the existence of such a relationship. However a gateway forwarder may provide a description for human operators to interpret gateway's actions.

generic

Optional self-description for a generic mutator. This document imposes no judgment on signature validation failures but the description may help a human operator interpret the actions of the forwarder.

none

Conformant forwarders that do not mutate the message declares their participation and lack of mutation with none value.

For example:

ARC-Message-Signature: i=1; m=mailinglist;...

Receivers use this mutator description tag to interpret the forwarder's action and whether they followed the guidelines in this specification. If the forwarder promised that they did, then it makes it possible to enforce a validation failure that protects that forwarder.

2.2. Header Rewriting Characterization

A mailing-list forwarder that rewrites a Subject, From or DKIM-Signature headers can store the prior values by substituting with a modified header name i.e. prefixing "Prior-" to the original name. Moreover the header name retains the original capitalization, its place with respect to other headers, and is not ambiguous if there are multiple headers of the same name. First the bottoms-up line count of all headers is computed starting from the 0th position. Then all headers to be rewritten are collected, ordered by position to be prepended to the headers list and given a bottoms-up line count position. This is an offset from "Prior-" header to it's rewritten counterpart. It is assumed that the prior header has the following format:

<header name>:<value including any whitespaces>

The conserved header name is prefixed with a "Prior-". Then an ARC instance number tag "i" i.e. "=<#>," and line count tag "l" i.e. "l=<bottoms-up line count offset>" are prepended to the header value and separated by a semicolon and one whitespace. The position of this "i" and "l" is important because they may otherwise be ambiguous with the original header's tag-values e.g DKIM-Signature. Thus the conserved header is:

Prior-<header name>: i=<#>; l=<bottoms-up line count offset>;
     <value including any whitespaces>

For example:

Prior-Dkim-Signature: i=1; l=4; d=example.com; b=...
Prior-Subject: i=1; l=3; Meeting on the 5th

This stores the prior header in a format that SHOULD be ignored by subsequent forwarders, and if not then will cause a verification failure. The forwarder then rewrites or deletes the header as before by prepending the rewritten header to the message or skipping prepending for deletion. To protect the integrity of these headers, these headers are hashed separately from ARC-Message-Signature "h" header list. The forwarder generates the header hash by hashing the "Prior-" headers (and any "Content-footer" header described below) found in bottoms up order as found in the headers. This specification calls for the rewritten header hash to be explicitly added to the ARC-Message-Signature with a tag "hh". Conformant forwarders even without mailing-lists mutations MUST report header hash explicitly to better differentiate and tolerate body modifications when verifying headers.

To reverse the rewriting or deletion of headers, all "Prior-" headers are collected at the given ARC instance number. The prior header name is extracted from the "Prior-" header, and the original value extracted and restored. The rewritten header is found by taking the "Prior-" header line count position and adding the offset in the "l" tag's value. Rewritten headers are deleted in top down order.

2.4. Validation

The receiver verifies prior ARC sets per the procedure described in ARC [RFC8617]. In addition, the receiver validates the ARC sets starting from the largest instance number found to the smallest. First the receiver verifies the given instance ARC-Message-Signature or DKIM-Signature as appropriate. Then the receiver computes the rewritten header hash taking the header hash computed by ARC-Message-Signature at the given instance number or DKIM-Signature. Then the hash is taken for "Prior-" headers and "Content-footer" header if found in the headers at the current ARC set instance number and all prior ones. This is verified against the header hash value associated with the tag "hh", reporting signature failure if it mismatches.

Next the receiver determines whether it needs to reverse any header or footer mutations at that ARC set instance by looking for the ARC-Message-Signature mutation tag "m=". For values of "mailinglist", it attempts to reverse mutations keeping the resulting message so that further validations are possible. The receiver attempts to provide header reversing procedures given in "Header Rewriting" section and body reversing procedures given in "Message Footer" section. For values "gateway" the receiver MAY apply local policy to interpret subsequent validation failures. For all other mutation tag "m" values, it assumes no mutations are present or outside the scope of mailing-list modifications.

In addition, the originating sender's DKIM-signature or ARC-Message-Signature MUST successfully verify. If so and all prior signatures verify, then the result is a "pass". Verification failures are subject to interpretation in "Footer Characterization" section, and potentially indicate a "fail". The result of this procedure is written in Authentication-Result [RFC8601] and ARC-Authentication-Result with a method named "reverse" as the REVERSE result.

Informationally, if the receiver implements draft-chuang-replay-resistant-arc, this specification suggests modifying draft-chuang-replay-resistant-arc PATH results to take into account the REVERSE result. At each ARC set instance where PATH recursively combines the local DARA (or SeRCi) results, if REVERSE reports "fail" then the PATH result reports "fail". Because REVERSE _combined with DARA _represents a higher bar of verification than DARA alone, the receiver applies local policy when interpreting the PATH result.

3. Mailing-List Modifications

When a message body or Subject header is modified by a forwarder, the sender's DKIM signature will no longer validate. To mitigate this forwarders MAY elect to replace the DKIM signature with their own with a new message hash that takes into account the modifications. Because the signature is not DMARC aligned, senders also MAY rewrite the From header to take ownership of the message. The following creates a specification making the message body or Subject header modification, replacing the DKIM signature and applying characterization headers.

3.1. Subject Header Modification

The mailing-list may want to communicate to the recipient that a message went through a mailing-list by modifying the Subject header to append the name of the mailing-list. Typically the name is put between brackets e.g. "[name]" and prepended to the Subject header content. This specification supports arbitrary text changes by saving the earlier inbound version of the Subject header's content in the "Prior-Subject" header. The change in Subject text will break the DKIM-Signature so mailing-lists MAY rewrite DKIM-Signature to update the message header hash and resign the signature. Similarly they MAY update the From header to DMARC align the DKIM-Signature "d=" domain with the From header domain. Typically the From address is the set to the mailing-list address to further identify the mailing-list. This specification supports saving the earlier inbound version of the DKIM-Signature and From headers in the "Prior-DKIM-Signature" and "Prior-From" headers respectively.

For example the original message looks like:

Dkim-Signature: d=example.com; b=...
From: john.doe@example.com
Subject: Meeting on the 5th

A mailing-list rewrite Subject, From and DKIM-Signature headers and saves the original content in the "Prior-" headers

Dkim-Signature: d=mailinglist.example.com; b=...
From: mailinglist@mailinglist.example.com
Subject: [mailing-list] Meeting on the 5th
Prior-Dkim-Signature: i=1; l=3; d=example.com; b=...
Prior-From: i=1; l=3; john.doe@example.com
Prior-Subject: i=1; l=3; Meeting on the 5th

3.2. Body Modification

The mailing-list may want to communicate to the recipient that a message went through a mailing-list by adding a text footer describing the mailing-list. Typically this is done by appending that text description at the bottom of message body text. This specification supports three different footer organizations, depending on the MIME structure, content type and content transport encoding of the MIME parts of the message. In particular the organization depends if its MIME parts can be concatenated, meaning whether a text footer can be appended and removed without having to re-encode the original content to preserve the original content. For example appending new content to "base64", and "binary" will likely introduce artifacts between the original message and the interpretation footer. MIME parts that can be concatenated are defined as having one of the following type and subtype of Content-type:

  • text/plain
  • text/html

and one of the following mechanisms for Content-transfer-encoding:

  • 7bit
  • 8bit
  • quoted-printable

Similarly, non-text MIME parts, and complex multipart MIME parts don't lend themselves to appending text. One exception to this is "multipart/alternative" as will be described shortly. To handle these scenarios, this specification calls for adding a new text MIME part footer that can handle any encoding or any mime structure but requires altering the MIME tree.

The following procedure describes how the mailing-lists MAY choose from one of those formats to add a footer. Here a mailing-list should evaluate in the same order as these three sections, following the steps in each section. If any footer addition is successful, then the footer algorithm stops.

3.2.1. Unstructured or Text Message Body

The simplest structure applies when the message lacks [RFC2045] MIME structure or the top level MIME part can be concatenated. Here the forwarder appends a text footer with the same content type and content transfer encoding as the message-body. When a message lacks MIME struct, the default message content-type is "text/plain" (section 5.2) and default content transfer encoding is "7bit" (section 6.1). Next a "Content-footer" header is prepended describing the start and ending message body octet offsets. Reversing this transform is a straightforward deletion of the footer at the offsets given.

For example given an original message of:

Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
From: john.doe@example.com

This is the message body.

A text footer can be appended as follows:

Content-footer: i=1; b=25; e=67
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
From: john.doe@example.com

This is the message body.

============
This is a mailing-list footer.

3.2.2. Multipart/Alternative

When the top level MIME part is Content-type "multipart/alternative", the forwarder checks if any of the immediate children MIME parts can be concatenated. If so, it attempts to append a text footer with the same content type and content transfer encoding as the children MIME part. Next it prepends a Content-footer header in the child MIME part header description with the start and end octet offsets from the beginning of the part. If a footer can be added to a child MIME part, then this is considered success and the algorithm can halt. The reversing algorithm is to look for the top level MIME part with Content-type "multipart/alternative". If so then look in the immediate child MIME parts and delete the text at the offsets given by the Content-footer.

For example given an original message of:

Content-Type: multipart/alternative; boundary="abcd"
From: john.doe@example.com

--abcd
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

This is the message body.

--abcd--

A text footer can be appended as follows:

Content-Type: multipart/alternative; boundary="abcd"
From: john.doe@example.com

--abcd
Content-footer: i=1; b=25; e=67
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

This is the message body.

============
This is a mailing-list footer.

--abcd--

4. Examples

These examples are informational.

4.1. Originator ⇒ First Mailing-List ⇒ Second Mailing-List ⇒Receiver

This message is sent through two mailing-lists to some receiver. The originating sender creates and signs a message as follows:

DKIM-Signature: d=example.com
From: john.doe@example.com
Subject: A really big announcement

It's Jane Doe's birthday tomorrow!

The first mailing-list adds a Subject header prefix and message-body footer, and denotes this using the procedures and headers specified in this document. It denotes that it performed mailing-list mutations in the ARC-Message-Signature.

ARC-Message-Signature: i=1; m=mailinglist...
Content-footer: i=1; b=34; e=78
DKIM-Signature: d=mailinglist.example.com...
From: school@mailinglist.example.com
Subject: [school list] A really big announcement
Prior-DKIM-Signature: i=1; l=3; d=example.com...
Prior-From: i=1; l=5=3; john.doe@example.com
Prior-Subject: i=1; l=3; A really big announcement

It's Jane Doe's birthday tomorrow!

============
This is the school mailing-list.

The second mailing-list adds a Subject header prefix and message-body footer, and denotes this using the procedures and headers specified in this document. It denotes that it performed mailing-list mutations in the ARC-Message-Signature.

ARC-Message-Signature: i=2; m=mailinglist...
Content-footer: i=2; b=79; e=124
DKIM-Signature: d=mailinglist.example.com...
From: district@mailinglist.example.com
Subject: [district list] [school list] A really big announcement
ARC-Message-Signature: i=1; m=mailinglist...
Content-footer: i=1; b=34; e=78
Prior-DKIM-Signature: i=2; l=5; d=mailinglist.example.com
Prior-From: i=2; l=5; school@mailinglist.example.com
Prior-Subject: i=2; l=5; [school list] A really big announcement
Prior-DKIM-Signature: i=1; l=3; d=example.com
Prior-From: i=1; l=3; john.doe@example.com
Prior-Subject: i=1; l=3; A really big announcement

It's Jane Doe's birthday tomorrow!

============
This is the school mailing-list.

============
This is the district mailing-list.

The receiver sees the above message on inbound delivery, and attempts to verify the ARC message signature at the i=2. Upon success, the receiver notices that there were mailing-list mutations at ARC set i=2. It applies the REVERSE validation algorithm to reverse the mutations from the second mailing-list. After applying the reverse procedure, the reversed message looks like:

ARC-Message-Signature: i=2; m=mailinglist...
ARC-Message-Signature: i=1; m=mailinglist...
Content-footer: i=1; b=34; e=79
DKIM-Signature: d=mailinglist.example.com
From: school@mailinglist.example.com
Subject: [school list] A really big announcement
Prior-DKIM-Signature: i=1,l=3, d=example.com
Prior-From: i=1; l=3; john.doe@example.com
Prior-Subject: i=1; l=3; A really big announcement

It's Jane Doe's birthday tomorrow!

============
This is the school mailing-list.

Again the receiver attempts to verify the ARC message signature now at the i=1. Upon success, the receiver notices that there were mailing-list mutations at ARC set i=1. It applies the REVERSE validation algorithm to reverse the mutations from the first mailing-list, obtaining the original message. The reversed message looks like:

ARC-Message-Signature: i=2; m=mailinglist...
ARC-Message-Signature: i=1; m=mailinglist...
From: john.doe@example.com
DKIM-Signature: d=example.com...
Subject: A really big announcement

It's Jane Doe's birthday tomorrow!

The resulting reversed headers and message body DKIM-Signature verifies, and the REVERSE passing result is published in the ARC-Authentication-Result:

ARC-Authentication-Result: i=3; mx.example.com; reverse=pass...
ARC-Message-Signature: i=2; m=mailinglist...
Content-footer: i=2; b=79; e=124
DKIM-Signature: d=mailinglist.example.com...
From: district@mailinglist.example.com
Subject: [district list] [school list] A really big announcement
ARC-Message-Signature: i=1; m=mailinglist...
Content-footer: i=1; b=34; e=78
Prior-DKIM-Signature: i=2; l=5; d=mailinglist.example.com...
Prior-From: i=2; l=5; school@mailinglist.example.com
Prior-Subject: i=2; l=5; [school list] A really big announcement
Prior-DKIM-Signature: i=1; l=3; d=example.com...
Prior-From: i=1; l=3; john.doe@example.com
Prior-Subject: i=1; l=3; A really big announcement

It's Jane Doe's birthday tomorrow!

============
This is the school mailing-list.

============

This is the district mailing-list.

5. Security Considerations

Care must be used if the reversed transformed message is used for authentication. The reversed transformed message is vulnerable to replay attacks and "hides" the potentially spammy contribution from the mailing-lists.

6. IANA Considerations

There are no requests at this time.

7. Normative References

[RFC2045]
Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, DOI 10.17487/RFC2045, , <https://www.rfc-editor.org/rfc/rfc2045>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC6376]
Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed., "DomainKeys Identified Mail (DKIM) Signatures", STD 76, RFC 6376, DOI 10.17487/RFC6376, , <https://www.rfc-editor.org/rfc/rfc6376>.
[RFC7208]
Kitterman, S., "Sender Policy Framework (SPF) for Authorizing Use of Domains in Email, Version 1", RFC 7208, DOI 10.17487/RFC7208, , <https://www.rfc-editor.org/rfc/rfc7208>.
[RFC7489]
Kucherawy, M., Ed. and E. Zwicky, Ed., "Domain-based Message Authentication, Reporting, and Conformance (DMARC)", RFC 7489, DOI 10.17487/RFC7489, , <https://www.rfc-editor.org/rfc/rfc7489>.
[RFC8601]
Kucherawy, M., "Message Header Field for Indicating Message Authentication Status", RFC 8601, DOI 10.17487/RFC8601, , <https://www.rfc-editor.org/rfc/rfc8601>.
[RFC8617]
Andersen, K., Long, B., Ed., Blank, S., Ed., and M. Kucherawy, Ed., "The Authenticated Received Chain (ARC) Protocol", RFC 8617, DOI 10.17487/RFC8617, , <https://www.rfc-editor.org/rfc/rfc8617>.

Author's Address

Weihaw Chuang
Google, Inc.