Mathematical Mesh: ArchitectureComodo Group Inc. philliph@comodo.com
Security
The Mathematical Mesh 'The Mesh' is an end-to-end secure infrastructure that facilitates the exchange of configuration and credential data between multiple user devices. The architecture of the Mesh and examples of typical applications are described.The Mathematical Mesh is a user centered Public Key Infrastructure that uses cryptography to make computers easier to use.The Mesh uses cryptography and an untrusted cloud service to make management of computer configuration data transparent to the end user. Each Mesh user has a personal profile that is unique to them and contains a set of public keys for maintaining the user's Mesh profile.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].Public Key Cryptography permits Internet applications to be secure but requires an infrastructure for key distribution.WebPKI has been very successful for E-commerce. Client side PKI has been remarkably less successful.S/MIME and OpenPGP both have significant user bases but both have been limited to a small community. Government for S/MIME, system admins and security researchers for OpenPGP. Use of PKI for authentication of Web users has seen negligible use.One of the chief obstacles any network application has to overcome is the critical mass problem. While S/MIME and OpenPGP both have several million users, this is a small fraction of the number of email users. It is likely that the more significant obstacle to deployment is the difficulty of using client side PKI applications. While S/MIME and OpenPGP both claim to reduce the effort of sending secure email 'to a single click', no security feature that requires the user to make a conscious decision to use it every time it is used can ever hope to achieve ubiquitous deployment.Attempting to automate the process of sending encrypted mail introduces a new problem. The fact that a user has configured a client to receive encrypted mail the past does not mean that they are capable of receiving and decrypting such mail today. And even if they are still capable of receiving the encrypted mail today, this capability may be limited to a single machine that they do not currently have access to.While such objections have been repeatedly dismissed as trivial and 'easily solved' by protocol designers, to ordinary email users, they are anything but trivial. If a change is to be made to an infrastructure they rely on daily, it must be completely transparent. An email security infrastructure that interrupts or disrupts their flow of work is totally unacceptable.Equally overlooked by application designers is the difficulty of configuring applications that support end-to-end security through cryptography. While working on this project, the author attempted to configure a very popular email client to make use of the built in S/MIME capabilities. Even with 25 years of experience, this took over half an hour and required the user to follow a procedure with 17 different steps!It is important to note that this complexity is not simply a consequence of one poorly designed application, it is the result of the functions of the PKI being divided across three poorly integrated applications on the user's machine compounded by a set of network protocols that are not designed to provide a seamless user experience.A similar problem is illustrated by the problem of configuring SSH. There is a simple way to configure SSH and there is a secure way and these are not the same. The simple way to configure SSH is for each user to create a single keypair and copy it to each of the machines they might need terminal access to. While this is straightforward it means that there is no way to mitigate the possibility of the key being compromised if a machine is lost or stolen. Sharing a private key between machines is as bad as sharing a password between accounts. But attempting to achieve cryptographic hygiene across a diverse collection of devices requires user effort proportional to the square of the number of devices.A key principle that guides the design of the Mesh is that any set of instructions that can be written down and given to a user can be written down as code and executed by the computer. Public key cryptography is used to automate the process of managing public keys.Traditional PKI attempted to solve the problems that were of paramount concern to the designers. The designers of S/MIME were concerned with the problem of exchanging secure email within a hierarchical organization and built a (mostly) hierarchical design. The designers of OpenPGP were concerned with the risk of government subversion of the trust infrastructure for nefarious ends.But what does the user care about? What is the user's principal concern?The biggest concern I hear from users is not the risk that someone else might get to see their confidential data, rather it is the risk that they might lose their precious data by some unintended user-error.Being user centered means considering and addressing the requirements that are set by users regardless of whether they are compatible with the designer's view of optimal security. In particular a user-centered PKI must address requirements such as:Guaranteeing that data loss does not happen even in the most extreme cases of total loss or destruction of all hardware they used to store their keys.Mitigating the consequences of user error or carelessness.Mitigating the consequences of devices being lost or stolen.Providing mechanisms that permit a user to permit access to their digital assets after their death.Traditionally cryptographic applications give the user a bewildering choice of algorithms and options. They can choose to have one RSA keypair used for encryption and signature or they can have separate keys for both, they can encrypt their messages using 3DES or AES at 128, 192 or 256 bit security. And so on.The Mesh eliminates such choices as unnecessary. Except where required by an application, the Mesh always uses separate keys for encryption and signature operations and only uses the highest strength on offer. Currently, Mesh profiles are always encrypted using RSA with a 2048 bit key, AES with a 256 bit key and SHA-2-512. (The CFRG ECC curves will be added in the near future when implementations become available.)For similar reasons, every Mesh master profile has an escrow key. The use of key escrow by applications is optional, but every profile has the capability of using it should circumstances require.All four of the open standards based PKIs that have been developed in the IETF are based on designs that emerged in the mid-1990s. Performing the computations necessary for public key cryptography without noticeable impact on the speed of user interaction was a constraint for even the fastest machines of the day. Consequently, PKI designs attempted to limit the number of cryptographic operations required to the bare minimum necessary. There were long debates over the question of whether certificate chains of more than 3 certificates were acceptable.Today a 32 bit computer with two processing cores running at 1.2GHz can be bought for $5 and public key algorithms are available that provide a higher level of security for less computation time. In 1995, the idea that a single user might need a hundred public key pairs and a personal PKI to manage them as an extreme scenario. Today when the typical user has a phone, a tablet and a laptop and their home is about to fill up dozens if not hundreds of network connected devices, the need to manage large numbers of keys for individual users is clear.Almost any information security requirement has a straightforward solution if you are prepared to commit the necessary resources. In general, each degree of cryptographic separation that is required will introduce an additional layer of hierarchy. Traditionally PKI has focused on the problem of delegating trust from one party to another. Such capabilities have been implicit in the model but only expressed in applications to a limited degree.In the WebPKI, Certificate Authorities maintain the private keys corresponding to their widely distributed root keys in offline facilities that are never connected to the Internet. These keys are in turn used to sign 'intermediate root certificates' corresponding to the keys used to sign end entity certificates. The CA has this capability but the end entity does not. In the PKIX model it is assumed that if the end entity needs to change their cryptographic configuration, they will go back to their CA and get a new certificate.In the OpenPGP Web of trust, Alice signs the key of Bob who signs the key of Carol. Since everyone is a trust provider in the OpenPGP model, Alice can sign a key for Alice. This mechanism is used to support key rollover but the task of distributing her new keys to the devices where Alice needs them is a problem left to Alice.While it is quite possible for a very capable and experienced PKI expert to configure PKIX and OpenPGP applications in a fashion that supports management of personal keys, such use is far beyond what can reasonably be expected of typical users.The Mesh applies PKI technology to the problem of making PKI use effortless. Once an initial configuration is established, the user is not required to think about PKI at all. Every PKI operation (e.g. key and certificate rollover) is performed automatically.The Mesh is a network infrastructure. As with any such infrastructure it is formed not as a set of things but rather as the relationship between those things.A Mesh user is a person or organization that has established a Mesh personal profile. A Mesh personal profile describes the configuration of the set of devices and applications that the user uses. Each Mesh profile is identified by a globally unique fingerprint value.A Mesh user MAY have multiple profiles for the purpose of compartmentalizing their online identity and preventing activity in one network context being linked to activity in another network context. The extent to which such separation provides increased privacy is not currently understood. From the point of view of the Mesh protocols, such profiles are held by separate users.At present the Mesh specifications are designed to support requirements arising from personal use such as the user transferring application settings from one device they own to another device they own. To deploy the Mesh in an enterprise environment, features such as the ability to import settings provided by the IT department are highly desirable.The Mesh may be used on any computer that has the ability to connect to a network and perform public key cryptography.Every device that uses the Mesh has a unique device profile that specifies public key pairs that are unique to that device.When a device is connected to a user's personal profile, it may be an Administration Device or a Connected Device depending on whether it has been assigned an Administration key.A device that has access to an administration key for the user's Mesh Personal Profile and is thus authorized to authorize actions such as connecting a new device to the profile, removing devices and creating or removing application profiles.A device that is connected to the Mesh Personal Profile that is not an administration device.Note that a device MAY be connected to more than one Personal Profile at the same time. For example, an embedded device such as a thermostat might have a single device profile installed during manufacture. If Alice and Bob share the same accommodations where the thermostat is installed, both users might have connected the device to their personal profile.Users do not interact with a Mesh Directly. All interaction with the Mesh is mediated by a Portal Provider. The portal provider is responsible for protecting the Mesh from abuse such as Denial of Service attacks, resource exhaustion, spam, etc.Users interact with a portal provider through an account which has an account identifier in the traditional [RFC5322] format:Where is an account identifier that is unique to that portal service and is the DNS name of the portal service.The Uniform Data Fingerprint format (UDF) [draft-hallambaker-udf-03] is used to construct names for Mesh data items. UDF employs Base32 [RFC3977] encoding and the SHA-2-512 and SHA-3-512 digest functions to construct fingerprints of varying lengths.The choice of fingerprint length is a balance between security and compactness of the representation. Longer fingerprints offer higher security but are less convenient. The minimum fingerprint size recommended for use in the Mesh is 25 characters, this presents a work factor of 2^117 to an attacker attempting to generate a signature key matching a particular fingerprint, approximately the same work factor as RSA with 2048 bit keys.In contrast to the URLs resolved by the HTTP protocol which identify a resource by means of a location and a means of retrieval, a UDF fingerprint only identifies a fixed data object and the data type. A UDF resolution service resolves UDF fingerprints in the same manner that a HTTP server resolves URLs but can only provide a response for the set of fingerprints known to that specific server. Unlike the HTTP service which the client must trust to return the correct resource, every response returned by a UDF resolution service may be validated against the fingerprint presented in the original request. Thus a user of a UDF resolution service is not required to trust it for the integrity of the result received.UDF fingerprints provide a probabilistically unique identifier for a static data object but do not provide a direct means of identifying resources that change over time. To identify such resources, digital signatures are used. A public key signature pair is created and the UDF fingerprint of the public key parameters serves as the identifier. The private key is then used to sign either the data object itself or a data object containing a further public key.The application/pkix-keyinfo content type described in [draft-hallambaker-udf-03] is used to create identifiers for public keys.A Mesh profile is a set of configuration settings that is bound to a persistent identifier (a UDF fingerprint).The Mesh protocols do not put any limit on the size or complexity of Mesh profiles but a Mesh Portal SHOULD impose such limits as are appropriate to avoid abuse such as denial of service attacks. Javascript Object Notation (JSON) [RFC7159] encoding is used to encode all Mesh data objects except for low level cryptographic formats where other encodings are already established.The Mesh defines two new protocols:A client-server protocol that mediates access to a Mesh. The Intermesh protocol is used to exchange Mesh profile data between portals. It is a flood fill protocol that applies the same principles demonstrated in NNTP [RFC4644].The DNS SRV mechanism is used for The principle of transparency was introduced by the Certificate Transparency specification [RFC6962]. Transparency is the ability to audit a system using only information that is available to the users of the system. If the system is a public service, all the data used to audit the service must be public.The Mesh uses strong encryption and Is unique to each device. If a device has multiple accounts, each account would typically require a separate device profile.Has separate keys for encryption, authentication and signature.Typically generated on the device.Once generated, is typically constant until the device is reset.Used to provision application keys out to a device.Is signed by the Master Signing Key which is in turn validated by the fingerprint.Contains a Master Signing Key, Set of Administration Keys and Set of Escrow Keys.Changes infrequently, usually only when the set of administration devices changes or a new escrow key is added.Is signed by an administration key.For convenience, the master profile is included as an attachment.Changes when there is a significant change to the configuration, the addition of a new device or application.Is signed by an administration key or an application administration key (if specified for the application).Contains the application configuration data. Is encrypted to the device keys.Changes when the application configuration is changed or when devices are added or removed.It may be desirable to partition the Application profiles so that it is not necessary for every device to download the whole thing. For example, sign a manifest so that the portal can strip out just the parts of the profile that are relevant to a device.Not necessarily instantaneous, may be latency between an update being published and it being available. This is not a priority at the moment.May be used to support local replication or replication between providers.It is anticipated that the Intermesh Protocol will operate at a substantially greater latency than the Mesh Portal Protocol. Probably resynchronizing on an hourly or even daily basis.Portals are not required to forward every update to the Intermesh. Only updates that have not been superseded within the time quanta need be published.Each Portal runs a local append only log of every transaction. This is periodically closed and a new log started. Some time after the log is closed, a hash structure is calculated across the log entries and broadcast to the other participants in the InterMesh. After a quorum of hash values has been received, each participant in the exchange calculates a new master hash entry which will be added to the log before the next checkpoint occurs.The participants exchange log records, but this may be on a limited basis. If the InterMesh has a hundred members, it is not necessary for every single node to have every single entry in real time. It is sufficient for each node to have knowledge of a partner that can provide it on demand.[Account request does not specify the portal in the request body, only the HTTP package includes this information. This is probably a bug.] A user interacts with a Mesh service through a Mesh portal provider with which she establishes a portal account. For user convenience, a portal account identifier has the familiar <username>@<domain> format established in [RFC822]. For example Alice selects example.com as her portal provider and chooses the account name alice. Her portal account identifier is alice. A user MAY establish accounts with multiple portal providers and/or change their portal provider at any time they choose. The first step in creating a new account is to check to see if the chosen account identifier is available. This allows a client to validate user input and if necessary warn the user that they need to choose a new account identifier when the data is first entered. The ValidateRequest message contains the requested account identifier and an optional language parameter to allow the service to provide informative error messages in a language the user understands. The Language field contains a list of ISO language identifier codes in order of preference, most preferred first. The ValidateResponse message returns the result of the validation request in the Valid field. Note that even if the value true is returned, a subsequent account creation request MAY still fail. [Note that for the sake of concise presentation, the HTTP binding information is omitted from future examples.] The first step in creating a new personal profile is to create a Master Profile object. This contains the long term Master Signing Key that will remain constant for the life of the profile, at least one Online Signature Key to be used for administering the personal profile and (optionally), one or more master escrow keys. For convenience, the descriptions of the Master Signing Key, Online Signing Keys and Escrow Keys typically include PKIX certificates signed by the Master Signing Key. This allows PKIX based applications to make use of PKIX certificate chains to express the same trust relationships described in the Mesh. The Master Profile is always signed using the Master Signing Key: Since the device used to create the personal profile is typically connected to the profile, a Device profile entry is created for it. This contains a Device Signing Key, a Device Encryption Key and a Device Authentication Key: The Device Profile is signed using the Device Signing Key: A personal profile would typically contain at least one application when first created. For the sake of demonstration, we will do this later. The personal profile thus consists of the master profile and the device profile: The personal profile is then signed using the Online Signing Key: Once the signed personal profile is created, the client can finaly make the request for the service to create the account. The request object contains the requested account identifier and profile: The service reports the success (or failure) of the account creation request: Connecting a device to a profile requires the client on the new device to interact with a client on a device that has administration capabilities, i.e. it has access to an Online Signing Key. Since clients cannot interact directly with other clients, a service is required to mediate the connection. This service is provided by a Mesh portal provider. All service transactions are initiated by the clients. First the connecting device posts ConnectStart, after which it may poll for the outcome of the connection request using ConnectStatus. Periodically, the Administration Device polls for a list of pending connection requests using ConnectPending. After posting a request, the administration device posts the result using ConnectComplete: The first step in the process is for the client to generate a device profile. Ideally the device profile is bound to the device in a read-only fashion such that applications running on the device can make use of the deencryption and authentication keys but these private keys cannot be extracted from the device: The device profile is then signed: One of the main architecutral principles of the Mesh is bilateral authentication. Every device that is connected to a Mesh profile MUST authenticate the profile it is connecting to and every Mesh profile administrator MUST authenticate devices that are connected. Having created the necessary profile, the device MUST verify that it is connecting to the correct Mesh profile. The best mechanism for achieving this purpose depends on the capabilities of the device being connected. The administration device obviously requires some means of communicating with the user to serve its function. But the device being connected may have a limited display capability or no user interaction capability at all. If the device has user input and display capabilities, it can verify that it is connecting to the correct display by first requesting the user enter the portal account of the profile they wish to connect to, retreiving the profile associated with the device and displaying the profile fingerprint. The client requests the profile for the requested account name: The response contains the requested profile information. Having received the profile data, the user can then verify that the device is attempting to connect to the correct profile by verifying that the fingerprint shown by the device attempting to connect is correct. Connection of an Internet of Things 'IoT' device that does not have the ability to accept user input requires a mechanism by which the user can identify the device they wish to connect to their profile and a mechanism to authenticate the profile to the device. If the connecting device has a wired communication capability such as a USB port, this MAY be used to effect the device connection using a standardized interaction profile. But an increasing number of constrained IoT devices are only capable of wireless communication. Configuration of such devices for the purpose of the Mesh requires that we also consider configuration of the wireless networking capabilities at the same time. The precise mechanism by which this is achieved is therefore outside the scope of this particular document. However prototypes have been built and are being considered that make use of some or all of the following communication techniques: Wired serial connection (RS232, RS485). DHCP signalling. Machine readable device identifiers (barcodes, QRCodes). Default device profile installed during manufacture. Optical communication path using camera on administrative device and status light on connecting device to communicate the device identifier, challenge nonce and confirm profile fingerprint. Speech output on audio capable connecting device. After the user verifies the device fingerprint as correct, the client posts a device connection request to the portal: The portal verifies that the request is accepable and returns the transaction result: The client can poll the portal for the status of pending requests at any time (modulo any service throttling restrictions at the service side). But the request status will only change when an update is posted by an administration device. Since the user is typically connecting a device to their profile, the next step in connecting the device is to start the administration client. When started, the client polls for pending connection requests using ConnectPendingRequest. The service responds with a list of pending requests: The device profile is added to the Personal profile which is then signed by the online signing key. The administration client publishes the updated profile to the Mesh through the portal: As usual, the service returns the response code: Having accepted the device and connected it to the profile, the administration client creates and signs a connection completion result which is posted to the portal using ConnectCompleteRequest: Again, the service returns the response code: As stated previously, the connecting device polls the portal periodically to determine the status of the pending request using ConnectStatusRequest: If the response is that the connection status has not changed, the service MAY return a response that specifies a minimum retry interval. In this case however there is a connection result: [Should probably unpack further.] Application profiles are published separately from the personal profile to which they are linked. This allows a device to be given administration capability for a particular application without granting administration capability for the profile itself and the ability to connect additional profiles and devices. Another advantage of this separation is that an application profile might be managed by a separate party. In an enterprise, the application profile for a user's corporate email account could be managed by the corporate IT department. A user MAY have multiple application profiles for the same application. If a user has three email accounts, they would have three email application profiles, one for each account. In this example, the user has requested a PaswordProfile to be created. When populated, this records the usernames and passwords for the various Web sites that the user has created accounts at and has requested the Web browser store in the Mesh. Unlike a traditional password management service, the data stored the Password Profile is encrypted end to end and can only be decrypted by the devices that hold a decryption key. The application profile is published to the Mesh in the same way as any other profile update, via a a Publish transaction: The service returns a status response. Note that the degree of verification to be performed by the service when an application profile is published is an open question. Having created the application profile, the administration client adds it to the personal profile and publishes it: Note that if the publication was to happen in the reverse order, with the personal profile being published before the application profile, the personal profile might be rejected by the portal for inconsistency as it links to a non existent application profile. Though the value of such a check is debatable. It might well be preferable to not make such checks as it permits an application profile to have a degree of anonymity. The Mesh invites users to put all their data eggs in one cryptographic basket. If the private keys in their master profile are lost, they could lose all their digital assets. The debate over the desirability of key escrow is a complex one. Not least because voluntary key escrow by the user to protect the user's digital assets is frequently conflated with mechanisms to support 'Lawful Access' through government managed backdoors. Accidents happen and so do disasters. For most users and most applications, data loss is a much more important concern than data disclosure. The option of using a robust key recovery mechanism is therefore essential for use of strong cryptography is to become ubiquitous. There are of course circumstances in which some users may prefer to risk losing some of their data rather than risk disclosure. Since any key recovery infrastructure necessarily introduces the risk of coercion, the choice of whether to use key recovery or not is left to the user to decide. The Mesh permits users to escrow their private keys in the Mesh itself in an OfflineEscrowEntry. Such entries are encrypted using the strongest degree of encryption available under a symmetric key. The symmetric key is then in turn split using Shamir secret sharing using an n of m threshold scheme. The OfflineEscrowEntry identifier is a UDF fingerprint of the symmetric key used to encrypt the data. This guarantees that a party that has the decryption key has the ability to locate the corresponding Escrow entry. The OfflineEscrowEntry is published using the usual Publish transaction: The response indicates success or failure: To recover a profile, the user MUST supply the necessary number of secret shares. These are then used to calculate the UDF fingerprint to use as the locator in a Get transaction: If the transaction succeeds, GetResponse is returned with the requested data. The client can now decrypt the OfflineEscrowEntry to recover the private key(s). Can be performed by any party that is a participant in the InterMesh protocol or subsequently in an offline transaction.Security Considerations are addressed in the companion document [draft-hallambaker-mesh-reference-02]IANA Considerations are addressed in the companion document [draft-hallambaker-mesh-reference-02]Comodo Group: Egemen Tas, Melhi Abdulhayo?lu, Rob Stradling, Robin Alden.Key words for use in RFCs to Indicate Requirement LevelsInternet Message FormatNetwork News Transfer Protocol (NNTP)The JavaScript Object Notation (JSON) Data Interchange FormatCertificate Transparency[Reference Not Found!][Reference Not Found!]Network News Transfer Protocol (NNTP) Extension for Streaming Feeds[Reference Not Found!]