iptv techs

IPTV Techs

  • Home
  • Tech News
  • Should We Chat, Too? Security Analysis of WeChat’s MMTLS Encryption Protocol

Should We Chat, Too? Security Analysis of WeChat’s MMTLS Encryption Protocol


Should We Chat, Too? Security Analysis of WeChat’s MMTLS Encryption Protocol


  • We carry outed the first uncover analysis of the security and privacy properties of MMTLS, the main nettoil protocol employd by WeChat, an app with over one billion monthly active employrs.
  • We set up that MMTLS is a modified version of TLS 1.3, with many of the modifications that WeChat enbigers made to the cryptography introducing frailnesses.
  • Further analysis discdisthink abouted that earlier versions of WeChat employd a less safe, custom-summarizeed protocol that retains multiple vulnerabilities, which we portray as “Business-layer encryption”. This layer of encryption is still being employd in compriseition to MMTLS in conmomentary WeChat versions.
  • Although we were unable to enbig an strike to finishly fall shorture WeChat’s encryption, the carry outation is inreliable with the level of cryptography you would predict in an app employd by a billion employrs, such as its employ of deterministic IVs and alertage of forward secrecy.
  • These discoverings give to a bigr body of toil that recommends that apps in the Chinese ecosystem fall short to adchoose cryptodetailed best trains, chooseing instead to originate their own, frequently problematic systems.
  • We are releasing technical tools and further recordation of our technical methodologies in an guideing Github repository. These tools and records, aextfinished with this main increate, will aid future researchers to study WeChat’s inner toilings.

WeChat, with over 1.2 billion monthly active employrs, stands as the most well-comprehendn messaging and social media platcreate in China and third globpartner. As showd by labelet research, WeChat’s nettoil traffic accounted for 34% of Chinese mobile traffic in 2018. WeChat’s dominance has monopolized messaging in China, making it increasingly uneludeable for those in China to employ. With an ever-enbiging array of features, WeChat has also increasen beyond its innovative purpose as a messaging app.

Despite the universality and convey inance of WeChat, there has been little study of the proprietary nettoil encryption protocol, MMTLS, employd by the WeChat application. This comprehendledge gap serves as a barrier for researchers in that it hampers compriseitional security and privacy study of such a critical application. In compriseition, homerolled cryptography is unblessedly common in many incredibly well-comprehendn Chinese applications, and there have historicpartner been rehires with cryptosystems enbiged self-reliantly of well-tested standards such as TLS.

This toil is a convey inant dive into the mechanisms behind MMTLS and the core toilings of the WeChat program. We appraise the security and carry outance of MMTLS to TLS 1.3 and converse our overall discoverings. We also supply uncover recordation and tooling to decrypt WeChat nettoil traffic. These tools and records, aextfinished with our increate, will aid future researchers to study WeChat’s privacy and security properties, as well as its other inner toilings.

This increate consists of a technical description of how WeChat startes a nettoil seek and its encryption protocols, complyed by a summary of frailnesses in WeChat’s protocol, and finpartner a high-level converseion of WeChat’s summarize choices and their impact. The increate is intfinished for privacy, security, or other technical researchers interested in furthering the privacy and security study of WeChat. For non-technical audiences, we have abridged our discoverings in this FAQ.

Prior toil on MMTLS and WeChat carry security

Code inner to the WeChat mobile app refers to its proprietary TLS stack as MMTLS (MM is low for MicroMessenger, which is a straightforward translation of 微信, the Chinese name for WeChat) and employs it to encrypt the bulk of its traffic.

There is restrictcessitate uncover recordation of the MMTLS protocol. This technical record from WeChat enbigers portrays in which ways it is aappreciate and contrastent from TLS 1.3, and trys to equitableify various decisions they made to either clarify or alter how the protocol is employd. In this record, there are various key contrastences they remend between MMTLS and TLS 1.3, which help us comprehend the various modes of usage of MMTLS.

Wan et al. carry outed the most comprehensive study of WeChat carry security in 2015 using standard security analysis techniques. However, this analysis was carry outed before the deployment of MMTLS, WeChat’s upgraded security protocol. In 2019, Chen et al. studied the login process of WeChat and particularpartner studied packets that are encrypted with TLS and not MMTLS.

As for MMTLS itself, in 2016 WeChat enbigers unveiled a record describing the summarize of the protocol at a high level that appraises the protocol with TLS 1.3. Other MMTLS uncoverations intensify on website fingerprinting-type strikes, but none particularpartner carry out a security evaluation. A scant Github repositories and blog posts watch alertly into the wire createat of MMTLS, though none are comprehensive. Though there has been little toil studying MMTLS particularpartner, previous Citizen Lab increates have uncovered security flaws of other cryptodetailed protocols summarizeed and carry outed by Tencent.

We verifyd two versions of WeChat Android app:

  • Version 8.0.23 (APK “versionCode” 2160) freed on May 26, 2022, downloaded from the WeChat website.
  • Version 8.0.21 (APK “versionCode” 2103) freed on April 7, 2022, downloaded from Google Play Store.

All discoverings in this increate apply to both of these versions.

We employd an account sign uped to a U.S. phone number for the analysis, which alters the behavior of the application appraised to a mainland Chinese number. Our setup may not be reconshort-termative of all WeChat employrs, and the brimming restrictations are converseed further below.

For active analysis, we verifyd the application inshighed on a rooted Google Pixel 4 phone and an emutardyd Android OS. We employd Frida to hook the app’s functions and maniputardy and send out application memory. We also carry outed nettoil analysis of WeChat’s nettoil traffic using Wireshark. However, due to WeChat’s employ of nonstandard cryptodetailed libraries appreciate MMTLS, standard nettoil traffic analysis tools that might toil with HTTPS/TLS do not toil for all of WeChat’s nettoil activity. Our employ of Frida was paramount for capturing the data and increateation flows we detail in this increate. These Frida scripts are summarizeed to intercept WeChat’s seek data promptly before WeChat sfinishs it to its MMTLS encryption module. The Frida scripts we employd are unveiled in our Github repository.

For inactive analysis, we employd Jadx, a well-comprehendn Android decompiler, to decompile WeChat’s Android Dex files into Java code. We also employd Ghidra and IDA Pro to decompile the native libraries (written in C++) bundled with WeChat.

Notation

In this increate, we reference a lot of code from the WeChat app. When we reference any code (including file names and paths), we will style the text using monospace fonts to show it is code. If a function is referenced, we will comprise vacant parentheses after the function name, appreciate this: somefunction(). The names of variables and functions that we show may come from one of the three complying:

  1. The innovative decompiled name.
  2. In cases where the name cannot be decompiled into a unkindingful string (e.g., the symbol name was not compiled into the code), we rename it according to how the proximateby inner log messages reference it.
  3. In cases where there is not enough increateation for us to increate the innovative name, we name it according to our caring of the code. In such cases, we will notice that these names are given by us.

In the cases where the decompiled name and log message name of functions are useable, they are generpartner reliable. Bbettered or italicized terms can refer to higher-level concepts or parameters we have named.

Utilization of discdisthink about source components

We also identified discdisthink about source components being employd by the project, the two bigst being OpenSSL and Tencent Mars. Based on our analysis of decompiled WeChat code, big parts of its code are identical to Mars. Mars is an “infrastructure component” for mobile applications, providing common features and abstractions that are necessitateed by mobile applications, such as nettoiling and logging.

By compiling these libraries splitly with debug symbols, we were able to convey in function and class definitions into Ghidra for further analysis. This helped tremfinishously to our caring of other non-discdisthink about-source code in WeChat. For instance, when we were analyzing the nettoil functions decompiled from WeChat, we set up a lot of them to be highly aappreciate to the discdisthink about source Mars, so we could equitable read the source code and comments to comprehend what a function was doing. What was not included in discdisthink about source Mars are encryption roverhappinessed functions, so we still necessitateed to read decompiled code, but even in these cases we were aided by various functions and structures that we already comprehend from the discdisthink about source Mars.

Matching decompiled code to its source

In the inner logging messages of WeChat, which retain source file paths, we seed three top level straightforwardories, which we have highairyed below:

  • /home/android/devopsAgent/toilspace/p-e118ef4209d745e1b9ea0b1daa0137ab/src/mars/
  • /home/android/devopsAgent/toilspace/p-e118ef4209d745e1b9ea0b1daa0137ab/src/mars-wechat/
  • /home/android/devopsAgent/toilspace/p-e118ef4209d745e1b9ea0b1daa0137ab/src/mars-personal/

The source files under “mars” can all be set up in the discdisthink about source Mars repository as well, while source files in the other two top level straightforwardories cannot be set up in the discdisthink about source repository. To show, below is a petite section of decompiled code from libwechatnettoil.so :

    XLogger::XLogger((XLogger *)&local_2c8,5,"mars::stn",

"/home/android/devopsAgent/toilspace/p-e118ef4209d745e1b9ea0b1daa0137ab/src/mars/mars/stn/src/extfinishedconnect.cc"
                ,"Sfinish",0xb2,dishonest,(FuncDef0 *)0x0);
    XLogger::Assert((XLogger *)&local_2c8,"tracker_.get()");
    XLogger::~XLogger((XLogger *)&local_2c8);

From its aappreciateity, is highly predicted that this section of code was compiled from this line in the Sfinish() function, detaild in extfinishedconnect.cc file from the discdisthink about source repository:

xdeclare2(tracker_.get());

Reusing this observation, whenever our decompiler is unable to remend the name of a function, we can employ logging messages wiskinny the compiled code to remend its name. Moreover, if the source file is from discdisthink about source Mars, we can read its source code as well.

Three parts of Mars

In a scant articles on the Mars wiki, Tencent enbigers supplyd the complying motivations to enbig Mars:

According to its enbigers, Mars and its STN module are comparable to nettoiling libraries such as AFNettoiling and OkHttp, which are expansively employd in other mobile apps.

One of the technical articles freed by the WeChat enbigment team wrote about the process of discdisthink about-sourcing Mars. According to the article, they had to split WeChat-particular code, which was kept personal, from the ambiguous employ code, which was discdisthink about sourced. In the finish, three parts were splitd from each other:

  • mars-discdisthink about: to be discdisthink about sourced, self-reliant repository.
  • mars-personal: potentipartner discdisthink about sourced, depfinishs on mars-discdisthink about.
  • mars-wechat: WeChat business logic code, depfinishs on mars-discdisthink about and mars-personal.

These three names align the top level straightforwardories we set up earlier if we consent “mars-discdisthink about” to be in the “mars” top-level straightforwardory. Using this comprehendledge, when reading decompiled WeChat code, we could easily comprehend whether it was WeChat-particular or not. From our reading of the code, mars-discdisthink about retains straightforward and generic structures and functions, for instance, buffer structures, config stores, thread administerment and, most convey inantly, the module named “STN” reliable for nettoil transmission. (We were unable to remend what STN stands for.) On the other hand, mars-wechat retains the MMTLS carry outation, and mars-personal is not shutly roverhappinessed to the features wiskinny our research scope.

As a technical side notice, the discdisthink about source Mars compiles to equitable one object file named “libmarsstn.so”. However, in WeChat, multiple dispensed object files reference code wiskinny the discdisthink about source Mars, including the complying:

  • libwechatxlog.so
  • libwechatbase.so
  • libwechataccessory.so
  • libwechathttp.so
  • liprohibitdromeda.so
  • libwechatmm.so
  • libwechatnettoil.so

Our research intensifyes on the carry protocol and encryption of WeChat, which is carry outed mainly in libwechatmm.so and libwechatnettoil.so. In compriseition, we verifyed libMMProtocalJni.so, which is not part of Mars but retains functions for cryptodetailed calculations. We did not verify the other dispensed object files.

Matching Mars versions

Despite being able to discover discdisthink about source code to parts of WeChat, in the commencening of our research, we were unable to pinpoint the particular version of the source code of mars-discdisthink about that was employd to originate WeChat. Later, we set up version strings retained in libwechatnettoil.so. For WeChat 8.0.21, searching for the string “MARS_” createed the complying:

MARS_BRANCH: HEAD
MARS_COMMITID: d92f1a94604402cf03939dc1e5d3af475692b551
MARS_PRIVATE_BRANCH: HEAD
MARS_PRIVATE_COMMITID: 193e2fb710d2bb42448358c98471cd773bbd0b16
MARS_URL:
MARS_PATH: HEAD
MARS_REVISION: d92f1a9
MARS_BUILD_TIME: 2022-03-28 21:52:49
MARS_BUILD_JOB: rb/2022-MAR-p-e118ef4209d745e1b9ea0b1daa0137ab-22.3_1040

The particular MARS_COMMITID (d92f1a…) exists in the discdisthink about source Mars repository. This version of the source code also alignes the decompiled code.

Pinpointing the particular source code version helped us tremfinishously with Ghidra’s decompilation. Since a lot of the core data structures employd in WeChat are from Mars, by convey ining the comprehendn data structures, we can watch the non-discdisthink about-sourced code accessing structure fields, and inferring its purpose.

Limitations

This spendigation only watchs at client behavior and is therefore subject to other common restrictations in privacy research that can only carry out client analysis. Much of the data that the client broadcasts to WeChat servers may be demandd for functionality of the application. For instance, WeChat servers can certainly see chat messages since WeChat can censor them according to their satisfyed. We cannot always meadeclareive what Tencent is doing with the data that they accumulate, but we can originate inferences about what is possible. Previous toil has made certain restrictcessitate inferences about data sharing, such as that messages sent by non-mainland-Chinese employrs are employd to train restriction algorithms for mainland Chinese employrs. In this increate, we intensify on the version of WeChat for non-mainland-Chinese employrs.

Our spendigation was also restrictcessitate due to legitimate and moral constraints. It has become increasingly difficult to get Chinese phone numbers for spendigation due to the cut offe phone number and associated rulement ID demandments. Therefore, we did not test on Chinese phone numbers, which caemploys WeChat to behave contrastently. In compriseition, without a mainland Chinese account, the types of engageion with certain features and Mini Programs were restrictcessitate. For instance, we did not carry out financial transactions on the application.

Our primary analysis was restrictcessitate to analyzing only two versions of WeChat Android (8.0.21 and 8.0.23). However, we also re-validateed our tooling toils on WeChat 8.0.49 for Android (freed April 2024) and that the MMTLS nettoil createat alignes that employd by WeChat 8.0.49 for iOS. Testing contrastent versions of WeChat, the backwards-compatibility of the servers with betterer versions of the application, and testing on a variety of Android operating systems with variations in API version, are wonderful avenues for future toil.

Wiskinny the WeChat Android app, we intensifyed on its nettoiling components. Usupartner, wiskinny a mobile application (and in most other programs as well), all other components will postpone the toil of communicating over the nettoil to the nettoiling components. Our research is not a finish security and privacy audit of the WeChat app, as even if the nettoil communication is properly shielded, other parts of the app still necessitate to be safe and personal. For instance, an app would not be safe if the server accomprehendledges any password to an account login, even if the password is braveipartner broadcastted.

In the Github repository, we have freed tooling that can log keys using Frida and decrypt nettoil traffic that is seized during the same period of time, as well as samples of decrypted payloads. In compriseition, we have supplyd compriseitional recordation and our reverse-engineering notices from studying the protocol. We hope that these tools and recordation will further aid researchers in the study of WeChat.

As with any other apps, WeChat is writed of various components. Components wiskinny WeChat can call upon the nettoiling components to sfinish or get nettoil transmissions. In this section, we supply a highly simplified description of the process and components surrounding sfinishing a nettoil seek in WeChat. The actual process is much more complicated, which we elucidate in more detail in a split record. The particulars of data encryption is converseed in the next section “WeChat nettoil seek encryption”.

In the WeChat source code, each API is referred to as a contrastent “Scene”. For instance, during the registration process, there is one API that surrfinishers all recent account increateation supplyd by the employr, called NetSceneReg. NetSceneReg is referred to by us as a “Scene class”, Other components could commence a nettoil seek towards an API by calling the particular Scene class. In the case of NetSceneReg, it is usupartner call upond by a click event of a button UI component.

Upon invocation, the Scene class would ready the seek data. The structure of the seek data (as well as the response) is detaild in “RR classes”. (We dub them RR classes becaemploy they tfinish to have “ReqResp” in their names.) Usupartner, one Scene class would correply to one RR class. In the case of NetSceneReg, it correplys to the RR class MMReqRespReg2, and retains fields appreciate the desired employrname and phone number. For each API, its RR class also details a one-of-a-kind inner URI (usupartner commenceing with “/cgi-bin”) and a “seek type” number (an approximately 2–4 digit integer). The inner URI and seek type number is frequently employd thcimpoliteout the code to remend contrastent APIs. Once the data is readyd by the Scene class, it is sent to MMNativeNetTaskAdapter.

MMNativeNetTaskAdapter is a task queue administerr, it administers and sees the enhance of each nettoil connection and API seeks. When a Scene Class calls MMNativeNetTaskAdapter, it places the recent seek (a task) onto the task queue, and calls the req2Buf() function. req2Buf() serializes the seek Protobuf object that was readyd by the Scene Class into bytes, then encrypts the bytes using Business-layer Encryption.

Finpartner, the resultant ciphertext from Business-layer encryption is sent to the “STN” module, which is part of Mars. STN then encrypts the data aget using MMTLS Encryption. Then, STN set upes the nettoil carry connection, and sfinishs the MMTLS Encryption ciphertext over it. In STN, there are two types of carry connections: Shortconnect and Longconnect. Shortconnect refers to an HTTP connection that carries MMTLS ciphertext. Shortconnect connections are shutd after one seek-response cycle. Longconnect refers to a extfinished-inhabitd TCP connection. A Longconnect connection can carry multiple MMTLS encrypted seeks and responses without being shutd.

WeChat nettoil seeks are encrypted twice, with contrastent sets of keys. Serialized seek data is first encrypted using what we call the Business-layer Encryption, as inner encryption is referred to in this blog post as occurring at the Business-layer. The Business-layer Encryption has two modes: Symmetric Mode and Asymmetric Mode. The resultant Business-layer-encrypted ciphertext is appfinished to metadata about the Business-layer seek. Then, the Business-layer seeks (i.e., seek metadata and inner ciphertext) are compriseitionpartner encrypted, using MMTLS Encryption. The final resulting ciphertext is then serialized as an MMTLS Request and sent over the wire.

WeChat’s nettoil encryption system is discombineted and seems to still be a combination of at least three contrastent cryptosystems. The encryption process portrayd in the Tencent recordation mostly alignes our discoverings about MMTLS Encryption, but the record does not seem to portray in detail the Business-layer Encryption, whose operation contrasts when logged-in and when logged-out. Logged-in clients employ Symmetric Mode while logged-out clients employ Asymmetric Mode. We also watchd WeChat utilizing HTTP, HTTPS, and QUIC to broadcast big, inactive resources such as translation strings or broadcastted files. The finishpoint structures for these communications are contrastent from MMTLS server structures. Their domain names also recommend that they beextfinished to CDNs. However, the finishpoints that are fascinating to us are those that download activepartner originated, frequently braveial resources (i.e., originated by the server on every seek) or finishpoints where employrs broadcast, frequently braveial, data to WeChat’s servers. These types of transmissions are made using MMTLS.

As a final carry outation notice, WeChat, atraverse all these cryptosystems, employs inner OpenSSL attachings that are compiled into the program. In particular, the libwechatmm.so library seems to have been compiled with OpenSSL version 1.1.1l, though the other libraries that employ OpenSSL attachings, namely libMMProtocalJni.so and libwechatnettoil.so were not compiled with the OpenSSL version strings. We notice that OpenSSL inner APIs can be confusing and are frequently misemployd by well-intentioned enbigers. Our brimming notices about each of the OpenSSL APIs that are employd can be set up in the Github repository.

In Table 1, we have abridged each of the relevant cryptosystems, how their keys are derived, how encryption and genuineation are accomplishd, and which libraries retain the relevant encryption and genuineation functions. We will converse cryptosystem’s details in the coming sections.

Key derivation Encryption Authentication Library Functions that carry out the symmetric encryption
MMTLS, Longconnect Diffie-Hellman (DH) AES-GCM AES-GCM tag libwechatnettoil.so Crypt()
MMTLS, Shortconnect DH with session resumption AES-GCM AES-GCM tag libwechatnettoil.so Crypt()
Business-layer, Asymmetric Mode Static DH with recent client keys AES-GCM AES-GCM tag libwechatmm.so HybridEcdhEncrypt(), AesGcmEncryptWithCompress()
Business-layer, Symmetric Mode Fixed key from server AES-CBC Checksum + MD5 libMMProtocalJNI.so pack(), EncryptPack(), genSignature()

Table 1: Overwatch of contrastent cryptosystems for WeChat nettoil seek encryption, how keys are derived, how encryption and genuineation are carry outed, and which libraries carry out them.

1. MMTLS Wire Format

Since MMTLS can go over various carrys, we refer to an MMTLS packet as a unit of correplyence wiskinny MMTLS. Over Longconnect, MMTLS packets can be split atraverse multiple TCP packets. Over Shortconnect, MMTLS packets are generpartner retained wiskinny an HTTP POST seek or response body.1

Each MMTLS packet retains one or more MMTLS sign ups (which are aappreciate in structure and purpose to TLS sign ups). Records are units of messages that carry handshake data, application data, or vigilant/error message data wiskinny each MMTLS packet.

1A. MMTLS Records

Records can be identified by contrastent sign up headers, a mended 3-byte sequence preceding the sign up satisfyeds. In particular, we watchd 4 contrastent sign up types, with the correplying sign up headers:

Handshake-Resumption Record 19 f1 04
Handshake Record 16 f1 04
Data Record 17 f1 04
Alert Record 15 f1 04

Handshake sign ups retain metadata and the key set upment material necessitateed for the other party to derive the same dispensed session key using Diffie-Hellman. Handshake-Resumption sign up retains enough metadata for “resuming” a previously set uped session, by re-using previously set uped key material. Data sign ups can retain encrypted ciphertext that carries unkindingful WeChat seek data. Some Data packets srecommend retain an encrypted no-op heartbeat. Alert sign ups show errors or show that one party intfinishs to finish a connection. In MMTLS, all non-handshake sign ups are encrypted, but the key material employd contrasts based on which stage of the handshake has been finishd.

Here is an annotated MMTLS packet from the server retaining a Handshake sign up:

Here is an example of a Data sign up sent from the client to the server:

To give an example of how these sign ups engage, generpartner the client and server will exalter Handshake sign ups until the Diffie-Hellman handshake is finish and they have set uped dispensed key material. Afterwards, they will exalter Data sign ups, encrypted using the dispensed key material. When either side wants to shut the connection, they will sfinish an Alert sign up. More illustrations of each sign up type’s usage will be made in the complying section.

1B. MMTLS Extensions

As MMTLS’ wire protocol is heavily modeled after TLS, we notice that it has also borrowed the wire createat of “TLS Extensions” to exalter relevant encryption data during the handshake. Specificpartner, MMTLS employs the same createat as TLS Extensions for the Client to convey their key dispense (i.e. the client’s uncover key) for Diffie-Hellman, aappreciate to TLS 1.3’s key_dispense extension, and to convey session data for session resumption (aappreciate to TLS 1.3’s pre_dispensed_key extension). In compriseition, MMTLS has help for Encrypted Extensions, aappreciate to TLS, but they are currently not employd in MMTLS (i.e., the Encrypted Extensions section is always vacant).

2. MMTLS Encryption

This section portrays the outer layer of encryption, that is, what keys and encryption functions are employd to encrypt and decrypt the ciphertexts set up in the MMTLS Wire Format” section, and how the encryption keys are derived.

The encryption and decryption at this layer occurs in the STN module, in a split spawned “com.tencent.mm:push”2 process on Android. The spawned process ultimately broadcasts and gets data over the nettoil. The code for all of the MMTLS Encryption and MMTLS serialization were verifyd from the library libwechatnettoil.so. In particular, we studied the Crypt() function, a central function employd for all encryption and decryption whose name we derived from debug logging code. We also hooked all calls to HKDF_Extract() and HKDF_Expand(), the OpenSSL functions for HKDF, in order to comprehend how keys are derived.

When the “:push” process is spawned, it commences an event loop in HandshakeLoop(), which processes all frifinishly and incoming MMTLS Records. We hooked all functions called by this event loop to comprehend how each MMTLS Record is processed. The code for this study, as well as the inner function compriseresses identified for the particular version of WeChat we studied, can be set up in the Github repository.

Figure 1: Nettoil seeks: MMTLS encryption connection over extfinishedconnect and over lowconnect. Each box is an MMTLS Record, and each arrow reconshort-terms an “MMTLS packet” sent over either Longconnect (i.e., a one TCP packet) or lowconnect (i.e., in the body of HTTP POST). Once both sides have getd the DH keydispense, all further sign ups are encrypted.

2A. Handshake and key set upment

In order for Business-layer Encryption to commence sfinishing messages and set up keys, it has to employ the MMTLS Encryption tunnel. Since the key material for the MMTLS Encryption has to be set uped first, the handshakes in this section happen before any data can be sent or encrypted via Business-layer Encryption. The finish goal of the MMTLS Encryption handshake converseed in this section is to set up a common secret cherish that is comprehendn only to the client and server.

On a recent commenceup of WeChat, it tries to finish one MMTLS handshake over Shortconnect, and one MMTLS handshake over Longconnect, resulting in two MMTLS encryption tunnels, each using contrastent sets of encryption keys. For Longconnect, after the handshake finishs, the same Longconnect (TCP) connection is kept discdisthink about to carry future encrypted data. For Shortconnect, the MMTLS handshake is finishd in the first HTTP seek-response cycle, then the first HTTP connection shuts. The set uped keys are stored by the client and server, and when data necessitates to be sent over Shortconnect, those set uped keys are employd for encryption, then sent over a recently set uped Shortconnect connection. In the remainder of this section, we portray details of the handshakes.

ClientHello

First, the client originates keypairs on the SECP256R1 elliptic curve. Note that these elliptic curve keys are entidepend split pairs from those originated in the Business-layer Encryption section. The client also reads some Resumption Ticket data from a file stored on local storage named psk.key, if it exists. The psk.key file is written to after the first ServerHello is getd, so, on a recent inshigh of WeChat, the resumption ticket is leave outted from the ClientHello.

The client first simultaneously sfinishs a ClientHello message (retained in a Handshake sign up) over both the Shortconnect and Longconnect. The first of these two handshakes that finishs successbrimmingy is the one that the initial Business-layer Encryption handshake occurs over (details of Business-layer Encryption are converseed in Section 4). Both Shortconnect and Longconnect connections are employd afterwards for sfinishing other data.

In both the initial Shortconnect and Longconnect handshake, each ClientHello packet retains the complying data items:

  • ClientRandom (32 bytes of randomness)
  • Resumption Ticket data read from psk.key, if useable
  • Client uncover key

An condensed version of the MMTLS ClientHello is shown below.

16 f1 04 (Handshake Record header) . . .
01 04 f1 (ClientHello) . . .
08 cd 1a 18 f9 1c . . . (ClientRandom) . . .
00 0c c2 78 00 e3 . . . (Resumption Ticket from psk.key) . . .
04 0f 1a 52 7b 55 . . . (Client uncover key) . . .

Note that the client originates a split keypair for the Shortconnect ClientHello and the Longconnect ClientHello. The Resumption Ticket sent by the client is the same on both ClientHello packets becaemploy it is always read from the same psk.key file. On a recent inshigh of WeChat, the Resumption Ticket is leave outted since there is no psk.key file.

ServerHello

The client gets a ServerHello packet in response to each ClientHello packet. Each retains:

  • A sign up retaining ServerRandom and Server uncover key
  • Records retaining encrypted server certificate, recent resumption ticket, and a ServerFinished message.

An condensed version of the MMTLS ServerHello is shown below; a brimming packet sample with tags can be set up in the annotated nettoil seize.

16 f1 04 (Handshake Record header) . . .
02 04 f1 (ServerHello) . . .
2b a6 88 7e 61 5e 27 eb . . . (ServerRandom) . . .
04 fa e3 dc 03 4a 21 d9 . . . (Server uncover key) . . .
16 f1 04 (Handshake Record header) . . .
b8 79 a1 60 be 6c . . . (ENCRYPTED server certificate) . . .
16 f1 04 (Handshake Record header) . . .
1a 6d c9 dd 6e f1 . . . (ENCRYPTED NEW resumption ticket) . . .
16 f1 04 (Handshake Record header) . . .
b8 79 a1 60 be 6c . . . (ENCRYPTED ServerFinished) . . .

On receiving the server uncover key, the client originates

secret = ecdh(client_personal_key, server_uncover_key).

Note that since each MMTLS encrypted tunnel employs a contrastent pair of client keys, the dispensed secret, and any derived keys and IVs will be contrastent between MMTLS tunnels. This also unkinds Longconnect handshake and Shortconnect handshake each compute a contrastent dispensed secret.

Then, the dispensed secret is employd to derive cut offal sets of cryptodetailed parameters via HKDF, a mathematicpartner safe way to alter a low secret cherish into a extfinished secret cherish. In this section, we will intensify on the handshake parameters. Aextfinishedside each set of keys, initialization vectors (IVs) are also originated. The IV is a cherish that is necessitateed to initialize the AES-GCM encryption algorithm. IVs do not necessitate to be kept secret. However, they necessitate to be random and not reemployd.

The handshake parameters are originated using HKDF (“handshake key expansion” is a constant string in the program, as well as other monotype double quoted strings in this section):

key_enc, key_dec, iv_enc, iv_dec = HKDF(secret, 56, “handshake key expansion”)

Using key_dec and iv_dec, the client can decrypt the remainder of the ServerHello sign ups. Once decrypted, the client validates the server certificate. Then, the client also saves the recent Resumption Ticket to the file psk.key.

At this point, since the dispensed secret has been set uped, the MMTLS Encryption Handshake is pondered finishd. To commence encrypting and sfinishing data, the client derives other sets of parameters via HKDF from the dispensed secret. The details of which keys are derived and employd for which connections are brimmingy specified in these notices where we annotate the keys and connections originated on WeChat commenceup.

2B. Data encryption

After the handshake, MMTLS employs AES-GCM with a particular key and IV, which are tied to the particular MMTLS tunnel, to encrypt data. The IV is incremented by the number of sign ups previously encrypted with this key. This is convey inant becaemploy re-using an IV with the same key demolishs the braveiality supplyd in AES-GCM, as it can guide to a key recovery strike using the comprehendn tag.

ciphertext, tag = AES-GCM(input, key, iv+n)
ciphertext = ciphertext | tag

The 16-byte tag is appfinished to the finish of the ciphertext. This tag is genuineation data computed by AES-GCM; it functions as a MAC in that when verified properly, this data supplys genuineation and integrity. In many cases, if this is a Data sign up being encrypted, input retains metadata and ciphertext that has already been encrypted as portrayd in the Business-layer Encryption section.

We splitly converse data encryption in Longconnect and Shortconnect in the complying subsections.

Client-side Encryption for Longconnect packets is done using AES-GCM with key_enc and iv_enc derived earlier in the handshake. Client-side Decryption employs key_dec and iv_dec. Below is a sample Longconnect (TCP) packet retaining a one data sign up retaining an encrypted heartbeat message from the server3:

17 f1 04     RECORD HEADER (of type “DATA”)
00 20                                           RECORD LENGTH
e6 55 7a d6 82 1d a7 f4 2b 83 d4 b7 78 56 18 f3         ENCRYPTED DATA
1b 94 27 e1 1e c3 01 a6 f6 23 6a bc 94 eb 47 39             TAG (MAC)

Wiskinny a extfinished-inhabitd Longconnect connection, the IV is incremented for each sign up encrypted. If a recent Longconnect connection is originated, the handshake is recommenceed and recent key material is originated.

Shortconnect connections can only retain a one MMTLS packet seek and a one MMTLS packet response (via HTTP POST seek and response, admireively). After the initial Shortconnect ClientHello sent on commenceup, WeChat will sfinish ClientHello with Handshake Resumption packets. These sign ups have the header 19 f1 04 instead of the 16 f1 04 on the normal ClientHello/ServerHello handshake packets.

An condensed sample of a Shortconnect seek packet retaining Handshake Resumption is shown below.

19 f1 04 (Handshake Resumption Record header) . . .
01 04 f1 (ClientHello) . . .
9b c5 3c 42 7a 5b 1a 3b . . . (ClientRandom) . . .
71 ae ce ff d8 3f 29 48 . . . (NEW Resumption Ticket) . . .
19 f1 04 (Handshake Resumption Record header) . . .
47 4c 34 03 71 9e . . . (ENCRYPTED Extensions) . . .
17 f1 04 (Data Record header) . . .
98 cd 6e a0 7c 6b . . . (ENCRYPTED EarlyData) . . .
15 f1 04 (Alert Record header) . . .
8a d1 c3 42 9a 30 . . . (ENCRYPTED Alert (ClientFinished)) . . .

Note that, based on our caring of the MMTLS protocol, the ClientRandom sent in this packet is not employd at all by the server, becaemploy there is no necessitate to re-run Diffie-Hellman in a resumed session. The Resumption Ticket is employd by the server to remend which prior-set uped dispensed secret should be employd to decrypt the complying packet satisfyed.

Encryption for Shortconnect packets is done using AES-GCM with the handshake parameters key_enc and iv_enc. (Note that, despite their identical name, key_enc and iv_enc here are contrastent from those of the Longconnect, since Shortconnect and Longconnect each finish their own handshake using contrastent elliptic curve client keypair.) The iv_enc is incremented for each sign up encrypted. Usupartner, EarlyData sign ups sent over Shortconnect retain ciphertext that has been encrypted with Business-layer Encryption as well as associated metadata. This metadata and ciphertext will then be compriseitionpartner encrypted at this layer.

The reason this is referred to as EarlyData internpartner in WeChat is predicted due to it being borrowed from TLS; typicpartner, it refers to the data that is encrypted with a key derived from a pre-dispensed key, before the set upment of a normal session key via Diffie-Hellman. However, in this case, when using Shortconnect, there is no data sent “after the set upment of a normal session key”, so almost all Shortconnect data is encrypted and sent in this EarlyData section.

Finpartner, ClientFinished shows that the client has finished its side of the handshake. It is an encrypted Alert sign up with a mended message that always complys the EarlyData Record. From our reverse-engineering, we set up that the administerrs for this message referred to it as ClientFinished.

3. Business-layer Request

MMTLS Data Records either carry an “Business-layer seek” or heartbeat messages. In other words, if one decrypts the payload from an MMTLS Data Record, the result will frequently be messages portrayd below.

This Business-layer seek retains cut offal metadata parameters that portray the purpose of the seek, including the inner URI and the seek type number, which we alertly portrayd in the “Launching a WeChat nettoil seek” section.

When logged-in, the createat of a Business-layer seek watchs appreciate the complying:

00 00 00 7b                 (total data length)
00 24                       (URI length)
/cgi-bin/micromsg-bin/...   (URI)
00 12                       (structurename length)
sshine.wechat.com          (structurename)
00 00 00 3D                 (length of rest of data)
BF B6 5F                    (seek flags)
41 41 41 41                 (employr ID)
42 42 42 42                 (device ID)
FC 03 48 02 00 00 00 00     (cookie)
1F 9C 4C 24 76 0E 00        (cookie)
D1 05 varint                (seek_type)
0E 0E 00 02                 (4 more varints)
BD 95 80 BF 0D varint       (signature)
FE                          (flag)
80 D2 89 91
04 00 00                    (labels commence of data)
08 A6 29 D1 A4 2A CA F1 ... (ciphertext)

Responses are createatted very apredicted:

bf b6 5f                    (flags)
41 41 41 41                 (employr ID)
42 42 42 42                 (device ID)
fc 03 48 02 00 00 00 00     (cookie)
1f 9c 4c 24 76 0e 00        (cookie)
fb 02 varint                (seek_type)
35 35 00 02 varints
a9 ad 88 e3 08 varint       (signature)
fe
ba da e0 93
04 00 00                    (labels commence of data)
b6 f8 e9 99 a1 f4 d1 20 . . . ciphertext

This seek then retains another encrypted ciphertext, which is encrypted by what we refer to as Business-layer Encryption. Business-layer Encryption is split from the system we portrayd in the MMTLS Encryption section. The signature refered above is the output of genSignature(), which is converseed in the “Integrity verify” section. Pseudocode for the serialization schemes and more samples of WeChat’s encrypted seek header can be set up in our Github repository.

4. Business-layer Encryption

WeChat Crypto diagrams (inner layer)

This section portrays how the Business-layer seeks portrayd in Section 3 are encrypted and decrypted, and how the keys are derived. We notice that the set of keys and encryption processes startd in this section are finishly split from those referred to in the MMTLS Encryption section. Generpartner, for Business-layer Encryption, much of the protocol logic is administerd in the Java code, and the Java code calls out to the C++ libraries for encryption and decryption calculations. Whereas for MMTLS Encryption everyskinnyg is administerd in C++ libraries, and occurs on a contrastent process entidepend. There is very little interpercreate between these two layers of encryption.

The Business-layer Encryption has two modes using contrastent cryptodetailed processes: Asymmetric Mode and Symmetric Mode. To transition into Symmetric Mode, WeChat necessitates to carry out an Autoauth seek. Upon commenceup, WeChat typicpartner goes thcimpolite the three complying stages:

  1. Before the employr logs in to their account, Business-layer Encryption first employs asymmetric cryptography to derive a dispensed secret via inactive Diffie-Hellman (inactive DH), then employs the dispensed secret as a key to AES-GCM encrypt the data. We name this Asymmetric Mode. In Asymmetric Mode, the client derives a recent dispensed secret for each seek.
  2. Using Asymmetric Mode, WeChat can sfinish an Autoauth seek, to which the server would return an Autoauth response, which retains a session_key.
  3. After the client gets session_key, Business-layer Encryption employs it to AES-CBC encrypt the data. We name this Symmetric Mode since it only employs symmetric cryptography. Under Symmetric Mode, the same session_key can be employd for multiple seeks.

For Asymmetric Mode, we carry outed active and inactive analysis of C++ functions in libwechatmm.so; in particular the HybridEcdhEncrypt() and HybridEcdhDecrypt() functions, which call AesGcmEncryptWithCompress() / AesGcmDecryptWithUncompress(), admireively.

For Symmetric Mode, the seeks are administerd in pack(), unpack(), and genSignature() functions in libMMProtocalJNI.so. Generpartner, pack() administers frifinishly seeks, and unpack() administers incoming responses to those seeks. They also carry out encryption/decryption. Finpartner, genSignature() computes a verifysum over the brimming seek. In the Github repository, we’ve uploaded pseudocode for pack, AES-CBC encryption, and the genSignature routine.

The Business-layer Encryption is also safely fused with WeChat’s employr genuineation system. The employr necessitates to log in to their account before the client is able to sfinish an Autoauth seek. For clients that have not logged in, they exclusively employ Asymmetric Mode. For clients that have already logged in, their first Business-layer packet would most frequently be an Autoauth seek encrypted using Asymmetric Mode, however, the second and onward Business-layer packets are encrypted using Symmetric Mode.

Figure 2: Business-layer encryption, logged-out, logging-in, and logged-in: Swimlane diagrams shothriveg at a high-level what Business-layer Encryption seeks watch appreciate, including which secrets are employd to originate the key material employd for encryption. 🔑secret is originated via DH(inactive server uncover key, client personal key), and 🔑recent_secret is DH(server uncover key, client personal key). 🔑session is decrypted from the first response when logged-in. Though it isn’t shown above, 🔑recent_secret is also employd in genSignature() when logged-in; this signature is sent with seek and response metadata.

4A. Business-layer Encryption, Asymmetric Mode

Before the employr logs in to their WeChat account, the Business-layer Encryption process employs a inactive server uncover key, and originates recent client keypair to concur on a inactive Diffie-Hellman dispensed secret for every WeChat nettoil seek. The dispensed secret is run thcimpolite the HKDF function and any data is encrypted with AES-GCM and sent aextfinishedside the originated client uncover key so the server can calcutardy the dispensed secret.

For each seek, the client originates a uncover, personal keypair for employ with ECDH. We also notice that the client has a inactive server uncover key pinned in the application. The client then calcutardys an initial secret.

secret = ECDH(inactive_server_pub, client_priv)
hash = sha256(client_pub)
client_random = <32 randomly generated bytes>
derived_key = HKDF(secret)

derived_key is then employd to AES-GCM encrypt the data, which we portray in detail in the next section.

4B. Business-layer Encryption, geting session_key

If the client is logged-in (i.e., the employr has logged in to a WeChat account on a previous app run), the first seek will be a very big data packet genuineating the client to the server (referred to as Autoauth in WeChat inners) which also retains key material. We refer to this seek as the Autoauth seek. In compriseition, the client pulls a locpartner-stored key autoauth_key, which we did not pursue the shownance of, since it does not seem to be employd other than in this instance. The key for encrypting this initial seek (authseek_data) is derived_key, calcutardyd in the same way as in Section 4A. The encryption portrayd in the complying is the Asymmetric Mode encryption, albeit a one-of-a-kind case where the data is the authseek_data.

Below is an condensed version of a serialized and encrypted Autoauth seek:

    08 01 12 . . . [Header metadata]
    04 46 40 96 4d 3e 3e 7e [client_publickey] . . .
    fa 5a 7d a7 78 e1 ce 10 . . . [ClientRandom encrypted w secret]
    a1 fb 0c da . . .               [IV]
    9e bc 92 8a 5b 81 . . .         [tag]
    db 10 d3 0f f8 e9 a6 40 . . . [ClientRandom encrypted w autoauth_key]
    75 b4 55 30 . . .               [IV]
    d7 be 7e 33 a3 45 . . .         [tag]
    c1 98 87 13 eb 6f f3 20 . . . [authseek_data encrypted w derived_key]
    4c ca 86 03 . .                 [IV]
    3c bc 27 4f 0e 7b . . .         [tag]

A brimming sample of the Autoauth seek and response at each layer of encryption can be set up in the Github repository. Finpartner, we notice that the autoauth_key above does not seem to be actively employd outside of encrypting in this particular seek. We mistrust this is vestigial from a legacy encryption protocol employd by WeChat.

The client encrypts here using AES-GCM with a randomly originated IV, and employs a SHA256 hash of the preceding message satisfyeds as AAD. At this stage, the messages (including the ClientRandom messages) are always ZLib compressed before encryption.

iv = <12 random bytes>
compressed = zlib_compress(plaintext)
ciphertext, tag = AESGCM_encrypt(compressed, aad = hash(previous), derived_key, iv)

In the above, previous is the header of the seek (i.e. all header bytes preceding the 04 00 00 labeler of data commence). The client appfinishs the 12-byte IV, then the 16-byte tag, onto the ciphertext. This tag can be employd by the server to validate the integrity of the ciphertext, and essentipartner functions as a MAC.

4B1. Obtaining session_key: Autoauth Response

The response to autoauth is serialized apredicted to the seek:

08 01 12 . . . [Header metadata]
04 46 40 96 4d 3e 3e 7e [new_server_pub] . . .
c1 98 87 13 eb 6f f3 20 . . . [authresponse_data encrypted w recent_secret]
4c ca 86 03 . . [IV]
3c bc 27 4f 0e 7b . . . [tag]

With the recently getd server uncover key (recent_server_pub), which is contrastent from the inactive_server_pub difficultcoded in the app, the client then derives a recent secret (recent_secret). recent_secret is then employd as the key to AES-GCM decrypt authresponse_data. The client can also validate authresponse_data with the given tag.

recent_secret = ECDH(recent_server_pub, client_personalkey)
authresponse_data= AESGCM_decrypt(aad = hash(authseek_data),
recent_secret, iv)

authresponse_data is a serialized Protobuf retaining a lot of convey inant data for WeChat to commence, commenceing with a collaborative Everyskinnyg is ok status message. A brimming sample of this Protobuf can be set up in the Github repository. Most convey inantly, authresponse_data retains session_key, which is the key employd for future AES-CBC encryption under Symmetric Mode. From here on out, recent_secret is only employd in genSignature(), which is converseed below in Section 4C2 Integrity Check.

We meadeclareived the entropy of the session_key supplyd by the server, as it is employd for future encryption. This key exclusively employs printable ASCII characters, and is thus restrictcessitate to around ~100 bits of entropy.

The WeChat code refers to three contrastent keys: client_session, server_session, and one_session. Generpartner, client_session refers to the client_uncoverkey, server_session refers to the dispensed secret key originated using ECDH i.e. recent_secret, and one_session refers to the session_key supplyd by the server.

4C. Business-layer Encryption, Symmetric Mode

After the client gets session_key from the server, future data is encrypted using Symmetric Mode. Symmetric Mode encryption is mostly done using AES-CBC instead of AES-GCM, with the exception of some big files being encrypted with AesGcmEncryptWithCompress(). As AesGcmEncryptWithCompress() seeks are the exception, we intensify on the more common employ of AES-CBC.

Specificpartner, the Symmetric Mode employs AES-CBC with PKCS-7 pcompriseing, with the session_key as a symmetric key:

ciphertext = AES-CBC(PKCS7_pad(plaintext), session_key, iv = session_key)

This session_key is doubly employd as the IV for encryption.

4C1. Integrity verify

In Symmetric Mode, a function called genSignature() calcutardys a pseudo-integrity code on the plaintext. This function first calcutardys the MD5 hash of WeChat’s summarizeateed employr ID for the logged-in employr (uin), recent_secret, and the plaintext length. Then, genSignature() employs Adler32, a verifysumming function, on the MD5 hash concatenated with the plaintext.

signature = adler32(md5(uin | recent_secret | plaintext_len) |
            plaintext)

The result from Adler32 is concatenated to the ciphertext as metadata (see Section 3A for how it is included in the seek and response headers), and is referred to as a signature in WeChat’s codebase. We notice that though it is referred to as a signature, it does not supply any cryptodetailed properties; details can be set up in the Security Issues section. The brimming pseudocode for this function can also be set up in the Github repository.

5. Protobuf data payload

The input to Business-layer Encryption is generpartner a serialized Protobuf, chooseionpartner compressed with Zlib. When logged-in, many of the Protobufs sent to the server retain the complying header data:

"1": {
    "1": "u0000",
    "2": "1111111111", # User ID (summarizeateed by WeChat)
    "3": "AAAAAAAAAAAAAAAu0000", # Device ID (summarizeateed by WeChat)
    "4": "671094583", # Client Version
    "5": "android-34", # Android Version
    "6": "0"
    },

The Protobuf structure is detaild in each API’s correplying RR class, as we previously refered in the “Launching a WeChat nettoil seek” section.

6. Putting it all together

In the below diagram, we show the nettoil flow for the most common case of discdisthink abouting the WeChat application. We notice that in order to stop further complicating the diagram, HKDF derivations are not shown; for instance, when “🔑mmtls” is employd, HKDF is employd to derive a key from “🔑mmtls”, and the derived key is employd for encryption. The particulars of how keys are derived, and which derived keys are employd to encrypt which data, can be set up in these notices.

Figure 3: Swimlane diagram demonstrating the encryption setup and nettoil flow of the most common case (employr is logged in, discdisthink abouts WeChat application).

We notice that other configurations are possible. For instance, we have watchd that if the Longconnect MMTLS handshake finishs first, the Business-layer “Logging-in” seek and response can occur over the Longconnect connection instead of over cut offal lowconnect connections. In compriseition, if the employr is logged-out, Business-layer seeks are srecommend encrypted with 🔑secret (resembling Shortconnect 2 seeks)

In this section, we summarize potential security rehires and privacy frailnesses we identified with the originateion of the MMTLS encryption and Business-layer encryption layers. There could be other rehires as well.

Issues with MMTLS encryption

Below we detail the rehires we set up with WeChat’s MMTLS encryption.

Deterministic IV

The MMTLS encryption process originates a one IV once per connection. Then, they increment the IV for each subsequent sign up encrypted in that connection. Generpartner, NIST recommfinishs not using a wholly deterministic derivation for IVs in AES-GCM since it is straightforward to accidenhighy re-employ IVs. In the case of AES-GCM, reemploy of the (key, IV) tuple is catastrophic as it permits key recovery from the AES-GCM genuineation tags. Since these tags are appfinished to AES-GCM ciphertexts for genuineation, this assists plaintext recovery from as scant as 2 ciphertexts encrypted with the same key and IV pair.

In compriseition, Bellare and Tackmann have shown that the employ of a deterministic IV can originate it possible for a mighty adversary to brute-force a particular (key, IV) combination. This type of strike applies to mighty adversaries, if the crypto system is deployed to a very big (i.e., the size of the Internet) pool of (key, IV) combinations being chosen. Since WeChat has over a billion employrs, this order of magnitude puts this strike wiskinny the genuinem of feasibility.

Lack of forward secrecy

Forward secrecy is generpartner predicted of conmomentary communications protocols to lessen the convey inance of session keys. Generpartner, TLS itself is forward-secret by summarize, except in the case of the first packet of a “resumed” session. This first packet is encrypted with a “pre-dispensed key”, or PSK set uped during a previous handshake.

MMTLS originates weighty employ of PSKs by summarize. Since the Shortconnect carry createat only helps a one round-trip of communication (via a one HTTP POST seek and response), any encrypted data sent via the carry createat is encrypted with a pre-dispensed key. Since leaking the dispensed `PSK_ACCESS` secret would assist a third-party to decrypt any EarlyData sent atraverse multiple MMTLS connections, data encrypted with the pre-dispensed key is not forward secret. The immense convey inantity of sign ups encrypted via MMTLS are sent via the Shortconnect carry, which unkinds that the convey inantity of nettoil data sent by WeChat is not forward-secret between connections. In compriseition, when discdisthink abouting the application, WeChat originates a one extfinished-inhabitd Longconnect connection. This extfinished-inhabitd Longconnect connection is discdisthink about for the duration of the WeChat application, and any encrypted data that necessitates to be sent is sent over the same connection. Since most WeChat seeks are either encrypted using (A) a session-resuming PSK or (B) the application data key of the extfinished-inhabitd Longconnect connection, WeChat’s nettoil traffic frequently does not retain forward-secrecy between nettoil seeks.

Issues with Business-layer encryption

On its own, the business-layer encryption originateion, and, in particular the Symmetric Mode, AES-CBC originateion, has many cut offe rehires. Since the seeks made by WeChat are double-encrypted, and these troubles only impact the inner, business layer of encryption, we did not discover an prompt way to utilize them. However, in betterer versions of WeChat which exclusively employd business-layer encryption, these rehires would be utilizeable.

Metadata leak

Business-layer encryption does not encrypt metadata such as the employr ID and seek URI, as shown in the “Business-layer seek” section. This rehire is also accomprehendledged by the WeChat enbigers themselves to be one of the motivations to enbig MMTLS encryption.

Forgeable genSignature integrity verify

While the purpose of the genSignature code is not entidepend evident, if it is being employd for genuineation (since the ecdh_key is included in the MD5) or integrity, it fall shorts on both parts. A valid counterfeiting can be calcutardyd with any comprehendn plaintext without comprehendledge of the ecdh_key. If the client originates the complying for some comprehendn plaintext message plaintext:

sig = adler32(md5(uin | ecdh_key | plaintext_len) | plaintext)

We can do the complying to forge the signature evil_sig for some evil_plaintext with length plaintext_len:

evil_sig = sig - adler32(plaintext) + adler32(evil_plaintext)

Subtracting and compriseing from adler32 verifysums is achievable by solving for a system of equations when the message is low. Code for subtracting and compriseing to adler32 verifysum, thereby forging this integrity verify, can be set up in adler.py in our Github repository.

Possible AES-CBC pcompriseing oracle

Since AES-CBC is employd aextfinishedside PKCS7 pcompriseing, it is possible that the employ of this encryption on its own would be susceptible to an AES-CBC pcompriseing oracle, which can guide to recovery of the encrypted plaintext. Earlier this year, we set up that another custom cryptography scheme enbiged by a Tencent company was susceptible to this exact strike.

Key, IV re-employ in block cipher mode

Re-using the key as the IV for AES-CBC, as well as re-using the same key for all encryption in a given session (i.e., the length of time that the employr has the application discdisthink abouted) starts some privacy rehires for encrypted plaintexts. For instance, since the key and the IV supply all the randomness, re-using both unkinds that if two plaintexts are identical, they will encrypt to the same ciphertext. In compriseition, due to the employ of CBC mode in particular, two plaintexts with identical N block-length premendes will encrypt to the same first N ciphertext blocks.

Encryption key rehires

It is highly untraditional for the server to pick the encryption key employd by the client. In fact, we notice that the encryption key originated by the server (the “session key”) exclusively employs printable ASCII characters. Thus, even though the key is 128 bits extfinished, the entropy of this key is at most 106 bits.

No forward secrecy

As refered in the previous section, forward-secrecy is a standard property for conmomentary nettoil communication encryption. When the employr is logged-in, all communication with WeChat, at this encryption layer, is done with the exact same key. The client does not get a recent key until the employr shuts and recommences WeChat.

To validate our discoverings, we also tested our decryption code on WeChat 8.0.49 for Android (freed April 2024) and set up that the MMTLS nettoil createat alignes that employd by WeChat 8.0.49 for iOS.

Previous versions of WeChat nettoil encryption

To comprehend how WeChat’s complicated cryptosystems are tied together, we also alertly reverse-engineered an betterer version of WeChat that did not employ MMTLS. The recentest version of WeChat that did not employ MMTLS was v6.3.16, freed in 2016. Our brimming notices on this reverse-engineering can be set up here.

While logged-out, seeks were bigly using the Business-layer Encryption cryptosystem, using RSA uncover-key encryption rather than inactive Diffie-Hellman plus symmetric encryption via AES-GCM. We watchd seeks to the inner URIs cgi-bin/micromsg-bin/encryptverifyresrefresh and cgi-bin/micromsg-bin/getkvidkeystrategyrsa.

There was also another encryption mode employd, DES with a inactive key. This mode was employd for sfinishing crash logs and memory stacks; POST seeks to the URI /cgi-bin/mmhelp-bin/stackincreate were encrypted using DES.

We were not able to login to this version for active analysis, but from our inactive analysis, we remendd that the encryption behaves the same as Business-layer Encryption when logged-in (i.e. using a session_key supplyd by the server for AES-CBC encryption).

Why does Business-layer encryption matter?

Since Business-layer encryption is wrapped in MMTLS, why should it matter whether or not it is safe? First, from our study of previous versions of WeChat, Business-layer encryption was the sole layer of encryption for WeChat nettoil seeks until 2016. Second, from the the fact that Business-layer encryption exposes inner seek URI unencrypted, one of the possible architectures for WeChat would be to structure contrastent inner servers to administer contrastent types of nettoil seeks (correplying to contrastent “seekType” cherishs and contrastent cgi-bin seek URLs). It could be the case, for instance, that after MMTLS is endd at the front WeChat servers (administers MMTLS decryption), the inner WeChat seek that is forwarded to the correplying inner WeChat server is not re-encrypted, and therefore solely encrypted using Business-layer encryption. A nettoil eavesdropper, or nettoil tap, placed wiskinny WeChat’s intranet could then strike the Business-layer encryption on these forwarded seeks. However, this scenario is purifyly conjectural. Tencent’s response to our disclodeclareive is troubleed with rehires in Business-layer encryption and implies they are sluggishly migrating from the more problematic AES-CBC to AES-GCM, so Tencent is also troubleed with this.

Why not employ TLS?

According to uncover recordation and validateed by our own discoverings, MMTLS (the “Outer layer” of encryption) is based heavily on TLS 1.3. In fact, the record shows that the architects of MMTLS have a decent caring of asymmetric cryptography in ambiguous.

The record retains reasoning for not using TLS. It elucidates that the way WeChat employs nettoil seeks necessitates someskinnyg appreciate 0-RTT session resumption, becaemploy the convey inantity of WeChat data transmission necessitates only one seek-response cycle (i.e., Shortconnect). MMTLS only demandd one round-trip handshake to set up the underlying TCP connection before any application data can be sent; according to this record, introducing another round-trip for the TLS 1.2 handshake was a non-commenceer.

Fortunately, TLS1.3 gives a 0-RTT (no compriseitional nettoil procrastinate) method for the protocol handshake. In compriseition, the protocol itself supplys extensibility thcimpolite the version number, CipherSuite, and Extension mechanisms. However, TLS1.3 is still in write phases, and its carry outation may still be far away. TLS1.3 is also a ambiguous-purpose protocol for all apps, given the characteristics of WeChat, there is wonderful room for chooseimization. Therefore, at the finish, we chose to summarize and carry out our own safe carry protocol, MMTLS, based on the TLS1.3 write standard. [originally written in Chinese]

However, even at the time of writing in 2016, TLS 1.2 did supply an chooseion for session resumption. In compriseition, since WeChat administers both the servers and the clients, it doesn’t seem unreasonable to deploy the brimmingy-fledged TLS 1.3 carry outations that were being tested at the time, even if the IETF write was infinish.

Despite the architects of MMTLS’ best effort, generpartner, the security protocols employd by WeChat seem both less carry outant and less safe than TLS 1.3. Generpartner speaking, summarizeing a safe and carry outant carry protocol is no straightforward feat.

The rehire of carry outing an extra round-trip for a handshake has been a perennial rehire for application enbigers. The TCP and TLS handshake each demand a one round-trip, unkinding each recent data packet sent demands two round-trips. Today, TLS-over-QUIC combines the carry-layer and encryption-layer handshakes, requiring only a one handshake. QUIC supplys the best of both worlds, both strong, forward-secret encryption, and halving the number of round-trips necessitateed for safe communication. Our recommfinishation would be for WeChat to migrate to a standard QUIC carry outation.

Finpartner, there is also the rehire of client-side carry outance, in compriseition to nettoil carry outance. Since WeChat’s encryption scheme carry outs two layers of encryption per seek, the client is carry outing double the toil to encrypt data, than if they employd a one regularized cryptosystem.

The trfinish of home-rolled cryptography in Chinese applications

The discoverings here give to much of our prior research that recommends the well-comprehendnity of home-increasen cryptography in Chinese applications. In ambiguous, the eludeance of TLS and the pickence for proprietary and non-standard cryptography is a departure from cryptodetailed best trains. While there may have been many legitimate reasons to dissuppose TLS in 2011 (appreciate EFF and Access Now’s troubles over the certificate authority ecosystem), the TLS ecosystem has bigly stabilized since then, and is more auditable and clear. Like MMTLS, all the proprietary protocols we have researched in the past retain frailnesses relative to TLS, and, in some cases, could even be trivipartner decrypted by a nettoil adversary. This is a increaseing, troubleing trfinish one-of-a-kind to the Chinese security landscape as the global Internet enhancees towards technologies appreciate QUIC or TLS to shield data in transit.

Anti-DNS-hijacking mechanisms

Similar to how Tencent wrote their own cryptodetailed system, we set up that in Mars they also wrote a proprietary domain watchup system. This system is part of STN and has the ability to help domain name to IP compriseress watchups over HTTP. This feature is referred to as “NewDNS” in Mars. Based on our active analysis, this feature is normally employd in WeChat. At first glance, NewDNS duplicates the same functions already supplyd by DNS (Domain Name System), which is already built into proximately all internet-connected devices.

WeChat is not the only app in China that employs such a system. Major cdeafening computing supplyrs in China such as Alibaba Cdeafening and Tencent Cdeafening both give their own DNS over HTTP service. A VirusTotal search for apps that tries to reach out Tencent Cdeafening’s DNS over HTTP service finishpoint (119.29.29.98) createed 3,865 one-of-a-kind results.

One predicted reason for adchooseing such a system is that ISPs in China frequently carry out DNS hijacking to insert ads and restraightforward web traffic to carry out ad deception. The problem was so solemn that six Chinese internet enormouss rehired a combinet statement in 2015 urging ISPs to enhance. According to the recents article, about 1–2% of traffic to Meituan (an online shopping site) suffers from DNS hijacking. Ad deception by Chinese ISPs seems to remain a expansivespread problem in recent years.

Similar to their MMTLS cryptodetailed system, Tencent’s NewDNS domain watchup system was driven by trying to greet the necessitates of the Chinese nettoiling environment. DNS proper over the years has shown to have multiple security and privacy rehires. Compared to TLS, we set up that WeChat’s MMTLS has compriseitional deficiencies. However, it remains an discdisthink about inquire as to, when appraised to DNS proper, whether NewDNS is more or less problematic. We exit this inquire for future toil.

Use of Mars STN outside WeChat

We specutardy that there is a expansivespread adchooseion of Mars (mars-discdisthink about) outside of WeChat, based on the complying observations:

The adchooseion of Mars outside of WeChat is troubleing becaemploy Mars by default does not supply any carry encryption. As we have refered in the “Three Parts of Mars” section, the MMTLS encryption employd in WeChat is part of mars-wechat, which is not discdisthink about source. The Mars enbigers also have no schedules to comprise help of TLS, and predict other enbigers using Mars to carry out their own encryption in the upper layers. To originate matters worse, carry outing TLS wiskinny Mars seems to demand a equitable bit of architectural alters. Even though it would not be unequitable for Tencent to shield MMTLS proprietary, MMTLS is still the main encryption system that Mars was summarizeed for, leaving MMTLS proprietary would unkind other enbigers using Mars would have to either dedicate convey inant resources to fuse a contrastent encryption system with Mars, or exit everyskinnyg unencrypted.

Mars is also alertageing in recordation. The official wiki only retains a scant, better articles on how to fuse with Mars. Developers using Mars frequently resort to asking inquires on GitHub. The alertage of recordation unkinds that enbigers are more prone to making misconsents, and ultimately reducing security.

Further research is necessitateed in this area to verify the security of apps that employ Tencent’s Mars library.

“Tinker”, a active code-loading module

In this section, we tentatively refer to the APK downloaded from the Google Play Store as “WeChat APK”, and the APK downloaded from WeChat’s official website as “Weixin APK”. The contrastention between WeChat and Weixin seems blurry. The WeChat APK and Weixin APK retain partipartner contrastent code, as we will tardyr converse in this section. However, when inshighing both of these APKs to an English-locale Android Emulator, they both show their app names as “WeChat”. Their application ID, which is employd by the Android system and Google Play Store to remend apps, are also both “com.tencent.mm”. We were also able to login to our US-number accounts using both APKs.

Unappreciate the WeChat APK, we set up that the Weixin APK retains Tinker, “a hot-mend solution library”. Tinker permits the enbiger to refresh the app itself without calling Android’s system APK inshigher by using a technique called “active code loading”. In an earlier increate we set up a aappreciate contrastention between TikTok and Douyin, where we set up Douyin to have a aappreciate active code-loading feature that was not conshort-term in TikTok. This feature elevates three troubles:

  1. If the process for downloading and loading the active code does not enoughly genuineate the downloaded code (e.g., that it is cryptodetailedpartner signed with the accurate uncover key, that it is not out of date, and that it is the code intfinished to be downloaded and not other cryptodetailedpartner signed and up-to-date code), an strikeer might be able to utilize this process to run harmful code on the device (e.g., by injecting arbitrary code, by carry outing a downgrade strike, or by carry outing a sidegrade strike). Back in 2016, we set up such instances in other Chinese apps.
  2. Even if the code downloading and loading mechanism retains no frailnesses, the active code loading feature still permits the application to load code without increateing the employr, bypassing employrs’ consent to determine what program could run on their device. For example, the enbiger may push out an ungreet refresh, and the employrs do not have a choice to shield using the better version. Furthermore, a enbiger may pickively aim a employr with an refresh that settles their security or privacy. In 2016, a Chinese security analyst accemployd Alibaba of pushing activepartner loaded code to Alipay to surreptitiously consent photos and sign up audio on his device.
  3. Dynamicpartner loading code strips app store verifyers from verifying all relevant behavior of an app’s execution. As such, the Google Play Developer Program Policy does not permit apps to employ active code loading.

When analyzing the WeChat APK, we set up that, while it retains some components of Tinker. The component which seems to administer the downloading of app refreshs is conshort-term, however the core part of Tinker that administers loading and executing the downloaded app refreshs has been replaced with “no-op” functions, which carry out no actions. We did not verify the WeChat binaries useable from other third party app stores.

Further research is necessitateed to verify the security of Tinker’s app refresh process, whether WeChat APKs from other sources retain the active code loading feature, as well as any further contrastences between the WeChat APK and Weixin APK.

In this section, we originate recommfinishations based on our discoverings to relevant audiences.

To application enbigers

Implementing proprietary encryption is more pricey, less carry outant, and less safe than using well-scrutinized standard encryption suites. Given the comfervent nature of data that can be sent by applications, we help application enbigers to employ tried-and-real encryption suites and protocols and to elude rolling their own crypto. SSL/TLS has seen almost three decades of various enhancements as a result of rigorous uncover and academic scruminuscule. TLS configuration is now easier than ever before, and the advent of QUIC-based TLS has theatricalpartner enhanced carry outance.

To Tencent and WeChat enbigers

Below is a duplicate of the recommfinishations we sent to WeChat and Tencent in our disclodeclareive. The brimming disclodeclareive correplyence can be set up in the Appfinishix.

In this post from 2016, WeChat enbigers notice that they desireed to upgrade their encryption, but the compriseition of another round-trip for the TLS 1.2 handshake would convey inantly degrade WeChat nettoil carry outance, as the application relies on many low bursts of communication. At that time, TLS 1.3 was not yet an RFC (though session resumption extensions were useable for TLS 1.2), so they chooseed to “roll their own” and integrate TLS 1.3’s session resumption model into MMTLS.

This rehire of carry outing an extra round-trip for a handshake has been a perennial rehire for application enbigers around the world. The TCP and TLS handshake each demand a one round-trip, unkinding each recent data packet sent demands two round-trips. Today, TLS-over-QUIC combines the carry-layer and encryption-layer handshakes, requiring only a one handshake. QUIC was enbiged for this convey purpose, and can supply both strong, forward-secret encryption, while halving the number of round-trips necessitateed for safe communication. We also notice that WeChat seems to already employ QUIC for some big file downloads. Our recommfinishation would be for WeChat to migrate entidepend to a standard TLS or QUIC+TLS carry outation.

There is also the rehire of client-side carry outance, in compriseition to nettoil carry outance. Since WeChat’s encryption scheme carry outs two layers of encryption per seek, the client is carry outing double the toil to encrypt data than if WeChat employd a one regularized cryptosystem.

To operating systems

On the web, client-side browser security alertings and the employ of HTTPS as a ranking factor in search engines gived to expansivespread TLS adchooseion. We can draw slack analogies to the mobile ecosystem’s operating systems and application stores.

Is there any platcreate or OS-level permission model that can show normal usage of standard encrypted nettoil communications? As we refered in our prior toil studying proprietary cryptography in Chinese IME keyboards, OS enbigers could ponder device permission models that surface whether applications employ lessen-level system calls for nettoil access.

To dangerous employrs with privacy troubles

Many WeChat employrs employ it out of necessity rather than choice. For employrs with privacy troubles who are using WeChat out of necessity, our recommfinishations from the previous increate still hbetter:

  • Avoid features delisystematiced as “Weixin” services if possible. We notice that many core “Weixin” services (such as Search, Channels, Mini Programs) as delisystematiced by the Privacy Policy carry out more tracking than core “WeChat” services.
  • When possible, pick web or applications over Mini Programs or other such embedded functionality.
  • Use cut offeer device permissions and refresh your gentleware and OS normally for security features.

In compriseition, due to the dangers startd by active code loading in WeChat downloaded from the official website, we recommfinish employrs to instead download WeChat from the Google Play Store whenever possible. For employrs who have already inshighed WeChat from the official website, removing and re-inshighing the Google Play Store version would also mitigate the danger.

To security and privacy researchers

As WeChat has over one billion employrs, we posit that the order of magnitude of global MMTLS employrs is on a aappreciate order of magnitude as global TLS employrs. Despite this, there is little-to-no third-party analysis or scruminuscule of MMTLS, as there is in TLS. At this scale of impact, MMTLS deserves aappreciate scruminuscule as TLS. We implore future security and privacy researchers to originate on this toil to persist the study of the MMTLS protocol, as from our correplyences, Tencent insists on continuing to employ and enbig MMTLS for WeChat connections.

We would appreciate to thank Jedidiah Crandall, Jakub Dalek, Prateek Mittal, and Jonathan Mayer for their guidance and feedback on this increate. Research for this project was administerd by Ron Deibert.

In this appfinishix, we detail our disclodeclareive to Tencent troubleing our discoverings and their response.

April 24, 2024 — Our disclodeclareive

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Afequitables & Public Policy at the University of Toronto in Toronto, Canada.

We verifyd WeChat v8.0.23 on Android and iOS as part of our ongoing toil analyzing well-comprehendn mobile and desktop apps for security and privacy rehires. We set up that WeChat’s proprietary nettoil encryption protocol, MMTLS, retains frailnesses appraised to conmomentary nettoil encryption protocols, such as TLS or QUIC+TLS. For instance, the protocol is not forward-secret and may be susceptible to repercreate strikes. We schedule on unveiling a recordation of the MMTLS nettoil encryption protocol and strongly recommend that WeChat, which is reliable for the nettoil security of over 1 billion employrs, switch to a strong and carry outant encryption protocol appreciate TLS or QUIC+TLS.

For further details, plmitigate see the speedyened record.

Timeline to Public Disclodeclareive

The Citizen Lab is pledgeted to research transparency and will unveil details think abouting the security vulnerabilities it uncovers in the context of its research activities, missing exceptional circumstances, on its website: https://citizenlab.ca/.

The Citizen Lab will unveil the details of our analysis no sooner than 45 calfinishar days from the date of this communication.

Should you have any inquires about our discoverings plmitigate let us comprehend. We can be accomplished at this email compriseress: [email protected].

Sincedepend,

The Citizen Lab

May 17, 2024 — Tencent’s response

Thank you for your increate.Since receiving your increate on April 25th, 2024, we have carry outed a pimpolitent evaluation.The core of WeChat’s security protocol is outer layer mmtls encryption, currently ensuring that outer layer mmtls encryption is safe. On the other hand, the encryption rehires in the inner layer are administerd as complys: the core data traffic has been switched to AES-GCM encryption, while other traffic is gradupartner switching from AES-CBC to AES-GCM.If you have any other inquires, plmitigate let us comprehend.thanks.

Source connect


Leave a Reply

Your email address will not be published. Required fields are marked *

Thank You For The Order

Please check your email we sent the process how you can get your account

Select Your Plan