FAQ – The Citizen Lab https://citizenlab.ca University of Toronto Tue, 15 Oct 2024 19:01:21 +0000 en-CA hourly 1 【我们继续聊天?】常问问题 https://citizenlab.ca/2024/10/%e6%88%91%e4%bb%ac%e7%bb%a7%e7%bb%ad%e8%81%8a%e5%a4%a9%ef%bc%9f%e5%b8%b8%e9%97%ae%e9%97%ae%e9%a2%98-zh-cn/ Tue, 15 Oct 2024 18:59:49 +0000 https://citizenlab.ca/?p=81087 阅读完整报告:我们也应该聊天吗?(Should We Chat, Too?)微信 MMTLS 加密协议安全性分析

这项研究如何加深我们对微信的了解?

微信是一款具有多种功能的应用程序。之前,我们研究了小程序的隐私问题及其监视以及审查文本图像消息。本研究重点研究微信的网络加密协议及其安全性。

当我们这样的信息安全研究人员分析应用程序的安全性时,我们会执行网络流量分析,以研究应用程序发送了什么以及如何发送。通过该分析,我们可以了解该应用程序收集了哪些数据以及与谁共享这些数据。

在微信上进行这样的分析最初并不简单。当今大多数应用程序使用行业标准传输层安全性协议(TLS)来加密其网络流量的内容,这通常可以使窃听者无法读取底层数据。当研究人员希望分析他们自己的应用程序发送的流量时,已经存在解密此类内容的常用工具。然而,这些工具不适用于微信,因为它使用不同于TLS的专有网络加密协议,称为“MMTLS”。在本次研究之前,人们对MMTLS知之甚少,并且没有现有的工具可以检查使用MMTLS加密的内容。

我们对微信网络加密的内部工作原理进行了逆向工程,发现其安全性存在一些小问题。我们发现,之前所说的MMTLS只是微信使用的外层加密。在MMTLS中,我们发现了完全独立于MMTLS的第二层加密,称为“业务层加密”。两种加密体系像俄罗斯套娃一样互相“嵌套”,即先对明文内容进行业务层加密,再将得到的业务层密文作为MMTLS加密的输入,得到MMTLS密文,并透过网络发送出去。

我们发现业务层加密存在几个问题,最严重的是元数据泄漏,导致用户帐户ID和一些其他信息在此层未被加密。我们简单研究了一下旧版微信,发现只包含业务层加密。这些发现表明,业务层加密比MMTLS更早被使用,并且MMTLS很可能是为了弥补业务层加密的缺点而设计的。由于MMTLS加密包裹着业务层加密,因此试图利用业务层加密的弱点通常必须先破坏MMTLS层提供的保护。 

到目前为止,我们还没有发现MMTLS存在严重的安全问题。因此,尽管业务层加密存在漏洞,但这些问题无法被攻击者利用,并且不会影响应用程序网络加密的整体安全性。

我在微信上的通讯安全吗?

每个人认同的威胁模型各异,因此对“安全”的定义也不同。如果您担心与其他微信用户的通信内容被网络窃听者看到,我们的研究表明,尽管与行业标准加密协议相比,微信的加密协议对网络窃听的保护较弱,但它不易受到目前已知的任何攻击技术的影响。

微信使用自定义加密协议,而不是行业标准的传输层安全性协议(TLS)。信息安全专家通常不建议使用定制设计的加密协议,因为经过良好测试的加密协议通常需要许多研究人员多年的共同努力。单个公司不可能投入同等程度的心力。我们还发现微信的加密协议中存在一些小问题,而TLS中并不存在同样的问题。

总而言之,尽管我们没有发现微信的加密协议存在任何重大弱点,但我们仍然发现了一些小问题。这些问题不会损害用户通讯的机密性。然而,行业标准TLS中并不存在同样的问题。

中国政府能够阅读我的微信信息吗?

在监管层面,由于腾讯总部位于中国,因此必须遵守当地法律并回应中国政府的用户数据请求。

从技术层面上,微信的加密协议保护了用户设备与微信服务器之间的通信。它不是一个端到端加密系统,不会对两个用户设备之间发送的数据进行加密。微信的服务器可以解密并阅读每条传输的消息。过去,我们发现微信使用基于关键词的检测系统审查中国用户发送和接收的私人信息。该应用程序还使用非中国用户发送的文件训练他们审查中国用户文件的数据库

微信收集哪些类型的用户数据?

请参阅我们的之前的报告,其中回答了这个问题。

我有一部包含敏感数据的手机。如果我安装微信,它能窃取这些数据吗?

现代手机操作系统(OS)通常限制应用程序访问敏感用户数据(例如联系人和照片)和系统资源(例如地理位置服务),以及存储在其他应用程序中的数据(例如聊天应用程序中的聊天记录)。因此,从技术上讲,应用程序很难在未经用户授权的情况下访问敏感数据。然而,恶意应用程序可以利用操作系统保护中的漏洞并绕过这些访问限制。恶意应用程序还可能会诱骗用户授予访问数据的权限。

测试微信是否表现出这些恶意行为超出了我们的研究范围。但我们在研究中也没有观察到上述恶意行为。

为了保护自己免受此类攻击,我们建议:

  1. 确保您的手机的操作系统版本目前受供应商支持
  2. 保持手机操作系统为最新版本
  3. 从官方来源(内置应用商店)安装应用程序,而不是非官方来源
  4. 安装前检查应用程序的声誉
  5. 对于高度敏感的信息,使用其他专属的设备来处理

微信卸载后还会损害手机安全和隐私吗?

如上所述,现代手机操作系统通过实施系统保护来控制应用程序对敏感数据的访问。从技术上来说,应用程序卸载后很难在系统上保留植入物。然而,一些恶意应用程序可能会尝试利用系统漏洞来实现这一点。

测试微信是否表现出这些恶意行为超出了我们的研究范围。然而,我们在整个研究中并没有观察到这种恶意行为。

如果还有其他问题怎么办?

阅读我们的完整报告并查看我们之前分析微信隐私的报告中的常问问题

翻译说明:这是原始英文报告的非正式翻译。此非正式翻译可能包含不准确之处。其目的仅是为了提供对我们研究的基本了解。若存在差异或歧义,请以本报告的英文版本为准。

]]>
【我們繼續聊天?】常見問題 https://citizenlab.ca/2024/10/%e6%88%91%e5%80%91%e7%b9%bc%e7%ba%8c%e8%81%8a%e5%a4%a9%ef%bc%9f%e5%b8%b8%e8%a6%8b%e5%95%8f%e9%a1%8c-zh_tw/ Tue, 15 Oct 2024 18:59:25 +0000 https://citizenlab.ca/?p=81089 閱讀完整研究報告:【我們繼續聊天?】微信  MMTLS  加密協定的安全性分析

這項研究如何加深我們對微信的了解?

微信是一個具有許多功能的應用程式。之前我們研究過圍繞小程式的隱私問題,以及微信對文字圖片訊息的監視及審查。在這項研究中,我們主要關注微信的網路加密協定及其安全性。

當像我們這樣的資訊安全研究人員分析應用程式的安全性時,我們會執行網路流量分析,以分析應用程式傳送的內容及傳送方式。此類分析可以告訴我們應用程式收集哪些資料以及與誰共享資料。

在微信上執行這樣的分析起初並非易事。如今,大多數應用程式都採用業界標準的傳輸層安全協定(TLS)來加密其網路流量的內容,這通常可以防止竊聽者讀取底層資料。當研究人員希望分析自己的應用程式傳送的流量時,已經存在用於解密此類內容的通用工具。然而,此類工具不適用於微信,因為微信採用了與 TLS 不同的專有網路加密協定,稱為「MMTLS」。在這項研究之前,人們對 MMTLS 知之甚少,也沒有現有的工具來檢查採用 MMTLS 加密的內容。

我們對微信網路加密的內部工作原理進行了逆向工程,發現其安全性存在一些小問題。我們發現之前提到的 MMTLS 只是微信採用的外層加密。在 MMTLS 中,我們發現了完全獨立於 MMTLS 運作的第二層加密,稱為「業務層加密」。兩個加密系統像俄羅斯套娃一樣「嵌套」在一起,即明文內容首先使用業務層加密進行加密,得到的業務層密文將作為 MMTLS 加密的輸入,產生 MMTLS 密文,最終再透過網路傳送 MMTLS 密文。

我們發現業務層加密存在幾個問題,最嚴重的是後設資料洩漏,導致用戶帳戶編號和其他一些資訊未在這一層加密。我們初步研究了一個早期的微信版本,發現其中只有業務層加密。這些發現表明業務層加密比 MMTLS 更早出現,而 MMTLS 可能旨在彌補業務層加密的缺點。由於 MMTLS 加密包覆在業務層加密之外,因此若攻擊者欲利用業務層加密的弱點,通常必須先破壞 MMTLS 層提供的保護。 

到目前為止,我們尚未發現 MMTLS 有嚴重的安全性問題。因此,儘管業務層加密存在漏洞,但這些問題無法被攻擊者利用,並且不會影響應用程式網路加密的整體安全性。

我在微信上的通訊安全嗎?

每個人認同的威脅模型各異,因此對「安全」的定義也不同。如果您擔心您與其他微信用戶的通訊內容會被網路竊聽者讀取,我們的研究表明,儘管與行業標準加密協定相比,微信的加密協定對網路竊聽的保護較弱,但微信的加密協定應不至於受到當今已知的攻擊方法所影響。

微信使用自訂的加密協定而不是行業標準的傳輸層安全協定(TLS)。資訊安全專家通常不建議使用自訂設計進行加密,因為經過充分測試的加密協定通常需要許多研究人員多年的共同努力。單一公司不太可能投入同等程度的心力。我們也發現微信的加密協定存在一些小問題,而 TLS 中不存在類似的問題。

總而言之,儘管我們沒有發現微信加密協定有任何重大缺陷,但我們仍然發現了一些小問題。這些問題不會損害用戶通訊的機密性。然而,在業界標準 TLS 中並不存在同樣的問題。

中國政府可以閱讀我的微信訊息嗎?

在監管層面,由於騰訊總部位於中國,因此必須遵守當地法律並回應中國政府對用戶資料的要求。

從技術層面來說,微信的加密協定保護了用戶設備與微信伺服器之間的通訊。但它並非端對端加密系統,不會對兩個用戶設備之間傳送的資料進行加密。微信的伺服器可以並且確實會解密並讀取每條傳輸的訊息。之前,我們已經發現微信採用基於關鍵字的偵測系統來審查中國用戶傳送和接收的私人訊息。該應用程式還使用非中國用戶傳送的附檔來訓練他們針對中國用戶的審查資料庫

微信收集哪些用戶資料?

我們之前的報告回答了這個問題。

我的手機內有敏感資料。如果我安裝了微信,它會竊取這些資料嗎?

現代手機作業系統(OS)通常會限制應用程式存取使用者的敏感資料(例如聯絡人和照片)及系統資源(例如定位服務)以及其他應用程式中儲存的資料(例如聊天紀錄)。因此,在沒有使用者授權的情況下,應用程式很難存取敏感資料。然而,惡意應用程式可能會利用作業系統保護中的漏洞規避這些存取限制。惡意應用程式也可能誘騙使用者授予存取資料的權限。

測試微信是否表現出這些惡意行為超出了我們的研究範圍。但在我們的研究過程中,我們亦沒有觀察到上述惡意行為。(也就是說,我們的研究並未刻意尋找上述惡意行為,但在研究過程中我們所接觸到的微信程式碼的範圍內,也沒有發現上述惡意行為。在我們研究未探索的程式碼中,是否存在上述惡意行為仍是未知數。)

為了保護自己免受此類攻擊,我們建議:

  1. 確保您手機的作業系統版本目前仍受製造商支援
  2. 及時更新手機作業系統
  3. 從官方來源(內建應用商店)安裝應用程式,而不是非官方來源
  4. 安裝前檢查應用程式的聲譽
  5. 使用專用設備來處理高度敏感的資料

微信在解除安裝後是否仍會損害手機的安全及隱私?

如上所述,現代手機作業系統採取系統保護手段來控制應用程式對敏感資料的存取。從技術上來說,應用程式在解除安裝後很難維持對系統的植入。然而,某些惡意應用程式可能會嘗試利用系統漏洞來執行此操作。

測試微信是否表現出這些惡意行為超出了我們的研究範圍。然而,我們在研究中並沒有觀察到此類惡意行為。

如果我還有其他問題怎麼辦?

閱讀我們的完整報告以及我們之前的微信隱私分析報告中的常見問題

譯文注意事項:此為英文原始報告的非正式翻譯,其中可能含有不準確之處。其目的僅為提供對我們研究的基本說明。如有差異或歧義,以本報告的英文版本為準。

]]>
Should We Chat, Too? FAQ https://citizenlab.ca/2024/10/should-we-chat-too-faq/ Tue, 15 Oct 2024 18:59:13 +0000 https://citizenlab.ca/?p=81075 Read the full report: Should We Chat, Too? Security Analysis of WeChat’s MMTLS Encryption Protocol

What does this research contribute to what we already know about WeChat?

WeChat is an app with many features. Previously, we studied the privacy issues surrounding its Mini Programs, as well as its surveillance and censorship of text and image messages. In this research, we focus on WeChat’s network encryption protocol and its security.

When information security researchers like us analyze the security of apps, one typeof analysis we perform is network traffic analysis, wherein we analyze what is sent by apps and how. This analysis can inform us about what data the app collects and with whom the data is shared.

Performing such an analysis on WeChat was initially not straightforward. Most apps today use industry standard Transport Layer Security (TLS) to encrypt the content of their network traffic, which normally protects eavesdroppers from reading the underlying data. When researchers wish to analyze their own traffic that their own apps are sending there already exist common tools to decrypt such content. However, such tools were inapplicable to WeChat because it uses a proprietary network encryption protocol different from TLS, called “MMTLS”. Prior to this research, little was known about MMTLS and there were no pre-existing tools to inspect content encrypted with MMTLS.

We reverse engineered the inner workings of WeChat’s network encryption and found minor issues with its security. We found that what we previously referred to as MMTLS was only the outer layer of encryption used by WeChat. Within MMTLS, we found a second layer of encryption that works entirely separately from MMTLS, called “Business-layer Encryption”. The two encryption systems are “wrapped” around each other like a Russian doll, i.e., the plaintext content would first be encrypted with the Business-layer Encryption, and the resulting Business-layer ciphertext would be used as input to MMTLS encryption, producing MMTLS ciphertext that would eventually be sent over the network.

We found several issues with the Business-layer Encryption, the most serious being a metadata leak, which leaves the user account ID and some other information unencrypted at this layer. We briefly studied an older WeChat version and found that it only contained the Business-layer Encryption. These findings suggest that Business-layer Encryption is older than MMTLS and that MMTLS was likely designed to remedy the shortcomings of Business-layer Encryption. Since MMTLS Encryption wraps around Business-layer Encryption, attempts to exploit the weaknesses of Business-layer Encryption would typically have to first defeat the protection provided by the MMTLS layer. 

So far, we have not found serious security issues with MMTLS. Therefore, despite vulnerabilities in the Business Layer of encryption, these issues could not be utilized by an attacker, and does not affect the overall security of the app’s network encryption.

Is my communication on WeChat safe?

Everyone identifies with different threat models and thus have different definitions of “safe.” If you are concerned about the contents of your communication with another WeChat user being visible to a network eavesdropper, our research has shown that, despite having weaker protection against network eavesdropping compared to industry standard encryption protocols, WeChat’s encryption protocol is not vulnerable to any attack techniques known today.

WeChat uses a custom encryption protocol instead of the industry standard Transport Layer Security (TLS). Using a custom design for encryption is generally not recommended by information security experts, because a well-tested encryption protocol usually takes a multi-year joint effort by many researchers. It is unlikely that a single company is able to invest the same level of effort. We have also found minor issues in WeChat’s encryption protocol, while these same issues are not present in TLS.

In conclusion, despite that we did not find any significant weaknesses in WeChat’s encryption protocol, we still did find minor issues with it. These issues do not compromise the confidentiality of user communications. However, the same issues are not present within industry-standard TLS.

Can the Chinese government read my WeChat text messages?

On a regulatory level, because Tencent’s headquarter is located in China, it must comply with local law and respond to the Chinese government’s user data requests.

On a technical level, WeChat’s encryption protocol protects the communication between user devices and WeChat’s servers. It is not an end-to-end encryption system, which would encrypt data sent between two user devices. WeChat’s servers can and do decrypt and read each transmitted message. In the past, we found that WeChat uses a keyword-based detection system to censor private messages to and from Chinese users. The app also uses files sent by non-Chinese users to train their database of censored files for Chinese users.

What kind of user data does WeChat collect?

Please refer to our previous report which answers this question.

I have a phone with sensitive data. If I install WeChat, would it be able to steal that data?

Modern phone operating systems (OS) typically restrict apps from accessing sensitive user data (such as contacts and photos) and system resources (such as a geo-location service), as well as data stored in other apps (such as chat history in a chat app). Therefore, it is technically difficult for apps to access sensitive data without the user granting them permission. However, malicious apps could exploit vulnerabilities in OS protections and circumvent these access restrictions. Malicious apps might also trick the user into granting permission to access data.

It is outside our research scope to test whether WeChat exhibits these malicious behaviors. During our research we have not observed the malicious behaviors mentioned above.

To protect yourself from this kind of attack, we suggest:

  1. Make sure your phone’s OS version is currently supported by the vendor
  2. Keep your phone’s OS up-to-date
  3. Install apps from official sources (built-in app stores) instead of unofficial ones
  4. Check the app’s reputation before installing
  5. For highly sensitive information, use a separate dedicated device to handle it

Can WeChat harm my phone security and privacy even after uninstallation?

As mentioned above, modern phone operating systems implement system protection to control apps’ access to sensitive data. It is technically difficult for apps to maintain implants on a system after the app’s uninstallation. However, some malicious apps might try to do so by exploiting system vulnerabilities.

It is outside our research scope to test whether WeChat exhibits these malicious behaviors. However, throughout our research, we have not observed such malicious behavior.

What if I have other questions?

Read our full report and check out the FAQ from our previous WeChat report analyzing the app’s privacy.

]]>
Chinese Keyboard App Vulnerabilities Explained https://citizenlab.ca/2024/04/chinese-keyboard-app-vulnerabilities-explained/ Tue, 23 Apr 2024 11:59:21 +0000 https://citizenlab.ca/?p=80502 This is an FAQ for the full report titled “The not-so-silent type: Vulnerabilities across keyboard apps reveal keystrokes to network eavesdroppers.”

What are cloud-based pinyin keyboard apps?

There are various ways to type Chinese on a keyboard. The most popular input method for mainland Chinese users is the pinyin input method, based on the pinyin romanization of Chinese characters. With any Chinese input method, prediction is necessary to determine which character a user intends to type, since there are far greater characters than there are keys on a keyboard.

As a result, all Chinese keyboards use some amount of prediction. By default, the prediction capabilities are limited by your phone’s hardware. To overcome this limitation, Chinese keyboards often offer “cloud-based” prediction services which transmit your keystrokes to a server that hosts more powerful prediction models. As many have previously pointed out, this is a massive privacy tradeoff, as “cloud-based” keyboards and input methods can function as vectors for surveillance and essentially behave as keyloggers.

We note that this report is not about how operators of cloud-based keyboards can read users’ keystrokes, which is a phenomenon that has already been extensively studied and documented. This report is primarily concerned with protecting this keystroke data from network eavesdroppers.

Which keyboard apps were analyzed in this study? And how did you choose which apps to study?

We analyzed third-party keyboard apps Tencent QQ, Baidu, and iFlytek, on the Android, iOS, and Windows platforms. Along with Tencent Sogou, they comprise over 95% of the market share for third-party keyboard apps in China.

We also analyzed the keyboard apps installed by default on Honor, Huawei, OPPO, Vivo, Samsung, and Xiaomi devices sold in China. We chose these since they are all popular phone manufacturers in China. In 2023, Honor, OPPO, and Xiaomi alone comprised nearly 50% of the smartphone market in China.

What types of software vulnerabilities were identified in the keyboard apps you analyzed?

To enable the “cloud-based” prediction features, the keyboards we analyzed transmit user keystrokes to a server on the Internet. We found that these apps’ transmission of keystrokes over the Internet were insecure in various ways. This means that if you are using one of these keyboard apps, your ISP, VPN, or even other users on the same WiFi network as you, can retrieve the keystrokes you are typing into your device.

Among the nine vendors whose keyboard apps we analyzed, we found that there was only one vendor, Huawei, in whose apps we could not find any security issues regarding the transmission of users’ keystrokes.

We note that we did not perform a full audit of any app or make any attempt to exhaustively find every security vulnerability in any software. Our report concerns analyzing keyboard apps for a particular class of vulnerabilities that we discovered, and the absence of our reporting of other vulnerabilities should not be considered evidence of their absence.

What are the implications of these vulnerability discoveries for users of these keyboard apps?

Keystrokes are a particularly sensitive class of information, as they comprise everything we enter into our devices, including passwords, financial data, and browsing data. We estimate that up to one billion users could be vulnerable to having their keystrokes intercepted, constituting a tremendous risk to user security.

We notified all affected vendors, and in most cases the vendors updated the apps to address the vulnerabilities. We urgently encourage users to update their keyboards, operating systems, or switch to keyboards with only “on-device” prediction (e.g., not “cloud-based”). Keyboards that are not cloud-based include Google’s Gboard and Apple’s default iOS keyboard.

What do researchers recommend for users to do in light of these discoveries?

First, high-risk users or users with privacy concerns should not enable “cloud-based” features on their keyboards or IMEs. iOS users can also restrict their keyboards’ network access by revoking the “Full Access” permission for their keyboards or IMEs.

Users of QQ Pinyin should switch keyboards immediately. Users of Honor devices should disable the pre-installed Baidu keyboard and use a different third-party keyboard. We also recommend against using Baidu keyboards in general, as their updated network security protocol still contains privacy weaknesses.

Otherwise, users of any Sogou, Baidu, or iFlyTek keyboard, including the versions that are bundled or pre-installed on operating systems, should ensure their keyboards and operating systems are up-to-date. At-risk users may consider switching to a keyboard that is not cloud-based such as Google’s Gboard or Apple’s default iOS keyboard.

If updates to certain keyboards are not available, how can a user protect themselves?

In some cases, we had trouble updating the keyboards on our test devices. In these cases, we recommend users disable those keyboards and switch to a different keyboard.

What have the vendors done in response to the research findings?

We notified all affected vendors, and in most cases the vendors updated the apps to address the vulnerabilities.

All companies except Baidu, Vivo, and Xiaomi responded to our disclosures1. Baidu fixed the most serious issues we reported to them shortly after our disclosure, but Baidu has yet to fix all issues that we reported to them. The mobile device manufacturers whose preinstalled keyboard apps we analyzed fixed issues in their apps except for their Baidu apps, which either only had the most serious issues addressed or, in the case of Honor, did not address any issues (see the table below for the security status of the apps that we analyzed as of April 1, 2024).

Legend
✘✘ working exploit created to decrypt transmitted keystrokes for both active and passive eavesdroppers
working exploit created to decrypt transmitted keystrokes for an active eavesdropper
! weaknesses present in cryptography implementation
no known issues or all known issues fixed
N/A product not offered or not present on device analyzed
Keyboard developer Android iOS Windows
Tencent N/A
Baidu ! ! !
iFlytek

Pre-installed keyboard developer

Device manufacturer Own Sogou Baidu iFlytek iOS Windows
Samsung  * ! N/A N/A N/A
Huawei  * N/A N/A N/A N/A
Xiaomi N/A  * ! N/A N/A
OPPO N/A  !* N/A N/A N/A
Vivo  * N/A N/A N/A N/A
Honor N/A N/A  ✘✘* N/A N/A N/A

* Default keyboard app on our test device.
Both QQ Pinyin and Sogou IME are developed by Tencent; in this report we analyzed QQ Pinyin and found the same issues as we had in Sogou IME.

In summary, we no longer have working exploits against any products except Honor’s keyboard app and Tencent’s QQ Pinyin. Baidu’s keyboard apps on other devices continue to contain weaknesses in their cryptography which we are unable to exploit at this time to fully decrypt users’ keystrokes in transit.

  1. After the publication of our report, Baidu responded to our disclosure. We have included this response, as well as our response to Baidu, in the Appendix.↩︎
]]>
HKLEAKS Doxxing Explained: Role of Online Harassment Tactics to Repress 2019 Hong Kong Protests https://citizenlab.ca/2023/07/hkleaks-doxxing-explained/ Thu, 13 Jul 2023 13:59:49 +0000 https://citizenlab.ca/?p=79552 Read more »]]> Read the full report “Beautiful Bauhinia: “HKLeaks” – The Use of Covert and Overt Online Harassment Tactics to Repress 2019 Hong Kong Protests.”

What is this report about, and what did it find?

The report is an in-depth analysis of the doxxing campaign known as “HKLEAKS”, which began in August 2019 and for at least two years targeted protesters active in the Anti-Extradition Bill 2019-20 Hong Kong protests. 

In February 2019, the Hong Kong government proposed a bill regarding extradition, which would establish a mechanism for the transfer of fugitives to mainland China, Taiwan, and Macau. Critics claimed that it would endanger freedom of speech and civil liberties enjoyed in Hong Kong as people could be subject to arbitrary detention and unfair trials. The proposed bill sparked mass protests in Hong Kong, which the local authorities tried to violently repress.

The online campaign HKLEAKS used doxxing as their weapon of choice. Doxxing is the unauthorized public exposure of Personal Identifiable Information (PII) with the intent to cause harm to the targeted individual.

Previous analyses have examined parts of the HKLEAKS campaign, making varied assumptions on its true nature. Our research took a holistic approach, conducting an in-depth forensic examination of the whole campaign’s online footprint. 

  1. It showed that HKLEAKS was most likely an inorganic, highly coordinated, and well-resourced campaign despite posing as the expression of a grassroots movement. 
  2. The report found that the campaign benefited from a broader support network that included – but was not limited to – overt governmental entities, such as a bounty campaign run by a former Hong Kong Chief Executive, as well as Chinese state media.
  3. It also found multiple indicators suggesting the campaign was carried out by operators from, or with links to, mainland China. 
  4. Additionally,  the report concludes that the broader network’s diversification of tactics – including bounty campaigns, addressing international audiences with alleged grassroots anti-protest content, and overt governmental messaging – contributed to the doxxing campaign’s increased impact.

The outbreak of COVID-19, followed by  the implementation of the National Security Law in Hong Kong in mid-2020, effectively muzzled the street demonstrations and brought an end to HKLEAKS. Nevertheless, this case study offers an important lesson and a potential preview of the type of targeted disinformation and doxxing campaigns that may become more common in the future.

How was this study conducted? 

After collating several pieces of publicly available analysis produced about HKLEAKS over the past few years, we took a holistic assessment of the campaign. We conducted a complete forensic analysis of its footprint, mapping out its relationships with other networks and digital assets, both in Hong Kong and in mainland China.

We then listed all the relevant evidence that we could identify, and utilized an analytical technique known as Analysis of Competing Hypotheses (ACH) to evaluate alternative scenarios answering the research question: what was the nature of the HKLEAKS campaign?

We scored the resulting four alternative scenarios for likelihood, and analyzed the evidence in support of each of them.

We highlighted technical signals that the campaign was not run by a grassroots movement as claimed, but rather by well-resourced and sophisticated actors, consistent with a government or its proxies.

What was the attribution process?

We examined the collected evidence for signals supporting the determination of the campaign operators’ identity or affiliation. We found that the HKLEAKS actors consistently went to great lengths to hide, and that as a result, a conclusive attribution without access to privileged data (i.e. the kind stored by the web hosting or social media platforms that the campaign exploited) was ultimately unattainable.

However, we identified circumstantial evidence indicating that a governmental organization likely conducted, or at a minimum actively supported, the HKLEAKS campaign. Also, we located evidence pointing to the likelihood that such an organization had linkages to mainland China.

How did the doxxers target protestors? 

The actors published individual doxxing cards, each containing varied types of PII for the target person, on proprietary websites utilizing multiple permutations of an “hkleaks” web domain.

The doxxing cards were then distributed over social media and instant messaging channels. We found that the platforms predominantly used to disseminate the doxxing content were Telegram, WeChat and, at a later stage, Twitter.

While we could not identify significant dissemination of the HKLEAKS doxxing content through other broadly accessible social media platforms (notably, those owned by Meta: Facebook and Instagram), we did observe the supporting network promoting certain communities, such as for example specific Facebook Groups, apparently aligned with their anti-protest mission.

Can social media companies be held accountable for doxxing on their platforms? 

Legislation punishing doxxing has only started emerging in a few countries. Prosecution therefore remains difficult, as the available judicial tools often do not address doxxing as a criminal action having specific signatures. Notably, a legislative amendment ostensibly punishing doxxing was made by the Hong Kong authorities in 2022, although as we describe in the report, it has to date not been applied to the doxxing of protesters by HKLEAKS.

Similarly to legislation, social media platforms’ policies prohibiting doxxing also appear as nascent, fragmentary, and inconsistently applied. We found that large amounts of the doxxing content are still freely available on both Telegram and Twitter. Elsewhere, where the content could have been disseminated in past years (for example, over Facebook or Instagram), it is possible that it was and that it has been rapidly removed by the platforms. However, there is no publicly available indication that the responsible network of accounts was enforced on, and barred from subsequent activity, or that the platforms have formulated dedicated policies targeting doxxing as an adversarial behavior that harms their users.

What are the avenues of redress for the victims of doxxing? 

Methods for the victims to mitigate or redress the harm caused by doxxing are generally limited. Inherently, doxxing has the effect of intimidating the targets by exposing them, and their close circles, to the pressure of a sympathetic (to the attacker) public. That impact, when achieved, can be hard to reverse.

In the case of this particular operation, additionally, the legal options available to the targets were further neutralized by the hosting of the doxxing on acquiescent web hosting and social media platforms.

This is why a combination of effective legislation and strictly enforced online content policies against doxxing is necessary to both empower the targets and keep the offenders accountable.

In this environment, it is advisable that protestors – and more broadly, civil society actors – apply heightened standards of online privacy, as well as of digital security hygiene, including using the recommendations provided by tools such as the Consumer Reports’ Security Planner.

 

]]>
Privacy in the WeChat Ecosystem Explained https://citizenlab.ca/2023/06/privacy-in-the-wechat-ecosystem-explained/ Wed, 28 Jun 2023 15:00:07 +0000 https://citizenlab.ca/?p=79484 Read the full report Should We Chat: Privacy in the WeChat Ecosystem.

What is this report about, and what did we learn?

This report analyzes privacy issues with popular app WeChat by reviewing the data collected by the app and sent to WeChat servers during the regular operation of its various features. We find that they collect more usage data than is disclosed in the WeChat privacy policy.

Specifically, we find that WeChat is collecting activity and usage logs when users run Mini Programs. The WeChat privacy policy implies that only third parties collect this data, despite the fact that WeChat collects a vast amount of data, not just the third party developers of the Mini Program. For the average user, it means your identity and activities on Mini Programs are disclosed to WeChat without an informed way to opt-out of this data collection. This will not only pose a privacy risk but it’s also unknown how WeChat might use that information.

Additionally we find that various operating system protections work to limit the amount of data the WeChat application may gather, and encourage users to be cautious with sensitive permissions like location. Many new security features in newer Android versions seek to enforce permission boundaries and limit types of identifiers available to the application.

What is WeChat, and how is it used?

WeChat is the most popular messaging and social media platform in China and third in the world, with over 1.2 billion monthly active users. According to some market research, network traffic from WeChat made up 34% of Chinese mobile traffic in 2018.

Many inside and outside China use WeChat out of necessity. Besides individuals in China, diaspora populations, family members, journalists, international activists, diplomats, people who do business in China, and just about anyone with a relationship in China are also using WeChat out of necessity.

What is Weixin, and its relation to WeChat?

According to the WeChat Terms of Service, if the user registered using a Chinese phone number (country code +86), they are considered a “Weixin user”. Tencent appears to characterize Weixin and WeChat as two “services” provided within the same “app” based on the language of both WeChat and Weixin’s policies. Both “services” are operated by two separate subsidiaries (WeChat International Pte. Ltd. in Singapore and Shenzhen Tencent Computer Systems Company Limited for Weixin).

In the app, the boundary between these two “services” are not clear. There are features operated by Weixin available for WeChat users. From our observation, both services also mostly use the same set of servers. Users of both services can directly communicate with each other.

What are Mini Programs?

Mini Programs are lightweight apps that can be downloaded and launched within the WeChat app. They can also sync and link with users’ WeChat accounts. The breadth and variety of Mini Programs is essentially the same as any other app ecosystem, like the Google Play Store or the Apple App Store. Mini Programs cover e-commerce, health, public services, gaming, and any other service an app may possibly be used for. This also means that many popular Mini Program apps manage sensitive data. Certain apps manage health data, government services, or perform financial transactions on behalf of the user.

How did you conduct this study?

To set the stage for this work, we first developed tools to study WeChat network requests. We then used these tools to identify and analyze data flowing from the WeChat client to the server during the usage of various WeChat features.

What type of data is sent to WeChat servers during Mini Program execution?

The data collection observed on Mini Programs is likely in-place to enable the application monitoring and analytics features provided by WeChat, namely, “We分析” or “WeAnalyze”. However, from our analysis, we find that all Mini Programs are automatically enrolled into the WeAnalyze program and data collection, and there is no reasonable way to opt-out. To put this data collection into perspective, it would be an equivalent privacy violation if the Google Play Store automatically injected Google Analytics tracking scripts into all applications that were available on the platform.

What other type of data is sent to WeChat servers?

Generally, WeChat collects device and network metadata on top of whatever other data it needs to implement the app’s functionality.

If your location permission is granted to WeChat, WeChat enables the “People Nearby” feature, which collects your location when you are using the application.

Certain features of WeChat send more usage and tracking data than others. Using Mini Programs or Channels, for instance, collects click/page data and tracks your usage of the app.

For a more comprehensive description, check out the full report.

Where are WeChat servers located?

We observed WeChat reporting to servers that are nominally located in Singapore and Hong Kong. The application also has the capability to contact servers in mainland China. Which servers the app uses may be determined based on your IP address location if you are logged out or your registered phone number if you are logged in.

What happens to the data after WeChat/Tencent collects it?

Using our methodology we cannot definitively say what happens to data after WeChat or Tencent collects it, since we are studying client behaviors. WeChat’s privacy policy specifies retention periods for certain types of data, like location data, log data, and messaging data. The privacy policy also provides conditions in which user data may be shared with Weixin, a service operated in Shenzhen, China, such as by communicating with users with mainland China accounts. However, we note that Weixin’s privacy policy does not specify any retention periods. Furthermore, previous research has observed that even communications entirely among North American accounts were still used to secretly train Weixin’s Chinese political censorship system.

What are the limitations of this work?

This report only looks at the behavior of a recent version of the WeChat mobile Android app. Even though we look at what types of data are sent to WeChat servers, we cannot always definitively say what WeChat servers are doing with that data.

Furthermore, we only investigated the application using a U.S. phone number, which limits the scope of our results to understanding the app’s behavior for users who do not have mainland China accounts. We also cannot test certain features, such as WeChat Pay.

Finally, WeChat is a very large app with many features. Although we do our best to be comprehensive, there may be blind spots in our study in which we may have failed to induce the application conditions necessary for the transmission of certain data.

Does the privacy policy address all of the data that is collected by the application?

Not quite. For certain core features, such as Messaging and Moments, the WeChat privacy policy addresses the data that is collected. However, according to WeChat’s privacy policy, the features with the most invasive tracking behavior, such as Search and Channels, are considered features run by a “third-party entity” named Weixin, a service operated in Shenzhen, China.

Though WeChat makes a separation between “WeChat” and “Weixin” services in the privacy policy, there is no such observable distinction on the application itself. All of the data collected by “WeChat features” and “Weixin features” are transmitted to the same servers.

The WeChat privacy policy also states that it will only share data with Weixin as necessary. However, app usage tracking for analytics is not necessary for the operation of the platform. In addition, we note that prior research found that non-mainland-Chinese user data was being used to train censorship algorithms for mainland-Chinese users.

Second, the WeChat privacy policy implies that only third-party privacy practices and policies govern Mini Programs, when in fact, WeChat/Weixin also collect lots of data. In fact, Mini Programs are not listed as subject to the Weixin privacy policy, and instead listed under “Weixin Open Platform,” which are only governed by third-party privacy policies.

What are some recommendations for Tencent?

Since there is no meaningful app distinction between features operated by WeChat or Weixin, WeChat’s privacy policy should cover “Weixin features” so that users may better understand how their data is handled when shared with the Shenzhen-based service.

WeChat should also allow users to opt out of extraneous tracking during usage of “Weixin” services. In particular, WeChat should remove forced enrollment of Mini Program analysis and tracking features and switch to an opt-in model. Currently, both developers and users are automatically enrolled into the WeAnalyze (We分析) data collection program with little notification. There is currently no way to opt out of the program for either developers or users.

For more recommendations, you can read the WeChat recommendations section of our report.

What are some recommendations for users?

For general WeChat users, we can provide a few recommendations:

  • Avoid features delineated as “Weixin services” if possible. Many core “Weixin” services (such as Search and Channels) perform more tracking than core “WeChat” services, and by using “Weixin” services your data is shared with an entity operating in Shenzhen, China.
  • Use stricter permissions. In modern versions of Android, it is possible to restrict certain permissions (like location access) to only when the application is open on screen or to outright deny these permissions.
  • Apply regular security and operating system updates. Many new security features on modern versions of Android are working to enforce permission boundaries and limit certain types of identifiers that are available to the application. We recommend regularly updating for additional security features down the line.

If I am a high risk user, how can I protect myself?

We caution no amount of adjustments can make the app completely “safe” for certain high-risk threat models. We can recommend alternative encrypted or anonymous messaging systems, but we also recognize that most WeChat users are on WeChat out of necessity. For high-risk users, we recommend talking to a security professional about your particular concerns to see what you can do to limit, manage, or reduce your exposure to risk while using the app.

]]>
FAQ: A comparison of search censorship in China https://citizenlab.ca/2023/04/faq-a-comparison-of-search-censorship-in-china/ Wed, 26 Apr 2023 14:00:34 +0000 https://citizenlab.ca/?p=79334 Read the full report Missing Links: A Comparison of Search Censorship in China

What has your study of Chinese search platforms revealed?

Given the strict regulatory environment in which search platforms operate, users in China have limited choice in how they search for information. However, even among those limited choices, we nevertheless found important differences in the levels of censorship and in the availability of information among these search platforms. Across the search platforms that we analyzed, we discovered over 60,000 unique censorship rules used to partially or totally censor search results returned on these platforms based on users’ queries. Most strikingly, we found that, although Baidu — Microsoft’s chief search engine competitor in China — has more censorship rules than Microsoft Bing, Bing’s political censorship rules were broader and affected more search results than Baidu. This finding runs counter to the intuition that North American companies infringe less on their Chinese users’ human rights than their Chinese company counterparts.

What kinds of search platforms did you look at?

We looked at eight popular search platforms operating in China. Our sample included three web search engines: Baidu, Sogou, and Microsoft Bing. It also included four Chinese social media networks: Weibo, a microblogging site similar to Twitter; Douyin, the Chinese version of Tiktok; Bilibili, a video sharing platform similar to Youtube; and Baidu Zhidao, a question and answer site similar to Quora. Finally, we tested Jingdong, a Chinese e-commerce platform similar to Amazon. Microsoft Bing is the only search platform that we analyzed that was not operated by a Chinese company.

What is “soft” censorship?  How does it compare to “hard” censorship?

One way to censor search results for a sensitive query is to simply return zero results for that query. We call this form of censorship “hard” censorship. Another, more subtle form of censorship is to only allow results for that query from certain authorized sources. We call this form of censorship “soft” censorship. On web search engines operating in China, soft-censored queries will generally only show results from Chinese government websites and state-aligned media. On social media sites, soft-censored queries will generally only show results from accounts which have a certain level of approval and verification.

What are some examples of search queries that you have discovered censored?

Across all of the platforms which we looked at, we discovered over 60,000 unique censorship rules used to partially or totally censor search queries. Many of the censorship rules were targeting politically sensitive material, including references to Chinese political figures like Xi Jinping, banned religious movements like the Falun Gong, and major historical events like the 1989 Tiananmen Square protests. Other rules targeted illicit activities like gambling, drug use, pornography, and the buying and selling of weapons. Some of the recently discovered censorship rules censor queries containing “中国间谍气球” [Chinese spy balloon], “成为下一个乌克兰 + 台湾” [Be the next Ukraine + Taiwan], and “逮捕令 + 普京 + 习近平” [Arrest Warrant + Putin + Xi Jinping]. Languages censored spanned English, Chinese, and Uyghur. Examples targeting Uyghur language include censorship of queries containing “ئەركىنلىك” [Freedom] and “ۋەتىنىمىز” [Our homeland].

How did you know if search results for a string of text was censored rather than genuinely having no results?

Our method of automatically measuring search censorship tests large swaths of text for content triggering censorship and, if present, isolating the content triggering its censorship. However, such large swaths of text might be too large and therefore too specific to have search results even when they are not censored. Thus, to test such content for censorship, we used special search queries that we call “truisms” to wrap search strings so that they either have a large number of results (if they are not censored) or zero results (if they are censored). These truisms work by taking advantage of advanced search operators that search platforms support. As an example, many search platforms support searching for results that contain either one string or another. Thus, instead of “xi jinping”, you might create a truism by searching for “xi jinping | the”, where the “|” symbol means “or”. Even if there is no content containing “xi jinping”, there is surely content containing “the”. In a more realistic example, “xi jinping” might be a sentence or more of content that we are testing at once, but the principle is the same — this query should always return results unless the query triggered censorship. Due to the diversity of support for advanced search operations, for each search platform we analyzed, we generally needed to find a different way of creating a truism on that search platform.

All search platforms have some form of content moderation in place. How do the Chinese search platforms you’ve studied compare?

The censorship that we measure differs from typical search platform content moderation in both its methods and the type of content it targets. First, search engines such as Google might delist single web pages or websites that contain content that is illegal to access in a user’s jurisdiction from appearing in results. However, the web search engines we analyzed operating in China, including Microsoft Bing, used broad, keyword-based censorship rules to restrict queries for certain types of content to only show results from Chinese government websites and state-aligned media or even censoring all results for a query altogether. Second, as our report shows, the content being censored on Chinese search platforms, including Bing, is largely political and religious content that is inconvenient to the Chinese Communist Party, whereas Google’s content moderation generally is not concerned with such types of censorship.

Are these search results only censored for China-based users of these search platforms?

On only one platform — Microsoft Bing — was the censorship we measured restricted to only users accessing from mainland China. In all other platforms we looked at, censorship applied to all users, regardless of whether they were accessing the search platform from mainland China or not.

What do your findings mean for non-Chinese technology companies operating or desiring to operate search platforms in China?

Among web search engines Microsoft Bing and Baidu, Bing’s chief competitor in China, we found that, although Baidu has more censorship rules than Bing, Bing’s political censorship rules were broader and affected more search results than Baidu. Bing on average also restricted displaying search results from a greater number of website domains. Our work calls into question the ability of non-Chinese technology companies to better resist censorship demands than their Chinese counterparts. Our findings serve as a dismal forecast concerning the ability of other non-Chinese technology companies to introduce search products or other services in China without integrating at least as many restrictions on political and religious expression as their Chinese competitors. In fact, rather than North American companies having a positive influence on the Chinese market, the Chinese market may be having a negative influence on these companies. Previous work has shown how the Chinese censorship systems designed by Microsoft and Apple have affected users outside of China.

 

]]>
Mobility Data and Canadian Privacy Law Explained https://citizenlab.ca/2022/11/mobility-data-and-canadian-privacy-law-explained/ Tue, 22 Nov 2022 10:00:38 +0000 https://citizenlab.ca/?p=78969 On November 22, 2022 Citizen Lab published an analysis and recommendations pertaining to the collection of de-identified mobility data and its use under the socially beneficial and legitimate interest exemptions in Canadian privacy law. In this explainer, we discuss the report and accompanying recommendations with Amanda Cutinha and Christopher Parsons, the report’s authors.

What are the key findings of this report?

In the report, we investigate the collection of mobility data by the federal government, its legality under the existing and proposed commercial privacy regime, and proposed recommendations for the reform of draft Bill C-27 which would address many of the issues in the governance of mobility data.

The federal government obtained de-identified and aggregated mobility data from Telus and BlueDot, beginning  as early as March 2020, but this only came to the public’s attention in December 2021. The Standing Committee on Access to Information, Ethics, and Privacy (ETHI) investigated this data collection and ultimately raised concerns about the federal government’s inadequate consultation with the Office of the Privacy Commissioner, the failure of the government to verify consent had been provided to collect or disclose the mobility information, the broad purposes for data collection, and the unclear timeline for the government’s retention of data.

When we assessed the lawfulness of the collection of mobility data, we found that BlueDot and Telus likely complied with current private sector privacy legislation PIPEDA. Specifically, the de-identified information likely did not constitute personal information within the meaning of PIPEDA. This, however, led us to spotlight deficiencies in current privacy legislation. These included:

  • inadequate governance of de-identified data
  • an absence of appropriate transparency and accountability principles
  • a failure to adequately account for harmful impacts of data sharing
  • a neglect of commitments to Indigenous data sovereignty principles
  • insufficient enforcement mechanisms

We found that these deficiencies remain in the Consumer Privacy Protection Act (CPPA). Most pertinently, the proposed legislation contains significant exceptions to knowledge and consent where the purposes of data sharing are deemed as either socially beneficial or within a corporation’s legitimate interests. The result is that individuals’ mobility information may be collected or used without knowledge or consent in the service of legitimate business interests, or disclosed to parties including the federal government such as for socially beneficial purposes.

We make 19 corrective recommendations to the CPPA that would alleviate many of the thematic issues facing PIPEDA and, by extension, the CPPA. However, even were these amendments adopted the legislation should be significantly re-thought to protect individual and collective privacy rights.

What are the key privacy issues with regards to collection of mobility data during the COVID-19 pandemic?

We outline a number of privacy issues surrounding the collection of data during the COVID-19 pandemic.

First, there has been a lack of transparency concerning the collection, use, or disclosure of de-identified mobility data between private sector organizations and the federal government. Though the pandemic required timely and urgent responses, communications from the government were often muddled and did not clearly address whether the government was collecting mobility data. This lack of transparency can fuel distrust amongst members of the Canadian public who already doubt that the federal government respects their privacy rights.

Second, the federal privacy commissioner was not adequately involved in assessing the government’s collection of mobility information. In the case of the disclosure between Telus, BlueDot, and the federal government, the Privacy Commissioner was not engaged. Consequently, the Commissioner could not review the privacy practices linked to the activity in order to confirm the adequacy of de-identification or to ensure consent was obtained where necessary under law.

Third, while the government asserted it established requirements to protect Canadians’ privacy when entering into contracts with Telus and BlueDot, these requirements were not made public or discussed in greater detail.

Fourth, the stated purposes for the collection of data were very broad. They would allow for, in theory, the provision of information or policy advice to relevant provincial and municipal governments to target enforcement actions towards communities with higher-than-average mobility scores. This could have led to enforcement activities being applied to racialized neighborhoods where residents more regularly traveled significant distances for work. The prospect of disproportionate enforcement actions raises equity concerns.

Fifth, the absence of transparency was not limited to the purposes for data collection but continued through to retention timelines. The collection of data was to continue until the end of the pandemic, raising questions as to who decides when the pandemic is ‘over’.

Overall, these issues highlight deficiencies in the existing framework governing private-public data sharing: an absence of governance for de-identified data; a lack of transparency requirements in the sharing of data; inadequate protections to prevent function creep and long retention timelines; and the absence of requirements to consider the equity implications of information sharing with government agencies.

What are the potential negative consequences of collecting and sharing COVID-19 pandemic mobility data with the intention of being ‘socially beneficial’?

Individual privacy rights are at risk when data sharing can occur for socially beneficial purposes, where individuals whose data is being shared are neither aware of the sharing nor consent to it. Socially beneficial purposes can mean different things to differently-situated people: what may be perceived as being socially beneficial to one group may not be to another.

To give one example, consider the context of abortion-care services. One government might analyze de-identified data to assess how far people must travel to obtain abortion-care services and, subsequently, recognize that more services are required. Other governments could use the same de-identified mobility data and come to the opposite conclusion and selectively adopt policies to impair access to such services.

Moreover, the sharing of data for socially beneficial purposes without knowledge or consent may be interpreted as inherently paternalistic. Though the federal government is tasked with making policy that benefits the lives of its citizens, sharing data without knowledge and consent can undermine the data sovereignty of individuals in society. This problem is further pronounced for Indigenous people whose sovereignty has been historically undermined.

Is current privacy legislation adequate in protecting individuals’ privacy interests?

No. We argue that current commercial privacy legislation fails to adequately protect individuals’ privacy interests for the following reasons:

  1. PIPEDA fails to adequately protect the privacy interests at stake with de-identified and aggregated data despite risks that are associated with re-identification.
  2. PIPEDA lacks requirements that individuals be informed of how their data is de-identified or used for secondary purposes.
  3. PIPEDA does not enable individuals or communities to substantively prevent harmful impacts of data sharing with the government.
  4. PIPEDA lacks sufficient checks and balances to ensure that meaningful consent is obtained to collect, use, or disclose de-identified data.
  5. PIPEDA does not account for Indigenous data sovereignty nor does it account for Indigenous sovereignty principles in the United Nations Declaration on the Rights of Indigenous Peoples, which has been adopted by Canada.
  6. PIPEDA generally lacks sufficient enforcement mechanisms.

 Why does the collection of de-identified data matter?

De-identified data runs the risk of being re-identified, especially with the rapid evolution of machine learning technologies, breadth of publicly and commercially available datasets, and regularly evolving statistical methods for analyzing data. Where information which is sensitive is de-identified and not subject to the same privacy protections as identifiable, personal information, re-identification risks are magnified.

Would implementing your recommendations solve issues with privacy law?

We wrote this report, in part, to provide practical solutions to gaps in draft privacy legislation. Our recommendations were drafted in light of this practical aim.

However, as we ultimately conclude, our recommendations are not a panacea – even if all of the changes were implemented, they would not ameliorate all of the issues with the CPPA. In order to adequately protect individual privacy rights, the correct approach would be to take a human rights centric approach to privacy protections.

Which recommendations are the most important?

In drafting recommendations, we sought to ameliorate existing deficiencies in current privacy law. The recommendations of the most concern relate to the exemptions to knowledge and consent for “socially beneficial” purposes and “legitimate interests” of organizations.

The sharing of de-identified mobility data between the private and public sector would be authorized under the socially beneficial purpose exemption to knowledge and consent under the draft CPPA. While socially beneficial activities can have positive characteristics, determining what constitutes a beneficial activity can be political. There is a risk that what is socially beneficial for some is not for others. The failure to narrow this exception may allow for information sharing that disproportionately intrudes on the privacy or equity rights of some individuals. We offer numerous recommendations intended to reduce the risks associated with potentially socially beneficial uses of data while, at the same time, not asserting that such sharing should be barred in its entirety.

While the socially beneficial purposes clause opens the door to sharing de-identified information with third-parties, such as government agencies, the legitimate interest exception enables private organizations to determine whether the collection or use of personal information outweighs the adverse effects of doing so. While the information cannot be used to influence an individual’s behavior or decisions it could be used to create datasets that facilitate business or policy developments. While the Privacy Commissioner could investigate organizations that use the exception, they would first need to know that organizations were collecting or using information under this exception; only then could the Commissioner request the organization’s records. The effect is that unless the Privacy Commissioner is zealously engaged in asking private organizations about whether they are collecting or using personal information under the legitimate interest exception, it will be private organizations that will principally be the judges and juries of whether their collection falls under the legitimate interest exception. We argue that organizations should need to be up front with the Commissioner about the use of this exception while, also, aiming to better empower individuals to control how private organizations collect and use their personal information.

 

]]>
Privacy and Security Analysis of the IATA Travel Pass Explained https://citizenlab.ca/2022/04/privacy-and-security-analysis-of-the-iata-travel-pass-explained/ Wed, 13 Apr 2022 13:00:25 +0000 https://citizenlab.ca/?p=76337 On April 13, the Citizen Lab published an analysis of the IATA Travel Pass. In this post, we discuss the significance of the report’s findings.

What are the main technical and policy findings of this report?

The registration process of the IATA Travel Pass (ITP) is flawed. The flaw allows an attacker to create an ITP account impersonating any person, while only needing the victim’s passport details, but not the passport itself. This flaw is currently circumvented by requiring users to present their physical passports whenever an ITP account is being authenticated at a physical location.

ITP utilizes a blockchain-based technology “Sovrin” to verify the validity and authenticity of user-supplied digital COVID-19 test reports. Sovrin provides a way for entities to issue unforgeable digital proofs, and a way to independently verify them. However, in ITP, most if not all issuers (COVID-19 testing laboratories) rely on the same cloud-based web application centrally managed by Evernym, a provider of Sovrin technologies. With this design, it is technically possible for Evernym to issue valid digital proofs in the name of the laboratories without their knowledge. This is one of the flaws that we show, in which the ITP system design nullifies the advantages brought by Sovrin, a decentralized blockchain system.

This report reveals a vulnerability in the IATA Travel Pass app that is the result of an intentional design decision. Does this mean the developers didn’t foresee this issue? Or was convenience prioritized over security?

From our correspondence with IATA, it appears that they were aware of this issue when making the decision.

The developer were faced with two options, each having different drawbacks:

  1. Sending user information (passport details and liveness test captures) to the server for verification will give trustworthy verification results, since the user could not interfere with the verification process. However, this means that the server has to process highly sensitive user information. This increases the possibility of a data breach.
  2. Verifying user information on the phone itself makes it much easier for the user to interfere with the process, yielding a forged result, as we have demonstrated. Since the result could not be trusted, verifiers have to rely on other sources for verification instead. IATA stated that physical passports are currently required to verify user identity at COVID-19 testing laboratories. Checking physical passports is of course secure, but also eliminates the need for digital passports, because they intend to serve the same purpose (authenticating user identity).

As we have shown, the developer chose the second option.

ITP uses blockchain technology but part of the verification process is outsourced to a single provider, seemingly nullifying the benefits of a decentralized system. What challenges does this present?

Sovrin, which is a decentralized blockchain ecosystem, allows entities to issue, transmit, and validate digital proofs, without needing any centralized authority. Compared to centralized ecosystems (such as mainstream social networking websites), a decentralized system tends to be more resilient against cyber attacks and network outages, because there is no single point of failure. Failures (such as outages and data breaches) only affect the node (entity) itself, but not the others. Decentralized systems are also less prone to surveillance because surveilling a single node would not yield data of the others.

In a decentralized system, entities have to keep operation and maintenance to their own hands, because outsourcing operation and maintenance would also give out controls, which defeats the purpose of decentralization. This is one of the issues our research has shown with ITP.

The decision to outsource operation and maintenance of laboratory systems was likely driven by resource constraints, as a single laboratory would have much less information technology capabilities than a technical service provider. A single technical service provider operating systems for multiple laboratories is also likely to bear lower overall cost than each laboratory operating their own systems.

This dilemma presents a challenge in choosing between decentralized and centralized system architectures. A centralized system consolidates control and responsibility to a central authority, which could leverage the economics of scale to operate an efficient system, while keeping operational costs low for most users. A decentralized system distributes control and responsibility to each participant. While not having to rely on a central authority, each participant must now bear more responsibility. Weighing these pros and cons is a central challenge when choosing between decentralized and centralized system design.

ITP’s system design is a hybrid. Its current low-level system architecture is decentralized; however, it is encapsulated by a centralized high-level interface. If operated through the centralized interface (which is currently true for most cases), the system possesses the same set of security and privacy properties as conventional centralized systems.

What implications does this report have for travellers using IATA Travel Pass?

The “digital passport” feature of ITP is only intended to be used when registering with laboratories, and not as replacements to physical passports. When registering with laboratories, physical passports have to be crossed-checked with the digital passport because ITP’s system design flaw allows digital passports to be issued without possession of the physical counterpart.

Travellers might already expect their passport data to be shared with laboratories, because a consent form in the app is shown. They might also expect their data to be processed by the laboratories’ technical provider. However, they might not have expected that the laboratories’ technical provider, Evernym, is also in charge of the development of the ITP app, and is a contractor of IATA. These relationships create trust issues, as it is technically possible for IATA to demand user passport data from Evernym.

What implications does this report have for companies looking to adopt, or already adopted, the IATA Travel Pass?

Despite utilizing a decentralized blockchain technology, Sovrin, under its hood, ITP implemented a centralized interface to encapsulate Sovrin. This centralized interface is used by laboratories. If operated through the centralized interface, the system possesses the same set of security and privacy properties as conventional centralized systems.

Instead of the laboratories, Evernym actually has the ultimate control to issue COVID-19 test reports, because laboratories delegate their private issuer keys to Evernym for easy management.

The softwares for issuing and verifying digital COVID-19 test results are implemented by the same vendor, Evernym. This creates a conflict of interest, because Evernym now has to make sure all digital test results are issued properly (i.e., that the software does not leak private issuer keys and that results are issued only on the instruction of laboratories), while also being the same entity to produce software to scrutinize whether the issued results were trustworthy.

The digital passport produced with ITP is not guaranteed to bear the same information as the physical counterpart that it was derived from. A digital passport could also be produced with arbitrary data, without needing a physical passport at all, because of ITP’s design flaw. The digital passport should be treated as an unverified copy of its physical counterpart.

]]>
Pandemic Privacy Explained https://citizenlab.ca/2021/09/pandemic-privacy-explained/ Tue, 28 Sep 2021 13:00:35 +0000 https://citizenlab.ca/?p=75513 On September 28, the Citizen Lab published an analysis of COVID-19 data collection practices. In this post, we discuss the significance of the findings with report authors.

What are the main findings of this report?

This report focused on how data was collected during the COVID-19 pandemic in the United States, United Kingdom, and Canada, the extent to which privacy inhibited pandemic responses in Canada, and how Canadian privacy legislation introduced during the pandemic would problematically have rewritten federal commercial privacy law had it not died on the order paper. 

In analyzing how COVID-19 data has been collected in the United States, United Kingdom, and Canada, we found that the breadth and extent of data collection constituted entirely novel technological responses to a health crisis despite the fact that many of the adopted methods could be mapped onto a trajectory of past collection practices. We also found that the ability for private companies such as Google and Apple to forcefully shape some of the technology-enabled pandemic responses speaks to the significant ability of private companies to guide public health measures that rely on contemporary smartphone technologies.

Throughout the pandemic concerns have arisen that privacy, or privacy law, would prevent governments from adequately collecting, using, or sharing data to mitigate the spread of COVID-19. We did not find that privacy law was responsible for the problems that have arisen throughout Canadian governments’ responses to the pandemic. Privacy, health, and emergencies laws that were in place since the outset of the COVID-19 pandemic ensured that governments and private organizations alike were able to mobilize information to combat the pandemic.

Finally, in assessing potential future privacy legislation that emerged in the wake of the pandemic, we found that the Canadian government’s proposed legislation could have significantly extended the ability of private organizations to collect, use, or disclose personal information without individuals’ consent. Moreover, had the Canadian-style legislation been adopted into law then it would have failed to include a human rights-based focus, with the effect of insufficiently protecting Canadians’ personal information at a time where such protections are sorely needed

Given the many differences between countries, what do we gain by doing comparative analyses? 

We conducted a comparative analysis of the different technologies which were adopted to combat COVID-19. This approach let us better understand and assess how private organizations influenced state behaviours, while also making clear that the processes that we were seeing were not limited to one country but were instead representative of a broader trend. Focusing on how collection technologies were used a single country, in contrast, would not have afforded us the ability to draw equivalently broad conclusions.

When it comes to our legal and legislative analysis, we focused on a single jurisdiction to assess whether there were common trends in legislation that empowered governments in Canada to collect, use, and disclose personal information to combat the pandemic. By triangulating between three provinces—British Columbia, Ontario, and Quebec—as well as the federal government, we were able to make stronger claims about the relatively unrestricted ability of governments to handle personal information during health emergencies than had we focused only on federal laws, or the laws of a single or pair of provinces.

How have different political cultures informed pandemic responses?

When looking at how collection technologies were used to support the pandemic response, we deliberately conducted a comparative analysis that included the United States, United Kingdom, and Canada. Doing so let us assess the extent to which these countries, which have different political cultures as well as health provisioning systems, would adopt collection technologies, and the extent to which their approaches would differ. Based on our study, we found that common trends in technology uptake occurred despite differences in nations’ political cultures. Specifically, in all jurisdictions we saw a willingness on the part of private industry to repurpose pre-existing consumer surveillance systems to facilitate disease surveillance, a push and pull between states and Apple and Google with the effect of states capitulating and adopting a data collection system built by Apple and Google, and a general adoption of privacy-protective means of collecting disease symptom information from individuals who installed symptom checker applications on their smartphones. 

Both the technologies deployed to combat COVID-19 as well as the legal rationales underpinning them are often framed as unprecedented, but the report notes that they have “historical legacies.” How do these approaches speak to deeply rooted approaches to health and civil liberties?

In the case of the technologies that we examined—which were built or configured to enable states to surveil, identify, and interrupt the spread of disease—they were part of a lineage of disease surveillance processes, practices, and techniques. However, the actual modes of surveillance, which often re-purposed existing communications infrastructures for mass surveillance to facilitate the mitigation of the pandemic, were unprecedented in scope.

When looking at laws that were in place at the start of the pandemic, in Canada, we found that governments had developed an extensive legislative framework to facilitate data collection, use, and sharing. This framework was created, in part, as a response to the 2003 SARS outbreak, where concerns had been raised in its aftermath that commercial privacy laws might impede responses to future pandemics. The key lessons from the SARS pandemic, which focused significantly around ensuring governments could collect, use, and share information efficiently, were not learned. As a result, there is precedent in Canada for governments failing to address a health crisis, and many of the failures that arose in SARS have been mirrored in the current health crisis. 

What role do private companies play in the collection and sharing of public health data?

Private companies have, and hold, significant roles in how health care is provisioned and in how governments can mitigate and combat the spread of disease. Given the global nature of the COVID-19 pandemic many companies were involved in the pandemic response. This included telecommunications companies and private advertising companies that collect huge volumes of our personal information every day, Google and Apple which transformed elements of their operating system to enable decentralized and privacy-protective exposure notification applications, and companies that designed applications to help individuals assess whether they had the symptoms of COVID-19. However, much of the data that was collected was already in the hands of private companies and they either resisted sharing some of it, in part due to privacy concerns, or were involved in not just collecting the data but also making decisions about what it meant. This put private companies in notable positions of influence over how states were informed of the efficacy of their policies. 

In the case of the exposure notification applications that use the Google/Apple exposure notification system, countries were forced to adapt to the limits laid out by Google and Apple regardless of what their own health care professionals said were needed to mount a public health response to the health emergency. While Google and Apple may have had good reasons for their decisions—such as to prevent disease surveillance applications from turning into mass surveillance tools by repressive governments—the fact remains that private companies dictated how public health workers would respond to the pandemic. This kind of resistance to states’ preferred modes of collecting data about a health emergency is notable, as is the fact that ultimately Google and Apple did transform elements of their operating system to create the largest, decentralized and privacy-protective, exposure notification system that’s ever existed. 

How have Canada’s privacy laws prevented or enabled collecting, sharing, and using personal information throughout the pandemic?

At the outset of the pandemic in Canada there were concerns that privacy might unduly prevent governments from collecting, using, or sharing personal information, with the effect that governments would be ill prepared to mitigate the spread of COVID-19. In our research, however, we did not find this to be the case: the web of health, emergencies, and privacy law that predated the COVID-19 pandemic ensured that governments and private organizations alike could mobilize data, as needed to combat the pandemic. 

While applications such as Canada’ COVID Alert were recognized by experts throughout Canada as being highly protective of individuals’ privacy rights, many Canadians still resisted installing the application on the grounds of fearing it would intrude upon their privacy. What does this situation suggest to you about how some Canadians interpret what is a legitimate, or illegitimate, intrusion into their privacy?

Canadian law empowers the government to collect significant amounts of personal information from Canadians, so long as there is a direct connection between the collection and the mandate of a given government agency. The Government of Canada, and their provincial counterparts, declined to exercise these powers under the law and instead ensured that the COVID Alert application would be opt-in, and that individuals would provide consent before any data collection took place. Despite being amongst the most privacy-protective government initiatives in recent memory, many Canadians hesitated to install the application. 

In our report, we discuss the need for governments to update their privacy legislation to ensure that Canadians can better understand how their data might be used once collected, as well as ensure that governments will not extend their uses of collected personal data once they have obtained it. At the same time, it behooves governments to publicly consult with Canadians to explain when they will obtain consent before using types of data, as well as build in processes where Canadians are made aware of data collection, use, or disclosure at a time when their decision is linked to the action the government wants to undertake. This might mean that before an agency, which had collected data, used it that the agency either informed or obtained further consent or approval from the affected individual. Doing so would help Canadians better understand how, when, and why their personal information was used, while also hopefully building trust that the government wasn’t carelessly handling or using their personal information.

As we look to the future, what are the issues policymakers need to grapple with?

We identify a set of issues policymakers will need to attend to as a result of what has been learned throughout the COVID-19 pandemic. First, the pandemic has shown that private companies can exert considerable influence over the technical systems that are used to monitor for, and subsequently help to mitigate, the spread of disease. Has the balance of influence during the COVID-19 pandemic been appropriate, or should state actors be empowered to compel private companies to more closely adhere to what public health officials believe are necessary actions? 

Second, many of the technical systems that were used to collect data during the pandemic were untested at their launch: governments have never tried to rely on mobile phone data, or exposure notification applications, or even digital symptom checkers at the scale they were employed during the pandemic. Because many of these tools were experimental in nature there is an ongoing, and vigorous, debate over their utility or efficacy, and in particular concerns that some systems and tools may have exacerbated pre-existing social inequities. How can we ensure that future experimental systems and tools can be deployed in equitable ways, and who should be responsible for assessing the bioethical implications of these tools and the associated government policies? 

Third, the pandemic has showcased that there is a divide between the privacy protections expected by some members of the public versus those guaranteed in law. Policymakers will need to contemplate how to regain trust that has already been lost, as well as propose privacy law reforms that satisfy individuals’ expectations of privacy while balancing those expectations against the needs of states to be prepared to respond to future health emergencies. The question before government officials is how can they undertake meaningful consultations so that Canadians feel heard in what they believe the government should do to better protect their personal information, and how will governments then act on what they hear?

Finally, proposed reforms to federal privacy legislation in Canada during the pandemic lacked human rights protections and, as a result, would have inadequately protected privacy and thus could compound distrust in how private organizations can handle personal information. Policymakers preparing legislation for the next parliament will need to respond to whether they want to re-introduce such deeply problematic legislation or if, instead, they want to follow international consensus and adopt privacy legislation that is grounded in human rights.

]]>