The controversial HiQ v. LinkedIn, 938 F.3d 985 (9th Cir. 2019) opinion concerning data scraping was vacated and remanded this June by the Supreme Court to be reconsidered in light of the recently decided case Van Buren v. United States, 141 S. Ct. 1648 (2021). This blog will delve into the reasoning of Van Buren and predict the final decision of HiQ v. LinkedIn

To begin, let’s recall the basic facts of HiQ v. LinkedIn. HiQ is a data analytics company that uses automated bots to scrape public information from LinkedIn users’ profiles, including name, job title, and work history. It then uses an algorithm to yield “people analytics” based on the information and sells them to business clients for profit. LinkedIn sent a C&D letter, asking HiQ to stop accessing and copying the data based on the User Agreement. During trial, LinkedIn claimed that HiQ violated the Computer Fraud and Abuse Act of 1986 (“CFAA”), which states that criminal liability will be inflicted if someone “intentionally accesses a computer without authorization or exceeds authorized access,” and thereby obtains computer information. 18 U.S.C. §1030(a)(2). The legal issue here is whether HiQ’s further scraping after receiving the C&D letter constitutes an act “without authorization or exceed[ing] authorized access.” 

The interpretation of this clause has been disputed for a long time. Some circuits have held that a violation of policies and contracts such as terms of use and confidentiality agreements is enough to establish liability under the CFAA, while others tend to interpret it narrowly. Finally, Van Buren provides more explicit guidelines to resolve this circuit split. In Van Buren, the police sergeant Van Buren ran the license-plate search in his department’s database for his friend, which violated the department’s policy of database use only for law enforcement purposes. The Court ruled that Van Buren’s conduct doesn’t fall under the scope of the CFAA with the reasoning as follows. According to 18 U.S.C. §1030(e)(6), the term “exceeds authorized access” means “to access a computer with authorization and to use such access to obtain or alter information in the computer that the accessor is not entitled so to obtain or alter”. Both parties agree that Van Buren accessed the computer with authorization and obtained the information; the question is: whether he is “entitled so to obtain?” The Court mainly focused on the interpretation of the word “so,” which shall refer to a stated manner from the “preceding text.” Thus, the clause shall only refer to information that a person is not entitled to obtain “via a computer one is authorized to access” since it’s the only manner which has already been mentioned in the provision. It can’t be interpreted more broadly to include other limitations by contracts or policies. Van Buren, at 1655. To view “exceeds authorized access” as a whole in the computer context, “access” can mean both entering the computer system itself or a particular part of the system, such as files, folders, and databases. Therefore, an authorized computer user may not have access to a particular part of the system and in this sense, the act of entering that part can be deemed “exceeds authorized access.” Van Buren, at 1657-58. To conclude, the Court explained this CFAA clause narrowly, in a rather technical sense. To hold otherwise would mean millions of law-abiding computer users violating computer-use policies would be criminals. 

Applying this interpretation to HiQ v. LinkedIn, HiQ’s bots have authorized access to LinkedIn’s servers and thereby were entitled so to obtain LinkedIn’s member profiles since they’re public data. Although the User Agreement stated that the LinkedIn users own the profile information and only LinkedIn was licensed to use, copy, publish, and process them, HiQ’s data scraping activities won’t be considered to “exceed authorized access” under the CFAA. Despite the probable same result, this reasoning focusing on the statutory terms will be very different from the analysis of whether HiQ’s conduct is analogous to “breaking and entering” in the vacated 9th Circuit opinion. 

Apart from providing explicit explanation of the CFAA in the computer context, this case also provokes much reflection on data law. There’s still no clear legislation on the ownership of data, but it seemed the 9th Circuit accepted that data subjects own the data while the platform may be licensed to process data. As for data scraping, it’s essential to abide by the robots.txt rules and not to circumvent any technical measures implemented by the website owner for an ordered and peaceful Internet ecosystem.

2 thoughts on “The Legal Boundary of Data Scraping in Light of Van Buren v. United States”

Comments are closed.