Shift Law Logo
Questions? Call 647-361-7533

Webscraping and copyright in factual content

Published:

On April 15, 2019, Justice Southcott of the Federal Court of Canada issued an order that puts to rest a dispute between the Toronto Real Estate Board (TREB) and website operator MongoHouse over the legality of web scraping of TREB’s Multiple Listing Service (MLS). The order is noteworthy because it touches on multiple interesting topics, including legal strategies for combatting web scraping and the limits of copyright in factual content.

The case

The TREB MLS is an online database of current and historic real-estate listings and sale prices that is maintained by the board and is accessible to brokers and agents. MongoHouse operated a website that provided publicly accessible property listing and sale price data. In its action against MongoHouse, TREB asserted that copyright subsists in the MLS and alleged that MongoHouse was infringing TREB’s copyright by accessing and copying data from the MLS and distributing it on the MongoHouse website.

In a final order on consent, Justice Southcott declared that TREB is the owner of the copyright in the MLS and that  any unauthorized copying, data scraping, downloading, display or distribution of the MLS data is “a breach of TREB’s proprietary rights and copyrights associated with the TREB MLS.” Justice Southcott also declared that any unauthorized access to the MLS is breach of section 41.1 of the Copyright Act, which prohibits the circumvention of technological protection measures. Lastly, Justice Southcott issued a permanent injunction enjoining MongoHouse from accessing, copying or distributing MLS data.

What is Web Scraping?

Web scraping is the practice of using automated computer software to extract data from a website and copy it to another location, such as a database accessible through another website. Web scraping can be an issue for any owner of proprietary or sensitive information since web scraping techniques can be used to grant unauthorized access to such information, thereby undermining the owner’s ability to control or commercialize their data.

Strategies for combatting web scraping

TREB’s action against MongoHouse is not the first instance of a real estate entity taking action against another party’s unauthorized use of online real estate data. For example, in a previous blogpost we covered two US cases where operators of regional MLS’s sued real estate search engine and referral site NeighborCity.com for copyright infringement stemming from web scraping activities. And in 2011, the Supreme Court of BC held that website operator Zoocasa’s scraping of Century 21’s online real estate listings was a breach of the latter’s Terms of Use agreement, which explicitly prohibited copying or scraping of Century 21’s website.

Taken together, the MongoHouse, NeighborCity, and Zoocasa disputes all demonstrate that web scraping of real estate websites and/or MLS databases may be actionable under copyright law. However, for web scraping to constitute copyright infringement, it must first be established that copyright subsists in the information that is the subject of the scraping. This can be problematic where the information at issue is factual, since copyright does not protect facts. For this reason, providing an express prohibition on scraping in a website’s Terms of Use can be a prudent safety measure.

Copyright protection over MLS type information

Two principles of copyright law bear on the determination of whether copyright subsists in a database of information, such as an MLS. On the one hand, is the general rule that copyright does not subsist in facts. On the other hand, the Copyright Act states that “a work resulting from the selection or arrangement of data” is protectable as a compilation. Thus, while facts are not protectable, a compilation of facts may be if certain requirements are met.

For example, in the Alberta Court of Queen’s Bench decision in Geophysical Services Incorporated v Encanna Corporation, it was held that copyright subsists in compilations of raw and processed seismic data because the act of recording and compiling the data was an exercise of skill and judgment.

That said, copyright does not subsist in a compilation of facts simply because labour was expended in its creation. Indeed, in CCH Canadian Ltd. v Law Society of Upper Canada the Supreme Court of Canada clearly established that the skill and judgment necessary to attract copyright protection “must not be so trivial that it could be characterized as a purely mechanical exercise.”

Therefore, while a compilation of facts may be protected under copyright law, the act of compiling information is not, in and of itself, necessarily sufficient to give rise to copyright protection.

Whether copyright subsists in an MLS database is likely still an open question. On the one hand, the information contained within an MLS is factual and, therefore, does not attract copyright protection in isolation. But if the compilation of such information entailed sufficient skill and judgment, then an MLS could be protected as a compilation. If, on the other hand, the process of compiling information into an MLS database is merely a mechanical exercise, then copyright will not subsist in the ensuing compilation. Indeed, in 2017 the Federal Court of Appeal upheld a decision by the Competition Tribunal denying copyright protection over the TREB MLS on the grounds that “TREB’s specific compilation of data from real estate listings amounts to a mechanical exercise.”

Regardless, TREB has, at least for the time being, secured an order from the Federal Court establishing its ownership over the copyright in the TREB MLS. 

Key takeaways

The Federal Court’s order in TREB’s action against MongoHouse is an interesting development in the law surrounding web scraping of commercially valuable information. The order demonstrates that a plaintiff may seek relief under copyright law for harm flowing from another party’s web scraping activities.

However, because the order was issued on consent and was not accompanied by reasons, Justice Southcott’s decision is not conclusive of whether copyright subsists in an MLS or other compilation of commercially valuable information.

For all the foregoing reasons, protecting online databases of proprietary or sensitive information remains a tricky task. Therefore, the operator of such a database would be wise to rely on multiple forms of protection, including expressly prohibiting web scraping in an applicable Terms of Use agreement.

Back to Blog

Take stock. Manage. Protect.

Get in touch with us to learn more about what we can do for your business.

Let's Connect