Copyright

Think Big! The Need for Patent Rights in the Era of Big Data and Machine Learning

Think Big! The Need for Patent Rights in the Era of Big Data and Machine Learning
By: Hyunjong Ryan Jin*   Download a PDF version of this article here  

Introduction

With AlphaGo’s triumph over the 9-dan Go professional Lee Sedol in March 2016, Google’s DeepMind team conquered the last remaining milestone in board game artificial intelligence.[1] Just nineteen years after IBM Deep Blue’s victory over the Russian chess grandmaster Garry Kasparov,[2] Google’s success exceeded expert predictions by decades.[3] AlphaGo demonstrated how machine learning algorithms could enable processing of vast amounts of data. Played out on a 19 by 19 grid, the number of possible configurations on a Go board is astronomical.[4] With near-infinite number of potential moves, conventional brute-force comparison of all possible outcomes is not feasible.[5] To compete with professional level human Go players, the gaming artificial intelligence requires a more sophisticated approach than the algorithms employed for chess — machine learning. The underlying science and implementation of machine learning was described in a Nature article two months prior to AlphaGo’s match with Lee. In the article, the Google team described how a method called “deep neural networks” decides between the insurmountable number of possible moves in Go.[6] The AlphaGo model was built by reinforcement learning from a database consisting of over thirty million moves of world-class Go players.[7] This allowed the algorithm to optimize the search space of potential moves, therefore reducing the required calculations to determine the next move.[8] In other words, the algorithm mimics human intuition based on the “experience” it gained from the database “fed” into the algorithm, which drastically increases computational efficiency by eliminating moves not worth subsequent consideration. This allows the algorithm to devote computational resources towards the outcomes of “worthwhile” moves. The advent of such powerful analytical tools, capable of mimicking human intuition alongside massive computation power, opens endless possibilities—early stage cancer detection[9], accurate weather forecasting,[10] prediction of corporate bankruptcies,[11] natural event detection,[12] and even prediction of elections.[13] For information technology (“IT”) corporations, investment in such technology is no longer an option, but a necessity. The question that this Note addresses is whether the current state of intellectual property law is adequate to harness the societal benefits that we hope to enjoy through the advances in machine learning. In particular, are patents necessary in the age of big data? And if they are, how should we apply patent protection in the field of big data and machine learning? Part I of this Note examines the need for intellectual property rights in machine learning and identifies the methods by which such protection may be achieved. The differences between trade secret, copyright and patent protection in software are discussed, followed by the scope of protection offered by each means. This background provides the basis to discuss the effectiveness of each method in the context of machine learning and big data innovations. Part II discusses the basics of the underlying engineering principle of machine learning and demonstrates how the different types of intellectual property protection may apply. Innovators may protect their contributions in machine learning by defending three areas—(1) the vast amount of data required to train the machine learning algorithm, (2) innovations in the algorithms itself including advanced mathematical models and faster computational methods, and (3) the resulting machine learning model and the output data sets. Likewise, there are three distinct methods of protecting these intellectual properties: patents, copyright, and secrecy.[14] This Note discusses the effectiveness of each method of intellectual property protection with three principles of machine learning innovation in mind: facilitating data sharing, avoiding barriers to entry from data network effects, and providing incentives to address the key technological challenges of machine learning. This Notes proposes that patents on computational methods adequately balance the concern of patent monopoly and promoting innovation, hence should be the primary means of intellectual property protection in machine learning. Part III then visits the legal doctrine of patentable subject matter starting with the United States Supreme Court’s Alice decision. While Alice imposed a high bar for software patents, the post-Alice Federal Circuit decisions such as Enfish, Bascom, and McRO suggest that certain types of software inventions are still patentable. Specifically, this section will discuss the modern framework pertinent to subject matter analysis: (1) inventions that are directed to improvements of computer functionality rather than an abstract idea, (2) inventions that contain an inventive concept, and (3) inventions that do not improperly preempt other solutions. The Note will apply this framework to innovations in machine learning. The Note proposes that patents for computational methods balance the need for intellectual property protection while permitting data sharing, paving the pathway for promoting innovation in machine learning. The Note further argues that machine learning algorithms are within patentable subject matter under 35 U.S.C. §101.

I. Need for Intellectual Property Rights in Machine Learning

He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.” – Thomas Jefferson “I’m going to destroy Android, because it’s a stolen product. I’m willing to go thermonuclear war on this. They are scared to death, because they know they are guilty.” – Steve Jobs The two quotes above demonstrate the conflicting views on protecting intangible ideas with intellectual property law. Thomas Jefferson implied that the free circulation of inventive ideas and thoughts would not dampen the progress of innovation nor disadvantage innovators. On the other hand, Steve Jobs exhibited fury over the similarity between the iOS and the Android OS. Why? Was it because his company was worse off due to the similarity between the two products? Would Apple have refrained from inventing the iPhone had it known others would enter the smartphone market? This section discusses the motives behind the grant of intellectual property rights and whether such protection should be extended to machine learning innovations. Basics of patent law, copyright law, and trade secret are introduced to provide the analytical tools for subsequent discussion on which type of intellectual property protection best promotes the socially-beneficial effects of machine learning.

A. Do We Need Intellectual Property Rights for Machine Learning?

The primary objectives of intellectual property rights are to encourage innovation and to provide the public with the benefits of those innovations.[15] In the context of machine learning, it is not clear whether we need any additional incentives to promote participation in this field. Machine learning is already a “hot field,” with countless actors in industry and academia in active pursuit to keep pace.[16] Hence investment incentivizing may not be a valid justification for granting intellectual property rights in machine learning. Rather, such protection is crucial to promote competition and enhance public benefits. The quality of inferences that may be drawn from a given data set increases exponentially as the aggregation diversifies, which is why cross-industry data aggregation will greatly enhance the societal impact of machine learning.[17] Companies will need to identify new data access points outside of their own fields to gain access to other data sets to further diversity their data. Yet the incentive structures of behemoth corporations may not be well-suited to identify and grow niche markets.[18] It would be up to the smaller, specialized entities to find the gaps that the larger corporations overlooked and provide specialized services addressing the needs of that market. Protective measures that assist newcomers to compete against resource-rich corporations may provide the essential tools for startups to enter such markets. Sufficient intellectual property protection may serve as leverage that startups may use to gain access to data sets in the hands of the Googles and Apples of the world, thus broadening the range of social benefits from machine learning.

B. The Basics of Patent Law

To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries” – United States Constitution, Article I, § 8 The United States Constitution explicitly authorizes Congress to promote useful arts by granting inventors the exclusive rights of their discoveries. Such constitutional rights stems from two distinct bases — (1) a quid pro quo where the government issues a grant of monopoly in exchange for disclosure to society, and (2) property rights of the inventor. The purpose for such rights is explicitly stated in the Constitution—to promote new inventions. The goal is to prevent second arrivers who have not invested in the creation of the initial invention from producing competing products and services at a lower price, undercutting the innovator whose costs are higher for having invested to create the invention. As an incentive for innovators willing to invest in new, useful arts, the patent system provides the innovator rights to exclude others from practicing the invention. Another purpose of such rights is the concept of “ “mining rights.” Akin to the grant of mining rights to the owner in efforts to suppress aggressive mining, the inventor should have the right to define and develop a given field by excluding other people from the frontiers of that knowledge. Considering the importance of industry standards in modern electronics, such a purpose acknowledges the importance of early stage decisions that may define the trajectory of new technological advances.

C. The Thin Protection on Software Under Copyright Law

The Copyright Act defines a “computer program” as “a set of statements or instructions to be used directly or indirectly in a computer to bring about a certain result.”[19] Though it may be counterintuitive to grant copyright protection for “useful arts” covered by patents, Congress has explicitly mandated copyright protection for software.[20] However, as will be discussed below, copyright protection of software has been significantly limited due to case law. Copyright protects against literal infringement of the text of the program. Source code, code lines that the programmers “author” via computer languages such as C++ and Python, is protected under copyright as literary work.[21] In Apple v. Franklin Corp., the Third Circuit Court of Appeals held that object code, which is the product of compiling the source code, is also considered a literary work.[22] Given that compiled code is a “translation” of the source code, this ruling seems to be an obvious extension of copyright protection. Removing the copyright distinction between source code and object code better reflects the nature of computer languages such as Perl, where the source code is not translated into object code but rather is directly fed into the computer for execution. However, the scope of protection on either type of code is very narrow. The copyright system protects the author against literal copying of code lines. This leaves open the opportunity for competitors to avoid infringement by implementing the same algorithm using different text. Fortunately, in addition to protection against literal copying of code, copyright law may provide some protection of the structure and logical flow of a program. Equivalent to protecting the “plot” of a novel, the Second Circuit Court of Appeals ruled that certain elements of programming structure are considered an expression (copyrightable) rather than idea (not copyrightable), extending copyright protection to non-literal copying.[23] The Computer Associates International v. Altai court applied a three-step test to determine whether a computer program infringes other programs—(1) map levels of abstraction of the program; (2) filter out protectable expression from non-protectable ideas; and (3) compare which parts of the protected expression are also in the infringing program.[24] The merger doctrine is applied to step two of the Altai test to limit what may be protected under copyright law. Under the merger doctrine, code implemented for efficiency reasons is considered as merged with the underlying idea, hence not copyrightable.[25] Since most algorithms are developed and implemented for efficiency concerns, the Altai framework may prevent significant aspects of software algorithms from receiving copyright protection. This means that for algorithms related to computational efficiency, patents may provide significantly more meaningful protection than copyright. The Federal Circuit, in the 2016 case McRO Inc. v. Bandai Namco Games America Inc., ruled that patent claims with “focus on a specific means or method that improves the relevant technology” may still be patentable.[26] Although preemption concerns may impede patentability, exemption of patent right by preemption is narrow compared to that of copyright by the merger doctrine. Scène à faire doctrine establishes yet another limitation on copyright for computer programs. Aspects of the programs that have been dictated by external concerns such as memory limits, industry standards and other requirements are deemed as non-protectable elements.[27] For mobile application software, it is difficult to imagine programs that are not restricted by form factors such as mobile AP computation power, battery concerns, screen size, and RAM limitations. As for machine learning software, the algorithms determine the “worthiness” of computation paths based on conserving computational resources. The external factors that define the very nature and purpose of such machine learning algorithms may exempt them from copyright protection.

D. Comparing Trade Secret and Non-disclosures with Patents

The crucial distinction between trade secret and patent law is secrecy. While patent applicants are required to disclose novel ideas to the public in exchange for a government granted monopoly, trade secret requires owners to keep the information secret. Though trade secret protection prevents outsiders from acquiring the information by improper means, it does not protect the trade secret against independent development or even reverse engineering of the protected information. In trade secret doctrine, the existence of prior disclosed art is only relevant for discerning whether the know-how is generally known, a different and simpler analysis than the issue of novelty in patent law.[28] The United States Supreme Court has specified in Kewanee Oil that all matters may be protected under trade secret law, regardless whether it may or may not be patented.[29] The Kewanee Oil court predicted that inventors would not resort to trade secret when offered a presumptively stronger protection by patent law:
“The possibility that an inventor who believes his invention meets the standards of patentability will sit back, rely on trade secret law, and after one year of use forfeit any right to patent protection, 35 U.S.C. § 102(b), is remote indeed.”[30]
Trade secret is an adequate form of protection for innovators that are concerned with the limits of what may be patentable. The secrecy requirement of trade secret inherently provides protection that may potentially outlive any patent rights, provided a third party does not independently acquire the secret. This coincides with an interesting aspect of machine learning and big data—the need for massive amounts of data. Developers need data to “train” the algorithm, and increase the accuracy of the machine learning models. Companies that have already acquired massive amounts of data may opt to keep their data secret, treating the aggregated data as a trade secret. In addition to the amount of amassed data, companies have all the more reason to keep their data secret if they have access to meaningful, normalized data. Even if a company amasses an enormous amount of data, the data sets may not be compatible with each other. Data gathered from one source may have different reference points or methodologies that are not immediately compatible with data from another source. This raises the concern of “cleaning” massive amounts of data.[31] Such concerns of data compatibility mean that parties with access to a single, homogenous source of high quality data enjoy a significant advantage over parties that need to pull data from multiple sources. However, data secrecy may not be a suitable strategy for companies that are aiming for cross-industry data aggregation. Institutions such as Global Alliance for Genomics and Health are promoting data sharing between research participants. The Chinese e-commerce giant Alibaba announced a data sharing alliance with companies such as Louis Vuitton and Samsung to fight off counterfeit goods.[32] To facilitate the development of technology and to mitigate risks, various companies and research institutions across diverse fields are engaging in joint development efforts and alliances. Seeking protection under trade secret runs against this trend of engaging in effective cross-industry collaboration. Yet there are countervailing arguments that trade secret promotes disclosure by providing legal remedies that can replace the protection of secrets.[33] Parties can sidestep the limitations of trade secrets by sharing proprietary information under the protection of contract law. While data sharing practices may void trade secret protection, the nature of continued accumulation of data and carefully drafted contractual provisions may provide sufficient protection for the data owners.

II. Placing Machine Learning within Intellectual Property Law

Learning is any process by which a system improves performance from experience.” – Herbert Simon, Nobel Prize in Economics 1978. The concept of machine learning relates to computer programs that have the capability to improve performance based on experience, with limited intervention of the programmer.[34] Machine learning models have the capability to automatically adapt and customize for individual users, discover new patterns and correlations from large databases, and automate tasks that require some intelligence by mimicking human intuition.[35] This section dissects the mechanics of machine learning to identify the aspects of machine learning innovations that are at issue as intellectual property.

A. Machine Learning Basics

Machine learning methods are divided into two different approaches—supervised machine learning and unsupervised machine learning. For supervised machine learning, models are typically established by applying “labeled” sets of data to a learning algorithm. Labeled data refers to data sets that have both relevant features and the target results that the programmer is interested in. For example, we may be interested in developing a machine learning model that classifies images with dogs in them. The data sets for supervised machine learning would indicate whether a given images has dogs or not. The learning process begins with the algorithm fitting trends found in the training data set into different types of models. The algorithm compares the prediction errors of the models by inputting the validation set data into each model, measuring their accuracy. This allows the algorithm to decide which of the various models is best suited as the resulting machine learning model. Finally, the machine learning model is then evaluated by assessing the accuracy of the predictive power of the model. The developed model is then applied to data without a correct answer to test the validity of the model. In unsupervised machine learning, the data sets are “unlabeled” data, which may not contain the result that the programmer is interested in. Returning to our dog image classification example, data sets for unsupervised machine learning will have pictures of various animals that are not labeled—the computer does not know which pictures are associated with dogs. The unsupervised machine learning algorithm develops a model that extracts common elements from the picture, teaching itself the set of features that makes the subject of the picture a dog. In essence, unsupervised machine learning uses data sets that do not have specific labels fed into the algorithm for the purpose of identifying common trends embedded in that data set. The objective of developing such machine learning models varies. Sometimes the goal is to develop a prediction model that can forecast a variable from a data set. Classification, which assigns records to a predefined group, is also a key application of the algorithm. Clustering refers to splitting records into distinct groups based on the similarity within such group. Association learning identifies the relationship between features.         Figure 1. Overview of Machine Learning Model Development   Figure 1 illustrates the overall process of machine learning model development. The learning process of machine learning algorithms begins with aggregation of data. The data originates from an array of diverse sources ranging from user input, sensor measurement, or monitoring of user behavior.[36] The data sets are then preprocessed. The quality of data presents a challenge in improving machine learning models—any data that has been manually entered contains the possibility of error and bias.[37] Even if the data is collected through automatic means, such as health monitoring systems or direct tracking of user actions, the data sets require preprocessing to account for systematic errors associated with the recording device or method.[38] This includes data skews due to difference between individual sensors, errors in the recording or transmission of data, and incorrect metadata about the sensor.[39] Simply put, the data sets may have differing reference points, embedded biases, or differing formats. The “cleaning” process accommodates for the data skews. The objective of machine learning models is to identify and quantify “features” from a given data set. The term “feature” refers to individually measurable property of an observed variable.[40] From the outset, there may be an extensive list of features that are present in a set of data. It would be computationally expensive to define and quantify each feature, and then to identify the inter-feature relationships, from massive amounts of data. Due to the high demand for the computational power required for processing massive amounts of data, dedication of computational resources to features that are outside the scope of the designer’s interest would be a waste of such limited computational capacity.[41] The machine learning algorithm reduces waste of computational resources by applying dimensionality reduction to the pre-processed data sets.[42] The algorithm can identify an optimal subset of features by reducing the dimension and the noise of the data sets.[43] Dimensionality reduction allows the machine learning model to achieve higher level of predictive accuracy, increased speed of learning, and improves the simplicity and comprehensibility of the results.[44] However, the reduction process has limitations—reducing dimensionality inevitably imposes a limit on the amount of insights and information that may be extracted from the data sets. If the machine learning algorithm discerns a certain feature, the model would not be able to draw inferences related to said feature. Following dimensionality reduction, the machine learning algorithm attempts to fit the data sets into preset models. Typically, three different types of data are fed into the machine learning model—training set, validation set, and test set.[45] The machine learning algorithm “trains” the model by fitting the training set data into various models to evaluate the accuracy of each selection. Then the validation set is used to estimate error rates of each model when applied to data outside the training set that was used to develop each model. Through this process, the machine learning algorithm selects the model that best describes the characteristics and trends of the target features from the test and validation sets.[46] The test set is then used to calculate the generalized prediction error, which is reported to the end user for proper assessment of the predictive power of the model.[47] Simply put, the training test and validation set is used to develop and select a model that reflects the trends of the given data set, and the test set is used to generate a report on the accuracy of the selected model. The crucial elements in developing a machine learning model are (1) training data, (2) inventions related to the machine learning algorithm such as the method of preprocessing the training data, the method of dimensional reduction, feature extraction, and the method of model learning/testing, and (3) the machine learning model and output data.[48] An ancillary element associated with the three elements above is the human talent that is required to implement such innovation.[49] Innovators in the field of machine learning may protect their investments by protecting one or more of the elements listed above. The difference between training data and output data, as well as the difference between the machine learning algorithm and the machine learning model, are best illustrated with an example. Let us assume a credit card company wants to use machine learning to determine whether the company should grant a premium credit card to a customer. Let us further assume that the company would prefer to grant this card to customers that would be profitable to the company while filtering out applicants that are likely to file for bankruptcy. Data sets about prior applicant information would correspond to training data. The company would apply a mathematical method of extracting insight about the correlation between features and the criteria that the company wants to evaluate (e.g., profitable for the firm or likely to file bankruptcy). The mathematical methods are referred as machine learning algorithms. The resulting mechanism, such as a scoring system, that determines the eligibility of card membership is the machine learning model. The credit card applicant’s personal data would be the input data for the machine learning model, and the output data would include information such as expected profitability of this applicant and likelihood of bankruptcy for this applicant.

B. Industry Trends in Machine Learning

Discussing incentive structures and trends behind the machine learning industry is essential in identifying adequate methods of intellectual property rights. The current trends in the world of machine learning will predict what intellectual property regime is most useful to companies to protect their work. The United States has chronically struggled to maintain adequate supply of talent in the high-tech industry, a deficit of talent that continues in the field of machine learning.[50] From a report by the McKinsey Global Institute, the United States’ demand for talent in deep learning “could be 50 to 60 percent greater than its projected supply by 2018.”[51] Coupled with the dearth of machine learning specialists, the short employment tenure of software companies further complicates the search for talent. Software engineers from companies such as Amazon and Google have reported an average employment tenure of one year.[52] While some parts of the high attrition rate may be attributed to cultural aspects of the so-called “Gen Y” employees, the “hot” demand for programming talent has significant impact on the short employee tenure.[53] Job mobility within the software industry is likely to increase as the “talent war” for data scientist intensifies. Employee mobility and California’s prohibition against “covenants not to compete” have been accredited as a key factor behind the success of Silicon Valley.[54] Another trend in the field is the rapid advances in machine learning methods. Due to the fast-paced development of the field, data scientists and practitioners have every reason to work with companies that would allow them to work at the cutting edge of machine learning, using the best data sets. This may influence the attrition rates and recruiting practices of the software industry mentioned above.[55] Eagerness of employees to publish scientific articles and contribute to the general machine learning committee may be another factor of concern. To accelerate innovation by repurposing big data for uses different from the original purpose, and to form common standards for machine learning, more industries are joining alliances and collaborations.[56] Cross-industry collaborations may enable endless possibilities. Imagine the inferences that may be drawn by applying machine learning methods to dietary data from home appliances, biometric data, and data on the weather patterns around the user. Putting privacy nightmares aside, machine learning with diverse data sets may unlock applications that were not previously possible. More companies are attempting to capitalize on commercial possibilities that data sharing may unlock.[57]

C. Machine Learning Innovators – Protect the Data or Inventions?

Though it may seem intuitive that patent protection may be the best option, innovations in machine learning may not need patent protection. Trade secret protection on the data sets may be sufficient to protect the interests of practicing entities while avoiding disclosure of their inventions during the patent prosecution process. Furthermore, numerous software patents have been challenged as unpatentable abstract subject matter under 35 U.S.C. §101 since the Alice decision in 2014.[58] Though subsequent decisions provided guidelines for types of software patents that would survive the Alice decision, it is not clear how the judiciary will view future machine learning patents. Such issues raise the question about the patentability of machine learning – should we, and can we, resort to patents to protect machine learning inventions? Following the discussion on the building blocks of machine learning and recent emerging trends in the field, this section discusses the mode and scope of protection that current legal system provides for each element pertinent to innovation in machine learning. The possible options for protecting innovations are (1) non-disclosure agreements and trade secret law, (2) patent law, and (3) copyright. The three options for protection may be applied to the three primary areas of innovation—(1) training data, (2) inventions related to computation, data processing, and machine learning algorithms, and (3) machine learning models and output data. This discussion will provide context about the methods of protection for innovations in machine learning by examining the costs and benefits of the various approaches.
1. Protecting the Training Data—Secrecy Works Best
Access to massive amounts of training data is a prime asset for companies in the realm of machine learning. The big data phenomenon, which triggered the surge of interest in machine learning, is predicated on the need for practices to analyze large data resources and the potential advantages from such analysis.[59] Lack of access to a critical mass of training data prevents innovators from making effective use of machine learning algorithms. Previous studies suggest that companies resent sharing data with each other.[60] Michael Mattioli discusses the hurdles against sharing data and considerations involved with reuse of data in his article Disclosing Big Data.[61] Indeed, there may be practical issues that prevent recipients of data from engaging in data sharing. Technical challenges in comparing data from different sources, or inherent biases embedded in data sets may be reasons that complicate receiving outside data.[62] Mattioli also questions the adequacy of the current patent and copyright system to promote data sharing and data reuse—information providers may prefer not to disclose any parts of their data due to the rather thin legal protection for databases.[63] Perhaps this is why secrecy seems to be the primary method of protecting data.[64] The difficulty of reverse engineering to uncover the underlying data sets promotes the reliance on non-disclosure.[65] Compared to the affirmative steps required to maintain trade secret protection if the data is disclosed, complete non-disclosure may be a cost effective method of protecting data.[66] Companies that must share data with external entities may exhibit higher reliance on contract law rather than trade secret law. In absence of contract provisions, it would be a challenge to prove that the trade secret has been acquired by misappropriation of the recipient party. The “talent war” for data scientists may also motivate companies to keep the training data sets secret. With a shortage of talent to implement machine learning practices and rapid developments in the field, retaining talent is another motivation for protecting against unrestricted access to massive amounts of data. Companies may prefer exclusivity to the data sets that programmers can work with — top talents in machine learning are lured to companies with promises of exclusive opportunities to work with massive amounts of data.[67] The rapid pace of development in this field encourages practitioners to seek opportunities that provide the best resources to develop their skill sets. This approach is effective since a key limitation against exploring new techniques in this field is the lack of access to high quality big data. Overall, secrecy over training data fits well with corporate recruiting strategies to retain the best talents in machine learning. Non-disclosure and trade secret protection seems to be the best mode of protection. First, despite the additional legal requirements necessary to qualify as trade secrets, trade secret protection fits very well with non-disclosure strategy. On the other hand, patent law is at odds with the principle of non-disclosure. While trade secret law provides companies protection without disclosing information, patent law requires disclosure in exchange for monopolistic rights. Furthermore, neither patent nor copyright provide adequate protection for underlying data. Patent law rewards creative concepts and inventions, not compiled facts themselves. Copyright may protect labeling or distinct ways of compiling information, but does not protect underlying facts. Also, as a practical matter, the difficulty of reverse engineering of machine learning models does not lend well to detecting infringement. Analysis of whether two parties used identical training data would not only be time consuming and costly, but may be fundamentally impossible. If companies were to seek protection of training data, it would be best to opt for secrecy by non-disclosure. This would mean companies would opt out of the cross-industry collaborations that were illustrated above. This may be less of a concern for innovation, as companies may still exchange output data as means of facilitating cross-industry collaboration.
2. Protecting the Inventions—Patent Rights Prevail
Adequate protection over inventive approaches in processing data is becoming increasingly important as various industries begin to adopt a collaborative alliance approach in machine learning. Cross-industry collaboration requires implementation of methods such as preprocessing diverse data sets for compatibility. As the sheer amount of data increases, more processing power is required. The machine learning algorithm needs to maintain a high degree of dimensionality to accurately identify the correlations between a high number of relevant features. The need for more innovative ideas to address such technological roadblocks will only intensify as we seek more complex applications for machine learning. The three primary areas where novel ideas would facilitate innovations in machine learning are pre-training data processing, dimensional reduction, and the machine learning algorithm. Access to massive amounts of data alone is not sufficient to sustain innovation in machine learning. The raw data sets may not be compatible with each other, requiring additional “cleaning” of data prior to machine learning training.[68] The data provided to the machine learning algorithm dictates the result of the machine learning model, hence innovations in methods to merge data with diverse formats is essential to enhancing the accuracy of the models. As cross-industry data analysis becomes more prominent, methods of merging data will have more significant impact on advancing the field of machine learning than mere collection of large data sets. Cross-industry data sharing would be useless unless such data sets are merged in a comparable manner.[69] Companies can opt to protect their inventive methods by resorting to trade secret law. The difficulty of reverse engineering machine learning inventions, coupled with the difficulty of patenting software methods provides incentives for innovators to keep such inventions secret from the public. However, two factors would render reliance on non-disclosure and trade secret ineffective—frequent turnover of software engineers and rapid speed of development in the field. Rapid dissemination of information from employment mobility may endanger intellectual property protection based on secrecy. Furthermore, while the law will not protect former employees that reveal trade secrets to their new employers, the aforementioned fluid job market coupled with general dissemination of information make it difficult to distinguish between trade secrets from former employment and general knowledge learned through practice. The difficulties of reverse engineering machine learning models work against the trade secret owner as well in identifying trade secret misappropriation—how do you know others are using your secret invention? The desire for software communities to discuss and share recent developments in the field does not align well with the use of secrecy against innovations in machine learning. Secrecy practices disincentivize young data scientists from joining due to the limits against gaining recognition.[70] The rapid development of machine learning technology also presents challenges against reliance on trade secret law. Secret methods may be independently developed by other parties. Neither trade secret law nor non-disclosure agreements protect against independent development of the same underlying invention.[71] Unlike training data, machine learning models, or the output data, there are no practical limitations that impedes competitors from independently inventing new computational methods of machine learning algorithms. With such a fluid employment market, high degree of dissemination of expertise, and rapid pace of development, patent protection may provide the assurance of intellectual property protection for companies developing inventive methods in machine learning. Discussions on overcoming the barriers of patenting software will be presented in later sections.[72]
3. Protecting the Machine Learning Models and Results—Secrecy Again
The two primary products from applying the machine learning algorithms to the training data are the machine learning model and the accumulation of results produced by inputting data into the machine learning model. The “input data” in this context may refer to individual data that is analyzed by the insights gained from the machine learning model. In a recent article, Brenda Simon and Ted Sichelman discuss the concerns of granting patent protection for “data-generating patents,” which refers to inventions that generate valuable information in their operation or use.[73] Exclusivity based on patent protection may be extended further by trade secret protection over the data that has been generated by the patented invention.[74] Simon and Sichelman argue that the extended monopoly over data may potentially overcompensate inventors since the “additional protection was not contemplated by the patent system[.]”[75] Such expansive rights will cause excessive negative impact on downstream innovation and impose exorbitant deadweight losses.[76] The added protection over the resulting data derails the policy rationale behind the quid pro quo exchange between the patent holder and the public by excluding the patented information from public domain beyond the patent expiration date.[77] The concerns addressed in data-generating patents also apply to machine learning models and output data. Corporations may obtain patent protection over the machine learning models. Akin to a preference for secrecy for training data, non-disclosure would be the preferred mode of protection for the output data. The combined effect of the two may lead to data network effects where users have strong incentives to continue the use of a given service.[78] The companies that have exclusive rights over the machine learning model and output data gather more training data, increasing the accuracy of their machine learning products. The reinforcement by monopoly over the means of generating data allows few companies to have disproportionately strong dominance over their competitors.[79] Market dominance by data-generating patents becomes particularly disturbing when the patent on a machine learning model preempts other methods in the application of interest. Trade secret law does not provide protection against independent development. However, if there is only one specific method to obtain the best output data, no other party would be able to create the output data independently. The exclusive rights over the only methods of producing data provides means for the patent holder to monopolize both the patent and the output data.[80] From a policy perspective, the excessive protection does seem troubling. Yet such draconian combinations are less feasible after the recent rulings on patentable subject matter of software, which will be discussed below.[81] Mathematical equations or concepts are likely directed to an “abstract concept,” thus will be deemed directed to a patent ineligible subject matter.[82] Furthermore, though recent cases in the Federal Circuit have found precedents where software patents passed the patentable subject matter requirement, those cases expressed limitations against granting patents that would improperly preempt all solutions to a particular problem.[83] The rapid pace of innovation in the field of machine learning compared to the rather lengthy period required to obtain patents may also dissuade companies from seeking patents. Overall, companies have compelling incentives to rely on non-disclosure and trade secrets to protect their machine learning models instead of seeking patents. The secrecy concerns regarding training data applies to machine learning models and the output data as well. Non-disclosure would be the preferred route of obtaining protection over the two categories. However, use of non-disclosure or trade secrets to protect machine learning models and output data presents challenges that are not present in the protection of training data. The use of secrecy to protect machine learning models or output data conflicts with recruiting strategies to hire and retain top talent in the machine learning field. The non-disclosure agreements limit the employee’s opportunity to gain recognition in the greater machine learning community. In a rapidly developing field where companies are having difficulty hiring talent, potential employees would not look fondly on corporate practices that limit avenues of building a reputation within the industry.[84] Companies have additional incentives to employ a rather lenient secrecy policy for machine learning models and the output data. They have incentives to try to build coalitions with other companies to monetize on the results. Such cross-industry collaboration may be additional source of income for those companies. The data and know-how that Twitter has about fraudulent accounts within their network may aid financial institutions such as Chase with novel means of preventing wire fraud. The reuse of insights harvested from the large amount of raw training data can become a core product the companies would want to commercialize. Data reuse may have an incredible impact even for applications ancillary to the primary business of the company. Interesting aspects of disclosing machine learning models and output data are the difficulty of reverse engineering and consistent updates. If the company already has sufficient protection over the training data and/or the computational innovations, competitors will not be able to reverse engineer the machine learning model from the output data. Even with the machine learning model, competitors will not be able to provide updates or refinements to the model without the computational techniques and the sufficient data for training the machine learning algorithm. In certain cases, the result data becomes training data for different applications, which raises concerns of competitors using the result data to compete with the innovator. Yet the output data would contain less features and insights compared to the raw training data that the innovator possesses, and therefore would inherently be at a disadvantage when competing in fields that the innovator has already amassed sufficient training data. Grant of patents on machine learning models may incentivize companies to build an excessive data network while preempting competitors from entering competition. This may not be feasible in the future, as technological preemption is becoming a factor of consideration in the patentable subject matter doctrine. Companies may use secrecy as an alternative, yet may have less incentives to keep secrecy compared to the protection of training data.

D. Need of Patent Rights for Machine Learning Inventions in the Era of Big Data

The current system, on its surface, does not provide adequate encouragement for data sharing. If anything, companies have strong incentives to avoid disclosure of their training data, machine learning model, and output data. Despite these concerns, data reuse may enable social impacts and advances that would not be otherwise possible. Previous studies have pointed out that one of the major barriers preventing advances in machine learning is the lack of data sharing between institutions and industries.[85] Data scientists have demonstrated that they were able to predict flu trends with data extracted from Twitter.[86] Foursquare’s location database provides Uber with the requisite data to pinpoint the location of users based on venue names instead of addresses.[87] Information about fraudulent Twitter accounts may enable early detection of financial frauds.[88] The possibilities that cross-industry data sharing may bring are endless. To encourage free sharing of data, companies should have a reliable method of protecting their investments in machine learning. At the same time, protection based on non-disclosure of data would defeat of purpose of promoting data sharing. Hence protection over computation methods involved with machine learning maintains the delicate balance between promoting data sharing and protecting innovation. Protection over inventions in the machine learning algorithm provides one additional merit other than allowing data sharing and avoiding the sort of excessive protection that leads to a competitor-free road and data network effects. It incentivizes innovators to focus on the core technological blocks to the advancement of technology, and encourages disclosure of such know-how to the machine learning community. Then what are the key obstacles in obtaining patents in machine learning inventions? While there are arguments that the definiteness requirement of patent law is the primary hurdle against patent protection of machine learning models due to reliance on subjective judgment, there is no evidence that the underlying inventions driving big data faces the same challenge.[89] Definiteness may be overcome by providing reasonable certainty for those skilled in the art of defining what the scope of the invention is at the time of filing.[90] There is no inherent reason why specific solutions for data cleaning, enhancement of computation efficiency, and similar inventions would be deemed indefinite by nature. Since the United States Supreme Court invalidated a patent on computer implemented financial transaction methods in the 2014 Alice decision, the validity of numerous software and business method patents were challenged under 35 U.S.C. §101.[91] As of June 8th, 2016, federal district courts invalidated 163 of the 247 patents that were considered under patentable subject matter—striking down 66% of challenged patents.[92] The U.S. Court of Appeals for the Federal Circuit invalidated 38 of the 40 cases it heard.[93] Arguably, the public benefits more from such high rates of post-issuance invalidity. The public still has access to the disclosures from the patents and patent applications. In reliance on granted patents, companies may have already invested in growing related businesses, catering to the need of consumers. At the same time, the patent holder’s monopolistic rights have been shortened as the result of litigation. Effectively, the price that the public pays to inventors in exchange for the benefits of disclosure is reduced. Yet the high degree of invalidity raises several concerns for the software industry. Smaller entities, lacking market influence and capital, have difficulty competing against established corporations without the monopolistic rights granted through the patent system. Investors become hesitant to infuse capital into startups for fear that invalidity decreases the worth of patents. Reliance on trade secret has its own limitations due to the disclosure dilemma—the inventor needs to disclose the secret to lure inventors, but risks losing secrecy in the process. Copyright law does not provide appropriate protection. The restrictions imposed by the merger doctrine and scène à faire doctrine constrain copyright protection of software. Though copyright provides an alternative method of protecting literal copying of code, it does little to protect the underlying software algorithms and innovation. Ultimately, the increase of alliances and collaboration provides incentives for parties to obtain patent rights. Reliance on trade secret or copyright are not suitable methods of protecting their intellectual property. Furthermore, market power or network effects alone cannot sufficiently mitigate the risks involved with operating a business. Patents become even more important for startups since patents provide investors with assurance that in the worst case, the patents may still serve as potential collateral.        

III. Patentability of Machine Learning Innovations in the Era of Big Data

  Patentable subject matter continues to be a barrier for patenting innovations in software. Additional doctrines such as enablement, written description, and obviousness are also serious obstacles against obtaining patents, yet such requirements are specific to each claimed invention and the draftsmanship of claims. Subject matter is considered a broader, categorical exclusion of patent rights. This section explores the current landscape of the patentable subject matter doctrine in the software context.

A. Alice: The Legal Framework of Patentable Subject Matter in Software

The complexity involved with software, coupled with the relatively broad scope of software patents, has presented challenges in identifying the boundaries of the claims.[94] Many members of the software community detest imposing restrictions on open source material and attest that many key innovations in algorithms are rather abstract.[95] Such hostility against patenting software has raised the question of whether patent rights should be the proper method of protecting innovations in software. Alice was a case that embodied such opposition to the grant of software patents. The case involved patents on computerized methods for financial trading systems that reduce “settlement risk” when only one party to financial exchange agreement satisfies its obligation.[96] The method proposed the use of a computer system as a third-party intermediary to facilitate the financial obligations between parties.[97] The United States Supreme Court ruled that the two-step test established from Mayo governed all patentable subject matter questions.[98] In particular, for the abstract idea context, the Supreme Court established the following two-step framework for patentable subject matter of software inventions:  
1. Step one: “[D]etermine whether the claims at issue are directed to a patent-ineligible concept. If so, the Court then asks whether the claim’s [additional] elements, considered both individually and ‘as an ordered combination,’ ‘transform the nature of the claim’ into a patent-eligible application.”[99]
 
2. Step two: “[E]xamine the elements of the claim to determine whether it contains an ‘inventive concept’ sufficient to ‘transform’ the claimed abstract idea into a patent-eligible application. A claim that recites an abstract idea must include ‘additional features’ to ensure that the [claim] is more than a drafting effort designed to monopolized the [abstract idea]” which requires “more than simply stat[ing] the [abstract idea] while adding the words ‘apply it.’”[100]
  The Alice Court found that the patent on financial transaction was “directed to a patent-ineligible concept: the abstract idea of intermediated settlement,” and therefore failed step one.[101] Furthermore, the Court ruled that the claims did “no more than simply instruct the practitioner to implement the abstract idea of intermediated settlement on a generic computer” and did not provide an inventive concept that was sufficient to pass step two.[102]

B. The post-Alice cases from the Federal Circuit

The Alice framework was considered as a huge setback for the application of patentable subject matter doctrine to software. It was a broad, categorical exclusion of certain inventions that were deemed “directed to” an abstract idea, natural phenomenon, or law of nature. The biggest misfortune was the lack of guidance in the Alice decision on the threshold for such categorical exclusion—we were left without any suggestions on the type of software patents that would be deemed as patentable subject matter. The recent line of cases in the Federal Circuit provides the software industry with the much-needed clarification on the standards that govern patentability of software inventions. Enfish v. Microsoft, decided on March 2016, involved a “model of data for a computer database explaining how the various elements of information are related to one another” for computer databases.[103] In June 2016, the Federal Circuit decided another case on the abstract idea category for patentable subject matter. Bascom Global v. AT&T Mobility is on a patent disclosing an internet content filtering system located on a remote internet service provider (ISP) server.[104] Shortly after Bascom, the Federal Circuit decided McRO v. Bandai Namco Games in September 2016.[105] The case ruled that an automated 3D animation algorithm that renders graphics in between two target facial expressions is patentable subject matter.[106] The rulings from the Federal Circuit on the aforementioned three cases provide guidelines along the two-step Alice test of patentable subject matter. The software patents in Enfish and McRO were deemed “directed to” a patent eligible subject matter, informing the public of what may pass the first set of the Alice test. Bascom failed the first step.[107] Yet the court ruled that those patents had inventive concepts sufficient to transform a patent ineligible subject matter into a patent eligible application. Combined together, the three cases give more certainty in what may pass the 35 U.S.C. §101 patentable subject matter inquiry. Reiterating the Alice test, whether an invention is a patentable subject matter is determined by a two-step process—(1) is the invention directed to, rather than an application of, an abstract idea, natural phenomenon, or law of nature, and even if so, (2) do the elements of the claim, both individually and combined, contain an inventive concept that transforms this invention into a patent-eligible application? The Federal Circuit fills in the gaps that were left unexplained from the Alice ruling.
1. The Federal Circuit’s Standard for Alice Step One
The Enfish court discussed what constitutes an abstract idea at the first step of the Alice inquiry. Judge Hughes instructs us to look at whether the claims are directed to a specific improvement rather than an abstract idea. In this case, the patent provides the public with a solution to an existing problem by a specific, non-generic improvement to computer functionality. The Enfish court ruled that such invention is patent eligible subject matter.[108] McRO also ruled that the facial graphic rendering for 3D animation was not an abstract concept. Here, the Federal Circuit again emphasized that a patent may pass step one of the Alice test if the claims of the patent “focus on a specific means or method that improves the relevant technology.”[109] The McRO court also noted that preemption concerns may be an important factor for the 35 U.S.C. §101 subject matter inquiry—that improper monopolization of “the basic tools of scientific and technological work” is a reason why such categorical carve outs against granting patents on abstract ideas exist.[110] Bascom provides the standards on what would fail step one of the Alice patentable subject matter inquiry. If the patent covers a conventional, well-known method in the field of interest, then the invention would be considered abstract. This is akin to the inventive concept considerations conducted at the second phase of the 35 U.S.C. §101 subject matter inquiry. The main takeaway from Enfish and McRO is that in the first step of the Alice test, a patent application is not directed to an abstract idea if (1) the invention addresses an existing problem by specific improvements rather than by conventional, well-known methods and (2) the claims do not raise preemption concerns. This encourages practitioners to define the problem as broadly as possible, while defining the scope of improvement in definite terms.
2. The Federal Circuit’s Standard for Alice Step Two, and the Overlap with Step One
The second step of the Alice test is an inquiry of whether the patent application, which is directed to a patent ineligible subject, still contains a patent-worthy inventive concept. Bascom ruled in favor of granting the patent following the second step of the Alice test.[111] While the patent at hand was considered directed to patent ineligible subject matter, the Bascom court found that the content filter system invention still had an inventive concept worthy of a patent.[112] Even if elements of a claim are separately known in prior art, an inventive concept can be found in the non-conventional and non-generic arrangement of known, conventional pieces. This inquiry seems like a lenient standard compared to the 35 U.S.C. §103 obviousness inquiry; hence, it is not clear if this step has an independent utility for invalidating or rejecting a patent. Nonetheless, the court found that merely showing that all elements of a claim were already disclosed in prior art was not sufficient reason to make an invention patent ineligible.[113] While it is possible to infer sufficient reasons of ruling out inventive concepts from the Bascom case, it is still unclear what would warrant an invention to pass the second step of the Alice test. Cases such as DDR Holdings v. Hotels.com have suggested that the second step of Alice is satisfied since it involved a solution to a specific technological problem that “is necessarily rooted in computer technology in order to overcome a problem specifically arising in the realm of computer networks.”[114] This interpretation of inventive concept becomes perplexing when comparing the two steps of Alice—both steps look to whether the proposed solution addresses problems that are specific to a given field of interest. While we would need additional cases to gain insight on whether the two steps have truly distinct functions, at the very least the Federal Circuit provided essential guidelines on what may be deemed as patentable software.

C. Applying Patentable Subject Matter to Machine Learning Inventions

As the Bascom court has taught, the first step in the Alice inquiry is to ask whether an invention (1) provides a solution to an existing problem by (2) a specific, non-generic improvement that (3) does not preempt other methods of solving the existing problem. Applying this test to inventions in machine learning, mathematical improvements and computational improvements would be treated differently. As mentioned before, a key aspect of machine learning is the “noise” associated with the data sets.[115] Another concern is the fitting of a given algorithm to a certain model. Methods that facilitate the computations of the training process may be deemed as a specific improvement. However, machine learning algorithms themselves, including the base models that the algorithm fits the training into would not be pertinent to just a specific improvement. Hence, generic mathematical methods applicable to various problems are directed to an abstract idea. For example, an invention that addresses the issue of normalizing data from different sources would be a computational issue and hence would pass the Alice test given that it did not preempt other solutions to the problem of data normalization. On the other hand, a specific mathematical equation that serves as a starting model for the machine learning algorithm would be mathematical and hence directed to an abstract idea. Even if the mathematical starting model is only good for a specific application, the model is not a specific improvement pertinent to that application. Although the model may not necessarily be a good starting model for other applications, it is nonetheless a generic solution that applies to other applications as well.

Conclusion

While highly restrictive, the guidelines from the Federal Circuit still allow the grant of patent rights for the computational aspects of machine learning algorithms. The guidelines also would prevent highly preemptive mathematical innovations, including data-generating patents such as machine learning models. The narrow range of patentability makes a patent regime appealing for computational methods. The recent emphasis on preemption concerns acts in favor of preventing data network effects based on data-generating patents. While not discussed in this paper, other patentability requirements such as obviousness or definiteness would further constraint the grant of overly broad data-generating patents. Such an approach strikes the appropriate balance between promoting innovation and encouraging data reuse for societal benefits. Compared to other approaches of providing protection over innovations in machine learning, the narrowly tailored approach for patent rights for computational inventions fits best with the policy goal of promoting innovation through data reuse. The industry trends in collaboration and recruiting also matches the proposed focus on patent law protection.
* J.D. Candidate, New York University School of Law, 2018; Ph.D. in Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 2012. The author would like to thank Professor Katherine Jo Strandburg for her guidance; fellow JIPEL Notes Program participants Julian Pymento, Gia Wakil, Neil Yap, and Vincent Honrubia for their comments and feedback; and Dr. Sung Jin Park for her support throughout the process.
[1] Sang-Hun Choe & John Markoff, Master of Go Board Game Is Walloped by Google Computer Program, N.Y. Times (March 9, 2016), https://www.nytimes.com/2016/03/10/world/asia/google-alphago-lee-se-dol.html (reporting the shocking defeat of Go Master Lee Se-dol to Google DeepMind’s AlphaGo).
[2] Laurence Zuckerman, Chess Triumph Gives IBM a Shot in the Arm, N.Y. Times (May 12, 1997), http://politics.nytimes.com/library/cyber/week/051297ibm.html (detailing IBM’s highly publicized win through Deep Blue’s victory over world chess champion Garry Kasparov).
[3] See Choe & Markoff, supra note 1.
[4] David Silver et al., Mastering the game of Go with deep neural networks and tree search, 529 Nature 484, 484 (2016).
[5] Id.
[6] Id.
[7] Id. at 485.
[8] Id.
[9] See Andre Esteva et al., Dermatologist-level classification of skin cancer with deep neural networks, 542 Nature 115 (2017).
[10] See Sue Ellen Haupt & Branko Kosovic, Big Data and Machine Learning for Applied Weather Forecasts, (2015).
[11] See Wei-Yang Lin et al., Machine Learning in Financial Crisis Prediction: A Survey, 42 IEEE Transactions on Systems, Man, and Cybernetics 421 (2012).
[12] See Farzindar Atefeh & Wael Khreich, A Survey of Techniques for Event Detection in Twitter, 31 Computational Intelligence 132 (February 2015).
[13] See Corey Blumenthal, ECE Illinois Students Accurately Predicted Trump’s Victory, ECE Illinois (Nov. 18, 2016), https://www.ece.illinois.edu/newsroom/article/19754.
[14] For the purpose of this Note, secrecy refers to the use of trade secret and contract based non-disclosure agreements.
[15] Mark A. Lemley, The Surprising Virtues of Treating Trade Secrets As IP Rights, 61 Stan. L. Rev. 311, 332 (2008) (“Patent and copyright law do not exist solely to encourage invention, however. A second purpose — some argue the main one — is to ensure that the public receives the benefit of those inventions.”).
[16] Andrew Ng et al., How Artificial Intelligence Will Change Everything, Wall Street Journal (March 7, 2017), https://www.wsj.com/articles/how-artificial-intelligence-will-change-everything-1488856320.
[17] Limor Peer, Mind the Gap in Data Reuse: Sharing Data Is Necessary But Not Sufficient for Future Reuse, London Sch. Econ. & Poli. Sci. (Mar. 28, 2014) http://blogs.lse.ac.uk/impactofsocialsciences/2014/03/28/mind-the-gap-in-data-reuse (“The idea that the data will be used by unspecified people, in unspecified ways, at unspecified times . . . is thought to have broad benefits”).
[18] See Saeed Ahmadiani & Shekoufeh Nikfar, Challenges of Access to Medicine and The Responsibility of Pharmaceutical Companies: A Legal Perspective, 24 DARU Journal of Pharmaceutical Sciences 13 (2016) (discussing how “pharmaceutical companies find no incentive to invest on research and development of new medicine specified for a limited population . . .”).
[19] 17 U.S.C. §101 (2012).
[20] Id.
[21] 17 U.S.C. §102(a) (Copyright exists “in original works of authorship fixed in any tangible medium of expression . . .”).
[22] Apple Comput., Inc. v. Franklin Comput. Corp., 714 F.2d 1240 (3d Cir. 1983).
[23] Comput. Assocs. Int’l v. Altai, 982 F.2d 693 (2d Cir. 1992).
[24] Id.
[25] See id. at 707-09.
[26] 837 F.3d 1299, 1314 (Fed. Cir. 2016).
[27] Altai, 982 F.2d at 698.
[28] See Dionne v. Se. Foam Converting & Packaging, Inc., 240 Va. 297 (1990).
[29] Kewanee Oil v. Bicron Corp., 416 U.S. 470 (1974).
[30] Id. at 490.
[31] Nikolay Golova & Lars Rönnbäck, Big Data Normalization For Massively Parallel Processing Databases, 54 Computer Standards & Interfaces 86, 87 (2017).
[32] Jon Russell, Alibaba Teams Up with Samsung, Louis Vuitton and Other Brands to Fight Counterfeit Goods, TechCrunch (Jan. 16, 2017) https://techcrunch.com/2017/01/16/alibaba-big-data-anti-counterfeiting-alliance.
[33] Lemley, supra note 15, at 33
[34] See Lior Rokach, Introduction to Machine Learning, Slideshare 3 (July 30, 2012), https://www.slideshare.net/liorrokach/introduction-to-machine-learning-13809045.
[35] Id. at 4.
[36] Id. at 10.
[37] See Lars Marius Garshol, Introduction to Machine Learning, Slideshare 26 (May 15, 2012) https://www.slideshare.net/larsga/introduction-to-big-datamachine-learning.
[38] Id.
[39] Id.
[40] See Lei Yu et al., Dimensionality Reduction for Data Mining – Techniques, Applications and Trends, Binghamton University Computer Science 11, http://www.cs.binghamton.edu/~lyu/SDM07/DR-SDM07.pdf (last visited Feb. 23, 2018).
[41] Id.
[42] See Rokach, supra note 34, at 10.
[43] Yu et al., supra note 40.
[44] Laurens van der Maaten et al., Dimensionality Reduction: A Comparative Review, Tilburg Centre for Creative Computing, TiCC TR 2009-005, Oct. 26, 2009, at 1 (“In order to handle such real-world data adequately, its dimensionality needs to be reduced. Dimensionality reduction is the transformation of high-dimensional data into a meaningful representation of reduced dimensionality. Ideally, the reduced representation should have a dimensionality that corresponds to the intrinsic dimensionality of the data. The intrinsic dimensionality of data is the minimum number of parameters needed to account for the observed properties of the data”).
[45] Andrew Ng, Nuts and Bolts of Applying Deep Learning (Andrew Ng), YouTube (Sept. 27, 2016), https://www.youtube.com/watch?v=F1ka6a13S9I.
[46] Andrew Ng, Model Selection and Train/Validation/Test Sets, Machine Learning, https://www.coursera.org/learn/machine-learning/lecture/QGKbr/model-selection-and-train-validation-test-sets (last visited Feb. 23, 2018).
[47] Id.
[48] See Rokach, supra note 34, at 10.
[49] Alex Rampell & Vijay Pande, a16z Podcast: Data Network Effects, Andreesen Horowitz (Mar. 8, 2016), http://a16z.com/2016/03/08/data-network-effects/.
[50] James Manyika et. al., Big Data: The Next Frontier for Innovation, Competition, and Productivity, McKinsey Global Inst., May 2011, at 11, available at https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20Insights/Big%20data%20The%20next%20frontier%20for%20innovation/MGI_big_data_exec_summary.ashx.
[51] Id.
[52] Leonid Bershidsky, Why Are Google Employees So Disloyal?, Bloomberg (July 13, 2013, 11:41 AM), https://www.bloomberg.com/view/articles/2013-07-29/why-are-google-employees-so-disloyal-.
[53] Id.
[54] Rob Valletta, On the Move: California Employment Law and High-Tech Development, Federal Reserve Bank of S.F. (Aug. 16, 2002), http://www.frbsf.org/economic-research/publications/economic-letter/2002/august/on-the-move-california-employment-law-and-high-tech-development/#subhead1.
[55] Id.
[56] See Quentin Hardy, IBM, G.E. and Others Create Big Data Alliance, N.Y. Times (Feb. 15, 2015), https://bits.blogs.nytimes.com/2015/02/17/ibm-g-e-and-others-create-big-data-alliance.
[57] See, e.g., Finicity and Wells Fargo Ink Data Exchange Deal, Wells Fargo (Apr. 4, 2017), https://newsroom.wf.com/press-release/innovation-and-technology/finicity-and-wells-fargo-ink-data-exchange-deal.
[58] Alice Corp. Pty. Ltd. v. CLS Bank Int’l, 134 S. Ct. 2347 (2014).
[59] Karen E.C. Levy, Relational Big Data, 66 Stan. L. Rev. Online 73, 73 n.3 (2013), https://review.law.stanford.edu/wp-content/uploads/sites/3/2013/09/66_StanLRevOnline_73_Levy.pdf (explaining that the big data phenomenon is due to the need of practices to analyze data resources).
[60] Christine L. Borgman, The Conundrum of Sharing Research Data, 63 J. Am. Soc’y for Info. Sci. & Tech. 1059, 1059-60 (2012) (discussing the lack of data sharing across various industries).
[61] See Michael Mattioli, Disclosing Big Data, 99 Minn. L. Rev. 535 (2014).
[62] See id. at 545-46 (discussing the technical challenges in merging data from different sources, and issue of subjective judgments that may be infused in the data sets).
[63] See id. at 552 (discussing how institutions with industrial secrets may rely on secrecy to protect the big data they have accumulated).
[64] See id. at 570 (“[T]he fact that these practices are not self-disclosing (i.e., they cannot be easily reverse-engineered) lends them well to trade secret status, or to mere nondisclosure”).
[65] Id.
[66] Id. at 552.
[67] Patrick Clark, The World’s Top Economists Want to Work for Amazon and Facebook, Bloomberg (June 13, 2016, 10:47 AM), https://www.bloombergquint.com/technology/2016/06/09/the-world-s-top-economists-want-to-work-for-amazon-and-facebook (“If you want to be aware of what interesting questions are out there, you almost have to go and work for one of these companies”).
[68] Bill Franks, Taming the Big Data Tidal Wave 20 (2012) (discussing that the biggest challenge in big data may not be developing tools for data analysis, but rather the processes involved with preparing the data for the analysis).
[69] See Borgman, supra note 60, at 1070 (“Indeed, the greatest advantages of data sharing may be in the combination of data from multiple sources, compared or “mashed up’ in innovative ways.” (citing Declan Butler, Mashups Mix Data Into Global Service, 439 Nature 6 (2006))).
[70] Jack Clark, Apple’s Deep Learning Curve, Bʟᴏᴏᴍʙᴇʀɢ Bᴜsɪɴᴇssᴡᴇᴇᴋ, (Oct 29, 2015) https://www.bloomberg.com/news/articles/2015-10-29/apple-s-secrecy-hurts-its-ai-software-development.
[71] Kewanee Oil v. Bicron Corp., 416 U.S. 470, 490 (1974).
[72] See infra Section III-B.
[73] Brenda Simon & Ted Sichelman, Data-Generating Patents, 111 Nw. U.L. Rev. 377 (2017).
[74] Id. at 379.
[75] Id. at 414.
[76] Id. at 415 (“[B]roader rights have substantial downsides, including hindering potential downstream invention and consumer deadweight losses . . .”).
[77] Id. at 417.
[78] Rampell & Pande, supra note 49.
[79] Lina Kahn, Amazon’s Antitrust Paradox, 126 Yale L.J. 710, 785 (2017) (“Amazon’s user reviews, for example, serve as a form of network effect: the more users that have purchased and reviewed items on the platform, the more useful information other users can glean from the site”).
[80] Simon & Sichelman, supra note 73, at 410.
[81] See infra Section III-A.
[82] Id.
[83] See infra Section III-B.
[84] Jack Clark, Apple’s Deep Learning Curve, Bʟᴏᴏᴍʙᴇʀɢ Bᴜsɪɴᴇssᴡᴇᴇᴋ (Oct 29, 2015), https://www.bloomberg.com/news/articles/2015-10-29/apple-s-secrecy-hurts-its-ai-software-development.
[85] Peer, supra note 17 (“The idea that the data will be used by unspecified people, in unspecified ways, at unspecified times . . . is thought to have broad benefits”).
[86] See Harshavardhan Achrekar et al., Predicting Flu Trends using Twitter data, IEEE Conference on Comput. Commc’ns. Workshops 713 (2011), http://cse.unl.edu/~byrav/INFOCOM2011/workshops/papers/p713-achrekar.pdf.
[87] Jordan Crook, Uber Taps Foursquare’s Places Data So You Never Have to Type an Address Again, TechCrunch, (May 25, 2016) https://techcrunch.com/2016/05/25/uber-taps-foursquares-places-data-so-you-never-have-to-type-an-address-again/.
[88] See Rampell & Pande, supra note 49.
[89] See Mattioli, supra note 61, at 554 (“A final limitation on patentability possibly relevant to big data is patent law’s requirement of definiteness”).
[90] See Nautilus, Inc. v. Biosig Instruments, Inc., 134 S. Ct. 2120 (2014).
[91] See Alice Corp. Pty. Ltd. v. CLS Bank Int’l, 134 S. Ct. 2347 (2014).
[92] Robert R. Sachs, Two Years After Alice: A Survey of the Impact of a “Minor Case” (Part 1), Bilski Blog (June 16, 2016), http://www.bilskiblog.com/blog/2016/06/two-years-after-alice-a-survey-of-the-impact-of-a-minor-case.html.
[93] Id.
[94] Stephanie E. Toyos, Alice in Wonderland: Are Patent Trolls Mortally Wounded by Section 101 Uncertainty, 17 Loy. J. Pub. Int. L. 97,100 (2015).
[95] Id.
[96] Alice Corp. Pty. Ltd. v. CLS Bank Int’l, 134 S. Ct. 2347, 2349 (2014).
[97] Id.
[98] Id.
[99] Id. at 2350 (emphasis added) (citation omitted).
[100] Id. at 2357 (emphasis added) (alteration in original) (citation omitted).
[101] Id. at 2350.
[102] Id. at 2351.
[103] Enfish, LLC v. Microsoft Corp., 822 F.3d 1327, 1330 (Fed. Cir. 2016).
[104] Bascom Glob. Internet Servs. v. AT&T Mobility LLC, 827 F.3d 1341, 1349 (Fed. Cir. 2016).
[105] McRO, Inc. v. Bandai Namco Games Am. Inc., 837 F.3d 1299, 1308 (Fed. Cir. 2016).
[106] Id.
[107] Bascom, 827 F.3d at 1349.
[108] Enfish, 822 F.3d at 1330.
[109] McRO, Inc., 837 F.3d at 1314.
[110] Id.
[111] Bascom, 827 F.3d at 1349.
[112] Id.
[113] Id.
[114] See Toyos, supra note 94, at 121; DDR Holdings, LLC v. Hotels.com, 773 F.3d 1245, 1257 (Fed. Cir. 2014).
[115] See supra Section II-A.

Fitting Marrakesh into a Consequentialist Copyright Framework

Fitting Marrakesh into a Consequentialist Copyright Framework
Download a pdf version of this article here. The Marrakesh Treaty to Facilitate Access to Published Works by Visually Impaired Persons and Persons with Print Disabilities entered into force on September 30, 2016. The treaty aims to alleviate what has been described as the “book famine,” and has been lauded as a significant achievement in advancing the rights of and promoting equal opportunity for the visually disabled. Contracting states are required to implement copyright limitations and exceptions to facilitate access to copyrighted material for the global print-disabled community. This note will argue that, notwithstanding the treaty’s strong rights-based underpinnings, the treaty aligns comfortably with U.S. consequentialist copyright justifications. This note will also demonstrate the limitations of other copyright justificatory theories while discussing their incompatibility with the treaty’s philosophy.

Through the Looking Glass: Photography and the Idea/Expression Dichotomy

Through the Looking Glass: Photography and the Idea/Expression Dichotomy
Download a pdf version of this article here. Copyright law has always expressed an idea/expression dichotomy, where copyright protection extends not to an idea of a work but only to work’s expression of that idea. Alas, this distinction walks a fine line with regard to non-textual and visual works. In particular, courts are prone to inconsistent outcomes and a violation of the fundamental precepts of copyright law because courts often succumb to shortcomings in grasping aesthetic theories of originality, realism, and ideas idiosyncratic to visual works. However, this dilemma may be solved within the existing framework of copyright law. This note argues that the solution should start by focusing less on visual works’ subject matter, but rather elements of the work, such as the originality and realism of the expression that clarify the author’s creativity. Moreover, the concept of an “idea” should be defined broadly as the residual locus of uncopyrightable elements in a work, rather than a cohesive concept that attempts to definitively pin down the “idea” behind that individual work. Taking this two-pronged solution would thus both recognize visual and photographic work’s unique niche within copyright as well as align these forms of art with copyright’s law’s ultimate objective of authorship protection.

Un-Blurring Substantial Similarity: Aesthetic Judgments and Romantic Authorship in Music Copyright Law

Un-Blurring Substantial Similarity: Aesthetic Judgments and Romantic Authorship in Music Copyright Law
Download a PDF version of this article here.

By Nicole Lieberman*

Introduction

By refusing to acknowledge[1] the aesthetic judgments inherent in determining copyright disputes,[2] American courts have plagued our copyright law with subjective bias[3] and doctrinal confusion. To avoid the appearance of impropriety, since at least 1903[4] courts have side-stepped clearly defining foundational concepts such as “originality,”[5] “authorship,” and “infringement.” As such, they have failed to provide a meaningful methodology for determining when a work infringes the copyright of another.[6] By instead relying on the impossibly vague “substantial similarity” test,[7] courts have crafted an impressionistic doctrine that has drifted far from copyright’s original economic purpose of incentivizing creation.

While copyright infringement requires proof of copying, mere copying is not the end of the inquiry, as “[t]rivial copying is a significant part of modern life.”[8] Thus, proof of copying, or “copying-in-fact,” is only a threshold issue for proving infringement.

Copying-in-fact can be shown through direct evidence, such as testimony, but with witnesses and honest thieves often lacking, copying is most often shown by circumstantial evidence. Indirect proof of copying is provided by evidence creating an inference that the defendant copied – typically a combination of evidence of access to the plaintiff’s work and similarities probative of copying. While courts allow expert analysis and dissection to aid them in inferring copying, the largely unguided impression of lay observers determines the more exacting question of misappropriation.[9]

Yet determining misappropriation requires an understanding of the “axiom of copyright law that the protection granted to a copyrightable work extends only to the particular expression of an idea and never to the idea itself.”[10] Application of the “idea/expression” distinction requires delicate line-drawing to decide the appropriate “‘level of abstraction’ at which one defines the ‘idea’ that merges with the subject’s expression.”[11] But fact finders are unlikely to understand on their own which “ideas” are excluded or what elements fall into the category of “ideas.” While jury instructions can theoretically work to inform jurors to exclude such elements, “in practice jurors aren’t going to know what things are, for example, scène à faire[12] in the music industry, without some testimony on standard chord progressions.”[13] Thus, jurors are not likely to understand such an ephemeral distinction between ideas and expression, especially when applied to areas in which they lack expertise, as is often the case with copyright.[14] Because the issue of misappropriation is so dependent on the interpretation of these underlying principles of copyright law, classifying the issue as purely a question of fact for the jury requires reconsideration.[15]

Courts recognize the need for expert analysis and dissection in determining infringement in cases involving computer software. Distinguishing computers as “complex” and having elements dictated by limited options, courts apply a special test to ensure only protected elements are considered for infringement purposes. Yet they proscribe such guidance when the “aesthetic arts” are at issue, failing to recognize traditions unique to genres, that all art is capable of being broken down into constituent elements, and that such elements are dictated by genre and functional constraints. Courts have assumed that art is intuitive, simply reflecting emotions, and capable of being understood by anyone. Their narrow understanding of art comes from our law’s founding.

Copyright arose in an era where the courts viewed creativity as coming from a place of pure autonomous genius,[16] but this romantic view of aesthetics is a relic of the past: a counteraction to the age of enlightenment and rationalization. The reality is that creative borrowing is almost unavoidable and results in widespread use of unprotected elements from preexisting works. Without expert guidance and the ability to dissect protectable and unprotectable elements, judges and jurors are “more likely to find infringement in dubious circumstances, because they aren’t properly educated on the difference between protectable and unprotectable elements.”[17]

Due to the prevalence of music copyright infringement suits, and the fact that music is more perceptively derivative than other media,[18] it seems disproportionately plagued by the courts’ bias for traditional aesthetics. But music, like all arts, is inherently complex and technical,[19] and few “ordinary observers” know the elements and factors that go into its creation,”[20] especially with works of less familiar genres. Thus, fact finders are easily misled into finding substantial similarity based on unprotectable elements.[21]

While music may be uniquely crippled by our current copyright regime, the problems plaguing music copyright stem directly from a lack of guidance where it is arguably most needed: the technical issue of misappropriation.[22] With fact finders less likely to detect similarities attributable to common sources in unfamiliar aesthetic, the current system results in a prejudice against lesser-known aesthetics, and a bias for the traditional. [23] The result is far from encouraging aesthetic progress.

This paper will argue that to create a more encompassing[24] and objective copyright law, that fosters progress in all arts, it is vital to expand the role of analytic dissection and expert testimony to the misappropriation prong of the infringement test.

Part I of this paper provides background on the history of court treatment of music copyright and lays out the two major approaches to copyright infringement. In addition, this part outlines the foundational principle that only the expression of an idea is protectable. Part II illustrates how the tests have veered away from the original purposes and values underlying the inquiry. It argues that by relying on the ordinary observer test for misappropriation, the tests fail to accurately account for the idea-expression distinction. In outlining the problems facing music under our current copyright regime, this section shows how the problems with the audience test are particularly problematic for music, a medium in which the line between idea and expression is often not “spontaneous and immediate” to the ordinary observer.[25] The recent “Blurred Lines”[26] lawsuit serves to illustrate how the lack of objectivity in our current law results in inconsistent application, thereby diminishing incentives to create new works. More broadly, this section considers that while the problems for music are often more noticeable than for other media, they merely expose the larger inaccuracies of the audience test. Finally, Part III considers proposals for creating a more guided and objective infringement analysis. Ultimately, this paper concludes that the best solution is adopting the test for computer software, the abstraction-filtration-comparison method (AFC),[27] as a uniform test for infringement.

Requiring careful dissection of unprotected elements by the court would ensure educated decisions, and reserving the intuitive question of whether the defendant copied those elements for the trier would preserve the economic rationale of the lay listener test.[28] Effectively reversing the analysis of proof “will likely result in greater attention to the limiting doctrines of copyright law”[29] and the evolution of reasoned rule of law.[30] By basing aesthetic nondiscrimination in objective and reasoned criteria, as opposed to the “anti-intellectual and book burning” philosophy[31] of visceral impressions, the courts can determine actual illicit copying while being receptive to unconventional aesthetics.[32]

I. Historical Background of Music in Copyright

While copyright law struggles to deal with the fine arts as a whole,[33] particular problems arise in the context of musical works. These issues are rooted in the history of copyright law. Many of the problems facing music copyright lie in the fact that creators are seeking protection under a scheme created for the distinct purpose of protecting works of literature.[34] However, these problems are not unique to music. American copyright law is based on a concept of authorship ill-suited to progress in general. This section will outline the evolution of our copyright infringement doctrine. In considering the historical application of the doctrine to musical works, this section analyzes the aesthetic norms embedded within judges’ and jurors’ findings of infringement.

A. Music’s Initial Encounters in Early Legislation and Case Law

Article 1, Section 8 of the Constitution authorizes federal legislation “[t]o promote the Progress of Science and useful Arts,”[35] but gives little guidance in defining the scope of the copyright system. The original Copyright Act of 1790 extended protection only to maps, charts, and books.[36] Though musical compositions were routinely registered under the 1790 Act as “books,”[37] it was not until the Copyright Act of 1831 that Congress expressly extended protection to musical compositions. Congress’s early failure to provide well-crafted protection for musical compositions is hardly surprising given the 1790 Act’s roots in Great Britain’s Statute of Anne, which covered only the distinct category of “books.”[38]

With no other protection available against infringers, composers naturally came to seek protection of their works through copyright.[39] Yet utilizing a scheme created for “written” works meant obtaining copyright protection solely with respect to the underlying composition, “the notated, written score, including the music and any lyrics.”[40] While seemingly analogous, music as a performing art is “often related in some way to performance” and must be understood by reference to its context, that is, elements outside the composition.[41] Though federal law since 1976 has applied copyright protection to musical recordings, including some performance elements such as percussion, recordings are treated as distinct expressions with separate copyright protection.[42] Consequently, musical compositions are protected only within the restrictive framework of the “musical work,” which is defined as a combination of melody and harmony.[43]

More problematically, courts analyzing music copyright cases tend to place undue weight on melody, rather than harmony and rhythm,[44] failing to consider the complexity of music and a realm of possible distinguishing features. Focusing on elements of music that “lend themselves to notation”[45] may seem adequate in analyzing works from European musical traditions, which typically have predominant harmonic and melodic structures,[46] but doing so fails to consider music in its totality. Because music is inherently relational,[47] our perception of musical works, and their meaning, is dependent on the context in which notes and pitches in the melody are played.[48] Elements such as timbre and spatial organization are also relevant to the way we hear music and to the similarities we perceive. Consequently, “originality is better viewed as a function of the interaction and conjunction of these elements than of any element alone.”[49]

Neglecting to consider the totality of elements in musical works, while ill-suited even to classical traditions, most drastically affects works outside of Western traditions. The main aesthetic features of non-Western music often fall outside the confines of the court’s emphasis on melody and notation. Most notably, hip-hop, which finds its roots in certain African musical traditions, features a dominant “oral tradition,” evidenced in the practice of rapping, and complex rhythmic structures with less emphasis on melodic and harmonic structures. Moreover, such traditions predominately feature the element of musical borrowing through the practice of sampling, looping, and interpolation. These features are also found in other African American musical genres, including blues, jazz, rhythm and blues, gospel, Soul, rock, reggae, funk, disco, and rap, and are found mixed with all types of music today.[50] Electronic music producers are now producing hip-hop tracks,[51] and even pop-country artists are making rhythm-centric tracks that reference hip-hop culture. Not surprisingly, entering the arena puts artists at risk of facing a copyright suit. Taylor Swift recently faced a $42 million infringement claim for using the lyric “haters gone hate,” a staple in hip-hop culture and music, [52] in her recent dance-pop track “Shake it Off.”[53] Despite the prevalence of non-notational elements, copyright’s bias for written work places works that do not fit the mold “at the bottom of the hierarchies of taste,”[54] making findings of original elements in allegedly infringing works more difficult to obtain.

Borrowing similarly conflicts with Western ideals of creativity and originality, with the result that music has historically been disvalued. Records from the time of the Statute of Anne’s enactment are telling of the hostile attitudes facing music. While literature was held in high esteem for its educative role, music was seen as an unnecessary luxury that served merely as entertainment.[55] In Pyle v. Falkener,[56] an early case brought under the Statute of Anne, defendant publishers argued that, in contrast to works of literature, authorship of music required “a high standard of originality to qualify for protection under any legal theory.”[57] Underlying their challenge was the commonly held notion that composers “merely borrowed” from “[o]ld [t]unes which had been [u]sed in [c]ommon by all persons for many years before…” and as such have no proprietary rights.[58]

Such disparaging views of music are less surprising when one considers the rise of the Romantic view of authorship in the nineteenth and twenties centuries.[59] Unlike the classical conception of authorship, which “conceives of art as imitating universal truths and ideas,” and thus contemplates the evolutionary nature of art, the Romantic view conceptualizes the creation of art “as a process that reflect[s] the emotions and personality of the individual artist.”[60] With the Romantic view informing cultural assumptions, originality often came to be defined as requiring independent creation, “which essentially appears to rule out or significantly limit borrowing.”[61]

With the functional and genre constraints inherent to music,[62] tensions existed early on in applying copyright to musical works. Yet “use of existing works has historically been a core feature of the musical composition process”[63] and the artistic process in general. The courts’ neglect to appreciate the reality of borrowing has often resulted in overbroad copyrights, extending protection to more than just the particular arrangement of the literal elements of a work.[64] Additionally, by failing to recognize distinguishing features of songs that lie outside the melody and notation, courts often find infringement based on unprotected elements. Genres that explicitly sample existing works, such as hip-hop, have been hit hardest. As a result, the courts are perpetuating a bias for traditional aesthetics at the expense of progressive and unfamiliar artistic movements.[65]

B. The Idea-Expression Distinction

Early on, courts using the idea-expression dichotomy to distinguish between unprotectable and protectable aspects of works did so on the basis of tangibility. Though the courts still use these terms in filtering out unprotectable elements, changing views on the nature of the artistic process have distorted the original tangibility basis, leading to ad hoc judicial determinations. With the rise of the Romantic view, artistic works in their entirety came to be regarded as reflecting the artist’s contributions.[66] As a result, perceptions regarding the moral (and thus, intellectual property) rights of an artist expanded to include more than just the particular arrangement of the literal elements of a work.

Originally, American copyright law viewed “ideas” as “intangible, unexpressed concept[s] that existed only in the author’s mind.”[68] Therefore, copyright protected only the tangible[69] “expression,” or the “arrangement of words which the author has selected to express the idea.”[70] The rationale served the purposes of the intellectual property clause well, since free access to ideas is critical to the development of creative works.[71] Moreover, the right granted did not include a right over certain words used, because “they are the common property of the human race.”[72] This early approach was consistent under the classical conception of the creative process, which views the artist as portraying an intangible idea or truth which “cannot and should not be captured or controlled by one artist.”[73]

However, with the rise of the Romantic view in the nineteenth and twenties centuries,[74] Congress no longer limited “expression” to the arrangement of the literal elements of the copyrighted work, but expanded it to include underlying “original” conceptual elements as well.[75] In 1909, Congress both enlarged the category of works eligible for protection and expanded the rights provided to copyright owners, including use of the work in a different medium.[76] Protection under the Act was no longer limited to the literal form or features of the expressed idea, but extended to elements of a work that are intangible and conceptual.[77] In applying the new act, the Supreme Court in Kalem Co. v. Harper Bros.[78] found the defendant’s film to infringe upon the plaintiff’s copyright in the book Ben Hur because the film expressed the same underlying idea, or plot, albeit in an entirely different medium.[79]

It became clear that balanced against the idea-expression distinctions is the countervailing consideration that copyright infringement cannot be limited to exact copying, “else a plagiarist would escape by immaterial variations.”[80] The problem is one of line-drawing: at what point is a variation distinguishable enough to “sufficiently alter a work’s substantial similarity to another so as to negate infringement,” without extending protection to the underlying idea of the plaintiff’s work?[81]

Views on the nature of art and the creative process have only continued to evolve and become more inconsistent with the idea-expression dichotomy.[82] The conceptual art movement advanced the rejection of any distinction between an artist’s idea and the ultimate expression.[83] As conceptual artist Sol LeWitt stated, “the idea or concept is the most important aspect of the work. When an artist uses a conceptual form of art, it means that all of the planning and decisions are made beforehand and the execution is a perfunctory affair. The idea becomes a machine that makes the art.”[84] In rejecting the Formalist tradition, which defined art by its form and structure, conceptual art judges art by what it contributes to the conception and definition of “art.”[85] Even an unchanged item from the grocery store, like a box of Brillo soap pads,[86] can be art if framed in a new way.

With Romantic and neo-romantic views challenging classical aesthetic theory, no universally accepted philosophical or objective basis remains for distinguishing ideas from expression in works of art. Continuing to use the terms leaves courts to make infringement decisions on the basis of their own subjective assessments of a work’s artistic value.[87] Judicial determinations of what constitutes the “idea” versus the “expression” have come to reflect personal assumptions and experiences. Courts tend to find elements of a work to be an “idea” when they are familiar with the work’s aesthetic tradition and can recognize the elements as commonplace.[88] Conversely, courts are more likely to find elements of works in less familiar traditions to be original “expression,” making them more inclined to find later uses infringing.[89] As the Ninth Circuit admitted, “‘At least in close cases, one may suspect, the classification [of idea and expression] the court selects may simply state the result reached rather than the reason for it.’”[90] Thus, with changing views on the creative process, “it is no longer necessary or valuable or even possible to dissect a work of art to uncover the universal truths or ideas which must remain freely available to all future authors.”[91]

Distinguishing between ideas and expression is perhaps most illusory in the context of music, due to the relatively limited number of compositional choices when compared with literary works.[92] Western music, at issue in most copyright suits, is primarily written in the tonal system, an organized and relational system of tones (e.g., the notes of a major or minor scale) in which one tone becomes “the central point to which the remaining tones are related.”[93] Because there are a limited number of possible pitch and harmonic relationships, options within tonal music are somewhat dictated by the system.[94] Moreover, because the tonal system is built on a hierarchy of predominate chords and pitches,[95] certain “patterns and tendencies are . . . common to virtually all musical works composed in the tonal system.”[96] The distinction between these unprotectable ideas and the original expression thereof is difficult to see, and thus the room for bias is most apparent.[97]

C. Evolution of the Copyright Infringement Tests

Courts since the nineteenth century have attempted to separate the issue of copyright infringement into two issues. First, “Copying-in-Fact”: did the defendant see and copy from the copyrighted work or did he create his work independently?; second, “Misappropriation”: did the defendant appropriate too much of the protected work?[98] The first question is used as an evidentiary tool to infer copying from access or “striking similarity,”[99] while the second focuses on the liability issue.[100] The degree of similarity between the two works is relevant to both the inquiries;[101] the phrase “probative similarity” is often used in reference to the first inquiry, while “illicit similarity” is used for the second.

Courts in the 1900s maintained the distinction between the copying-in-fact and misappropriation inquiries. A “substantial similarity” test was used for the copying-in-fact inquiry to determine whether the degree of similarity between the defendant’s and the plaintiff’s work was substantial to the point of being probative of actual copying. [102] The focus was solely on whether the defendant had copied “the labors of the original author.”[103] As such, before comparing the two works for similarities, the court filtered out unprotected elements from the plaintiff’s work, including those that were “well known and in common use.”[104]

In determining misappropriation—that is, whether the defendant copied enough of the plaintiff’s work to be held liable—courts looked to the economic or aesthetic value of what the defendant copied: if the portions extracted “embody the spirit and the force of the work…they take from it that in which its chief value consists.”[105] In this context, courts often used the adjective “substantial” to refer to the qualitative value of what was copied.[106]

As precedent evolved, courts began to combine the structure of these two prongs. As a result, courts have often confused the economic purpose of the misappropriation prong: finding infringement based solely on quantitative similarity without taking account of the unprotected elements in the original work. The analysis of the two major copyright tests below outlines how this confusion arose and focuses on the problems the misappropriation prong are causing for copyright.

1. Second Circuit Copying/Unlawful Appropriation Test

In Arnstein v. Porter,[107] the litigious Ira B. Arnstein sued the American songwriter and composer Cole Porter, alleging that many of Porter’s songs infringed the copyrights of songs written by Arnstein.[108] The Second Circuit conducted an influential bifurcated test,[109] which requires a plaintiff to prove (1) copying-in-fact and (2) illicit copying (unlawful appropriation) to establish infringement.[110]

i. Copying-in-Fact

The first prong of the Arnstein test is satisfied with the showing of (a) access; and (b) sufficient similarity, which is “probative” of copying: “The stronger the proof of similarity, the less the proof of access that is required.”[111] Thus, if similarities are so “striking” so as to “preclude the possibility that plaintiff and defendant independently arrived at the same result,” evidence of access may not be necessary.[112] Of course, the converse is not true because “access without similarity cannot create an inference of copying.” [113]

To evaluate the likelihood of copying versus independent creation, expert testimony and “analytic dissection”[114] are admissible.[115] However, the two works are to be compared in their entirety, including both protectable and non-protectable material.[116]

ii. Unlawful Appropriation

Only if the threshold issue of copying-in-fact is shown does the court move to the question of misappropriation.[117] Having established copying-in-fact, the issue of unlawful appropriation is a question of fact. Therefore, the fact finder must determine whether the taking went so far as to constitute infringement under the “substantial similarity” test.[118] That is, would the ordinary observer, unless he set out to detect the disparities, be disposed to overlook them and regard the aesthetic appeal of the two works as the same?[119] The second part of the Arnstein test is “related to the nineteenth century concern with the value of what the defendant had copied” as it asks whether the similarity “relates to material of substance and value in plaintiff’s work.”[120] However, the Arnstein test departs in some ways from earlier definitions of infringement by looking to the reaction of the “ordinary observer.”[121]

In determining misappropriation, the Arnstein test looks to “the response of the ordinary lay hearer.” That is, rather than making a purely subjective determination, the trier of fact is meant to determine the issue “in light of the impressions reasonably expected to be made upon the hypothetical ordinary observer.”[122] Because the court reasoned that the value of the work lay solely in the opinion of its intended audience, it held that expert testimony on the “impression made on the refined ears of musical experts” was “utterly immaterial.”[123] While seeming to realize the difficulty in discovering the views of the imaginary “ordinary observer,”[124] the court stated that expert testimony was permitted for the limited purpose of “assist[ing] in determining the reactions of lay auditors.”[125]

Moreover, because the court determined that the value of the works lay in their final form as impressed upon the ordinary observer, it instructed that detailed analysis and careful dissection were inappropriate.[126] Therefore, according to Arnstein, works were to be considered in their entirety, again including both protectable and non-protectable material.[127] The trier was left to depend on “some visceral reaction” as the basis for determining misappropriation.[128]

If the case involves “comprehensive non-literal similarity”[129] – that is, similarity in the overall structure of the works – the trier must make a value judgment of “whether defendant took from plaintiff’s works so much of what is pleasing to the ears of lay listeners, who comprise the audience for whom such popular music is composed, that defendant wrongfully appropriated something which belongs to the plaintiff.”[130] In a case of “fragmented literal similarity,” or verbatim copying of constituent elements, an analogous value judgment must be made, but here only with respect to the protectable portions of plaintiff’s work that have been taken.[131] Dissimilarities between materials alleged to be infringing are “significant because they mitigate any impression of similarity.”[132] Dissimilarities in other aspects of the defendant’s work, except to the extent they create an overall different impression, “typically are not significant.”[133] As Judge Learned Hand said, “no plagiarist can excuse his wrong by showing how much work he did not pirate.”[134] Thus, if the defendant copies from the plaintiff’s work, it does not matter if he adds significant material of his own,[135] resulting in what might be a transformative new work.

Consequently, under Arnstein, “[i]nstead of using some objective standards or criteria based on economic impact or quantity, courts [are] to determine infringement on an unpredictable, impressionistic basis.”[136]

iii. Further Developments and Confusions of the Arnstein Test

Although the Second Circuit in Arnstein conducted two separate inquiries into the level of similarity between the two works,[137] namely to establish copying-in-fact and then to determine misappropriation, confusion ensued from the dual use of the term “substantial similarity.” As a result of this confusion, in Ideal Toy Corp. v. Fab-Lu Ltd,[138] the Second Circuit essentially combined the issues into one subjective test. By misinterpreting the element of misappropriation identified in Arnstein as “merely an alternative way of formulating the issue of substantial similarity” rather than as an independent step, the Second Circuit stated that copyright infringement is shown solely by “substantial similarity” between the two works based on the reaction of the ordinary observer.[139] The court effectively reduced the test for prima facie copyright infringement to (1) access, and (2) misappropriation, thereby failing to consider copying-in-fact.[140] Therefore, the court rejected dissection, analysis, and expert testimony entirely and did not think it necessary to analyze the similarities between the works to determine the likelihood of independent creation.[141] By basing the entire copyright infringement inquiry on the subjective impression of those untrained in the arts, the court neglected to protect against a finding of infringement based on purely unprotectable and unoriginal elements. The Ideal Toy test fails to deal with the fundamental principle of copyright law that seeks to protect merely the expression of ideas rather than ideas themselves.[142] Unfortunately, Ideal Toy’s interpretation of the Arnstein test largely influenced the way modern courts use “substantial similarity” in determining infringement.[143]

Luckily, some courts have maintained the Arnstein two-part inquiry. In Universal Athletic Sales Co. v. Salkeld,[144] the Third Circuit restored Arnstein’s bifurcated approach in holding that a plaintiff must prove copying and “that copying went so far as to constitute improper appropriation.”[145] Moreover, the court recognized that “substantial similarity to show that the original work has been copied is not the same as substantial similarity to prove infringement.”[146]

More inherently problematic is the fact that the Salkeld court maintained Arnstein’s limits on expert analysis and dissection in determining misappropriation.[147] As a result, the Salkeldtest likewise fails to provide any objective standards or criteria for determining how much similarity is necessary to constitute misappropriation.[148]

Though courts instruct the fact finder to find misappropriation only if the defendant’s work copies not merely the idea, but ‘the expression of the idea,’ this “vague formula” is a reformulation, not a solution, to the problem of determining “what sort of similarity short of the verbatim will constitute substantial similarity.”[149] Thus, the ordinary observer continues to be left with “the impossible task of comparing only protected expression in determining substantial similarity without engaging in any thoughtful dissection or analysis of the works.”[150]

2. Ninth Circuit: “Total Concept and Feel”

The Ninth Circuit’s framework, laid out in Sid & Marty Krofft Television Prods., Inc. v. McDonald’s Corp,[151] represents the second main approach to determining copyright infringement. Although not a music infringement case, the “extrinsic/intrinsic” test developed by the Krofft court has been influential in many copyright disputes, including those involving music.[152] In recognizing that the ordinary observer is unlikely to be able to separate idea from expression in comparing two works without dissection or analysis, the Ninth Circuit proposed its own two-step test that attempts to ensure that there is “substantial similarity not only of the general ideas but of the expressions of those ideas as well.”[153] Though the test was later reformulated to include specific expressive elements during the extrinsic inquiry, as discussed in in Part I.ii.c., understanding the original formulation is key to examining its foundational flaws.

i. Extrinsic Test

The first step, or the “extrinsic” analysis, as originally cast by the Krofft court, was an objective comparison by the court for similarity in ideas.[154] Only if a substantial similarity of objective criteria under the “extrinsic” test is found do courts consider misappropriation under the “intrinsic” analysis.[155] Thus, the extrinsic test aims to limit protection to protectable elements by first filtering out unprotectable elements, including ideas, facts, and scènes à faire, and then determining whether the allegedly infringing work is “substantially similar to the protectable elements of the artist’s work.”[156]

According to the Ninth Circuit, in filtering out unprotected elements the extrinsic test incorporates the idea-expression dichotomy by limiting the scope of copyright protection to expression. As the court stated:

By creating a discrete set of standards for determining the objective similarity of literary works, the law of this circuit has implicitly recognized the distinction between situations in which idea and expression merge in representational objects and those in which the idea is distinct from the written expression of a concept….[157]

Courts conducting the extrinsic test “must take care to inquire only whether the protectable elements, standing alone, are substantially similar.”[158] Therefore, analytic dissection and expert testimony presented by the plaintiff on the similarities between the plaintiff’s work and defendant’s work are recommended.[159]

In performing the analytic dissection, courts are instructed to list and analyze the “measurable, objective elements” of the works,[160] including “the type of artwork involved, materials used, and the subject matter.”[161] For literary works, courts have listed such elements as “plot, theme, dialogue, mood, setting, pace, characters, and sequence of events.”[162] Because these factors do not readily apply to visual works of art, the court looks to the “objective details of the appearance.”[163] Without attempting to provide an exhaustive list of relevant factors, the court in Ninth Circuit listed such elements as “the subject matter, shapes, colors, materials, and arrangement of the representations.”[164]

Though described as a factual question, because it bases the question on objective criteria rather than the response of the trier, the extrinsic test may often be decided as a matter of law.[165]

ii. Intrinsic Test

Much like the Second Circuit’s ordinary observer test, the intrinsic test is entirely subjective and based on the “response of the ordinary reasonable person” to the “total concept and feel” of a work,[166] excluding expert testimony and dissection. Similar to the Arnstein court’s language of “lay [persons], who comprise the audience,”[167] the Krofft court suggested that the fact finder’s reaction be geared towards that of the intended or likely audience.[168] In a suit involving the characters in a children’s television show, the court stated, “The present case demands an even more intrinsic determination because both plaintiffs’ and defendants’ works are directed to an audience of children.”[169] Therefore, the court limited the inquiry to the understanding of a child and found substantial similarity despite noted differences. Without expert testimony to aid the trier in determining whether children might detect distinctions, the court relied on the triers’ subjective belief that children would be unlikely to notice minor distinctions.[170]

iii. Further Developments and Confusions of the Krofft Test

In Shaw v. Lindheim,[171] the Ninth Circuit modified the extrinsic/intrinsic test in recognition of the fact that district courts were not limiting the extrinsic test to a comparison of ideas.[172] Recognizing that the similarity of ideas prong is often shown by “focusing on the similarities in the objective details of the works,”[173] the Shaw court explained that the extrinsic/intrinsic test is no longer divided by an analysis of ideas and expression.[174] Rather, the extrinsic test is an objective analysis of specific “manifestations of expression,” while the intrinsic test is a subjective analysis of expression by the fact finder, which is no more than the lay observer’s visceral reaction which is “virtually devoid of analysis.”[175] Though the Shaw court recognized that the test was “more sensibly described as objective and subjective”[176] courts have confusingly continued to use the extrinsic/intrinsic language. Moreover, subsequent cases have left the analysis of improper appropriation to the jury analyzing the works as a whole.

In Swirsky v. Carey,[177] the Ninth Circuit applied the extrinsic/intrinsic test to a case involving musical works. Swirsky and his co-writer filed a copyright infringement suit claiming that Mariah Carey’s song “Thank God I Found You” plagiarized their song “One of Those Love Songs.”[178] The court rejected the district court’s approach to the extrinsic test, which involved a “measure-by-measure comparison of melodic note sequences.”[179] The Ninth Circuit felt comparing notes would fail to consider other relevant elements such as “harmonic chord progression, tempo, and key” as “it is these elements that determine what notes and pitches are heard in a song and at what point in the song they are found.”[180] precisely which elements the court should consider, explaining in dicta that the copyright framework is difficult to apply to aesthetic works such as music which are “not capable of ready classification into . . . constituent elements” the way literary work can be classified into “plot, themes, mood, setting, pace, characters, and sequence of events.”[181]

The Ninth Circuit’s opinion in Swirsky is also relevant for its proposition that substantial similarity can be found based on a combination of elements, “even if those elements are individually unprotected.”[182] For example, in Three Boys Music Corp. v. Bolton,[183] the Ninth Circuit upheld the jury’s finding that two songs were substantially similar due to the presence of the same five individually unprotectable elements: “(1) the title hook phrase (including the lyric, rhythm, and pitch)&#