How AI is Occupying GitHub’s Chief Legal Officer, Shelley McKinley’s Schedule

Github’s chief legal officer, Shelley McKinley, has a lot to deal with, notably legal issues related to its Copilot pair-programmer, in addition to the Artificial Intelligence (AI) Act, which was passed in the European Parliament this week as “the first comprehensive AI law in the world.”

The EU AI Act has been in the works for three years and first made an appearance in 2021 through proposals meant to tackle the increasing influence of AI in our daily lives. The new legal framework aims to regulate AI applications based on perceived risks, with different rules and stipulations depending on the application and use-case.

GitHub, which was acquired by Microsoft for $7.5 billion in 2018, is one of the most active opponents of a specific aspect of the regulations, arguing that the ambiguous language could create legal liabilities for open source software developers.

McKinley has been with Microsoft since 2005, serving in various legal positions, including roles in hardware businesses like Xbox and Hololens, as well as general counsel positions in Munich and Amsterdam, before becoming the Chief Legal Officer at GitHub nearly three years ago.

“I transitioned to GitHub in 2021 to take on a different kind of Chief Legal Officer role, which is multidisciplinary,” explained McKinley to TechCrunch. “My responsibilities include the usual legal issues such as commercial contracts, product, and HR, but I also oversee accessibility, ensuring all developers can utilize our tools and services.”

Besides that, McKinley is in charge of handling environmental sustainability matters aligning with Microsoft’s own goals. His purview also covers trust and safety issues, which involve moderating content to maintain GitHub as a nurturing, secure, and friendly environment for developers.

There’s no doubt that McKinley’s duties are becoming increasingly connected to AI.

TechCrunch recently had a conversation with McKinley in London, before the EU AI Act was approved this week.

GitHub Chief Legal Officer Shelley McKinley Image Credits: GitHub

GitHub is a platform that offers a collaborative environment for software development. It lets users host, manage, and share code “repositories” (a storage area for project-specific files) with anyone, anywhere in the world. While companies can pay to keep their repositories private for internal projects, GitHub’s growth and success are largely motivated by open source software development conducted collaboratively in a public setting.

Since Microsoft’s acquisition six years ago, the technological world has seen considerable change. AI wasn’t exactly a novelty in 2018, and its increasing influence was becoming more noticeable in society. However, with developments like ChatGPT, DALL-E, and others, AI has now firmly taken place in the mainstream.

“I would say that AI is consuming a substantial chunk of my time — this includes topics like ‘how do we develop and launch AI products,’ ‘how do we participate in the AI discussions from a policy angle?,’ as well as ‘how do we consider AI as it comes onto our platform?’,” stated McKinley.

The progress of AI owes a lot to open source. Collaboration and shared data have been of great significance in developing some of the most remarkable AI systems today. Perhaps, this is best illustrated by OpenAI, the generative AI leading example. Originally, it had a potent open-source base, but it later shifted towards a more proprietary approach. This shift is also among the reasons Elon Musk is presently involved in a lawsuit against OpenAI.

How the OpenAI debacle could strengthen Meta and the ‘open AI’ movement

Despite the good intentions behind Europe’s upcoming AI regulations, some critics contend that these could lead to substantial unintended repercussions on the open source community. This could, in turn, slow the development of AI. This argument has been a focal point in GitHub’s advocacy efforts.

“Regulators, policymakers, lawyers… are not technologists,” said McKinley. “And one of the key tasks I’ve been part of over the past year, is to go out and help enlighten people about how the products function. There’s a need for people to have a comprehensive understanding of what’s happening, so they can contemplate these matters and reach sound decisions regarding regulation implementation.”

The main issue was the fear that these rules might make open source “general purpose AI systems,” which are based on models that can manage a variety of tasks, legally liable. If open source AI creators were made accountable for problems that occurred later (i.e., at the application phase), their willingness to contribute might decrease, thereby giving more power and control to major technology companies that create proprietary systems.

Open source software development, by its very nature, is distributed, and GitHub, with its over 100 million developers worldwide, relies on developers being motivated to continue contributing to what many believe to be the fourth industrial revolution. This is why GitHub has been outspoken about the AI Act, advocating for exemptions for developers working on open source general purpose AI technology.

“GitHub is the place for open source, we are the caretakers of the world’s largest open source community,” stated McKinley. “We want to be the go-to destination for all developers, and we want to expedite human progress through developer cooperation. For us, this is of utmost importance — it’s not just a ‘fun to have’ or ‘nice to have’ — it’s central to what we do as a company and a platform.”

Why GitHub’s CEO believes open source developers should be excluded from the EU’s AI Act

The AI Act now includes exemptions for AI models and systems released under free and open-source licenses, with an exception where “unacceptable” high-risk AI systems are involved. Therefore, developers of open-source general purpose AI models are not required to give the same level of documentation and guarantees to EU regulators. However, it has not been clarified which proprietary and open-source models will be classified under “high-risk”.

Despite these complexities, McKinley believes that their vigorous lobbying efforts have primarily paid off. The regulators are putting less emphasis on software “componentry” (individual elements of a system that is more likely to be generated by open-source developers) and more concentration on what is happening at the compiled application level.

McKinley credits this to their success in enlightening policymakers on these topics. McKinley explained that they have helped people grasp the concept of componentry. Open-source components are continuously developed, released for free, and there’s transparency around them – and the same goes for open-source AI models. However, the question is how do we consider responsibly distributing the liability? According to McKinley, it’s not so much about upstream developers as much as downstream commercial products. This is seen as a substantial victory for innovation and open-source developers.

Age of AI: Everything you need to know about artificial intelligence

GitHub’s AI-enabled pair-programming tool Copilot has triggered a generative AI revolution that could disrupt many industries, including software development, since its introduction three years ago. Similar to Gmail’s Smart Compose, Copilot suggests lines or functions as a developer types, thereby accelerating the software development process.

Unfortunately, the launch of Copilot’s commercial aspect in 2022 has evoked substantial controversy among the developer community. In particular, the Software Freedom Conservancy has urged all open source software developers to abandon GitHub. The cause of worrying is that Copilot, a proprietary, paid service, appears to profit from the resources of the open source community. Furthermore, Copilot was developed in collaboration with OpenAI, leveraging OpenAI Codex, which was educated on a wide range of public source code and natural language models, prior to the ChatGPT craze.

Copilot brings up important issues about software authorship. If Copilot is merely replicating code composed by another developer, isn’t it fair to credit the original author? This question is the primary focus of a detailed article by the Software Freedom Conservancy’s Bradley M. Kuhn, titled “If Software is My Copilot, Who Programmed My Software?

Open source software often elicits misunderstanding, with many under the impression that it’s a free-for-all, enabling anyone to use the code as they wish. However, open source licenses usually enforce at least one crucial stipulation – the requirement for proper attribution for borrowed code. It’s a challenge to provide correct attribution if the origin of the code served by Copilot is unknown.

The controversy encapsulating Copilot also underscores the complexity in comprehending what generative AI entails. Vast data amounts back large language models like the ones used in tools including ChatGPT or Copilot. Similar to a human programmer learning from previous code, Copilot can yield output closely resembling or identical to pre-existing examples. Therefore, when it does match public code, this match “frequently” pertains to “dozens, if not hundreds” of repositories.

“Generative AI is not a mere copy-and-paste machine,” remarked McKinley. “Copilot is most likely to replicate publicly accessible code when it’s a common method. That being said, we’re aware of the concerns people have and we aim to approach responsibly, meeting our community needs and the excitement developers have for this tool. We value developers’ feedback.”

By the end of 2022, Microsoft, GitHub and OpenAI dodged several accusations in a case where U.S software developers sued, claiming Copilot to be an “unprecedented open-source software piracy”. Despite some aspects of the case being dismissed, the litigation continues as the plaintiffs recently lodged an amended complaint regarding GitHub’s alleged contract breach with its developers, demonstrated here.

The legal dispute was anticipated, according to McKinley. “We certainly had feedback from the community, we were aware of the concerns that had been expressed,” said McKinley.

GitHub has attempted to mitigate these concerns with Copilot’s potential code “borrowing” from other developers. For example, it has implemented a “duplication detection” feature, which is initially deactivated, but once enabled, it prevents Copilot from suggesting code completions that are over 150 characters and identical to publicly accessible code. In August, GitHub launched a new code-referencing feature (still in beta) that enables developers to track the origin of suggested code segments. With this information, they can comply with licensing and attribution laws and even integrate the comprehensive library from which the code segment was derived.

The issue that developers have raised concerns about is challenging to quantify. GitHub has previously stated that its duplication detection feature would only be activated less than 1% of the time. This most often occurs when there is nearly an empty file with scarce local context to utilize, making it more likely to suggest code that mirrors code written elsewhere.

“There are a multitude of viewpoints out there — our platform hosts over 100 million developers,” expressed McKinley. “Between these developers, there is a spectrum of opinions regarding their concerns. Thus, our aim is to adapt to community feedback and also initiate strategies that we believe will enhance Copilot to be an excellent tool and experience for developers.”

The advancement of the EU AI Act is only the beginning — we are aware that it’s inevitable, and we know its form. However, it will still take another few years for companies to adhere to it — quite similar to how organizations had to prepare for GDPR in the realm of data privacy.

“I envision that [technical] standards will hold significant importance in this,” expressed McKinley. “We need to ponder on how we can achieve unified standards that companies can conform to. Citing GDPR as an instance, several privacy standards were proposed to attain uniformity. And we understand that as the AI Act begins implementation, various stakeholders will attempt to discern how to put it into action. Henceforth, we want to make sure that we are representing developers and open source developers in these dialogs.”

In addition to that, more regulations are approaching. President Biden has recently made an executive order with an aim to establish standards around AI safety and security. This offers an insight into the potential differences between Europe and the U.S. with regards to regulation — even though they might possess a similar “risk-based” approach.

“I would say the EU AI Act is a ‘fundamental rights base,’ as you would expect in Europe,” McKinley said. “And the U.S. side is very cybersecurity, deep-fakes — that kind of lens. But in many ways, they come together to focus on what are risky scenarios — and I think taking a risk-based approach is something that we are in favour of — it’s the right way to think about it.”

Discover the pinnacle of WordPress auto blogging technology with AutomationTools.AI. Harnessing the power of cutting-edge AI algorithms, AutomationTools.AI emerges as the foremost solution for effortlessly curating content from RSS feeds directly to your WordPress platform. Say goodbye to manual content curation and hello to seamless automation, as this innovative tool streamlines the process, saving you time and effort. Stay ahead of the curve in content management and elevate your WordPress website with AutomationTools.AI—the ultimate choice for efficient, dynamic, and hassle-free auto blogging. Learn More

Leave a Reply

Your email address will not be published. Required fields are marked *