Towards open and responsible AI licensing frameworks

-


Carlos Muñoz Ferrandis's avatar


Open & Responsible AI licenses (“OpenRAIL”) are AI-specific licenses enabling open access, use and distribution of AI artifacts while requiring a responsible use of the latter. OpenRAIL licenses might be for open and responsible ML what current open software licenses are to code and Creative Commons to general content: a widespread community licensing tool.

Advances in machine learning and other AI-related areas have flourished these past years partly due to the ubiquity of the open source culture within the Information and Communication Technologies (ICT) sector, which has permeated into ML research and development dynamics. Notwithstanding the advantages of openness as a core value for innovation in the sphere, (not so already) recent events related to the moral and socio-economic concerns of development and use of machine learning models have spread a transparent message: Openness is just not enough. Closed systems are usually not the reply though, as the issue persists under the opacity of firms’ private AI development processes.



Open source licenses don’t fit all

Access, development and use of ML models is very influenced by open source licensing schemes. As an example, ML developers might colloquially discuss with “open sourcing a model” after they make its weights available by attaching an official open source license, or some other open software or content license akin to Creative Commons. This begs the query: why do they do it? Are ML artifacts and source code really that similar? Do they share enough from a technical perspective that non-public governance mechanisms (e.g. open source licenses) designed for source code also needs to govern the event and use of ML models?

Most current model developers appear to think so, as the vast majority of openly released models have an open source license (e.g., Apache 2.0). See as an illustration the Hugging Face Model Hub and Muñoz Ferrandis & Duque Lizarralde (2022).

Nonetheless, empirical evidence can be telling us that a rigid approach to open sourcing and/or Free Software dynamics and an axiomatic belief in Freedom 0 for the discharge of ML artifacts is creating socio-ethical distortions in the usage of ML models (see Widder et al. (2022)). In simpler terms, open source licenses don’t take the technical nature and capabilities of the model as a special artifact to software/source code under consideration, and are subsequently ill-adapted to enabling a more responsible use of ML models (e.g. criteria 6 of the Open Source Definition), see also Widder et al. (2022); Moran (2021); Contractor et al. (2020).

If specific ad hoc practices dedicated to documentation, transparency and ethical usage of ML models are already present and improving every day (e.g., model cards, evaluation benchmarks), why shouldn’t open licensing practices even be adapted to the precise capabilities and challenges stemming from ML models?

Same concerns are rising in industrial and government ML licensing practices. Within the words of Bowe & Martin (2022): “Babak Siavoshy, general counsel at Anduril Industries, asked what variety of license terms should apply to an AI algorithm privately developed for computer-vision object detection and adapt it for military targeting or threat-evaluation? Neither industrial software licenses nor standard DFARS data rights clauses adequately answer this query as neither appropriately protects the developer’s interest or enable the federal government to realize the insight into the system to deploy it responsibly“.

If indeed ML models and software/source code are different artifacts, why is the previous released under open source licenses? The reply is straightforward, open source licenses have grow to be the de facto standard in software-related markets for the open sharing of code amongst software communities. This “open source” approach to collaborative software development has permeated and influenced AI development and licensing practices and has brought huge advantages. Each open source and Open & Responsible AI licenses (“OpenRAIL”) might well be complementary initiatives.

Why don’t we design a set of licensing mechanisms inspired by movements akin to open source and led by an evidence-based approach from the ML field? In actual fact, there’s a brand new set of licensing frameworks that are going to be the vehicle towards open and responsible ML development, use and access: Open & Responsible AI Licenses (OpenRAIL).



A change of licensing paradigm: OpenRAIL

The OpenRAIL approach taken by the RAIL Initiative and supported by Hugging Face is informed and inspired by initiatives akin to BigScience, Open Source, and Creative Commons. The two most important features of an OpenRAIL license are:

  • Open: these licenses allow royalty free access and versatile downstream use and re-distribution of the licensed material, and distribution of any derivatives of it.

  • Responsible: OpenRAIL licenses embed a particular set of restrictions for the usage of the licensed AI artifact in identified critical scenarios. Use-based restrictions are informed by an evidence-based approach to ML development and use limitations which forces to attract a line between promoting wide access and use of ML against potential social costs stemming from harmful uses of the openly licensed AI artifact. Subsequently, while benefiting from an open access to the ML model, the user is not going to give you the chance to make use of the model for the required restricted scenarios.

The combination of use-based restrictions clauses into open AI licenses brings up the flexibility to raised control the usage of AI artifacts and the capability of enforcement to the licensor of the ML model, standing up for a responsible use of the released AI artifact, in case a misuse of the model is identified. If behavioral-use restrictions weren’t present in open AI licenses, how would licensors even begin to take into consideration responsible use-related legal tools when openly releasing their AI artifacts? OpenRAILs and RAILs are step one towards enabling ethics-informed behavioral restrictions.

And even before eager about enforcement, use-based restriction clauses might act as a deterrent for potential users to misuse the model (i.e., dissuasive effect). Nonetheless, the mere presence of use-based restrictions won’t be enough to be certain that potential misuses of the released AI artifact won’t occur. This is the reason OpenRAILs require downstream adoption of the use-based restrictions by subsequent re-distribution and derivatives of the AI artifact, as a method to dissuade users of derivatives of the AI artifact from misusing the latter.

The effect of copyleft-style behavioral-use clauses spreads the requirement from the unique licensor on his/her wish and trust on the responsible use of the licensed artifact. Furthermore, widespread adoption of behavioral-use clauses gives subsequent distributors of derivative versions of the licensed artifact the flexibility for a greater control of the usage of it. From a social perspective, OpenRAILs are a vehicle towards the consolidation of an informed and respectful culture of sharing AI artifacts acknowledging their limitations and the values held by the licensors of the model.



OpenRAIL might be for good machine learning what open software licensing is to code

Three examples of OpenRAIL licenses are the recently released BigScience OpenRAIL-M, StableDiffusion’s CreativeML OpenRAIL-M, and the genesis of the previous two: BigSicence BLOOM RAIL v1.0 (see post and FAQ here). The latter was specifically designed to advertise open and responsible access and use of BigScience’s 176B parameter model named BLOOM (and related checkpoints). The license plays on the intersection between openness and responsible AI by proposing a permissive set of licensing terms coped with a use-based restrictions clause wherein a limited variety of restricted uses is about based on the evidence on the potential that Large Language Models (LLMs) have, in addition to their inherent risks and scrutinized limitations. The OpenRAIL approach taken by the RAIL Initiative is a consequence of the BigScience BLOOM RAIL v1.0 being the primary of its kind in parallel with the discharge of other more restricted models with behavioral-use clauses, akin to OPT-175 or SEER, being also made available.

The licenses are BigScience’s response to 2 partially addressed challenges within the licensing space: (i) the “Model” being a special thing to “code”; (ii) the responsible use of the Model. BigScience made that extra step by really focusing the license on the precise case scenario and BigScience’s community goals. In actual fact, the answer proposed is sort of a brand new one within the AI space: BigScience designed the license in a way that makes the responsible use of the Model widespread (i.e. promotion of responsible use), because any re-distribution or derivatives of the Model may have to comply with the precise use-based restrictions while with the ability to propose other licensing terms in terms of the remaining of the license.

OpenRAIL also aligns with the continued regulatory trend proposing sectoral specific regulations for the deployment, use and commercialization of AI systems. With the arrival of AI regulations (e.g., EU AI Act; Canada’s proposal of an AI & Data Act), latest open licensing paradigms informed by AI regulatory trends and ethical concerns have the potential of being massively adopted in the approaching years. Open sourcing a model without taking due account of its impact, use, and documentation might be a source of concern in light of recent AI regulatory trends. Henceforth, OpenRAILs must be conceived as instruments articulating with ongoing AI regulatory trends and a part of a broader system of AI governance tools, and never because the only solution enabling open and responsible use of AI.

Open licensing is one in all the cornerstones of AI innovation. Licenses as social and legal institutions must be well taken care of. They mustn’t be conceived as burdensome legal technical mechanisms, but slightly as a communication instrument amongst AI communities bringing stakeholders together by sharing common messages on how the licensed artifact might be used.

Let’s put money into a healthy open and responsible AI licensing culture, the long run of AI innovation and impact depends upon it, on all of us, on you.

Writer: Carlos Muñoz Ferrandis

Blog acknowledgments: Yacine Jernite, Giada Pistilli, Irene Solaiman, Clementine Fourrier, Clément Délange



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x