労働力調査(基本集計)2025年(令和7年)5月分
2020年基準 消費者物価指数 東京都区部 2025年(令和7年)6月分(中旬速報値)
情報通信審議会 情報通信技術分科会 電波有効利用委員会 電波監視作業班(第2回)
一般職技術系(情報通信行政)の情報を更新しました
Two Courts Rule On Generative AI and Fair Use — One Gets It Right
Things are speeding up in generative AI legal cases, with two judicial opinions just out on an issue that will shape the future of generative AI: whether training gen-AI models on copyrighted works is fair use. One gets it spot on; the other, not so much, but fortunately in a way that future courts can and should discount.
The core question in both cases was whether using copyrighted works to train Large Language Models (LLMs) used in AI chatbots is a lawful fair use. Under the US Copyright Act, answering that question requires courts to consider:
- whether the use was transformative;
- the nature of the works (Are they more creative than factual? Long since published?)
- how much of the original was used; and
- the harm to the market for the original work.
In both cases, the judges focused on factors (1) and (4).
The right approachIn Bartz v. Anthropic, three authors sued Anthropic for using their books to train its Claude chatbot. In his order deciding parts of the case, Judge William Alsup confirmed what EFF has said for years: fair use protects the use of copyrighted works for training because, among other things, training gen-AI is “transformative—spectacularly so” and any alleged harm to the market for the original is pure speculation. Just as copying books or images to create search engines is fair, the court held, copying books to create a new, “transformative” LLM and related technologies is also protected:
[U]sing copyrighted works to train LLMs to generate new text was quintessentially transformative. Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different. If this training process reasonably required making copies within the LLM or otherwise, those copies were engaged in a transformative use.
Importantly, Bartz rejected the copyright holders’ attempts to claim that any model capable of generating new written material that might compete with existing works by emulating their “sweeping themes, “substantive points,” or “grammar, composition, and style” was an infringement machine. As the court rightly recognized, building gen-AI models that create new works is beyond “anything that any copyright owner rightly could expect to control.”
There’s a lot more to like about the Bartz ruling, but just as we were digesting it Kadrey v. Meta Platforms came out. Sadly, this decision bungles the fair use analysis.
A fumble on fair useKadrey is another suit by authors against the developer of an AI model, in this case Meta’s ‘Llama’ chatbot. The authors in Kadrey asked the court to rule that fair use did not apply.
Much of the Kadrey ruling by Judge Vince Chhabria is dicta—meaning, the opinion spends many paragraphs on what it thinks could justify ruling in favor of the author plaintiffs, if only they had managed to present different facts (rather than pure speculation). The court then rules in Meta’s favor because the plaintiffs only offered speculation.
But it makes a number of errors along the way to the right outcome. At the top, the ruling broadly proclaims that training AI without buying a license to use each and every piece of copyrighted training material will be “illegal” in “most cases.” The court asserted that fair use usually won’t apply to AI training uses even though training is a “highly transformative” process, because of hypothetical “market dilution” scenarios where competition from AI-generated works could reduce the value of the books used to train the AI model..
That theory, in turn, depends on three mistaken premises. First, that the most important factor for determining fair use is whether the use might cause market harm. That’s not correct. Since its seminal 1994 opinion in Cambell v Acuff-Rose, the Supreme Court has been very clear that no single factor controls the fair use analysis.
Second, that an AI developer would typically seek to train a model entirely on a certain type of work, and then use that model to generate new works in the exact same genre, which would then compete with the works on which it was trained, such that the market for the original works is harmed. As the Kadrey ruling notes, there was no evidence that Llama was intended to to, or does, anything like that, nor will most LLMs for the exact reasons discussed in Bartz.
Third, as a matter of law, copyright doesn't prevent “market dilution” unless the new works are otherwise infringing. In fact, the whole purpose of copyright is to be an engine for new expression. If that new expression competes with existing works, that’s a feature, not a bug.
Gen-AI is spurring the kind of tech panics we’ve seen before; then, as now, thoughtful fair use opinions helped ensure that copyright law served innovation and creativity. Gen-AI does raise a host of other serious concerns about fair labor practices and misinformation, but copyright wasn’t designed to address those problems. Trying to force copyright law to play those roles only hurts important and legal uses of this technology.
In keeping with that tradition, courts deciding fair use in other AI copyright cases should look to Bartz, not Kadrey.
【おすすめ本】菅沼 堅吾『東京新聞はなぜ、空気を読まないのか』―ズバリ本質を突く報道 言葉でごまかす政治を許すな=丸山重威(ジャーナリズム研究者)<br />
AfriSIG 2025: An exceptional opportunity for African professionals in internet governance
Ahead of Budapest Pride, EFF and 46 Organizations Call on European Commission to Defend Fundamental Rights in Hungary
This week, EFF joined EDRi and nearly 50 civil society organizations urging the European Commission’s President Ursula von der Leyen, Executive Vice President Henna Virkunnen, and Commissioners Michael McGrath and Hadja Lahbib to take immediate action and defend human rights in Hungary.
The European Commission has a responsibility to protect EU fundamental rights, including the rights of LGBTQ+ individuals in Hungary and across the Union
With Budapest Pride just two days away, Hungary has criminalized Pride marches and is planning to deploy real-time facial recognition technology to identify those participating in the event. This is a flagrant violation of fundamental rights, particularly the rights to free expression and assembly.
On April 15, a new amendment package went into effect in Hungary which authorizes the use of real-time facial recognition to identify protesters at ‘banned protests’ like LGBTQ+ events, and includes harsh penalties like excessive fines and imprisonment. This is prohibited by the EU Artificial Intelligence (AI) Act, which does not permit the use of real-time face recognition for these purposes.
This came on the back of members of Hungary’s Parliament rushing through three amendments in March to ban and criminalize Pride marches and their organizers, and permit the use of real-time facial recognition technologies for the identification of protestors. These amendments were passed without public consultation and are in express violation of the EU AI Act and Charter of Fundamental Rights. In response, civil society organizations urged the European Commission to put interim measures in place to rectify the violation of fundamental rights and values. The Commission is yet to respond—a real cause of concern.
This is an attack on LGBTQ+ individuals, as well as an attack on the rights of all people in Hungary. The letter urges the European Commission to take the following actions:
- Open an infringement procedure against any new violations of EU law, in particular the violation of Article 5 of the AI Act
- Adopt interim measures on ongoing infringement against Hungary’s 2021 anti LGBT law which is used as a legal basis for the ban on LGBTQIA+ related public assemblies, including Budapest Pride.
There's no question that, when EU law is at stake, the European Commission has a responsibility to protect EU fundamental rights, including the rights of LGBTQ+ individuals in Hungary and across the Union. This includes ensuring that those organizing and marching at Pride in Budapest are safe and able to peacefully assemble and protest. If the EU Commission does not urgently act to ensure these rights, it risks hollowing out the values that the EU is built from.
Read our full letter to the Commission here.
How Cops Can Get Your Private Online Data
Can the cops get your online data? In short, yes. There are a variety of US federal and state laws which give law enforcement powers to obtain information that you provided to online services. But, there are steps you as a user and/or as a service provider can take to improve online privacy.
Law enforcement demanding access to your private online data goes back to the beginning of the internet. In fact, one of EFF’s first cases, Steve Jackson Games v. Secret Service, exemplified the now all-too-familiar story where unfounded claims about illegal behavior resulted in overbroad seizures of user messages. But it’s not the ’90s anymore, the internet has become an integral part of everyone’s life. Everyone now relies on organizations big and small to steward our data, from huge service providers like Google, Meta, or your ISP, to hobbyists hosting a blog or Mastodon server.
There is no “cloud,” just someone else's computer—and when the cops come knocking on their door, these hosts need to be willing to stand up for privacy, and know how to do so to the fullest extent under the law. These legal limits are also important for users to know, not only to mitigate risks in their security plan when choosing where to share data, but to understand whether these hosts are going to bat for them. Taking action together, service hosts and users can curb law enforcement getting more data than they’re allowed, protecting not just themselves but targeted populations, present and future.
This is distinct from law enforcement’s methods of collecting public data, such as the information now being collected on student visa applicants. Cops may use social media monitoring tools and sock puppet accounts to collect what you share publicly, or even within “private” communities. Police may also obtain the contents of communication in other ways that do not require court authorization, such as monitoring network traffic passively to catch metadata and possibly using advanced tools to partially reveal encrypted information. They can even outright buy information from online data brokers. Unfortunately there are few restrictions or oversight for these practices—something EFF is fighting to change.
Below however is a general breakdown of the legal processes used by US law enforcement for accessing private data, and what categories of private data these processes can disclose. Because this is a generalized summary, it is neither exhaustive nor should be considered legal advice. Please seek legal help if you have specific data privacy and security needs.
Type of data
Process used
Challenge prior to disclosure?
Proof needed
Subscriber information
Subpoena
Yes
Relevant to an investigation
Non-content information, metadata
Court order; sometimes subpoena
Yes
Specific and articulable facts that info is relevant to an investigation
Stored content
Search warrant
No
Probable cause that info will provide evidence of a crime
Content in transit
Super warrant
No
Probable cause plus exhaustion and minimization
Types of Data that Can be CollectedThe laws protecting private data online generally follow a pattern: the more sensitive the personal data is, the greater factual and legal burden police have to meet before they can obtain it. Although this is not exhaustive, here are a few categories of data you may be sharing with services, and why police might want to obtain it.
- Subscriber Data: Information you provide in order to use the service. Think about ID or payment information, IP address location, email, phone number, and other information you provided when signing up.
- Law enforcement can learn who controls an anonymous account, and find other service providers to gather information from.
- Non-content data, or "metadata": This is saved information about your interactions on the service; like when you used the service, for how long, and with whom. Analogous to what a postal worker can infer from a sealed letter with addressing information.
- Law enforcement can use this information to infer a social graph, login history, and other information about a suspect’s behavior.
- Stored content: This is the actual content you are sending and receiving, like your direct message history or saved drafts. This can cover any private information your service provider can access.
- This most sensitive data is collected to reveal criminal evidence. Overly broad requests also allow for retroactive searches, information on other users, and can take information out of its original context.
- Content in transit: This is the content of your communications as it is being communicated. This real-time access may also collect info which isn’t typically stored by a provider, like your voice during a phone call.
- Law enforcement can compel providers to wiretap their own services for a particular user—which may also implicate the privacy of users they interact with.
When US law enforcement has identified a service that likely has this data, they have a few tools to legally compel that service to hand it over and prevent users from knowing information is being collected.
SubpoenaSubpoenas are demands from a prosecutor, law enforcement, or a grand jury which do not require approval of a judge before being sent to a service. The only restriction is this demand be relevant to an investigation. Often the only time a court reviews a subpoena is when a service or user challenges it in court.
Due to the lack of direct court oversight in most cases, subpoenas are prone to abuse and overreach. Providers should scrutinize such requests carefully with a lawyer and push back before disclosure, particularly when law enforcement tries to use subpoenas to obtain more private data, such as the contents of communications.
Court OrderThis is a similar demand to subpoenas, but usually pertains to a specific statute which requires a court to authorize the demand. Under the Stored Communications Act, for example, a court can issue an order for non-content information if police provide specific facts that the information being sought is relevant to an investigation.
Like subpoenas, providers can usually challenge court orders before disclosure and inform the user(s) of the request, subject to law enforcement obtaining a gag order (more on this below).
Search WarrantA warrant is a demand issued by a judge to permit police to search specific places or persons. To obtain a warrant, police must submit an affidavit (a written statement made under oath) establishing that there is a fair probability (or “probable cause”) that evidence of a crime will be found at a particular place or on a particular person.
Typically services cannot challenge a warrant before disclosure, as these requests are already approved by a magistrate. Sometimes police request that judges also enter gag orders against the target of the warrant that prevent hosts from informing the public or the user that the warrant exists.
Super WarrantPolice seeking to intercept communications as they occur generally face the highest legal burden. Usually the affidavit needs to not only establish probable cause, but also make clear that other investigation methods are not viable (exhaustion) and that the collection avoids capturing irrelevant data (minimization).
Some laws also require high-level approval within law enforcement, such as leadership, to approve the request. Some laws also limit the types of crimes that law enforcement may use wiretaps in while they are investigating. The laws may also require law enforcement to periodically report back to the court about the wiretap, including whether they are minimizing collection of non-relevant communications.
Generally these demands cannot be challenged while wiretapping is occurring, and providers are prohibited from telling the targets about the wiretap. But some laws require disclosure to targets and those who were communicating with them after the wiretap has ended.
Gag ordersMany of the legal authorities described above also permit law enforcement to simultaneously prohibit the service from telling the target of the legal process or the general public that the surveillance is occurring. These non-disclosure orders are prone to abuse and EFF has repeatedly fought them because they violate the First Amendment and prohibit public understanding about the breadth of law enforcement surveillance.
How Services Can (and Should) Protect YouThis process isn't always clean-cut, and service providers must ultimately comply with lawful demands for user’s data, even when they challenge them and courts uphold the government’s demands.
Service providers outside the US also aren’t totally in the clear, as they must often comply with US law enforcement demands. This is usually because they either have a legal presence in the US or because they can be compelled through mutual legal assistance treaties and other international legal mechanisms.
However, services can do a lot by following a few best practices to defend user privacy, thus limiting the impact of these requests and in some cases make their service a less appealing door for the cops to knock on.
Put Cops through the ProcessParamount is the service provider's willingness to stand up for their users. Carving out exceptions or volunteering information outside of the legal framework erodes everyone's right to privacy. Even in extenuating and urgent circumstances, the responsibility is not on you to decide what to share, but on the legal process.
Smaller hosts, like those of decentralized services, might be intimidated by these requests, but consulting legal counsel will ensure requests are challenged when necessary. Organizations like EFF can sometimes provide legal help directly or connect service providers with alternative counsel.
Challenge Bad RequestsIt’s not uncommon for law enforcement to overreach or make burdensome requests. Before offering information, services can push back on an improper demand informally, and then continue to do so in court. If the demand is overly broad, violates a user's First or Fourth Amendment rights, or has other legal defects, a court may rule that it is invalid and prevent disclosure of the user’s information.
Even if a court doesn’t invalidate the legal demand entirely, pushing back informally or in court can limit how much personal information is disclosed and mitigate privacy impacts.
Provide NoticeUnless otherwise restricted, service providers should give notice about requests and disclosures as soon as they can. This notice is vital for users to seek legal support and prepare a defense.
Be Clear With UsersIt is important for users to understand if a host is committed to pushing back on data requests to the full extent permitted by law. Privacy policies with fuzzy thresholds like "when deemed appropriate" or “when requested” make it ambiguous if a user’s right to privacy will be respected. The best practices for providers not only require clarity and a willingness to push back on law enforcement demands, but also a commitment to be transparent with the public about law enforcement’s demands. For example, with regular transparency reports breaking down the countries and states making these data requests.
Social media services should also consider clear guidelines for finding and removing sock puppet accounts operated by law enforcement on the platform, as these serve as a backdoor to government surveillance.
Minimize Data CollectionYou can't be compelled to disclose data you don’t have. If you collect lots of user data, law enforcement will eventually come demanding it. Operating a service typically requires some collection of user data, even if it’s just login information. But the problem is when information starts to be collected beyond what is strictly necessary.
This excess collection can be seen as convenient or useful for running the service, or often as potentially valuable like behavioral tracking used for advertising. However, the more that’s collected, the more the service becomes a target for both legal demands and illegal data breaches.
For data that enables desirable features for the user, design choices can make privacy the default and give users additional (preferably opt-in) sharing choices.
Shorter RetentionAs another minimization strategy, hosts should regularly and automatically delete information when it is no longer necessary. For example, deleting logs of user activity can limit the scope of law enforcement’s retrospective surveillance—maybe limiting a court order to the last 30 days instead of the lifetime of the account.
Again design choices, like giving users the ability to send disappearing messages and deleting them from the server once they’re downloaded, can also further limit the impact of future data requests. Furthermore, these design choices should have privacy-preserving default
Avoid Data SharingDepending on the service being hosted there may be some need to rely on another service to make everything work for users. Third-party login or ad services are common examples with some amount of tracking built in. Information shared with these third-parties should also be minimized and avoided, as they may not have a strict commitment to user privacy. Most notoriously, data brokers who sell advertisement data can provide another legal work-around for law enforcement by letting them simply buy collected data across many apps. This extends to decisions about what information is made public by default, thus accessible to many third parties, and if that is clear to users.
Now that HTTPS is actually everywhere, most traffic between a service and a user can be easily secured—for free. This limits what onlookers can collect on users of the service, since messages between the two are in a secure “envelope.” However, this doesn’t change the fact the service is opening this envelope before passing it along to other users, or returning it to the same user. With each opened message, this is more information to defend.
Better, is end-to-end encryption (e2ee), which just means providing users with secure envelopes that even the service provider cannot open. This is how a featureful messaging app like Signal can respond to requests with only three pieces of information: the account identifier (phone number), the date of creation, and the last date of access. Many services should follow suit and limit access through encryption.
Note that while e2ee has become a popular marketing term, it is simply inaccurate for describing any encryption use designed to be broken or circumvented. Implementing “encryption backdoors” to break encryption when desired, or simply collecting information before or after the envelope is sealed on a user’s device (“client-side scanning”) is antithetical to encryption. Finally, note that e2ee does not protect against law enforcement obtaining the contents of communications should they gain access to any device used in the conversation, or if message history is stored on the server unencrypted.
Protecting Yourself and Your CommunityAs outlined, often the security of your personal data depends on the service providers you choose to use. But as a user you do still have some options. EFF’s Surveillance Self-Defense is a maintained resource with many detailed steps you can take. In short, you need to assess your risks, limit the services you use to those you can trust (as much as you can), improve settings, and when all else fails, accessorize with tools that prevent data sharing in the first place—like EFF’s Privacy Badger browser extension.
Remember that privacy is a team sport. It’s not enough to make these changes as an individual, it’s just as important to share and educate others, as well as fighting for better digital privacy policy on all levels of governance. Learn, get organized, and take action.