The Intercept, Raw Story, and AlterNet Sue OpenAI and Microsoft

The Intercept, Raw Story, and AlterNet have filed separate lawsuits against OpenAI and Microsoft, alleging copyright infringement and the removal of copyright information while training AI models. The Verge reports: The publications said ChatGPT “at least some of the time” reproduces “verbatim or nearly verbatim copyright-protected works of journalism without providing author, title, copyright or terms of use information contained in those works.” According to the plaintiffs, if ChatGPT trained on material that included copyright information, the chatbot “would have learned to communicate that information when providing responses.”

Raw Story and AlterNet’s lawsuit goes further (PDF), saying OpenAI and Microsoft “had reason to know that ChatGPT would be less popular and generate less revenue if users believed that ChatGPT responses violated third-party copyrights.” Both Microsoft and OpenAI offer legal cover to paying customers in case they get sued for violating copyright for using Copilot or ChatGPT Enterprise. The lawsuits say that OpenAI and Microsoft are aware of potential copyright infringement. As evidence, the publications point to how OpenAI offers an opt-out system so website owners can block content from its web crawlers. The New York Times also filed a lawsuit in December against OpenAI, claiming ChatGPT faithfully reproduces journalistic work. OpenAI claims the publication exploited a bug on the chatbot to regurgitate its articles.

Read more of this story at Slashdot.

Google Admits Gemini Is ‘Missing the Mark’ With Image Generation of Historical People

Google’s Gemini AI chatbot is under fire for generating historically inaccurate images, particularly when depicting people from different eras and nationalities. Google acknowledges the issue and is actively working to refine Gemini’s accuracy, emphasizing that while diversity in image generation is valued, adjustments are necessary to meet historical accuracy standards. 9to5Google reports: The Twitter/X post in particular that brought this issue to light showed prompts to Gemini asking for the AI to generate images of Australian, American, British, and German women. All four prompts resulted in images of women with darker skin tones, which, as Google’s Jack Krawcyczk pointed out, is not incorrect, but may not be what is expected.

But a bigger issue that was noticed in the wake of that post was that Gemini also struggles to accurately depict human beings in a historical context, with those being depicted often having darker skin tones or being of particular nationalities that are not historically accurate. Google, in a statement posted to Twitter/X, admits that Gemini AI image generation is “missing the mark” on historical depictions and that the company is working to improve it. Google also does say that the diversity represented in images generated by Gemini is “generally a good thing,” but it’s clear some fine-tuning needs to happen. Further reading: Why Google’s new AI Gemini accused of refusing to acknowledge the existence of white people (The Daily Dot)

Read more of this story at Slashdot.

Thanks to Machine Learning, Scientist Finally Recover Text From The Charred Scrolls of Vesuvius

The great libraries of the ancient classical world are “legendary… said to have contained stacks of texts,” writes ScienceAlert. But from Rome to Constantinople, Athens to Alexandria, only one collection survived to the present day.

And here in 2024, “we can now start reading its contents.”

A worldwide competition to decipher the charred texts of the Villa of Papyri — an ancient Roman mansion destroyed by the eruption of Mount Vesuvius — has revealed a timeless infatuation with the pleasures of music, the color purple, and, of course, the zingy taste of capers. The so-called Vesuvius challenge was launched a few years ago by computer scientist Brent Seales at the University of Kentucky with support from Silicon Valley investors. The ongoing ‘master plan’ is to build on Seales’ previous work and read all 1,800 or so charred papyri from the ancient Roman library, starting with scrolls labeled 1 to 4.

In 2023, the annual gold prize was awarded to a team of three students, who recovered four passages containing 140 characters — the longest extractions yet. The winners are Youssef Nader, Luke Farritor, and Julian Schilliger. “After 275 years, the ancient puzzle of the Herculaneum Papyri has been solved,” reads the Vesuvius Challenge Scroll Prize website. “But the quest to uncover the secrets of the scrolls is just beginning….” Only now, with the advent of X-ray tomography and machine learning, can their inky words be pulled from the darkness of carbon.
A few months ago students deciphered a single word — “purple,” according to the article. But “That winning code was then made available for all competitors to build upon.”
Within three months, passages in Latin and Greek were blooming from the blackness, almost as if by magic. The team with the most readable submission at the end of 2023 included both previous finders of the word ‘purple’. Their unfurling of scroll 1 is truly impressive and includes more than 11 columns of text. Experts are now rushing to translate what has been found. So far, about 5 percent of the scroll has been unrolled and read to date. It is not a duplicate of past work, scholars of the Vesuvius Challenge say, but a “never-before-seen text from antiquity.”
One line reads: “In the case of food, we do not right away believe things that are scarce to be absolutely more pleasant than those which are abundant.”

Thanks to davidone (Slashdot reader #12,252) for sharing the article.

Read more of this story at Slashdot.

Microsoft President: ‘You Can’t Believe Every Video You See or Audio You Hear’

“We’re currently witnessing a rapid expansion in the abuse of these new AI tools by bad actors,” writes Microsoft VP Brad Smith, “including through deepfakes based on AI-generated video, audio, and images.

“This trend poses new threats for elections, financial fraud, harassment through nonconsensual pornography, and the next generation of cyber bullying.” Microsoft found its own tools being used in a recently-publicized episode, and the VP writes that “We need to act with urgency to combat all these problems.”

Microsoft’s blog post says they’re “committed as a company to a robust and comprehensive approach,” citing six different areas of focus:

A strong safety architecture. This includes “ongoing red team analysis, preemptive classifiers, the blocking of abusive prompts, automated testing, and rapid bans of users who abuse the system… based on strong and broad-based data analysis.”
Durable media provenance and watermarking. (“Last year at our Build 2023 conference, we announced media provenance capabilities that use cryptographic methods to mark and sign AI-generated content with metadata about its source and history.”)
Safeguarding our services from abusive content and conduct. (“We are committed to identifying and removing deceptive and abusive content” hosted on services including LinkedIn and Microsoft’s Gaming network.)
Robust collaboration across industry and with governments and civil society. This includes “others in the tech sector” and “proactive efforts” with both civil society groups and “appropriate collaboration with governments.”
Modernized legislation to protect people from the abuse of technology. “We look forward to contributing ideas and supporting new initiatives by governments around the world.”
Public awareness and education. “We need to help people learn how to spot the differences between legitimate and fake content, including with watermarking. This will require new public education tools and programs, including in close collaboration with civil society and leaders across society.”

Thanks to long-time Slashdot reader theodp for sharing the article

Read more of this story at Slashdot.

Will ‘Precision Agriculture’ Be Harmful to Farmers?

Modern U.S. farming is being transformed by precision agriculture, writes Paul Roberts, the founder of and Editor in Chief at Security Ledger.

Theres autonomous tractors and “smart spraying” systems that use AI-powered cameras to identify weeds, just for starters. “Among the critical components of precision agriculture: Internet- and GPS connected agricultural equipment, highly accurate remote sensors, ‘big data’ analytics and cloud computing…”

As with any technological revolution, however, there are both “winners” and “losers” in the emerging age of precision agriculture… Precision agriculture, once broadly adopted, promises to further reduce the need for human labor to run farms. (Autonomous equipment means you no longer even need drivers!) However, the risks it poses go well beyond a reduction in the agricultural work force. First, as the USDA notes on its website: the scale and high capital costs of precision agriculture technology tend to favor large, corporate producers over smaller farms. Then there are the systemic risks to U.S. agriculture of an increasingly connected and consolidated agriculture sector, with a few major OEMs having the ability to remotely control and manage vital equipment on millions of U.S. farms… (Listen to my podcast interview with the hacker Sick Codes, who reverse engineered a John Deere display to run the Doom video game for insights into the company’s internal struggles with cybersecurity.)

Finally, there are the reams of valuable and proprietary environmental and operational data that farmers collect, store and leverage to squeeze the maximum productivity out of their land. For centuries, such information resided in farmers’ heads, or on written or (more recently) digital records that they owned and controlled exclusively, typically passing that knowledge and data down to succeeding generation of farm owners. Precision agriculture technology greatly expands the scope, and granularity, of that data. But in doing so, it also wrests it from the farmer’s control and shares it with equipment manufacturers and service providers — often without the explicit understanding of the farmers themselves, and almost always without monetary compensation to the farmer for the data itself. In fact, the Federal Government is so concerned about farm data they included a section (1619) on “information gathering” into the latest farm bill.
Over time, this massive transfer of knowledge from individual farmers or collectives to multinational corporations risks beggaring farmers by robbing them of one of their most vital assets: data, and turning them into little more than passive caretakers of automated equipment managed, controlled and accountable to distant corporate masters.

Weighing in is Kevin Kenney, a vocal advocate for the “right to repair” agricultural equipment (and also an alternative fuel systems engineer at Grassroots Energy LLC). In the interview, he warns about the dangers of tying repairs to factory-installed firmware, and argues that its the long-time farmer’s “trade secrets” that are really being harvested today. The ultimate beneficiary could end up being the current “cabal” of tractor manufacturers.

“While we can all agree that it’s coming…the question is who will own these robots?”

First, we need to acknowledge that there are existing laws on the books which for whatever reason, are not being enforced. The FTC should immediately start an investigation into John Deere and the rest of the ‘Tractor Cabal’ to see to what extent farmers’ farm data security and privacy are being compromised. This directly affects national food security because if thousands- or tens of thousands of tractors’ are hacked and disabled or their data is lost, crops left to rot in the fields would lead to bare shelves at the grocery store… I think our universities have also been delinquent in grasping and warning farmers about the data-theft being perpetrated on farmers’ operations throughout the United States and other countries by makers of precision agricultural equipment.

Thanks to long-time Slashdot reader chicksdaddy for sharing the article.

Read more of this story at Slashdot.

Scientists Propose AI Apocalypse Kill Switches

A paper (PDF) from researchers at the University of Cambridge, supported by voices from numerous academic institutions including OpenAI, proposes remote kill switches and lockouts as methods to mitigate risks associated with advanced AI technologies. It also recommends tracking AI chip sales globally. The Register reports: The paper highlights numerous ways policymakers might approach AI hardware regulation. Many of the suggestions — including those designed to improve visibility and limit the sale of AI accelerators — are already playing out at a national level. Last year US president Joe Biden put forward an executive order aimed at identifying companies developing large dual-use AI models as well as the infrastructure vendors capable of training them. If you’re not familiar, “dual-use” refers to technologies that can serve double duty in civilian and military applications. More recently, the US Commerce Department proposed regulation that would require American cloud providers to implement more stringent “know-your-customer” policies to prevent persons or countries of concern from getting around export restrictions. This kind of visibility is valuable, researchers note, as it could help to avoid another arms race, like the one triggered by the missile gap controversy, where erroneous reports led to massive build up of ballistic missiles. While valuable, they warn that executing on these reporting requirements risks invading customer privacy and even lead to sensitive data being leaked.

Meanwhile, on the trade front, the Commerce Department has continued to step up restrictions, limiting the performance of accelerators sold to China. But, as we’ve previously reported, while these efforts have made it harder for countries like China to get their hands on American chips, they are far from perfect. To address these limitations, the researchers have proposed implementing a global registry for AI chip sales that would track them over the course of their lifecycle, even after they’ve left their country of origin. Such a registry, they suggest, could incorporate a unique identifier into each chip, which could help to combat smuggling of components.

At the more extreme end of the spectrum, researchers have suggested that kill switches could be baked into the silicon to prevent their use in malicious applications. […] The academics are clearer elsewhere in their study, proposing that processor functionality could be switched off or dialed down by regulators remotely using digital licensing: “Specialized co-processors that sit on the chip could hold a cryptographically signed digital “certificate,” and updates to the use-case policy could be delivered remotely via firmware updates. The authorization for the on-chip license could be periodically renewed by the regulator, while the chip producer could administer it. An expired or illegitimate license would cause the chip to not work, or reduce its performance.” In theory, this could allow watchdogs to respond faster to abuses of sensitive technologies by cutting off access to chips remotely, but the authors warn that doing so isn’t without risk. The implication being, if implemented incorrectly, that such a kill switch could become a target for cybercriminals to exploit.

Another proposal would require multiple parties to sign off on potentially risky AI training tasks before they can be deployed at scale. “Nuclear weapons use similar mechanisms called permissive action links,” they wrote. For nuclear weapons, these security locks are designed to prevent one person from going rogue and launching a first strike. For AI however, the idea is that if an individual or company wanted to train a model over a certain threshold in the cloud, they’d first need to get authorization to do so. Though a potent tool, the researchers observe that this could backfire by preventing the development of desirable AI. The argument seems to be that while the use of nuclear weapons has a pretty clear-cut outcome, AI isn’t always so black and white. But if this feels a little too dystopian for your tastes, the paper dedicates an entire section to reallocating AI resources for the betterment of society as a whole. The idea being that policymakers could come together to make AI compute more accessible to groups unlikely to use it for evil, a concept described as “allocation.”

Read more of this story at Slashdot.

Largest Text-To-Speech AI Model Yet Shows ‘Emergent Abilities’

Devin Coldeway reports via TechCrunch: Researchers at Amazon have trained the largest ever text-to-speech model yet, which they claim exhibits “emergent” qualities improving its ability to speak even complex sentences naturally. The breakthrough could be what the technology needs to escape the uncanny valley. These models were always going to grow and improve, but the researchers specifically hoped to see the kind of leap in ability that we observed once language models got past a certain size. For reasons unknown to us, once LLMs grow past a certain point, they start being way more robust and versatile, able to perform tasks they weren’t trained to. That is not to say they are gaining sentience or anything, just that past a certain point their performance on certain conversational AI tasks hockey sticks. The team at Amazon AGI — no secret what they’re aiming at — thought the same might happen as text-to-speech models grew as well, and their research suggests this is in fact the case.

The new model is called Big Adaptive Streamable TTS with Emergent abilities, which they have contorted into the abbreviation BASE TTS. The largest version of the model uses 100,000 hours of public domain speech, 90% of which is in English, the remainder in German, Dutch and Spanish. At 980 million parameters, BASE-large appears to be the biggest model in this category. They also trained 400M- and 150M-parameter models based on 10,000 and 1,000 hours of audio respectively, for comparison — the idea being, if one of these models shows emergent behaviors but another doesn’t, you have a range for where those behaviors begin to emerge. As it turns out, the medium-sized model showed the jump in capability the team was looking for, not necessarily in ordinary speech quality (it is reviewed better but only by a couple points) but in the set of emergent abilities they observed and measured. Here are examples of tricky text mentioned in the paper:

– Compound nouns: The Beckhams decided to rent a charming stone-built quaint countryside holiday cottage.
– Emotions: “Oh my gosh! Are we really going to the Maldives? That’s unbelievable!” Jennie squealed, bouncing on her toes with uncontained glee.
– Foreign words: “Mr. Henry, renowned for his mise en place, orchestrated a seven-course meal, each dish a piece de resistance.
– Paralinguistics (i.e. readable non-words): “Shh, Lucy, shhh, we mustn’t wake your baby brother,” Tom whispered, as they tiptoed past the nursery.
– Punctuations: She received an odd text from her brother: ‘Emergency @ home; call ASAP! Mom & Dad are worried… #familymatters.’
– Questions: But the Brexit question remains: After all the trials and tribulations, will the ministers find the answers in time?
-Syntactic complexities: The movie that De Moya who was recently awarded the lifetime achievement award starred in 2022 was a box-office hit, despite the mixed reviews. You can read more examples of these difficult texts being spoken naturally here.

Read more of this story at Slashdot.

In Big Tech’s Backyard, a California State Lawmaker Unveils a Landmark AI Bill

An anonymous reader shared this report from the Washington Post:

A California state lawmaker introduced a bill on Thursday aiming to force companies to test the most powerful artificial intelligence models before releasing them — a landmark proposal that could inspire regulation around the country as state legislatures increasingly tackle the swiftly evolving technology.

The new bill, sponsored by state Sen. Scott Wiener, a Democrat who represents San Francisco, would require companies training new AI models to test their tools for “unsafe” behavior, institute hacking protections and develop the tech in such a way that it can be shut down completely, according to a copy of the bill. AI companies would have to disclose testing protocols and what guardrails they put in place to the California Department of Technology. If the tech causes “critical harm,” the state’s attorney general can sue the company.

Wiener’s bill comes amid an explosion of state bills addressing artificial intelligence, as policymakers across the country grow wary that years of inaction in Congress have created a regulatory vacuum that benefits the tech industry. But California, home to many of the world’s largest technology companies, plays a singular role in setting precedent for tech industry guardrails. “You can’t work in software development and ignore what California is saying or doing,” said Lawrence Norden, the senior director of the Brennan Center’s Elections and Government Program… Wiener says he thinks the bill can be passed by the fall.
The article notes there’s now 407 AI-related bills “active in 44 U.S. states (according to an analysis by an industry group called BSA the Software Alliance) — with several already signed into law. “The proliferation of state-level bills could lead to greater industry pressure on Congress to pass AI legislation, because complying with a federal law may be easier than responding to a patchwork of different state laws.”
Even the proposed California law “largely builds off an October executive order by President Biden,” according to the article, “that uses emergency powers to require companies to perform safety tests on powerful AI systems and share those results with the federal government. The California measure goes further than the executive order, to explicitly require hacking protections, protect AI-related whistleblowers and force companies to conduct testing.”

They also add that as America’s most populous U.S. state, “California has unique power to set standards that have impact across the country.” And the group behind last year’s statement on AI risk helped draft the legislation, according to the article, though Weiner says he also consulted tech workers, CEOs, and activists. “We’ve done enormous stakeholder outreach over the past year.”

Read more of this story at Slashdot.