QFM028: Irresponsible Ai Reading List - July 2024
Photo by Florida-Guidebook.com on Unsplash
The July edition of the Irresponsible AI Reading List starts with a critique of AI's potentially overblown promises, as highlighted in Edward Zitron's Put Up Or Shut Up. Zitron vociferously condemns the tech industry's trend of promoting exaggerated claims about AI capabilities, noting that recent announcements, such as those from Lattice and OpenAI, often lack practical substance and evidence. His scepticism is echoed by discussions on outdated benchmark tests, which, according to Everyone Is Judging AI by These Tests, fail to accurately assess the nuanced capabilities of modern AI.
The misinformation theme extends to the realm of AI summarization. In When ChatGPT summarises, it actually does nothing of the kind, the limitations of ChatGPT in providing accurate summaries are examined, revealing that while it can shorten texts, it often misrepresents or omits key information due to its lack of genuine understanding.
Bias remains a critical concern, especially in high-stakes fields like medical diagnostics. A study on AI models analyzing medical images highlights significant biases against certain demographic groups, pointing to the limitations of current debiasing techniques and the broader issue of fairness in AI. Similarly, Surprising gender biases in GPT show how GPT language models can perpetuate traditional gender stereotypes, underscoring the ongoing need for improved bias mitigation strategies in AI systems.
The legal and ethical dimensions of AI are significant. The dismissal of most claims in the lawsuit against GitHub Copilot, as detailed in Coders' Copilot code-copying copyright claims crumble, reveals the complexities of intellectual property issues in AI training practices. Meanwhile, the historical perspective provided by The Troubled Development of Mass Exposure offers insights into the long-standing struggle between technological advancement and privacy rights, drawing parallels to current AI-powered concerns about data use and consent.
Lastly, ChatBug: Tricking AI Models into Harmful Responses presents a critical vulnerability in AI safety, highlighting how specific attacks can exploit weaknesses in AI's instruction tuning to produce harmful outputs.
As always, the Quantum Fax Machine Propellor Hat Key will guide your browsing. Enjoy!

Links
Understanding AI is crucial for policy makers to avoid ineffective legislation. SB 1047, currently being considered in California, aims to regulate AI for safety but lacks the necessary technical definitions, causing potential issues. By focusing on the difference between model 'release' and 'deployment', the article explains how current legislative language could negatively impact the open-source AI community and suggests ways to improve it.
Peter Gostev, Head of AI at Moonpig, critiques Forrester's Foundation Model assessment for its confusing weightings and scores, especially questioning the logic behind the rankings for core capabilities and specific models, such as IBM's and Anthropics'. He highlights the overlap in categories and the odd results for well-regarded open-source models like Mistral.
Safe Superintelligence Inc. (SSI) has been established as the first lab dedicated solely to developing safe superintelligence. The company focuses on advancing superintelligent capabilities while ensuring safety measures remain a step ahead. Located in Palo Alto and Tel Aviv, SSI is recruiting top engineers and researchers to tackle this monumental challenge.
A NewsGuard audit found that top generative AI models, including ChatGPT-4 and Google's Gemini, repeated Russian disinformation narratives 32% of the time. These bots often cited fake local news sites as authoritative sources. The findings, amid the first AI-influenced election year, reveal how easily AI platforms can spread false information despite efforts to prevent this misuse.
Policy makers must grasp how AI works to effectively regulate it. The article uses SB 1047 as a case study, highlighting the differences between deployment and release of AI models. It emphasizes that regulating deployment, rather than release, would avoid stifling open source innovation and better align with safety goals.
After attempting to infuse garlic into olive oil without heating, a user discovered that tiny carbonation bubbles indicated the growth of a botulism culture, highlighting the potential danger of this method. Prompt with care and verify information, as this process can be hazardous.
The recent enthusiasm for AI has drawn comparisons to the dotcom bubble, with inflated stock prices and hype around AI technologies driving substantial investments. Some argue that while AI’s long-term potential is significant, current market behaviors resemble the speculative frenzy of the dotcom era. Notably, Nvidia and other tech giants are at the forefront but concerns persist about the sustainability of these high valuations and the possibility of market corrections if near-term expectations aren’t met. The discussion highlights both the promise and potential pitfalls of today's AI boom.
A recent research paper from the University of Washington and the Allen Institute for AI has highlighted a critical vulnerability in Large Language Models (LLMs), including GPT, Llama, and Claude. The study reveals that chat templates used in instruction tuning can be exploited through attacks like format mismatch and message overflow, leading the models to produce harmful responses. This vulnerability, named ChatBug, was tested on several state-of-the-art LLMs, revealing high susceptibility and a need for improved safety measures.
A Harvard Business Working Paper explores the challenges and limitations of expecting junior professionals to guide senior professionals in the use of emerging technologies like generative AI. The study, conducted with Boston Consulting Group, included interviews with junior consultants using GPT-4 and found that they often lack deep understanding and experience, making them ineffective in mitigating AI risks at a senior level. Insights suggest the need for more seasoned strategies and mitigation tactics focusing on system design and ecosystem-level changes.
Citigroup has issued a report revealing that AI could automate over half of the jobs in the banking sector, significantly transforming consumer finance and enhancing productivity. The bank notes that around 54% of banking roles have a high likelihood of automation, with another 12% potentially being augmented by AI technology. The report underscores the growing experimentation with AI by the world's largest banks, driven by the potential to improve staff productivity and reduce costs. This highlights a major shift within the banking industry towards AI-driven operations.
In this article, the author addresses a crucial question regarding the major challenges faced by the tech industry. They highlight the anxiety and mental health issues stemming from layoffs and fears of AI replacing programmers. The piece advises how leaders can mitigate these anxieties through open communication, positivity, and leveraging new technologies effectively.
This article argues for the new legal right to protect individuals from AI profiling based on publicly available data without their explicit informed consent. It develops three primary arguments dealing with social control, stigmatization, and the unique threat posed by AI profiling compared to other data processing methods. The article suggests that existing GDPR regulations are not sufficient and calls for explicit regulation with a sui generis right for protection.
Meta is now using facial recognition to verify the age of some users on Facebook and Instagram. This move comes amid growing political pressure to protect children's mental health, with both major Australian political parties expressing support for stricter age verification laws.
If you're using Facebook or Instagram, Meta is employing your data to enhance its AI models without giving you the choice to opt out. While EU users have the option to opt out due to stricter privacy laws, Australian users don't have this privilege. This has sparked backlash and calls for stronger privacy laws in Australia.
In the early days of a startup, a critical mistake involving ChatGPT-generated code cost over $10,000 in lost sales. The problem stemmed from a single hardcoded ID that caused unique ID collisions, preventing new users from subscribing. This story highlights the importance of robust testing and the perils of copy-pasting code in production environments.
Researchers from the University of Oxford have developed a method to identify when large language models (LLMs) like ChatGPT are confabulating, or providing false answers with confidence. The approach, which analyzes statistical uncertainty in responses, could help mitigate the issue of AI giving confidently incorrect answers by determining when the AI is unsure of the correct answer versus unsure of how to phrase it.
This article revisits John B. Calhoun's 1968 experiment known as Universe 25, where a perfect society was created for mice. Despite optimal conditions, the mice society collapsed due to social dysfunctions such as narcissism, aggression, and disengagement. The author draws parallels to modern human society and warns about the implications of tech-driven utopias created by Silicon Valley.
The Prince Charles Cinema in London cancelled the premiere of 'The Last Screenwriter,' a film with a script generated by ChatGPT 4.0, following a backlash from their audience. The filmmakers intended it as a contribution to the conversation surrounding AI in scriptwriting, but received 200 complaints. Despite the cancellation, a private screening for the cast and crew will go ahead.
Regards,
M@
[ED: If you'd like to sign up for this content as an email, click here to join the mailing list.]
Originally published on quantumfaxmachine.com and cross-posted on Medium.
hello@matthewsinclair.com | matthewsinclair.com | bsky.app/@matthewsinclair.com | masto.ai/@matthewsinclair | medium.com/@matthewsinclair | xitter/@matthewsinclair
Was this useful?