Blog

  • Gen AI News Summary 04.11.24

    A bracing sail through the seas of generative AI, handily mapped out under the following headings:

    – AI and Content Creation

    – AI Models and Tools

    – AI and Search

    – AI and Avartars

    – AI and Agents

    – AI Marketing and Sales

    – AI and Retail

    – AI Health and Education

    – AI Adoption

    – AI Regulation and Ethics

    – And Finally…


    AI and Content Creation

    Stable Diffusion 3.5
    Stability AI’s Stable Diffusion 3.5

    Midjourney’s ‘Powerful’ AI image editor now lets you edit any image – including images from the web.

    More on Midjourney’s new edit any image feature

    How to use Midjourney’s new edit any image feature

    Stability AI has updated its image model with several new model variants, including Stable Diffusion 3.5 Large, Stable Diffusion 3.5 Large Turbo, and Stable Diffusion 3.5 Medium. These models are highly customisable, run on consumer hardware, and are free for both commercial and non-commercial use under the permissive Stability AI Community License.

    More on Stable Diffusion 3.5

    Recraft‘s newest image model – the mystery AI that beat Midjourney and DALL-E in anonymous evaluations  anonymous evaluations – can generate high-quality images with impressive details, quality, and prompt fidelity. And finally, there’s an image generator that also does text well!

    More on Recraft’s impressive new image model

    Respected AI image start up Ideogram AI launches an ‘infinite canvas’ feature (not unlike the recently introduced GPT-4o with canvas). Ideogram canvas users can spread newly generated images out, compare them to older generations, resize and reorder them at will, and even combine multiple AI generated images into one new composite.

    More on Ideogram’s new canvas feature

    Since the Adobe Firefly Gen AI updates after last month’s Adobe MAX conference, some users have found that Adobe’s Gen AI tools in Adobe Camera Raw, Lightroom, and Photoshop have become less accurate.

    More on Adobe’s AI recent performance

    The Apple Intelligence” roll out may have underwhelmed so far, but is Apple’s AI photo “clean up” tool better than Adobe’s? Some think so…

    More on Apple Intelligence photo clean up tool

    Canva launches Dream Lab, a powerful AI image generator for creatives, developed on top of the gen AI model Leonardo.Ai that Canva acquired earlier this year.

    More on Canva

    Meta’s Gen AI image model is being lauded for being easy to use. One simple prompt can make AI images “come alive” in Facebook Messenger and Instagram, apparently.

    More on AI images in Facebook Messenger and Instagram

    ElevenLabs now lets you create your very own custom voiceover voices from text prompts.

    More on ElevenLabs new custom voiceovers

    Jacob Collier, a Grammy-winning musician, has teamed up with Google DeepMind and Google Labs to create MusicFX DJ, an AI-powered music tool. The interface has been redesigned to encourage creativity and help users easily enter a “flow state” of artistic inspiration. MusicFX DJ is available now, “offering intuitive controls for all skill levels”.

    More on MusicFX DJ


    AI Models and Tools

    OpenAI's ChatGPT 5. Coming soon?
    OpenAI’s ChatGPT-5. Coming soon?

    Oh yes they will…

    OpenAI plans to release its next big AI model by December. A report revealed that OpenAI would release its new ‘Orion’ frontier model (ie GPT-5, or whatever it will be called) by December, with Microsoft and other huge companies getting access before individuals.

    More on OpenAI’s plans to release the next version of ChatGPT

    And, then another report citing OpenAI CEO Sam Altman, said…

    Oh no they won’t!

    OpenAI CEO, Sam Altman, responded directly to the report on X, posting “fake news out of control”. An OpenAI spokesperson clarified that they have no plans for an “Orion” release this year but plan to release “a lot of other great technology.”

    OpenAI introduces an open source “factuality benchmark” to measure the factual accuracy of language models (the likelihood that they won’t hallucinate). The new benchmark is called SimpleQA.

    More on OpenAI’s factuality benchmark

    The Perplexity app for the Apple Mac desktop was released, making it more convenient to use the “Google search for research killer”, if you use a Mac…

    More on Perplexity’s new Mac desktop app

    Not to be outdone, Anthropic also release a desktop app for Claude, for both Mac and Windows.

    More on Claude’s new desktop apps

    Anthropic has also added PDF support to its Claude 3.5 Sonnet AI model in public beta, allowing it to process both the text and the images within PDF documents.

    More on Claude and PDF support

    After reports that OpenAI is planning to launch the next version of its flagship AI model in December, there is now a possibility that Google may be planning to launch the latest version of Gemini – Gemini 2.0 – in the same month.

    More on the possible December launch of the next version of Gemini

    Google is building controls into Gemini, so that Google smart home devices can be controlled with natural languages.

    More on Gemini smart home device controls

    Apple finally launches Apple Intelligence, with the release of iOS 18.1 if you have a new enough iPhone, iPad or Mac, and you’re happy to set it to US English, for the moment. This initial version introduced a more natural-sounding Siri, major upgrades for Apple’s Photos app, including a new “Clean Up” tool, and systemwide Writing Tools to help users rewrite, proofread, and summarise text in apps like Mail, Messages and Notes. But it currently lacks the conversational abilities we’ve come to expect from an AI assistant, the ChatGPT integration and user-created Genmoji features that many were expecting. Underwhelming, seems to be the general verdict. But, don’t bet against Apple, a latter day tortoise in the race against the hare…

    More on the launch of Apple Intelligence

    Apple Intelligence now available
    First version of Apple Intelligence now available

    Meta has struck a multi-year deal with Reuters to use its news content to provide real-time answers to user queries about news and current events in its AI chatbots.

    More on Meta’s deal with Reuters

    Meta released new versions of its Llama 3.2 AI models that run up to four times faster and achieve a 56% reduction in model size compared to their original counterparts. These breakthroughs make it more feasible to run powerful AI features directly on a mobile phone.

    More on Meta’s new Llama 3.2 models

    Elon Musk-owned xAI has added image-understanding capabilities to its Grok AI model. This means that paid users on his social platform X, who have access to the AI chatbot, can upload an image and ask the AI questions about it.

    https://techcrunch.com/2024/10/28/xai-adds-image-understanding-capabilities-to-grok

    AI and Search

    OpenAI launch ChatGPT search
    OpenAI launch ChatGPT search

    OpenAI and search part I: OpenAI launches its web search engine, as a feature in ChatGPT, initially to premium users, and then to enterprise, education and free users in the coming weeks.

    More on the roll out of the ChatGPT web search feature

    OpenAI’s own summary of the ChatGPT web search feature

    OpenAI and search part II: ChatGPT now lets you search your old chats in the web app. OpenAI says “Only you have access to your conversation history, and OpenAI doesn’t use these conversations for training unless you explicitly consent by opting in.”.

    More on OpenAI’s new conversation search feature

    Meta is reportedly developing its own search engine, to reduce its dependence on Google search and Microsoft Bing.

    More on Meta developing its own search engine


    AI and Avatars

    D-ID's  new real-time conversation avatars
    D-ID’s new real-time conversation avatars

    HeyGen rolled out new “Interactive Avatars” to allow you to have personalised AI-driven, “immersive, real-time conversations”. Users can either select a template avatar or create their own avatar for a specific use by selecting the option “All Avatars”.

    More on creating a HeyGen interactive avatars

    Meanwhile…. D-ID launched new high-quality avatars capable of real-time conversations.

    More on D-ID “real time conversation” avatars


    AI and Agents

    KPMG is developing AI agents
    KPMG is developing AI agents

    Google is developing a computer-using agent – AI that can take over your web browser to complete tasks such as gathering research, purchasing a product or booking a flight. The product, code-named Project Jarvis, is thought to be similar to one Anthropic has just announced. Google plans to preview the product as early as December alongside the release of its next flagship Gemini large language model.

    More on Google developing a computer-using agent

    Move over Salesforce , Microsoft et al, Big Four accounting and consulting firm KPMG is developing AI agents and is interested in becoming a leader in the emerging AI agent space.

    More on KPMG developing AI agents

    AI agents will be at centre of our digital worlds, dancing across our devices from smart glasses to cars, providing a consistent experience and adapting the way technology interacts with us.

    More on the approaching agentic AI world

    OpenAI is expected to launch agents in 2025. Salesforce’s CEO announced AI agents are the third wave of AI. Microsoft adding agent capabilities to Copilot. The message here is clear: AI agents are going to be big, and leaders need to begin strategising how to incorporate this powerful technology into their organisations.

    More on why business leaders need to think about AI agents


    AI Marketing and Sales

    Amazon Ads AI Creative Studio
    Amazon Ads AI Creative Studio

    A look at how leaders can maximise AI-driven sales strategies.

    More on AI driven sales strategies

    Harvard Business Review asks: “Can startups thrive in an age of AI?

    More on how AI is transforming the start-up landscape

    Amazon announces new image, audio and video AI-powered tools for marketers making ads, as part of an AI strategy, that drove up Amazon capital expenditures 81% year on year in Q3.

    More on Amazon’s AI-powered advertising tools


    AI and Retail

    Perplexity is building an AI-powered shopping experience.
    Perplexity is building an AI-powered shopping experience.

    Perplexity is quietly planning to take on Amazon by building an AI-powered shopping experience. The new ‘Pro Shop’ feature allows users to shop on Perplexity without leaving the platform.

    More on Perplexity’s plans for AI-powered shopping


    AI Health and Education

    NHS England to trial  AI tool to predict patients’ risk of  heart disease
    NHS England

    NHS England is to trial an AI tool that can predict patients’ risk of developing heart disease, and their risk of early death, using an electrocardiogram (ECG).

    More on an AI tool that can predict the risk heart disease

    A new deep learning model, developed by the University of Texas Southwestern Medical Center, could lead to more timely and accurate cancer assessments, helping many patients avoid unnecessary surgery and improve outcomes.

    More on a new AI model that could reduce the need for surgery for cancer

    Biotech startup Iambic Therapeutics just revealed Enchant, an AI platform designed to predict how drug candidates perform in human trials before leaving the lab.

    More on AI platform that predicts clinical outcomes from drug discovery

    The parents of a high school senior in Massachusetts argued in court that their son was unfairly punished for using artificial intelligence while researching a history project, harming his prospects for acceptance to an elite college.

    More on parents sue school for unfair AI accusations

    A new research report by Common Sense Media found that about two thirds of the parents of kids who are using AI are oblivious to that fact. And, nearly half said they hadn’t spoken with their teenage kids about AI.

    More on kids, parents and AI use research report

    The South Korea Ministry of Education plan to integrate AI into the public education system using digital textbooks that leverage AI to personalise learning experiences for each student.

    More on the South Koreans approach to using AI with students


    AI Adoption

    Ethan Mollick, The Wharton School
    Professor Ethan Mollick, The Wharton School

    The Wharton School professor Ethan Mollick says companies must make organisational changes if they want to benefit from AI.

    More on Ethan Mollick and corporate AI implementation

    The Generative AI landscape shifted dramatically in 2024, according to a new research study. Nearly three in four executives, 72%, report using gen AI at least once a week, up from 37% in 2023, according to a new study by AI at Wharton, a research centre at the The Wharton School of the University of Pennsylvania, in collaboration with GBK Collective, reveals a dramatic rise in Gen AI adoption across key business functions, as companies move from cautious exploration to rapid integration.

    More on “AI at Wharton” study on Gen AI adoption

    Microsoft Copilot AI use extends deep into corporate America, but companies are not 100% sold.

    More on US corporate adoption of MS Copilot


    AI Regulation and Ethics

    X.AI will train Grok on your data
    X.AI

    Elon Musk’s xAI uses all your Twitter/X posts to train its AI model Grok…

    More on XAI training Grok on your X/Twitter posts

    Google open-sourced its watermarking tool for AI-generated text

    More on Google’s open source AI watermarking tool for text

    Google announced it will add a note to photos people edit with AI tools, such as Zoom Enhance, Magic Eraser and Magic Editor, to aid transparency.

    More on Google’s AI photo edit note

    Several researchers raised concerns after finding that OpenAI ‘s Whisper transcription tool suffers from frequent hallucinations and invents text that never appears in recordings despite being deployed extensively in healthcare settings. Over 30,000 medical professionals use Whisper-based tools despite OpenAI’s warnings against high-risk applications, according to a The Associated Press report.

    More on the inaccuracy of OpenAI’s Whisper transcription tool

    Biden Administration issues first ever national security memorandum on artificial intelligence.

    More on Biden’s Memorandum on AI

    Chinese research institutions with ties to the Chinese People’s Liberation Army used Meta’s open-source Llama artificial intelligence model to develop an AI tool with potential military applications, Reuters reported, raising further concerns over how China’s government uses open-source AI models from U.S. companies to expand its military and intelligence capabilities.

    More on Reuters’ report on China’s army’s use of Meta’s open source models

    Big Tech Is paving the way for a nuclear power breakthrough. Small modular reactors, made commercially viable by AI processing needs of AI, could eventually make the power source cheaper, safer and faster to build

    More on AI companies use of nuclear power


    And finally…

    Perplexity's dedicated hub for U.S. general election information.
    Perplexity’s dedicated hub for U.S. general election information

    Perplexity announced a dedicated hub for U.S. general election information. Populated by data from The Associated Press and Democracy Works, the company described it in a blog as “an entry point for understanding key issues.”

    More on Perplexity’s US Election hub

  • Generative AI trends worth keeping an eye on in 2024

    Welcome to the first working Monday of 2024. AI may feel sooo 2023, given the explosion of interest generated by the public launch of ChatGPT. But interest is unlikely to dwindle in 2024, even if our focus moves on from “write a resignation letter to my boss in the style of a Hamilton hip hop song” to “help me define my target market segments, create personas and write my marketing plan…”

    So, which Generative AI trends are worth keeping an eye on in 2024? Here are just a few thoughts. Tell me what I’ve left out:

    Capability

    • Much of 2023 saw OpenAI ’s competitors playing catch up with ChatGPT 3.5, and then GPT-4. None of them quite made it, but Google claims its soon-to-be-launched Ultra version of its ChatGPT alternative, #Gemini, will beat #GPT-4 in some areas. That’s if OpenAI’s doesn’t launch GPT-4.5 first…
    • Why does it matter? At the launch of GPT-4 OpenAI said that it’s “more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5”. In practice, it became “multimodal” (in other words, it could handle visual, written and spoken instructions and responses), users could expand and tailor its use with the introduction of plugins and then GPTs (tailored versions of ChatGPT that users can create without any coding knowledge) and it came with DALL·E 3 – OpenAI’s image generator. These are just some of the ways the improvements to GPT increased ChatGPT’s capabilities in 2023. Expect more in 2024.
    • The drive for more capabilities more quickly will also fuel the debate as to which is better: open source language model development vs closed source. In other words, community collaboration vs proprietary focus. The closed source crowd will continue to be led by OpenAI – ironic given their name. In the open source world, Meta’s soon to be launched #LLaMA-3 appears to have real competition in engaging developers in the form of French company Mistral Solutions Pvt. Ltd ’s catchily named “Mixtral 8x7B”. Curiously, Apple, not a company famous for its open sourced approach, released #Ferret (yes, you read that right) a generative AI model for vision-language tasks, as open source last October. Expect to see it incorporated into a turbo-charged version of Siri that we all wanted to see launched several years ago.

    Adoption

    • While companies will continue to be concerned about inaccuracy, confidentiality, security and copyright issues, adoption of Generative AI tools at work will accelerate, propelled by personal use, much as the iPhone’s initial enterprise adoption was driven by employees rather than company policy.
    • Adoption of Generative AI across functions and sectors will broaden. Marketing and Sales teams largely led the charge in 2023. “Industries relying most heavily on knowledge work are likely to see more disruption—and potentially reap more value” says McKinsey & Company in “The state of AI in 2023: Generative AI’s Breakout” report. Expect to see an acceleration in adoption especially in banking, pharmaceuticals, medical, legal and accountancy sectors. But, also in sectors that have been slower to the table, such as retail and manufacturing, as the value in more efficiently and effectively making use of customer insights to provide “hyper-customised customer experiences” becomes more accessible.

    Integration

    • Key to adoption will be an increased focus on integration. On the device front, we’re seeing innovations like Google ’s Nano version of multimodal model, #Gemini – designed for “on device experiences” aka mobile devices, the surge in marketing of laptops as “AI PCs” complete with “made for AI” chips by the likes of NVIDIA . We will also see more new devices, along the lines of Humane’s AI pin, and the Rewind Pendant, as different parts of our body are deemed to be fit for purpose as ai containers. We can also expect big things (or maybe small and very intuitive to use things) in 2024 from the collaboration between Apple’s ex design guru, Jony Ive’s #LoveFrom and Sam Altman’s OpenAI. At the end of the year Reuters reported that Tang Tan, “who led the design for the iPhone and Apple Watch” is set to come on board in February.
    • We will also see more integration of generative AI into the enterprise systems that we already use. Leading the way is Microsoft ’s #Copilot integration into Microsoft 365 Enterprise and Windows 11, and its recently announced new, dedicated Copilot key to be added to the Windows keyboard – the keyboard’s first change in 30 years.
    • Integration into the workflow will be propelled by the adoption of AI tools or agents. OpenAI’s launch of GPTs – tailored versions of ChatGPT for a specific use or task – is an example of those tools, and an “App Store” like platform to share and sell them is due from OpenAI this month. Google is rumoured to be launching a similar tool, under the Gemini brand, later this year.
    • We will also see more use made of Generative AI in business processes and decision making. More businesses will start using AI for strategic decision-making processes, such as analysing market trends, forecasting, and providing insights for high-level strategy and operational decisions, as well as recruitment and training, regulatory compliance and helping to adopt and embed better ethical and sustainable practices.

    From Images to Video

    • If the latter half of 2023 saw major advances in image generation – led by OpenAI’s DALL·E, Midjourney , Craiyon🖍 and #DreamStudio by Stability AI – then expect to see the same for video in 2024. This will likely be led by #Pika Labs, #Midjourney, Runway, Stable Video Diffusion, Synthesia and others. Image generation will see further adoption as DALL·E 3 has been integrated into Chat GPT Plus, Midjourney becomes accessible via a website rather than just Discord and more people make greater use image generators that are integrated into the design platform they already use, such as the Canva AI Image Generator and Adobe #Firefly.

    Loneliness and Companions

    • As Prof Scott Galloway has so forcefully argued, one of the biggest malaises to hit Western societies is loneliness. The Survey Center on American Life reports that in a US-based survey comparing 1990 and 2021 the percentage of people claiming they have less than three close friends doubled from 16% to 32%. The same survey reported that the percentage of people who claimed to have no close friends at all rose from 3% to 12%.
    • Perhaps not totally disconnected, the Character.AI website was usually to be found in the top ten AI websites by use, in the 2023. Character AI lets you create “characters” and talk to them. In other words, #AIchatbot companions. Expect to see AI friends become more prevalent, as a synthetic antidote to loneliness, providing digital companionship.

    Legal, regulated and honest

    • Copyright, privacy, and regulation are key concerns and will remain so in 2024, as countries grapple with finding a balance between providing a stable, ethically acceptable environments for the development of AI technology without impeding the competitiveness of their companies. All while trying to keep up with the fastest-developing technology ever. The EU’s proposed AI Act is a case in point. It emphasises a risk-based approach, and has been several years in the making, which means it was conceived before the explosion in Generative AI, and President Macron, with an eye on the French company Mistral, is concerned that it will prove to be too restrictive in the face of foreign companies.
    • On the copyright front, the case The New York Times has brought against OpenAI will, coupled with the report that Apple is spending approximately $50 million to license content with key international media bodies, including IAC , Condé Nast , and NBC News , will help to determine the value of original content used to train AI models.
    • A US Presidential election, as well as national elections in a number of countries will bring the dangers of AI created “deep fakes”, sharply into focus in 2024. Developments like the ability of AI tools to pass the Capchta test, highlighted by Professor Ethan Mollick at The Wharton School, highlight the very broad range of such challenges.
    • I’m not convinced 2024 will provide the answers, but the commercial opportunities for the “verification” market will be substantial. The recent debates on defining and detecting plagiarism in US universities is an obvious example. And, it almost goes without saying, but the value of trusted brands will only increase.

    Medicine and Health

    • I hope we see more AI-fuelled health and medical breakthroughs in 2024, for example to identify the earliest signs of various cancers, vaccine breakthroughs, predict cardiac arrest, mitigate the risk of diseases like HIV, uncover genetic predictors of diseases such as Alzheimer’s and provide basic, affordable health care to areas of the world that still lack it. Bill Gates’s blog post, “The road ahead reaches a turning point in 2024″, provided a reasonably optimistic vision. I hope he’s right.