Author: IBL News

  • Adobe Launches an AI Assistant for PDF Documents

    Adobe Launches an AI Assistant for PDF Documents

    IBL News | San Diego

    Adobe introduced this week an AI assistant in beta in Reader and Acrobat, which instantly generates summaries and insights from long PDF documents. It also recommends and answers questions based on a PDF’s content through an intuitive conversational interface.

    The AI assistant generates citations with the source, and as an output, it formats the information for sharing in emails, reports, and presentations. Clickable links help quickly find information in long documents.

    This feature will be sold through a new add-on subscription plan when AI Assistant is out of beta.

    “Our AI Assistant is bringing generative AI to the masses, unlocking new value from the information inside the approximately 3 trillion PDFs in the world,” stated Adobe.

    This assistant leverages the same AI and machine learning models behind Acrobat Liquid Mode, the technology that supports responsive reading experiences for PDFs on mobile.

    PDF was invented by Adobe thirty years ago Adobe, and today remains the standard for reading, editing, and transforming PDFs.

    Currently, the new AI Assistant features are available in beta for Acrobat Standard and Pro Individual and Teams subscription plans on desktop and web in English, with features coming to Reader desktop customers in English over the next few weeks – all at no additional cost. Other languages will follow. A private beta is available for enterprise customers.

    Adobe: How people are using AI Assistant (YouTube videos)

  • Google Open-Sources a Small Model of Gemini

    Google Open-Sources a Small Model of Gemini

    IBL News | San Diego

    Google released yesterday Gemma 2B and 7B, two lightweight, pre-trained open-source AI models, mostly suitable for small developments such as simple chatbots or summarizations.

    It also lets developers use the research and technology used to create the Gemini closed models.

    They are available via Kaggle, Hugging Face, Nvidia’s NeMo, and Google’s Vertex AI. It’s designed with Google’s AI Principles at the forefront.

    Gemma supports multi-framework Keras 3.0, native PyTorch, JAX, and Hugging Face Transformers.

    Developers and researchers can work with Gemma using free access in Kaggle, a free tier for Colab notebooks, and $300 in credits for first-time Google Cloud users. Researchers can also apply for Google Cloud credits of up to $500,000 to accelerate their projects.

    Each size of Gemma is available at ai.google.dev/gemma.

    Google is also providing toolchains for inference and supervised fine-tuning (SFT) across all major frameworks: JAX, PyTorch, and TensorFlow through native Keras 3.0.

    Google’s Gemini comes in several weights, including Gemini Nano, Gemini Pro, and Gemini Ultra.

    Last week, Google announced a faster Gemini 1.5 intended for business users and developers.

  • D2L Brightspace Pilots Generative AI Tools for Teachers

    D2L Brightspace Pilots Generative AI Tools for Teachers

    IBL News | New York

    D2L Brightspace presented to selected customers new generative AI tools to generate practice and quiz questions as part of its LMS platform.

    With this new program, which will be in testing mode through the summer of 2024, teachers are able to set quiz questions before making them available to students, giving them more safety and control oversight.

    The beta program is based on D2L’s “Responsible AI Principles” document.

    “This automated question generation capability can make it easier for instructors to assess learners in the moment. It is the initial step in expanding our product roadmap with cutting-edge generative AI to help change the way the world learns,” said Stephen Laster, D2L president.
    .

     

  • Khanmigo Struggles with Basic Math, Showed a Report

    Khanmigo Struggles with Basic Math, Showed a Report

    IBL News | New York

    Khanmigo, Khan Academy’s ChatGPT-powered tutoring bot, makes frequent calculation errors, The Wall Street Journal reported after testing it. “We tested an AI tutor for kids. It struggled with basic math,” wrote the paper.

    Last year, Educator Sal Khan promised to “give every student on the planet an artificially intelligent but amazing personal tutor.”

    “Asking ChatGPT to do math is sort of like asking a goldfish to ride a bicycle—it’s just not what ChatGPT is for,” said Tom McCoy, a professor at Yale University who studies AI.

    According to the paper, “Khanmigo frequently made basic arithmetic errors, miscalculating subtraction problems such as 343 minus 17. It also didn’t consistently know how to round answers or calculate square roots. Khanmigo typically didn’t correct mistakes when asked to double-check solutions.”

    Now being piloted by about 65,000 students in 44 school districts, Khanmigo emphasizes to students and teachers that it is imperfect.

    Sal Khan said he expects “a million or two million” students to be using it by next school year at a price to schools of $35 a student.

    Unlike ChatGPT, Khanmigo is trained not to give students the right answer but to guide them through problems. It offers tutoring in third grade and up in math, language arts, history and science. It can give feedback on student essays, engage in simulated dialogue as famous literary characters and debate contemporary issues.

    In testing the product, the WSJ asked Khanmigo for help finding the length of the third side of a right triangle, a problem that students would likely encounter in eighth-grade math.

    Khanmigo correctly identified the Pythagorean theorem, a2 + b2 = c2, as crucial to finding the answer. When asked for the solution the bot offered responses such as: “I’m here to help you learn, not just give answers!”

    But Khanmigo struggled with math operations. When trying to solve a right triangle with a hypotenuse of 27 units and a side of 17, a reporter offered the wrong answer (430 rather than 440) to 272 minus 172. “Excellent!” Khanmigo responded. Later, it accepted the incorrect answer to the square root of 440.

    In another instance, Khanmigo constructed its own triangle problem with a hypotenuse of 15 units and a leg of nine. But when a reporter correctly said that 152 minus 92 equals 144, Khanmigo suggested the response was wrong. “I see where you’re coming from, but let’s take another look at the subtraction,” it said.
    .

  • Amazon Launched ‘Rufus’, an AI-Powered Shopping Assistant

    Amazon Launched ‘Rufus’, an AI-Powered Shopping Assistant

    IBL News | New York

    Amazon launched this month in beta an AI-powered shopping assistant called ‘Rufus’, trained on the company product catalog, customer reviews, community Q&As, and information from across the web. It’s available to select customers when they next update their Amazon Shopping app.

    Built on an internal LLM specialized for shopping, this conversational assistant answers customer questions on a variety of shopping needs and products, provide comparisons, and make recommendations based on conversational context.

    “Rufus knows Amazon’s selection inside and out, and can bring it all together with information from across the web to help them make more informed purchase decisions,” said Amazon.

    “With Rufus, customers can:

    • Learn what to look for while shopping product categories: Customers can conduct more general product research on Amazon, asking questions such as “what to consider when buying headphones?”, “what to consider when detailing my car at home?”, or “what are clean beauty products?” and receive helpful information to guide their shopping mission.
    • Shop by occasion or purpose: Customers can search for and discover products based on activity, event, purpose, and other specific use cases by asking a range of questions such as “what do I need for cold weather golf?” or “I want to start an indoor garden.” Rufus suggests shoppable product categories—from golf base layers, jackets, and gloves to seed starters, potting mix, and grow lights—and related questions that customers can click on to conduct more specific searches.
    • Get help comparing product categories: Customers can now ask “what’s the difference between lip gloss and lip oil?” or “compare drip to pour-over coffee makers” so they can find the type of product that best suits their needs and make even more confident purchase decisions.
    • Find the best recommendations: Customers can ask for recommendations for exactly what they need, such as “what are good gifts for Valentine’s Day?” or “best dinosaur toys for a 5-year-old.” Rufus generates results tailored to the specific question and makes it quick and easy for customers to browse more refined results.
    • Ask questions about a specific product while on a product detail page: Customers can use Rufus to quickly get answers to specific questions about individual products when they are viewing the product’s detail page, such as “is this pickleball paddle good for beginners?”, or “is this jacket machine washable?”, or “is this cordless drill easy to hold?”. Rufus will generate answers based on listing details, customer reviews, and community Q&As.”
    Amazon Rufus shopping AI assistant

    TechCrunch said that it’s worth pointing out that Amazon’s AI chatbot Q for businesses has struggled, producing hallucinations (false information) and revealing confidential data.

  • OpenAI Triples Its Valuation and Now It’s Valued at $80 Billion

    OpenAI Triples Its Valuation and Now It’s Valued at $80 Billion

    IBL News | New York

    OpenAI plans to sell existing shares in a tender offer led by venture firm Thrive Capital, in a deal that values the San Francisco-based company at $80 million, tripling its valuation in less than ten months, The New York Times reported.

    Under this deal, employees will be able to cash out their shares.

    Early last year, the venture capital firms Thrive Capital, Sequoia Capital, Andreessen Horowitz, and K2 Global agreed to buy OpenAI shares in a tender offer, valuing the company at around $29 billion.

    Last January, Microsoft invested $10 billion in OpenAI, bringing its total investment in the San Francisco start-up to $13 billion.

    Since then, Anthropic, an OpenAI rival, has raised $6 billion from Google and Amazon. Cohere, a start-up founded by former Google researchers, raised $270 million, bringing its total funding to more than $440 million. Inflection AI, founded by a former Google executive, also raised a $1.3 billion round, bringing its total to $1.5 billion.
    .

  • OpenAI’s Sora Can Generate Video Games, Too

    OpenAI’s Sora Can Generate Video Games, Too

    IBL News | New York

    OpenAI’s new video-generating AI model ‘Sora’, which is able to generate up to a minute of 1080p video, can render video games, too.

    A research paper published by OpenAI yesterday points to the ability of Sora to simulate digital worlds and pave the way for more realistic games from text descriptions.

    In an experiment, OpenAI fed Sora prompts containing the word Minecraft and had it render a convincingly a game.

    Senior Nvidia researcher Jim Fan defined Sora as a “data-driven physics engine.” He added, “It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, intuitive physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths.”

    In a webpage full of examples, Open AI said, “We believe the capabilities Sora has today demonstrate that continued scaling of video models is a promising path towards the development of capable simulators of the physical and digital world, and the objects, animals, and people that live within them.”

    .

     

     

  • OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    IBL News | New York

    OpenAI shared yesterday a new AI technology called ‘Sora’ that instantly generates eye-popping videos and can speed up the work of moviemakers while replacing less experienced digital artists.

    The San Francisco-based start-up shared this tool with a small group of academics and researchers.

    In an interview with The New York Times, the company said that it had not yet released Sora to the public because it was still working to understand the system’s dangers.

    OpenAI calls its new system Sora, after the Japanese word for sky.  It chose the name because it “evokes the idea of limitless creative potential.”

    In April 2023, a New York start-up called Runway AI unveiled technology that lets people generate videos simply by typing a prompt. Ten months later, OpenAI has unveiled a similar system that creates videos with a significantly higher quality.

    A demonstration included short videos created in minutes, like the ones shown below.

    OpenAI, which owns the still-image generator DALL-E, is now in the race to improve the AI video generator. Google and Meta are in this business, too.

    OpenAI declined to say how many videos the system learned from or where they came from, except to say the training included both publicly available videos and videos that were licensed from copyright holders. The company says little about the data used to train its technologies, most likely because it wants to maintain an advantage over competitors — and has been sued multiple times for using copyrighted material.

    The company is already tagging videos produced by the system with watermarks that identify them as being generated by AI. But it acknowledges that these can be removed.

    DALL-E, Midjourney, and other still-image generators have improved so quickly that they are now producing images nearly indistinguishable from photographs. Many digital artists are complaining that it has made it harder for them to find work.

    Examples:

    An AI-generated video by OpenAI was created with the following prompt: “Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.”

     

    This video’s AI prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

     

    “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”

     

    This video’s AI prompt: “A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures.”
    .

    Sam Altman, CEO at OpenAI, shared multiple videos generated by Sora.


    • Wired: OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos

  • ChatGPT Tests a Memory Feature that Remembers Users’ Conversations Over Time

    ChatGPT Tests a Memory Feature that Remembers Users’ Conversations Over Time

    IBL News | New York

    OpenAI announced that it started to test a memory feature that powers ChatGPT to remember things users discuss across all chats.

    This feature, which saves users from having to repeat information, will be applied to GPTs, too.

    “You’re in control of ChatGPT’s memory. You can explicitly tell it to remember something, ask it what it remembers, and tell it to forget conversationally or through settings. You can also turn it off entirely,” said the company.

    Users can turn off memory at any time (Settings > Personalization > Memory). While memory is off, memories won’t be used or created.

    OpenAI put these examples:

    • You’ve explained that you prefer meeting notes to have headlines, bullets, and action items summarized at the bottom. ChatGPT remembers this and recaps meetings this way.
    • You’ve told ChatGPT you own a neighborhood coffee shop. When brainstorming messaging for a social post celebrating a new location, ChatGPT knows where to start.
    • You mention that you have a toddler and that she loves jellyfish. When you ask ChatGPT to help create her birthday card, it suggests a jellyfish wearing a party hat.
    • As a kindergarten teacher with 25 students, you prefer 50-minute lessons with follow-up activities. ChatGPT remembers this when helping you create lesson plans.
      .
  • NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    IBL News | New York

    NVIDIA introduced yesterday a personalized demo chatbot app called Chat With RTX that runs locally on RTX-Powered Windows PCs providing fast and secure results.

    This early version  allows users to personalize a LLM connected to their own content—docs, notes, videos, or other data. It leverages retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration so users can query a custom chatbot to quickly get contextually relevant answers.

    Available to download, with 35GB’s installer, NVIDIA’s Chat With RTX requires Windows 11 and a GPU with NVIDIA GeForce RTX 30 or 40 Series GPU or NVIDIA RTX Ampere or Ada Generation GPU with at least 8GB of VRAM.

    With this app tailored for searching local documents and personal files, users can feed it YouTube videos and their own documents to create summaries and get relevant answers based on their own data analyzing collection of documents as well as scanning through PDFs.

    Chat with RTX essentially installs a web server and Python instance on a PC, which then leverages Mistral or Llama 2 models to query the data. It doesn’t remember context, so follow-up questions can’t be based on the context of a previous question.

    The installation is 30 minutes long, as The Verge analyzed.

    It takes an hour to install the two language models — Mistral 7B and LLaMA 2— and they required 70GB.

    Once it’s installed, a command prompt window launches with an active session, and the user can ask queries via a browser-based interface.
    .