Category: Top News

  • OpenAI’s Sora Can Generate Video Games, Too

    OpenAI’s Sora Can Generate Video Games, Too

    IBL News | New York

    OpenAI’s new video-generating AI model ‘Sora’, which is able to generate up to a minute of 1080p video, can render video games, too.

    A research paper published by OpenAI yesterday points to the ability of Sora to simulate digital worlds and pave the way for more realistic games from text descriptions.

    In an experiment, OpenAI fed Sora prompts containing the word Minecraft and had it render a convincingly a game.

    Senior Nvidia researcher Jim Fan defined Sora as a “data-driven physics engine.” He added, “It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, intuitive physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths.”

    In a webpage full of examples, Open AI said, “We believe the capabilities Sora has today demonstrate that continued scaling of video models is a promising path towards the development of capable simulators of the physical and digital world, and the objects, animals, and people that live within them.”

    .

     

     

  • OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    IBL News | New York

    OpenAI shared yesterday a new AI technology called ‘Sora’ that instantly generates eye-popping videos and can speed up the work of moviemakers while replacing less experienced digital artists.

    The San Francisco-based start-up shared this tool with a small group of academics and researchers.

    In an interview with The New York Times, the company said that it had not yet released Sora to the public because it was still working to understand the system’s dangers.

    OpenAI calls its new system Sora, after the Japanese word for sky.  It chose the name because it “evokes the idea of limitless creative potential.”

    In April 2023, a New York start-up called Runway AI unveiled technology that lets people generate videos simply by typing a prompt. Ten months later, OpenAI has unveiled a similar system that creates videos with a significantly higher quality.

    A demonstration included short videos created in minutes, like the ones shown below.

    OpenAI, which owns the still-image generator DALL-E, is now in the race to improve the AI video generator. Google and Meta are in this business, too.

    OpenAI declined to say how many videos the system learned from or where they came from, except to say the training included both publicly available videos and videos that were licensed from copyright holders. The company says little about the data used to train its technologies, most likely because it wants to maintain an advantage over competitors — and has been sued multiple times for using copyrighted material.

    The company is already tagging videos produced by the system with watermarks that identify them as being generated by AI. But it acknowledges that these can be removed.

    DALL-E, Midjourney, and other still-image generators have improved so quickly that they are now producing images nearly indistinguishable from photographs. Many digital artists are complaining that it has made it harder for them to find work.

    Examples:

    An AI-generated video by OpenAI was created with the following prompt: “Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.”

     

    This video’s AI prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

     

    “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”

     

    This video’s AI prompt: “A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures.”
    .

    Sam Altman, CEO at OpenAI, shared multiple videos generated by Sora.


    • Wired: OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos

  • ChatGPT Tests a Memory Feature that Remembers Users’ Conversations Over Time

    ChatGPT Tests a Memory Feature that Remembers Users’ Conversations Over Time

    IBL News | New York

    OpenAI announced that it started to test a memory feature that powers ChatGPT to remember things users discuss across all chats.

    This feature, which saves users from having to repeat information, will be applied to GPTs, too.

    “You’re in control of ChatGPT’s memory. You can explicitly tell it to remember something, ask it what it remembers, and tell it to forget conversationally or through settings. You can also turn it off entirely,” said the company.

    Users can turn off memory at any time (Settings > Personalization > Memory). While memory is off, memories won’t be used or created.

    OpenAI put these examples:

    • You’ve explained that you prefer meeting notes to have headlines, bullets, and action items summarized at the bottom. ChatGPT remembers this and recaps meetings this way.
    • You’ve told ChatGPT you own a neighborhood coffee shop. When brainstorming messaging for a social post celebrating a new location, ChatGPT knows where to start.
    • You mention that you have a toddler and that she loves jellyfish. When you ask ChatGPT to help create her birthday card, it suggests a jellyfish wearing a party hat.
    • As a kindergarten teacher with 25 students, you prefer 50-minute lessons with follow-up activities. ChatGPT remembers this when helping you create lesson plans.
      .
  • NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    IBL News | New York

    NVIDIA introduced yesterday a personalized demo chatbot app called Chat With RTX that runs locally on RTX-Powered Windows PCs providing fast and secure results.

    This early version  allows users to personalize a LLM connected to their own content—docs, notes, videos, or other data. It leverages retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration so users can query a custom chatbot to quickly get contextually relevant answers.

    Available to download, with 35GB’s installer, NVIDIA’s Chat With RTX requires Windows 11 and a GPU with NVIDIA GeForce RTX 30 or 40 Series GPU or NVIDIA RTX Ampere or Ada Generation GPU with at least 8GB of VRAM.

    With this app tailored for searching local documents and personal files, users can feed it YouTube videos and their own documents to create summaries and get relevant answers based on their own data analyzing collection of documents as well as scanning through PDFs.

    Chat with RTX essentially installs a web server and Python instance on a PC, which then leverages Mistral or Llama 2 models to query the data. It doesn’t remember context, so follow-up questions can’t be based on the context of a previous question.

    The installation is 30 minutes long, as The Verge analyzed.

    It takes an hour to install the two language models — Mistral 7B and LLaMA 2— and they required 70GB.

    Once it’s installed, a command prompt window launches with an active session, and the user can ask queries via a browser-based interface.
    .

     

  • 2U Warns of “Substantial Doubt of Ability to Continue as a Going Concern”

    2U Warns of “Substantial Doubt of Ability to Continue as a Going Concern”

    IBL News | New York

    Online learning platform company 2U / edX warned yesterday of substantial doubt about its ability to continue as a going concern.

    Referring to its liquidity and cash flow, the Lanham, Md.-based company said:

    “The company expects that if it does not amend or refinance its term loan, or raise capital to reduce its debt in the short term, and in the event the obligations under its term loan accelerate or come due within twelve months from the date of its financial statement issuance in accordance with its current terms, there is substantial doubt about its ability to continue as a going concern.”

    2U Inc., now under the leadership of a new CEO, presented its results for the fourth quarter and the full year of 2023. “We are resetting and enhancing our operations with renewed financial discipline,” said Paul Lalljie, Chief Executive Officer of 2U.

    “Looking ahead, we believe this renewed focus, along with our market-proven offerings, robust partner network, and scalable technology and services, will allow us to take advantage of increasing demand for high-quality online education and continue to deliver on our mission.”

    “Our immediate focus in 2024 is to strengthen the fundamentals of our business in order to extend our debt maturities and restore a healthy balance sheet,” added Matthew Norden, Chief Financial Officer of 2U.On the results of 2023 compared to 2022, revenue decreased 2% to $946.0 million and net loss was $317.6 million. Costs and expenses for the year totaled $1.17 billion, a 4% decrease from $1.22 billion in 2022.

    The results for the fourth quarter of 2023 compared to fourth quarter 2022 showed a revenue increased of 8% to $255.7 million, while degree program segment revenue increased 19% to $163.5 million and alternative credential segment revenue decreased 7% to $92.2 million.

    Looking forward, the company expects to increase its revenue in the first quarter of 2024 from $195 million to $198 million with a net loss ranging from $60 million to $55 million and adjusted EBITDA to range from $10 million to $12 million.

    For the full year of 2024, it expects revenue to range from $805 million to $815 million, net loss to range from $90 million to $85 million, and adjusted EBITDA to range from $120 million to $125 million.
    .

  • Apple Released ‘MGIE’, an Open Source AI Multimodal Model for Image Editing

    Apple Released ‘MGIE’, an Open Source AI Multimodal Model for Image Editing

    IBL News | New York

    Apple released last week MGIE (MLLM-Guided Image Editing), a new open-source AI model that edits images based on natural language instructions. It leverages multimodal large language models (MLLMs) to interpret user commands and perform pixel-level manipulations.

    Experts agreed that MGIE represents a major breakthrough, highlighting that the pace of progress in multimodal AI systems is accelerating quickly.

    The model can handle a wide range of editing scenarios, such as simple color and brightness adjustments, photo optimization, object manipulations, and Photoshop-style modification, such as cropping, resizing, rotating, flipping, and adding filters.

    For example, an instruction can make the sky more blue, and MGIE produces the instruction to increase the saturation of the sky region by 20%.

    MGIE — which was presented in a paper accepted at the International Conference on Learning Representations (ICLR) 2024 — is the result of a collaboration between Apple and researchers from the University of California, Santa Barbara.

    MGIE is available as an open-source project on GitHub. The project also provides a demo notebook that shows how to use MGIE for various editing tasks. Users can also try out MGIE online through a web demo hosted on Hugging Face Spaces.
    .

  • Brave’s AI Assistant Integrates the Open-Source Mixtral 8x7B as the default LLM

    Brave’s AI Assistant Integrates the Open-Source Mixtral 8x7B as the default LLM

    IBL News | New York

    Brave announced that its AI browser assistant ‘Leo’ integrated the open-source LLM Mixtral 8x7B as the default model. The free version is rate-limited, and subscribers to Leo Premium ($15/month) get higher rate limits.

    In addition, the privacy-focused Brave made improvements to the Leo user experience, adding clearer onboarding, context controls, input and response formatting, and a general UI polish.

    Mixtral 8x7B, an open-source LLM released by the French start-up Mistral AI, gained popularity and usage among the developer community since its December release. It currently outperforms ChatGPT 3.5, Claude Instant, Llama 2, and many others, according to the LMSYS Chatbot Arena Leaderboard. Mixtral also shows improvements in reducing hallucinations and biases, according to the BBQ benchmark.

    Among other benefits, Mixtral generates code, handles larger contexts, and interacts in English, French, German, Italian, and Spanish.

    Brave is already using Mixtral for its newly released Code LLM feature for programming-related queries in Brave Search.

    Brave Leo also offers Claude Instant from Anthropic as well as Llama 2 13B model from Meta in the free version (with rate limits) and for Premium.

    Feature Free Leo Leo Premium
    Models Mixtral 8x7B (strict rate limits)
    Claude Instant (strict rate limits)
    Llama 2 13B (higher rate limits)
    Mixtral 8x7B
    Claude Instant
    Llama 2 13B
    Rate limits Various rate limits Higher rate limits
    Quality of conversations Very high, dependent on models (upgraded with release 1.62) Very high
    Privacy Inputs are always submitted anonymously through a reverse-proxy and are not retained. Inputs are always submitted anonymously through a reverse-proxy and are not retained.
    Subscription Free $15 monthly

     

    Leo helps users with tasks in the context of the page they are on by creating real-time summaries of web pages or videos. It can also answer questions about content, generate new content, translate pages, analyze them, and rewrite them. “Whether you’re looking for information, trying to solve a problem, writing code, or creating content, Leo is integrated in the browser for enhanced productivity,” said the company.

    To access Leo, Brave desktop users can simply ask a question in the address bar and click  “Ask Leo”, or clickBrave Leo sidebar icon.

     

  • Microsoft Issued a Redesigned Copilot with Image Creation Capabilities

    Microsoft Issued a Redesigned Copilot with Image Creation Capabilities

    IBL News | New York

    Microsoft issued this week an update to its Copilot chatbot with further image creation capabilities and a new GPT 4-based model, Deucalion. It also released new apps on iOS and Android.

    The launch was coincident with a Super Bowl ad (see below). It also marked one year since the entry of Microsoft into the consumer AI sphere with Bing Chat.

    Powered by OpenAI’s DALL-E 3, the new Copilot comes with a cleaner, sleeker look UI with a cleaner look, more white space, less text, and a visual carousel of cards.

    In addition, it includes Microsoft Designer, which allows users to customize the generated images right inside Copilot without leaving the chat.

    Images can be regenerated between square and landscape, resized, or enhanced with color, blurred background, and different effects like pixel art, resize and regenerate images without leaving chat.

    Microsoft announced that it will soon roll out a Designer GPT inside Copilot.

    Here are images of the old Bing Chat and the new Microsoft Copilot design, one after another.
    .

     

     

    https://youtu.be/SaCVSUbYpVc?si=YJKEC9EGEVKJXens

     

  • Google Rebranded ‘Bard’ Chatbot as ‘Gemini’, and Rolled Out a Paid Subscription Model

    Google Rebranded ‘Bard’ Chatbot as ‘Gemini’, and Rolled Out a Paid Subscription Model

    IBL News | New York

    Google rebranded its Bard chatbot as Gemini — the family of its foundation model —, launched in the U.S. Gemini Ultra 1.0 — priced at $20 per month — and issued a new Gemini app on iOS and Android, as Sundar Pichai, CEO of Google and Alphabet, announced today.

    The API access to the Ultra model will be available in the coming weeks.

    The paid monthly subscription — the same price as ChatGPT 4 — will be available through a new bundle known as Google One Premium Plan that includes two terabytes of cloud storage — typically costing $9.99 monthly — and access to the Google Workspace apps like Docs, Slides, Sheets, and Meet. For now, users can get a two-month subscription trial at no cost.

    With that, Google did sunset the Duet AI brand, which became Gemini for Workspace, responding to Microsoft and its partner OpenAI’s offerings in this manner.

    “Gemini Ultra 1.0 is a model that sets the state of the art across a wide range of benchmarks across text, image, audio, and video,” Google’s Sissie Hsiao said in a press conference today.

    “The largest model Ultra 1.0 is the first to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects — including math, physics, history, law, medicine, and ethics — to test knowledge and problem-solving abilities,” Sundar Pichai stated.

    “Gemini Advanced can be a personal tutor, tailored to your learning style, or it can be a creative partner, helping you plan a content strategy or build a business plan, as explained in this post,” he added.

    Many users said Bard provided middling results, making a rebrand almost a necessity, TechCrunch commented today.
    .

    Video explaining two new experiences — Gemini Advanced and a mobile app — to help you easily collaborate with the best of Google AI.

  • Tech and EdTech Companies Continue Firing Employees In 2024

    Tech and EdTech Companies Continue Firing Employees In 2024

    IBL News | New York

    With 353,000 jobs added in January, the U.S. economy is booming, but tech and edtech companies are firing tens of thousands of workers, despite stabilizing interest rates and a booming job market in other industries. Most of these employees, were hired to meet the pandemic boom in consumer tech spending.

    Even workers with years of experience or deep technical expertise are having trouble getting hired again.

    In January, Google, Amazon, Microsoft, Discord, Salesforce and eBay all made significant cuts. On Tuesday, PayPal said, in a letter to workers, it would cut another 2,500 employees or about 9 percent of its workforce.

    In 2023, they laid off over 260,000 people, according to layoff tracker Layoffs.fyi.

    Last year, the job reduction was mostly due to over-hiring during the pandemic and high interest rates environment — which makes it harder to invest in new business ventures, according to a report in The Washington Post.

    Experts say that companies are under pressure from investors to improve their bottom lines and focus on increasing profits.

    “That is the way the American capitalist system works,” said Mark Zandi, Chief Economist at Moody’s Analytics. “It’s ruthless when it gets down to striving for profitability and creating wealth. It redirects resources very rapidly from one place to another.”

    It seems to be working. In 2022, the Nasdaq Composite, a stock index dominated by tech companies, lost a full third of its value. In 2023, it grew by 43 percent. It rose another 3 percent in January.

    “The tech sector may be able to produce a lot and innovate a lot without as many people going forward,” Zandi, the Moody’s economist, said. “That is a lesson of AI.”
    .