Category: Views

  • An Autonomous AI Agent Called ‘Devin’ Plans and Executes Complex Coding Tasks

    An Autonomous AI Agent Called ‘Devin’ Plans and Executes Complex Coding Tasks

    IBL News | New York

    Cognition AI, which builds AI teammates, introduced this week an autonomous AI software engineer called Devin.

    This AI agent can independently write entire software projects from scratch based on simple text prompts.

    Devin can plan and execute complex coding tasks with hundreds of steps.

    The autonomous agent can code while learning, recall relevant context at every step, fix errors, and collaborate with users in real time.

    “With our advances in long-term reasoning and planning, Devin can plan and execute complex engineering tasks requiring thousands of decisions,” said the company.

    Cognition AI has equipped Devin with common developer tools including the shell, code editor, and browser within a sandboxed compute environment.

    This agent can report on its progress in real-time, accept feedback, and work together with the user through design choices as needed.

    In the demos shown below, Devin built complete websites and apps in under 10 minutes. It also successfully completed real gigs posted on Upwork by itself.

    On a coding benchmark, the AI agent solved 13.86% of real-world GitHub issues end-to-end, crushing the previous SOTA benchmark of 1.96%.

    Funded with a $21 million Series A led by Founders Fund, Cognition AI is dedicated to building AI teammates with capabilities far beyond today’s existing AI tools by solving reasoning.
    .

  • The Humanoid Robot StartUp Figure AI Attracted the Support of Open AI, NVIDIA, Microsoft, and Jeff Bezos’ VC

    The Humanoid Robot StartUp Figure AI Attracted the Support of Open AI, NVIDIA, Microsoft, and Jeff Bezos’ VC

    IBL News | New York

    The final form for ChatGPT is not a bot.

    Figure AI, a startup working to build humanoid robots that can perform dangerous and undesirable jobs, got support from OpenAI and other large names in AI, such as NVIDIA, Microsoft, and Jeff Bezos’ venture fund.

    The Sunnyvale, California-based company announced on Thursday that it raised $675 million in Series B funding at a $2.6 billion valuation with investments from Microsoft, OpenAI Startup Fund, NVIDIA, Jeff Bezos (through Bezos Expeditions), Parkway Venture Capital, Intel Capital, Align Ventures, and ARK Invest.

    Focused on deploying humanoid robots to assist people with real-world applications addressing labor shortages, Figure recently announced its first commercial agreement with BMW Manufacturing to bring humanoids into automotive production.

    The Figure team, made up of top AI robotics experts from Boston Dynamics, Tesla, Google DeepMind, and Archer Aviation, has made remarkable progress in the past few months in the key areas of AI, robot development, robot testing, and commercialization. Founded 21 months ago, Figure currently has a team of 80 employees and is led by serial entrepreneur Brett Adcock.

    The new capital will be used to accelerate the timeline for humanoid commercial deployment as AI training, robot manufacturing, and expanding engineering headcount will be scaled up.

    The collaboration with OpenAI will help to accelerate “Figure’s commercial timeline by enhancing the capabilities of humanoid robots to process and reason from language,” stated the company.

    Peter Welinder, VP of Product and Partnerships at OpenAI, said: “We’ve always planned to come back to robotics and we see a path with Figure to explore what humanoid robots can achieve when powered by highly capable multimodal models. We’re blown away by Figure’s progress to date and we look forward to working together to open up new possibilities for how robots can help in everyday life.”

    Figure will use Microsoft Azure for AI infrastructure, training, and storage.

    To date, Figure AI has developed a general-purpose robot, called Figure 01, that looks and moves like a human. The company sees its robots being put to use in manufacturing, shipping and logistics, warehousing, and retail, where labor shortages are the most severe.

    Earlier this week, the company released a video showing Figure 01 in action (see below). The robot, attached to a tether, walks on two legs, and uses its five-fingered hands to pick up a plastic crate, then walks several more steps before placing the box on a conveyor belt.

    Figure’s ultimate aim is for Figure 01 to be able to perform “everyday tasks autonomously.” The company says getting there will require it to develop more robust AI systems.

    There is a crowded field of companies vying to make humanoid robots a reality, although the market is nascent. Amazon-backed Agility Robotics plans to open a factory that can produce up to 10,000 of its bipedal Digit robots per year.

    Tesla also trying to build a humanoid robot, called Optimus, while robotics company Boston Dynamics has developed several models. Norwegian humanoid robot startup 1X Technologies recently raised $100 million with backing from OpenAI.
    .

  • OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    OpenAI Shows ‘Sora’, an AI Model that Generates Photorealistic Videos

    IBL News | New York

    OpenAI shared yesterday a new AI technology called ‘Sora’ that instantly generates eye-popping videos and can speed up the work of moviemakers while replacing less experienced digital artists.

    The San Francisco-based start-up shared this tool with a small group of academics and researchers.

    In an interview with The New York Times, the company said that it had not yet released Sora to the public because it was still working to understand the system’s dangers.

    OpenAI calls its new system Sora, after the Japanese word for sky.  It chose the name because it “evokes the idea of limitless creative potential.”

    In April 2023, a New York start-up called Runway AI unveiled technology that lets people generate videos simply by typing a prompt. Ten months later, OpenAI has unveiled a similar system that creates videos with a significantly higher quality.

    A demonstration included short videos created in minutes, like the ones shown below.

    OpenAI, which owns the still-image generator DALL-E, is now in the race to improve the AI video generator. Google and Meta are in this business, too.

    OpenAI declined to say how many videos the system learned from or where they came from, except to say the training included both publicly available videos and videos that were licensed from copyright holders. The company says little about the data used to train its technologies, most likely because it wants to maintain an advantage over competitors — and has been sued multiple times for using copyrighted material.

    The company is already tagging videos produced by the system with watermarks that identify them as being generated by AI. But it acknowledges that these can be removed.

    DALL-E, Midjourney, and other still-image generators have improved so quickly that they are now producing images nearly indistinguishable from photographs. Many digital artists are complaining that it has made it harder for them to find work.

    Examples:

    An AI-generated video by OpenAI was created with the following prompt: “Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.”

     

    This video’s AI prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

     

    “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”

     

    This video’s AI prompt: “A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures.”
    .

    Sam Altman, CEO at OpenAI, shared multiple videos generated by Sora.


    • Wired: OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos

  • NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    NVIDIA Releases a Demo App that Allows Users to Run an AI Chatbot on Their PC

    IBL News | New York

    NVIDIA introduced yesterday a personalized demo chatbot app called Chat With RTX that runs locally on RTX-Powered Windows PCs providing fast and secure results.

    This early version  allows users to personalize a LLM connected to their own content—docs, notes, videos, or other data. It leverages retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration so users can query a custom chatbot to quickly get contextually relevant answers.

    Available to download, with 35GB’s installer, NVIDIA’s Chat With RTX requires Windows 11 and a GPU with NVIDIA GeForce RTX 30 or 40 Series GPU or NVIDIA RTX Ampere or Ada Generation GPU with at least 8GB of VRAM.

    With this app tailored for searching local documents and personal files, users can feed it YouTube videos and their own documents to create summaries and get relevant answers based on their own data analyzing collection of documents as well as scanning through PDFs.

    Chat with RTX essentially installs a web server and Python instance on a PC, which then leverages Mistral or Llama 2 models to query the data. It doesn’t remember context, so follow-up questions can’t be based on the context of a previous question.

    The installation is 30 minutes long, as The Verge analyzed.

    It takes an hour to install the two language models — Mistral 7B and LLaMA 2— and they required 70GB.

    Once it’s installed, a command prompt window launches with an active session, and the user can ask queries via a browser-based interface.
    .

     

  • Microsoft Issued a Redesigned Copilot with Image Creation Capabilities

    Microsoft Issued a Redesigned Copilot with Image Creation Capabilities

    IBL News | New York

    Microsoft issued this week an update to its Copilot chatbot with further image creation capabilities and a new GPT 4-based model, Deucalion. It also released new apps on iOS and Android.

    The launch was coincident with a Super Bowl ad (see below). It also marked one year since the entry of Microsoft into the consumer AI sphere with Bing Chat.

    Powered by OpenAI’s DALL-E 3, the new Copilot comes with a cleaner, sleeker look UI with a cleaner look, more white space, less text, and a visual carousel of cards.

    In addition, it includes Microsoft Designer, which allows users to customize the generated images right inside Copilot without leaving the chat.

    Images can be regenerated between square and landscape, resized, or enhanced with color, blurred background, and different effects like pixel art, resize and regenerate images without leaving chat.

    Microsoft announced that it will soon roll out a Designer GPT inside Copilot.

    Here are images of the old Bing Chat and the new Microsoft Copilot design, one after another.
    .

     

     

    https://youtu.be/SaCVSUbYpVc?si=YJKEC9EGEVKJXens

     

  • Google Rebranded ‘Bard’ Chatbot as ‘Gemini’, and Rolled Out a Paid Subscription Model

    Google Rebranded ‘Bard’ Chatbot as ‘Gemini’, and Rolled Out a Paid Subscription Model

    IBL News | New York

    Google rebranded its Bard chatbot as Gemini — the family of its foundation model —, launched in the U.S. Gemini Ultra 1.0 — priced at $20 per month — and issued a new Gemini app on iOS and Android, as Sundar Pichai, CEO of Google and Alphabet, announced today.

    The API access to the Ultra model will be available in the coming weeks.

    The paid monthly subscription — the same price as ChatGPT 4 — will be available through a new bundle known as Google One Premium Plan that includes two terabytes of cloud storage — typically costing $9.99 monthly — and access to the Google Workspace apps like Docs, Slides, Sheets, and Meet. For now, users can get a two-month subscription trial at no cost.

    With that, Google did sunset the Duet AI brand, which became Gemini for Workspace, responding to Microsoft and its partner OpenAI’s offerings in this manner.

    “Gemini Ultra 1.0 is a model that sets the state of the art across a wide range of benchmarks across text, image, audio, and video,” Google’s Sissie Hsiao said in a press conference today.

    “The largest model Ultra 1.0 is the first to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects — including math, physics, history, law, medicine, and ethics — to test knowledge and problem-solving abilities,” Sundar Pichai stated.

    “Gemini Advanced can be a personal tutor, tailored to your learning style, or it can be a creative partner, helping you plan a content strategy or build a business plan, as explained in this post,” he added.

    Many users said Bard provided middling results, making a rebrand almost a necessity, TechCrunch commented today.
    .

    Video explaining two new experiences — Gemini Advanced and a mobile app — to help you easily collaborate with the best of Google AI.

  • Perplexity AI, Valued at $520 Million After Getting Support of Jeff Bezos, Nvidia, and Databricks

    Perplexity AI, Valued at $520 Million After Getting Support of Jeff Bezos, Nvidia, and Databricks

    IBL News | New York

    Perplexity — an AI-powered search engine with one million iOS and Android users and $3 million in annual recurring revenue (ARR) — announced yesterday it raised $73.6 million in funding, leading to a valuation of $520 million post-money — a multiple of around 150 times its ARR.

    Established in August 2022, this San Francisco-based start-up, with a staff of 40 employees located in a co-working space, says that it plans to go after Google’s dominant position in web search.

    The funding round was led by IVP with support from Seed and Series A investors NEA, Elad Gil, Nat Friedman, and Databricks, as well as new investors NVIDIA, Jeff Bezos (through Bezos Expeditions Fund), Tobi Lutke, Bessemer Venture Partners, Naval Ravikant, Balaji Srinivasan, Guillermo Rauch, Austen Allred, Factorial Funds, and Kindred Ventures, among others.

    “The times of sifting through SEO spam, sponsored links, and multiple web pages will be replaced by a much more efficient way to consume and share information,” explained Aravind Srinivas, Co-founder & CEO of Perplexity, in a blog post.

    He presents the company’s new Copilot product this way:

    “It’s an AI research assistant that has changed how we uncover information and learn more about new topics. Copilot tailors search queries with custom follow-up questions, introducing the concept of generative user interfaces. It removes the burden of prompt engineering and does not require users to ask perfectly phrased questions to get the answers they seek. This enables users to gain more relevant and comprehensive answers than other AI chatbots, traditional search engines, or research tools. Copilot has seen strong traction, especially among academics, students, and knowledge workers who rely on frequent research for their day-to-day work and needs.”

    In other words, Perplexity says that its advantage is based on using advances in AI to provide direct answers, instead of website links, in response to search queries, without some of the limitations felt by larger companies.

    “If you can directly answer somebody’s question, nobody needs those ten blue links,” Srinivas said. Google has begun rolling out a feature that provides lengthy summaries in response to some search queries.

    Microsoft has struggled to make a dent in Google’s share of the search market since it introduced a version of its Bing search engine that can act like a chatbot.

    Neeva, a search start-up that used generative AI to provide direct answers, shut down last year after it failed to gain enough traction to compete with Google.

    Perplexity maintains its index of webpages, which it combines with a mixture of AI technology it has designed itself and purchased from outside providers such as OpenAI.

    The company, still not profitable, charges $20 a month for a more powerful version of the search engine that uses GPT-4, OpenAI’s most advanced technology.
    .

     

  • Harvard University’s President Resigned After Comments About Antisemitism

    Harvard University’s President Resigned After Comments About Antisemitism

    IBL News | New York

    Harvard University’s President, Claudine Gay, resigned Tuesday after facing allegations of plagiarism and criticism over her comments about antisemitism on campus.

    Last month, during a tense congressional hearing, Dr. Gay said calls for the killing of Jews were abhorrent. She added, however, that it would depend on the context whether such comments would constitute a violation of Harvard’s code of conduct regarding bullying and harassment.

    As a result of the comments, she faced mounting pressure to step down in recent weeks. Dozens of politicians and some high-profile alumni called for her to step down over the comments.

    But nearly 700 staff members rallied behind her in a letter, and the university said she would keep her job despite the controversy.

    Claudine Gay, 53 years old, served as President for six months and was the first black person, and only the second woman, to be appointed to lead the Ivy League university. Her tenure was the shortest in its 388-year history.

    In a letter announcing her resignation, Dr. Claudine Gay said it was in the “best interests” of the university for her to step down.

    “This is not a decision I came to easily. Indeed, it has been difficult beyond words,” Dr. Gay said. She added that her resignation would allow Harvard to “focus on the institution rather than any individual”.

    Harvard is one of several universities in the U.S. accused of failing to protect its Jewish students following the outbreak of the Israel-Hamas war in October. Jewish groups have reported an alarming rise in antisemitic incidents in the U.S. since the conflict began.

    Just hours before she resigned on Tuesday, claims that Dr. Gay had failed to properly cite academic sources emerged and were published anonymously in the conservative Washington Free Beacon newspaper.

    Harvard’s board investigated the allegations of plagiarism and found two published papers that required additional citations. The board, however, said that she did not violate standards for research misconduct.

    Provost Alan M. Garber ’76, Harvard’s Chief Academic Officer, will serve as interim president effective immediately, The Harvard Gazette reported.

  • Google Introduced Its Multimodal Technology ‘Gemini’ and Added It to Bard

    Google Introduced Its Multimodal Technology ‘Gemini’ and Added It to Bard

    IBL News | New York

    Google introduced yesterday its long-awaited answer to ChatGPT, a multimodal, natively designed, and pre-trained AI technology with reasoning capabilities named Gemini.  

    While other multimodal offerings — meaning it can analyze text, audio, video, images, and code —  exist, Gemini was described by Google’s CEO Sundar Pichai as the company’s “most capable and general model yet.”

    “Our first version, Gemini 1.0, is optimized for different sizes: Ultra, Pro, and Nano.”

    Demis Hassabis, CEO and Co-Founder of Google DeepMind, explained that “Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.”

    “This makes it especially good at explaining reasoning in complex subjects like math and physics.”

    Gemini can understand, explain, and generate high-quality code in Python, Java, C++, and Go. “Its ability to work across languages and reason about complex information makes it one of the leading foundation models for coding in the world,” said Demis Hassabis.

    Google said that Gemini 1.0 was now rolling out across a range of its products and platforms.

    For example, the chatbot Bard was upgraded with Gemini Pro, while Gemini Ultra will applied early next year in a new experience called Bard Advanced.

    Google was also bringing Gemini to Pixel. Pixel 8 Pro will be engineered to run Gemini Nano, powering new features like Summarize in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp.

    In the coming months, Gemini will be available in more of our products and services like Search, Ads, Chrome, and Duet AI.

    Starting on December 13, developers and enterprise customers will be able to access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

    (Google AI Studio is a free, web-based developer tool to prototype and launch apps quickly with an API key.)
    .

    A chart showing Gemini Ultra’s performance on common text benchmarks, compared to GPT-4 (API numbers calculated where reported numbers were missing).

    A chart showing Gemini Ultra’s performance on multimodal benchmarks compared to GPT-4V, with previous SOTA models listed in places where capabilities are not supported in GPT-4V.

     

    The New York Times: Google Updates Bard Chatbot With ‘Gemini’ A.I. as It Chases ChatGPT

  • A Year of the Huge Hit of OpenAI’s ChatGPT

    A Year of the Huge Hit of OpenAI’s ChatGPT

    IBL News | New York

    A year ago, on November 30, 2023, OpenAI’s ChatGPT was launched, starting a new time in EdTech through Generative AI technology.

    The day represents a turning point in the digital world, as it happened with Netscape, Facebook, Netflix, and the iPhone.

    When ChatGPT was launched nobody took the stage, and no one predicted that this apparently simple chatbot would be the fastest-growing consumer technology in history.

    It had a million users in five days, 100 million after just two months, and now boasts of having 100 million users every week.

    ChatGPT, and the model underneath it, also quickly became a billion-dollar business for OpenAI, with the huge backing of Microsoft, which invested over $12 billion.

    In a year otherwise marked by a huge decline in venture capital investing, companies with Generative AI in their pitch have been able to raise $17.9 billion just in the third quarter of 2023, according to Pitchbook.

    A few companies have successfully emerged: Anthropic as the most well-funded competitor; Midjourney and Stable Diffusion as image-generating; Character.ai as a free chatbot-creator; Github and Microsoft’s Bing copilots; and Google’s Duet.

    AI hardware made Nvidia one of the most valuable companies on earth.

    Within a year, we also saw corporate drama at OpenAI. CEO Sam Altman was briefly forced out. It was a power play between board members and executives, apparently on a disagreement over safety.
    .