Google went all-in on AI during its I/O 2023 developer annual conference on Wednesday.
The search giant publicly unveiled its newest large language model (LLM), PaLM 2, which, according to the company, is better at reasoning, writing, math, and logic, and performs better than OpenAI’s GPT-4 in coding and debugging.
The Mountain View, California-based company also introduced a new multimodal LLM called Gemini, which is currently under training.
PaLM 2 comes in different sizes, which are weirdly named after animal constellations: Gecko, Otter, Bison, and Unicorn.
Google’s new code completion and code generation tool, named Codey is the company’s answer to GitHub’s Copilot.
Codey is specifically trained to handle coding-related prompts and is also trained to handle queries related to Google Cloud in general.
Google also made its chatbot Bard, the equivalent of ChatGPT, officially available for everyone, removing any waitlist.
Built on PaLM2, Bard allows export to Google Docs, Sheets, Replit, and Gmail, among others.
Bard’s users will be able to generate images via Adobe Firefly and then modify them using Express.
In its search business, Google introduced the AI snapshot feature, which takes content from top links and allows for follow-up questions.
Sponsored ads will appear above while traditional links will be placed below.
Google’s productivity suite Workspace was improved with an AI sidekick called Duet, designed to provide better prompts.
It has automatic prompt suggestions and can do a ton of things, such as converting text to tables in Sheets, finding information in Gmail threads, and creating images in Slides.
“The Sidekick panel will live in a side panel in Google Docs and is constantly engaged in reading and processing your entire document as you write, providing contextual suggestions that refer specifically to what you’ve written,” said Google.
Another feature of Google Workplace, particularly in Gmail and Docs, is the ‘help me write’ feature, which allows users to write anything at different lengths.
The new AI features for Slides and Meet include the ability to type in what kind of visualization the user is looking for, and the AI creates that image. Specifically for Google Meet, that means custom backgrounds.
Google introduced the Magic Editor in Google Photos.
Google also announced new AI models heading to VertexAI, its fully managed AI enterprise service, including a text-to-image model called Imagen.
Another interesting project introduced by the search giant is Project Tailwind, an AI-powered notebook tool that takes a user’s free-form notes and automatically organizes and summarizes them.
Essentially, users pick files from Google Drive, then Project Tailwind creates a private AI model with expertise in that information, along with a personalized interface designed to help sift through the notes and docs.
The tool is available through Labs, Google’s refreshed hub for experimental products.
Project Tailwind takes your notes and makes it into a AI study tool, an AI first notebook #GoogleIO
This is amazing! pic.twitter.com/dyhVImbtkN
Google Vertex vs OpenAI GPT3.5!
🚀 At the recent #GoogleIO, Google launched Vertex AI models for language generation, stepping into the ring with OpenAI’s API. I decided to pit OpenAI’s GPT3.5 and Vertex AI’s text-bison@001 against each other in a knowledge retrieval task pic.twitter.com/VWdLbuKlDK
Meta/Facebook announced a new open-source multisensory model that links together six types of data (text, audio, visual data, thermal infrared images, and movement readings), pointing to a future of generative AI that creates immersive experiences by cross-referencing this info.
This model is a research model, a paper, with no immediate consumer or practical applications, and speaks in favor of Meta/Facebook since OpenAI and Google are developing these models secretively, according to experts.
For example, AI image generators like DALL-E, Stable Diffusion, and Midjourney all rely on systems that link together text and images during the training stage, as they follow users’ text inputs to generate pictures. Many AI tools generate video or audio in the same way.
Meta’s underlying model, named ImageBind, is the first one to combine six types of data into a single embedding space.
Recently, Meta open-sourced LLaMA, the language model that started an alternative movement to OpenAI and Google. With ImageBind, it’s continuing with this strategy by opening the floodgates for researchers to try to develop new, holistic AI systems.
• “When humans absorb information from the world, we innately use multiple senses, such as seeing a busy street and hearing the sounds of car engines. Today, we’re introducing an approach that brings machines one step closer to humans’ ability to learn simultaneously, holistically, and directly from many different forms of information — without the need for explicit supervision (the process of organizing and labeling raw data),” said Meta in a blog-post.
• “For instance, while Make-A-Scene can generate images by using text prompts, ImageBind could upgrade it to generate images using audio sounds, such as laughter or rain.”
• “Imagine that someone could take a video recording of an ocean sunset and instantly add the perfect audio clip to enhance it, or when a model like Make-A-Video produces a video of a carnival, ImageBind can suggest background noise to accompany it, creating an immersive experience.”
• “There’s still a lot to uncover about multimodal learning. We hope the research community will explore ImageBind and our accompanying published paper to find new ways to evaluate vision models and lead to novel applications.”
IBM announced yesterday, at its annual Think conference, Watsonx, a new platform that gives customers access to the toolset, infrastructure, and consulting services to build their own AI models or fine-tune and adapt pre-trained models on their data for generating computer code, and text.
According to IBM, WatsonX is an “enterprise studio for AI builders,” motivated by the challenges businesses experience in deploying AI within the workplace. In the same category, Amazon provides SageMaker Studio; Google, Vertex AI; and Microsoft, Azure AI. Along with tech giants, startups like Cohere and Anthropic appear as competitors.
IBM is offering seven pre-trained models to businesses using Watsonx.ai, a few of which are open source. It’s also partnering with Hugging Face, the AI startup, to include thousands of Hugging Face–developed models, datasets, and libraries.
(For its part, IBM is pledging to contribute open-source AI dev software to Hugging Face and make several of its in-house models accessible from Hugging Face’s AI development platform.)
The three that the company is contributing are fm.model.code, which generates code; fm.model.NLP, a collection of large language models; and fm.model.geospatial, a model built on climate and remote sensing data from NASA.
Similar to code-generating models like GitHub’s Copilot, fm.model.code lets a user give a command in natural language and then builds the corresponding coding workflow. Fm.model.NLP comprises text-generating models for specific and industry-relevant domains, like organic chemistry. And fm.model.geospatial makes predictions to help plan for changes in natural disaster patterns, biodiversity, and land use, in addition to other geophysical processes.
IBM claims that the models are differentiated by a training dataset containing “multiple types of business data, including code, time-series data, tabular data and geospatial data and IT events data,” Arvind Krishna, the CEO of IBM, said in the roundtable.
IBM is using the models itself, it says, across its suite of software products and services. For example, fm.model.code powers Watson Code Assistant, IBM’s answer to Copilot, which allows developers to generate code using plain English prompts across programs including Red Hat’s Ansible.
As for fm.model.NLP, those models have been integrated with AIOps Insights, Watson Assistant, and Watson Orchestrate — IBM’s AIOps toolkit, smart assistant, and workflow automation tech, respectively — to provide greater visibility into performance across IT environments, resolve IT incidents in a more expedient way and improve customer service experiences — or so IBM promises.
FM.model.geospatial, meanwhile, underpins IBM’s EIS Builder Edition, a product that lets organizations create solutions addressing environmental risks.
Alongside Watsonx.ai, under the same Watsonx brand umbrella, IBM unveiled Watsonx.data, a “fit-for-purpose” data store designed for both governed data and AI workloads. Watsonx.data allows users to access data through a single point of entry while applying query engines, IBM says, plus governance, automation, and integrations with an organization’s existing databases and tools.
Complementing Watsonx.ai and Watsonx.data is Watsonx.governance, a toolkit that provides mechanisms to protect customer privacy, detect model bias and drift, and help organizations meet ethics standards.
In an announcement related to Watsonx, IBM showcased a new GPU offering in the IBM cloud optimized for compute-intensive workloads — specifically training and serving AI models.
The company also showed off the IBM Cloud Carbon Calculator, an “AI-informed” dashboard that enables customers to measure, track, manage, and help report carbon emissions generated through their cloud usage.
Around 30% of business leaders responding to an IBM survey cite trust and transparency issues as barriers holding them back from adopting AI, while 42% cited privacy concerns around generative AI.
IBM expects AI will add $16 trillion to the global economy by 2030 and that 30% of back-office tasks will be automated within the next five years.
Atlassian announced an OpenAI-based ‘virtual teammate’ for its collaboration platform Confluence and Jira this week during its annual conference for software vendors, Team ’23 in Las Vegas.
Atlassian Intelligence chatbot will be tiered release starting in July 2023. Now, it is available in early access through a waitlist. Some of the features will become paid features over time.
In Confluence, Atlassian Intelligence can summarize meetings with action items and decision overviews. It can also create Tweets and blog posts using documents in Confluence as reference material.
In addition, this new tool can translate natural language queries into Atlassian’s SQL-like Jira Query Language.
In the collaboration software market, Confluence is competing with Notion, which has its own suite of AI tools, Guru and Zoho.
Adding ChatGPT-enabled features to their services is becoming an increased practice in the corporate world.
.
Hugging Face and ServiceNow released StarCoder, a free AI code-generating system alternative to GitHub’s Copilot (powered by OpenAI’s Codex), DeepMind’s AlphaCode, and Amazon’s CodeWhisperer.
StarCoder — which is licensed to allow for royalty-free use by anyone, including corporations — was trained in over 80 programming languages as well as text from GitHub repositories, including documentation and Jupyter programming notebooks.
It also integrates with Microsoft’s Visual Studio Code code editor and, like OpenAI’s ChatGPT, can follow basic instructions (e.g., “create an app UI”) and answer questions about code.
ServiceNow supplied an in-house compute cluster of 512 Nvidia V100 GPUs to train the StarCoder model.
Hugging Face and a co-lead on StarCoder, Leandro von Werra claimed that StarCoder matches or outperforms the AI model from OpenAI that was used to power initial versions of Copilot.
Unlike Copilot, the 15-billion-parameter StarCoder was trained over the course of several days on an open-source dataset called The Stack, which has over 19 million curated, permissively licensed repositories and more than six terabytes of code in over 350 programming languages.
Because it’s permissively licensed, code from The Stack can be copied, modified, and redistributed.
StarCoder isn’t open source in the strictest sense. Rather, it’s being released under a licensing scheme, OpenRAIL-M, that includes “legally enforceable” use case restrictions
The StarCoder code repositories, model training framework, dataset-filtering methods, code evaluation suite, and research analysis notebooks are available on GitHub as of this week.
“At launch, StarCoder will not ship as many features as GitHub Copilot, but with its open-source nature, the community can help improve it along the way as well as integrate custom models,” Leandro von Werra said in TechCrunch.
The nonprofit Software Freedom Conservancy among others criticized GitHub and OpenAI for using public source code, not all of which is under a permissive license, to train and monetize Codex.
Introducing: 💫StarCoder
StarCoder is a 15B LLM for code with 8k context and trained only on permissive data in 80+ programming languages. It can be prompted to reach 40% pass@1 on HumanEval and act as a Tech Assistant.
StarCoder was also trained on JupyterNotebooks and with Jupyter plugin from @JiaLi52524397 it can make use of previous code and markdown cells as well as outputs to predict the next cell.
Microsoft announced this week it has opened up access to its Bing GPT-4 chatbot to anyone with an account, removing the waitlist. The chatbot originally launched in a private preview in February, and Microsoft has gradually been opening it up ever since.
The software giant is also massively upgrading Bing Chat and redesigning Edge with lots of new features, including image and video results, persistent chat and history, compose sidebar, Knowledge Cards and visual search, and plug-in support.
Microsoft is working with OpenTable to enable its plug-in for completing restaurant bookings within Bing Chat and WolframAlpha for generating visualizations.
According to Microsoft, in ninety days, Bing has grown to 100 million daily active users and people have created 200 million images with Bing Image Creator.
• “We aimed to tackle a universal problem with traditional search – that nearly half of all web searches go unanswered, resulting in billions of people’s searches falling short of the mark. We launched the new Bing to bring you better search results, answers to your questions, the ability to create and compose, and a new level of ease of use by being able to chat in natural language.”
• “Bing combines powerful large language models like Open AI’s GPT-4 with our immense search index for results that are current, cited, and conversational – something you can’t get anywhere else but on Bing. This is fundamentally changing the way people find information.”
A leaked document from a senior software engineer at Google noted that Google is losing its edge in artificial intelligence to the open-source community, where many independent researchers use AI technology to make rapid and unexpected advances.
“A third faction has been quietly eating our lunch. I’m talking, of course, about open source,” he said.
“The one clear winner in all of this is Meta. Because the leaked model [LLaMA] was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open-source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.”
The engineer, Luke Sernau, published the document on an internal system at Google in early April.
Over the past few weeks, the document was shared thousands of times among Googlers, according to a person familiar with the matter, who asked not to be named because they were not authorized to discuss internal company matters.
On Thursday, the document below was published by an anonymous individual on a public Discord server. The only modifications are formatting and removing links to internal web pages. The document is only the opinion of a Google employee, not the entire firm.
We’ve done a lot of looking over our shoulders at OpenAI. Who will cross the next milestone? What will the next move be?
But the uncomfortable truth is, we aren’t positioned to win this arms race and neither is OpenAI. While we’ve been squabbling, a third faction has been quietly eating our lunch.
I’m talking, of course, about open source. Plainly put, they are lapping us. Things we consider “major open problems” are solved and in people’s hands today. Just to name a few:
While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months. This has profound implications for us:
We have no secret sauce. Our best hope is to learn from and collaborate with what others are doing outside Google. We should prioritize enabling 3P integrations.
People will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. We should consider where our value adds really is.
Giant models are slowing us down. In the long run, the best models are the ones that can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime.
https://lmsys.org/blog/2023-03-30-vicuna/
What Happened
At the beginning of March, the open source community got their hands on their first really capable foundation model, as Meta’s LLaMA was leaked to the public. It had no instruction or conversation tuning, and no RLHF. Nonetheless, the community immediately understood the significance of what they had been given.
A tremendous outpouring of innovation followed, with just days between major developments (see The Timeline for the full breakdown). Here we are, barely a month later, and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc. etc. many of which build on each other.
Most importantly, they have solved the scaling problem to the extent that anyone can tinker. Many of the new ideas are from ordinary people. The barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.
Why We Could Have Seen It Coming
In many ways, this shouldn’t be a surprise to anyone. The current renaissance in open-source LLMs comes hot on the heels of a renaissance in image generation. The similarities are not lost on the community, with many calling this the “Stable Diffusion moment” for LLMs.
In both cases, low-cost public involvement was enabled by a vastly cheaper mechanism for fine-tuning called low-rank adaptation, or LoRA, combined with a significant breakthrough in scale (latent diffusion for image synthesis, Chinchilla for LLMs). In both cases, access to a sufficiently high-quality model kicked off a flurry of ideas and iteration from individuals and institutions around the world. In both cases, this quickly outpaced the large players.
These contributions were pivotal in the image generation space, setting Stable Diffusion on a different path from Dall-E. Having an open model led to product integrations, marketplaces, user interfaces, and innovations that didn’t happen for Dall-E.
The effect was palpable: rapid domination in terms of cultural impact vs the OpenAI solution, which became increasingly irrelevant. Whether the same thing will happen for LLMs remains to be seen, but the broad structural elements are the same.
What We Missed
The innovations that powered open source’s recent successes directly solve problems we’re still struggling with. Paying more attention to their work could help us to avoid reinventing the wheel.
LoRA is an incredibly powerful technique we should probably be paying more attention to
LoRA works by representing model updates as low-rank factorizations, which reduces the size of the update matrices by a factor of up to several thousand. This allows model fine-tuning at a fraction of the cost and time. Being able to personalize a language model in a few hours on consumer hardware is a big deal, particularly for aspirations that involve incorporating new and diverse knowledge in near real-time. The fact that this technology exists is underexploited inside Google, even though it directly impacts some of our most ambitious projects.
Retraining models from scratch is the hard path
Part of what makes LoRA so effective is that – like other forms of fine-tuning – it’s stackable. Improvements like instruction tuning can be applied and then leveraged as other contributors add on dialogue, or reasoning, or tool use. While the individual fine tunings are low rank, their sum need not be, allowing full-rank updates to the model to accumulate over time.
This means that as new and better datasets and tasks become available, the model can be cheaply kept up to date, without ever having to pay the cost of a full run.
By contrast, training giant models from scratch not only throws away the pretraining, but also any iterative improvements that have been made on top. In the open source world, it doesn’t take long before these improvements dominate, making a full retrain extremely costly.
We should be thoughtful about whether each new application or idea really needs a whole new model. If we really do have major architectural improvements that preclude directly reusing model weights, then we should invest in more aggressive forms of distillation that allow us to retain as much of the previous generation’s capabilities as possible.
Large models aren’t more capable in the long run if we can iterate faster on small models
LoRA updates are very cheap to produce (~$100) for the most popular model sizes. This means that almost anyone with an idea can generate one and distribute it. Training times under a day are the norm. At that pace, it doesn’t take long before the cumulative effect of all of these fine-tunings overcomes starting off at a size disadvantage. Indeed, in terms of engineer-hours, the pace of improvement from these models vastly outstrips what we can do with our largest variants, and the best are already largely indistinguishable from ChatGPT. Focusing on maintaining some of the largest models on the planet actually puts us at a disadvantage.
Data quality scales better than data size
Many of these projects are saving time by training on small, highly curated datasets. This suggests there is some flexibility in data scaling laws. The existence of such datasets follows from the line of thinking in Data Doesn’t Do What You Think, and they are rapidly becoming the standard way to do training outside Google. These datasets are built using synthetic methods (e.g. filtering the best responses from an existing model) and scavenging from other projects, neither of which is dominant at Google. Fortunately, these high quality datasets are open source, so they are free to use.
Directly Competing With Open Source Is a Losing Proposition
This recent progress has direct, immediate implications for our business strategy. Who would pay for a Google product with usage restrictions if there is a free, high quality alternative without them?
And we should not expect to be able to catch up. The modern internet runs on open source for a reason. Open source has some significant advantages that we cannot replicate.
We need them more than they need us
Keeping our technology secret was always a tenuous proposition. Google researchers are leaving for other companies on a regular cadence, so we can assume they know everything we know, and will continue to for as long as that pipeline is open.
But holding on to a competitive advantage in technology becomes even harder now that cutting edge research in LLMs is affordable. Research institutions all over the world are building on each other’s work, exploring the solution space in a breadth-first way that far outstrips our own capacity. We can try to hold tightly to our secrets while outside innovation dilutes their value, or we can try to learn from each other.
Individuals are not constrained by licenses to the same degree as corporations
Much of this innovation is happening on top of the leaked model weights from Meta. While this will inevitably change as truly open models get better, the point is that they don’t have to wait. The legal cover afforded by “personal use” and the impracticality of prosecuting individuals means that individuals are getting access to these technologies while they are hot.
Being your own customer means you understand the use case
Browsing through the models that people are creating in the image generation space, there is a vast outpouring of creativity, from anime generators to HDR landscapes. These models are used and created by people who are deeply immersed in their particular subgenre, lending a depth of knowledge and empathy we cannot hope to match.
Owning the Ecosystem: Letting Open Source Work for Us
Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet’s worth of free labor. Since most open-source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
The value of owning the ecosystem cannot be overstated. Google itself has successfully used this paradigm in its open-source offerings, like Chrome and Android. By owning the platform where innovation happens, Google cements itself as a thought leader and direction-setter, earning the ability to shape the narrative on ideas that are larger than itself.
The more tightly we control our models, the more attractive we make open alternatives. Google and OpenAI have both gravitated defensively toward release patterns that allow them to retain tight control over how their models are used. But this control is a fiction. Anyone seeking to use LLMs for unsanctioned purposes can simply take their pick of the freely available models.
Google should establish itself a leader in the open source community, taking the lead by cooperating with, rather than ignoring, the broader conversation. This probably means taking some uncomfortable steps, like publishing the model weights for small ULM variants. This necessarily means relinquishing some control over our models. But this compromise is inevitable. We cannot hope to both drive innovation and control it.
Epilogue: What about OpenAI?
All this talk of open source can feel unfair given OpenAI’s current closed policy. Why do we have to share, if they won’t? But the fact of the matter is, we are already sharing everything with them in the form of the steady flow of poached senior researchers. Until we stem that tide, secrecy is a moot point.
And in the end, OpenAI doesn’t matter. They are making the same mistakes we are in their posture relative to open source, and their ability to maintain an edge is necessarily in question. Open source alternatives can and will eventually eclipse them unless they change their stance. In this respect, at least, we can make the first move.
The Timeline
Feb 24, 2023 – LLaMA is Launched
Meta launches LLaMA, open sourcing the code, but not the weights. At this point, LLaMA is not instruction or conversation tuned. Like many current models, it is a relatively small model (available at 7B, 13B, 33B, and 65B parameters) that has been trained for a relatively large amount of time, and is therefore quite capable relative to its size.
March 3, 2023 – The Inevitable Happens
Within a week, LLaMA is leaked to the public. The impact on the community cannot be overstated. Existing licenses prevent it from being used for commercial purposes, but suddenly anyone is able to experiment. From this point forward, innovations come hard and fast.
March 12, 2023 – Language models on a Toaster
A little over a week later, Artem Andreenko gets the model working on a Raspberry Pi. At this point the model runs too slowly to be practical because the weights must be paged in and out of memory. Nonetheless, this sets the stage for an onslaught of minification efforts.
March 13, 2023 – Fine Tuning on a Laptop
The next day, Stanford releases Alpaca, which adds instruction tuning to LLaMA. More important than the actual weights, however, was Eric Wang’s alpaca-lora repo, which used low rank fine-tuning to do this training “within hours on a single RTX 4090”.
Suddenly, anyone could fine-tune the model to do anything, kicking off a race to the bottom on low-budget fine-tuning projects. Papers proudly describe their total spend of a few hundred dollars. What’s more, the low rank updates can be distributed easily and separately from the original weights, making them independent of the original license from Meta. Anyone can share and apply them.
March 18, 2023 – Now It’s Fast
Georgi Gerganov uses 4 bit quantization to run LLaMA on a MacBook CPU. It is the first “no GPU” solution that is fast enough to be practical.
March 19, 2023 – A 13B model achieves “parity” with Bard
The next day, a cross-university collaboration releases Vicuna, and uses GPT-4-powered eval to provide qualitative comparisons of model outputs. While the evaluation method is suspect, the model is materially better than earlier variants. Training Cost: $300.
Notably, they were able to use data from ChatGPT while circumventing restrictions on its API – They simply sampled examples of “impressive” ChatGPT dialogue posted on sites like ShareGPT.
March 25, 2023 – Choose Your Own Model
Nomic creates GPT4All, which is both a model and, more importantly, an ecosystem. For the first time, we see models (including Vicuna) being gathered together in one place. Training Cost: $100.
March 28, 2023 – Open Source GPT-3
Cerebras (not to be confused with our own Cerebra) trains the GPT-3 architecture using the optimal compute schedule implied by Chinchilla, and the optimal scaling implied by μ-parameterization. This outperforms existing GPT-3 clones by a wide margin, and represents the first confirmed use of μ-parameterization “in the wild”. These models are trained from scratch, meaning the community is no longer dependent on LLaMA.
March 28, 2023 – Multimodal Training in One Hour
Using a novel Parameter Efficient Fine Tuning (PEFT) technique, LLaMA-Adapter introduces instruction tuning and multimodality in one hour of training. Impressively, they do so with just 1.2M learnable parameters. The model achieves a new SOTA on multimodal ScienceQA.
April 3, 2023 – Real Humans Can’t Tell the Difference Between a 13B Open Model and ChatGPT
Berkeley launches Koala, a dialogue model trained entirely using freely available data.
They take the crucial step of measuring real human preferences between their model and ChatGPT. While ChatGPT still holds a slight edge, more than 50% of the time users either prefer Koala or have no preference. Training Cost: $100.
April 15, 2023 – Open Source RLHF at ChatGPT Levels
Open Assistant launches a model and, more importantly, a dataset for Alignment via RLHF. Their model is close (48.3% vs. 51.7%) to ChatGPT in terms of human preference. In addition to LLaMA, they show that this dataset can be applied to Pythia-12B, giving people the option to use a fully open stack to run the model. Moreover, because the dataset is publicly available, it takes RLHF from unachievable to cheap and easy for small experimenters.
My thoughts on the now famous Google leak doc: https://t.co/hK2KcMTEDB
1. Open source AI is winning. I agree, and that is great for the world and for a competitive ecosystem. In LLMs we’re not there, but we just got OpenClip to beat openAI Clip and Stable diffusion is better than…
The Biden Administration announced actions to deal with AI-related risks and opportunities and make sure companies deploy safe products.
Yesterday, Vice President Harris and senior Administration officials met with CEOs of Alphabet, Anthropic, Microsoft, and OpenAI encouraging them to apply safeguards that mitigate risks and potential harms to individuals and society [In the picture, Sundar Pichai, Google’s CEO, left, and Sam Altman, OpenAI’s CEO, arriving at the White House.]
More engagements are planned with corporations, researchers, civil rights organizations, not-for-profit organizations, communities, international partners, and others on critical AI issues.
In addition, the National Science Foundation announced $140 million in funding to launch seven new National AI Research Institutes, that will advance AI R&D to drive breakthroughs in critical areas, including climate, agriculture, energy, public health, education, and cybersecurity.
This investment will bring the total number of Institutes to 25 across the country, and extend the network of organizations involved to nearly every state.
The Administration will promote an independent commitment from leading AI developers, including Anthropic, Google, Hugging Face, Microsoft, NVIDIA, OpenAI, and Stability AI, will participate in a public evaluation of AI systems by Scale AI—at the AI Village at DEFCON 31.
“This independent exercise will provide critical information to researchers and the public about the impacts of these models, and will enable AI companies and developers to take steps to fix issues found in those models. Testing of AI models independent of government or the companies that have developed them is an important component in their effective evaluation,” said the Biden Administration.
Discussing the impacts of artificial intelligence, Steve Wozniak, Apple’s co-founder said on CNN that he was not concerned. [See video below]
“I am confident AI will be used by bad actors, and yes it will cause real damage,” Microsoft Corp. Chief Economist Michael Schwarz said during a World Economic Forum panel in Geneva on Wednesday.
“It can do a lot damage in the hands of spammers with elections and so on,” he added.
His statements came one day after the “Godfather of AI”, Dr. Geoffrey Hinton [in the picture], quit Google after warning of the dangers of AI ahead, as The New York Times reported.
Dr. Geoffrey Hinton, an artificial intelligence pioneer, announced he was regretting his life’s work and was leaving Google, where he has worked for more than a decade so that he freely shares his concern that artificial intelligence could cause the world serious harm.
On Monday he joined a growing number of critics who say those companies are racing toward danger with their aggressive campaign to create products based on generative artificial intelligence, the technology that powers popular chatbots like ChatGPT.
“I console myself with the normal excuse: If I hadn’t done it, somebody else would have,” Dr. Hinton said to The New York Times.
“Dr. Hinton’s journey from A.I. groundbreaker to doomsayer marks a remarkable moment for the technology industry at perhaps its most important inflection point in decades,” wrote the paper.
Many industry insiders say Generative A.I. can already be a tool for misinformation, soon, it could be a risk to jobs, and somewhere down the line, it could be a risk to humanity.
“It is hard to see how you can prevent the bad actors from using it for bad things,” Dr. Hinton said.
Google spent $44 million to acquire a company started by Dr. Hinton and his two students. And their system led to the creation of increasingly powerful technologies, including new chatbots like ChatGPT and Google Bard.
In 2018, Dr. Hinton and two other longtime collaborators received the Turing Award, often called “the Nobel Prize of computing,” for their work on neural networks.
Dr. Hinton believes that the race between Google and Microsoft and others will escalate into a global race that will not stop without some sort of global regulation.
“The best hope is for the world’s leading scientists to collaborate on ways of controlling the technology.”
Palo Alto-based startup Inflection AI, run by Ex-DeepMind leaders, is launching a conversational chatbot named PI (for “personal intelligence”) that plays the active listener role and acts in a more personal way over back-and-forth dialog.
“It’s really a new class of AI — it’s distinct in the sense that a personal AI is one that really works for you as the individual,” said Mustafa Suleyman, CEO of Inflection AI, [in the picture below].
“Pi will help you organize your schedule, prep for meetings, and learn new skills,” he added. “It’s AI that is singularly aligned to your interests, that’s really yours.”
Built on one of Inflection’s in-house large language models, Pi is designed to speak casually as if conversing with an attentive friend while giving fact-based answers. Its company Inflection AI has raised $225 million to date.
The market is now being flooded with all kinds of bots, with the industry shipping new models and products in an accelerated way.
Last week, Suleyman’s former colleague at Google, AI “godfather” Geoffrey Hinton, announced he quit his job at Google to be able to speak more freely about AI’s dangers. “Look at how it was five years ago and how it is now,” Hinton toldThe New York Times. “Take the difference and propagate it forward. That’s scary.”
Suleyman’s previous company, DeepMind, was acquired by Google in 2014 to form the backbone of much of the company’s AI research — and the industry’s — ever since.
Structured as a public benefit corporation, Inflection employs about 30 people today, with Hoffman spending about one day per week with the company, according to Suleyman.
Inflection plans to offer Pi for free for now, with no token restrictions. Like OpenAI, Inflection uses Microsoft Azure for its cloud infrastructure.
“Through 10 or 20 such exchanges, Pi can tease out what a user really wants to know, or is hoping to talk through, more like a sounding board than a repackaged Wikipedia answer,” Suleyman said to Forbes.
Unlike other chatbots, Pi remembers a hundred turns of conversation with logged-in users across platforms, including WhatsApp, SMS messages, Facebook messages, and Instagram.
It also detects when users appear to be agitated or frustrated, and tweak its tone of responses, Suleyman said.