I gave a talk at Seoul National University.
I noscriptd the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field.
Video: https://youtu.be/dbo3kNKPaUA
Slides: https://docs.google.com/presentation/d/1636wKStYdT_yRPbJNrf8MLKpQghuWGDmyHinHhAKeXY/edit?usp=sharing
Trying to summarize the field forced me to think about what really matters in the field. While scaling undeniably stands out, its far-reaching implications are more nuanced. I share my thoughts on scaling from three angles:
1) Change in perspective is necessary because some abilities only emerge at a certain scale. Even if some abilities don’t work with the current generation LLMs, we should not claim that it doesn’t work. Rather, we should think it doesn’t work yet. Once larger models are available many conclusions change.
This also means that some conclusions from the past are invalidated and we need to constantly unlearn intuitions built on top of such ideas.
2) From first-principles, scaling up the Transformer amounts to efficiently doing matrix multiplications with many, many machines. I see many researchers in the field of LLM who are not familiar with how scaling is actually done. This section is targeted for technical audiences who want to understand what it means to train large models.
3) I talk about what we should think about for further scaling (think 10000x GPT-4 scale). To me scaling isn’t just doing the same thing with more machines. It entails finding the inductive bias that is the bottleneck in further scaling.
I believe that the maximum likelihood objective function is the bottleneck in achieving the scale of 10000x GPT-4 level. Learning the objective function with an expressive neural net is the next paradigm that is a lot more scalable. With the compute cost going down exponentially, scalable methods eventually win. Don’t compete with that.
In all of these sections, I strive to describe everything from first-principles. In an extremely fast moving field like LLM, no one can keep up. I believe that understanding the core ideas by deriving from first-principles is the only scalable approach.
I noscriptd the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field.
Video: https://youtu.be/dbo3kNKPaUA
Slides: https://docs.google.com/presentation/d/1636wKStYdT_yRPbJNrf8MLKpQghuWGDmyHinHhAKeXY/edit?usp=sharing
Trying to summarize the field forced me to think about what really matters in the field. While scaling undeniably stands out, its far-reaching implications are more nuanced. I share my thoughts on scaling from three angles:
1) Change in perspective is necessary because some abilities only emerge at a certain scale. Even if some abilities don’t work with the current generation LLMs, we should not claim that it doesn’t work. Rather, we should think it doesn’t work yet. Once larger models are available many conclusions change.
This also means that some conclusions from the past are invalidated and we need to constantly unlearn intuitions built on top of such ideas.
2) From first-principles, scaling up the Transformer amounts to efficiently doing matrix multiplications with many, many machines. I see many researchers in the field of LLM who are not familiar with how scaling is actually done. This section is targeted for technical audiences who want to understand what it means to train large models.
3) I talk about what we should think about for further scaling (think 10000x GPT-4 scale). To me scaling isn’t just doing the same thing with more machines. It entails finding the inductive bias that is the bottleneck in further scaling.
I believe that the maximum likelihood objective function is the bottleneck in achieving the scale of 10000x GPT-4 level. Learning the objective function with an expressive neural net is the next paradigm that is a lot more scalable. With the compute cost going down exponentially, scalable methods eventually win. Don’t compete with that.
In all of these sections, I strive to describe everything from first-principles. In an extremely fast moving field like LLM, no one can keep up. I believe that understanding the core ideas by deriving from first-principles is the only scalable approach.
YouTube
Large Language Models (in 2023)
I gave a talk at Seoul National University.
I noscriptd the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field.
Trying to summarize the field forced me to think about what really matters in the field. While…
I noscriptd the talk “Large Language Models (in 2023)”. This was an ambitious attempt to summarize our exploding field.
Trying to summarize the field forced me to think about what really matters in the field. While…
London-based autone (YC S22) has raised $4.5M (~€4.27M) in seed funding to help business maximize their growth through data-driven inventory optimization.
Today, retailers have to make 1000s of complex operational decisions every day, all impacting their bottom line. This problem is being currently tackled with Excel or legacy systems, both of which no longer fit for purpose.
Founded by Adil Bouhdadi and Harry Glucksmann Cheslaw, two business scientists with over a decade of experience building successful data-driven supply chains at places like Kering and LVMH, autone is a platform that lets retailers make optimal decisions, easily and quickly. It ingests a retailer's data, generates recommendations, and then allows users to approve a given action.
Autone covers topics including product pricing, inventory replenishment, and re-ordering with the goal of covering all operational processes.
Congrats to the team on the seed!
Today, retailers have to make 1000s of complex operational decisions every day, all impacting their bottom line. This problem is being currently tackled with Excel or legacy systems, both of which no longer fit for purpose.
Founded by Adil Bouhdadi and Harry Glucksmann Cheslaw, two business scientists with over a decade of experience building successful data-driven supply chains at places like Kering and LVMH, autone is a platform that lets retailers make optimal decisions, easily and quickly. It ingests a retailer's data, generates recommendations, and then allows users to approve a given action.
Autone covers topics including product pricing, inventory replenishment, and re-ordering with the goal of covering all operational processes.
Congrats to the team on the seed!
https://twitter.com/AlexReibman/status/1710160221719654421
For one day only, Lightspeed Venture Partners invited San Francisco’s top AI entrepreneurs to showcase what’s possible with AI
Live product and raw code only. No bullshit.
Here are the top demos from the
@cerebral_valley
x
@lightspeedvp
AI coworking day (🧵)
For one day only, Lightspeed Venture Partners invited San Francisco’s top AI entrepreneurs to showcase what’s possible with AI
Live product and raw code only. No bullshit.
Here are the top demos from the
@cerebral_valley
x
@lightspeedvp
AI coworking day (🧵)
X (formerly Twitter)
Alex Reibman on X
For one day only, Lightspeed Venture Partners invited San Francisco’s top AI entrepreneurs to showcase what’s possible with AI
Live product and raw code only. No bullshit.
Here are the top demos from the @cerebral_valley x @lightspeedvp AI coworking day…
Live product and raw code only. No bullshit.
Here are the top demos from the @cerebral_valley x @lightspeedvp AI coworking day…
👍2
I bought a couple of Chinese microphones, I wear them and turn them on all day recording everything I speak, at the end of the day the files are processed with OpenAi’s Whisper and transformed into text files from which the information is extracted.
Since the early days of the iPhone and Android, the smartphone has reigned supreme. Now, entrepreneurs and tech giants are racing to deliver AI in new devices and gadgets to challenge the dominant device.
Companies race to make AI you can wear
https://www.axios.com/2023/10/04/ai-wearables-meta-humane-tab-rewind?utm_campaign=editorial&utm_source=twitter&utm_medium=social
Companies race to make AI you can wear
https://www.axios.com/2023/10/04/ai-wearables-meta-humane-tab-rewind?utm_campaign=editorial&utm_source=twitter&utm_medium=social
Axios
The race for AI you can wear
Meta and others are shipping glasses, pendants and pins to harness the power of generative AI.
https://flair.ai/
커머스에서 사용하는 디자인들을 쉽게 사용할 수 있게 도와주는 AI Tool인데 백만 유저가 넘었다고 하네요.
https://twitter.com/i/status/1708340444546109820
커머스에서 사용하는 디자인들을 쉽게 사용할 수 있게 도와주는 AI Tool인데 백만 유저가 넘었다고 하네요.
https://twitter.com/i/status/1708340444546109820
Flair.ai
AI Product Photo Generator & Editor | Create E-Commerce Images
Create studio-quality e-commerce photoshoots in seconds with our drag-and-drop AI editor. Try it free today.
Today we're officially opening applications for Llama Impact Grants.
Full details & application ➡️ https://bit.ly/45lqz7z
From now until November 15, organizations across the globe can submit proposals for how they'd like to utilize Llama 2 to address challenges across three different tracks: education, environment & open innovation. The goal of the program is to identify and support the most compelling applications of Llama 2 for societal benefit.
Following a two-phase proposal review process, three $500,000 grants will be awarded to winning teams to implement their solutions.
We can't wait to see what you'll build!
Full details & application ➡️ https://bit.ly/45lqz7z
From now until November 15, organizations across the globe can submit proposals for how they'd like to utilize Llama 2 to address challenges across three different tracks: education, environment & open innovation. The goal of the program is to identify and support the most compelling applications of Llama 2 for societal benefit.
Following a two-phase proposal review process, three $500,000 grants will be awarded to winning teams to implement their solutions.
We can't wait to see what you'll build!
AI at Meta
Llama Impact Grants - AI at Meta
We are launching a challenge to encourage a diverse set of public, non-profit, and for-profit entities to use Llama 2 to address environmental, education and other important challenges.
Last week, Canva hit 150 million monthly active users, according to an internal investor deck viewed by The Information—a 50% jump over the 100 million MAUs it reported last October. While the vast majority of users opt for Canva’s free tools, the number of paying users is growing rapidly as well, up 60% from last year to 16 million. The company now has $1.7 billion in annualized revenue and a cash balance of $800 million; it claims to have been profitable for the last six years. Though Canva’s valuation was slashed to $25.5 billion earlier this year, down from $40 billion in September 2021, Perkins expresses unbridled confidence in its future. “We’re in a very strong position,” she said. “People are turning to Canva, not away, in times of economic uncertainty.”
In keeping with the times, the company this week debuted a suite of artificial intelligence products. Some AI-powered features, like a background remover and text generator, have existed on Canva for several years. But those offerings have now been joined by Magic Studio, a collection of AI-powered design tools that work in concert.
The new features include Magic Media, an AI image and video generator powered by generative AI company Runway’s Gen-2 model; Magic Switch, a tool to automatically convert a document’s design style or translate it into another language; Magic Write, a text generator powered by OpenAI’s Chat GPT; and Brand Voice, a text generator that can produce a specific tone. Other outside tools, like OpenAI’s Dall-E 2 and Google Cloud’s Imagen, can be accessed within Canva’s app directory. The company is also launching a Creator Compensation Program, in which Canva will pay creators who consent to allowing use of their photos to train generative AI models.
In keeping with the times, the company this week debuted a suite of artificial intelligence products. Some AI-powered features, like a background remover and text generator, have existed on Canva for several years. But those offerings have now been joined by Magic Studio, a collection of AI-powered design tools that work in concert.
The new features include Magic Media, an AI image and video generator powered by generative AI company Runway’s Gen-2 model; Magic Switch, a tool to automatically convert a document’s design style or translate it into another language; Magic Write, a text generator powered by OpenAI’s Chat GPT; and Brand Voice, a text generator that can produce a specific tone. Other outside tools, like OpenAI’s Dall-E 2 and Google Cloud’s Imagen, can be accessed within Canva’s app directory. The company is also launching a Creator Compensation Program, in which Canva will pay creators who consent to allowing use of their photos to train generative AI models.
https://www.canva.com/design/DACsLMTGKb8/view#1
It was a 16-slide presentation deck, Hearnden recalled: “A simple story, made of photos woven together with a few words on each. It reinforced for me that these were exactly the kind of people I knew I wanted to work with—talented, driven, visionary, deeply committed but mixed with a healthy dose of cheek and the bizarre.”
It was a 16-slide presentation deck, Hearnden recalled: “A simple story, made of photos woven together with a few words on each. It reinforced for me that these were exactly the kind of people I knew I wanted to work with—talented, driven, visionary, deeply committed but mixed with a healthy dose of cheek and the bizarre.”
Canva
Dave's Pitch Deck
Check out this Facebook Post designed by Melanie Perkins.
Here’s How FTX Executives Secretly Spent $8 Billion in Customer Money https://www.wsj.com/finance/regulation/sbf-trial-ftx-customer-money-missing-6ba13914?reflink=integratedwebview_share
WSJ
Here’s How FTX Executives Secretly Spent $8 Billion in Customer Money
Billions went to personal loans, luxury real estate and donations.
LlamaIndex Talk (AGI House + Truera + Pinecone)
https://docs.google.com/presentation/d/1mBhBgO7VFbp3gGPGx45-B6kcQRsLfbIK15s90DL5YXg/edit#slide=id.p
Also I got a ton of questions on how to do structured querying with a vector db. We have a guide here - the current guide is using Chroma but make sure to swap it out for Pinecone Autoretriever: https://docs.llamaindex.ai/en/stable/examples/vector_stores/chroma_auto_retriever.html Pinecone integration: https://docs.llamaindex.ai/en/stable/examples/vector_stores/PineconeIndexDemo.html
https://docs.google.com/presentation/d/1mBhBgO7VFbp3gGPGx45-B6kcQRsLfbIK15s90DL5YXg/edit#slide=id.p
Also I got a ton of questions on how to do structured querying with a vector db. We have a guide here - the current guide is using Chroma but make sure to swap it out for Pinecone Autoretriever: https://docs.llamaindex.ai/en/stable/examples/vector_stores/chroma_auto_retriever.html Pinecone integration: https://docs.llamaindex.ai/en/stable/examples/vector_stores/PineconeIndexDemo.html
Google Docs
LlamaIndex Talk (AGI House + Truera + Pinecone)
Evaluating your RAG App Jerry Liu, LlamaIndex co-founder/CEO