Multimodality in the context of AI refers to systems that can process and interpret various types of data, such as text and images, simultaneously. It is crucial as it mirrors human sensory experience and increases the robustness and versatility of AI systems. Different data modalities and tasks require systems to not only analyze but also generate or understand multimodal outputs, such as image captioning or visual question answering.
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
https://huyenchip.com/2023/10/10/multimodal.html
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
https://huyenchip.com/2023/10/10/multimodal.html
Chip Huyen
Multimodality and Large Multimodal Models (LMMs)
For a long time, each ML model operated in one data mode – text (translation, language modeling), image (object detection, image classification), or audio (speech recognition).
Continuous Learning_Startup & Investment
Could AMD be another option for NVDIA? https://www.databricks.com/blog/training-llms-scale-amd-mi250-gpus https://www.mosaicml.com/blog/amd-mi250 https://docs.google.com/presentation/d/1O-zbXIj103FxOHm9bWs9pjR8N8B47YAM/edit#slide=id.p1 https://www.lam…
6. Develop GPU alternatives
GPU has been the dominating hardware for deep learning ever since AlexNet in 2012. In fact, one commonly acknowledged reason for AlexNet’s popularity is that it was the first paper to successfully use GPUs to train neural networks. Before GPUs, if you wanted to train a model at AlexNet’s scale, you’d have to use thousands of CPUs, like the one Google released just a few months before AlexNet. Compared to thousands of CPUs, a couple of GPUs were a lot more accessible to Ph.D. students and researchers, setting off the deep learning research boom.
In the last decade, many, many companies, both big corporations, and startups, have attempted to create new hardware for AI. The most notable attempts are Google’s TPUs, Graphcore’s IPUs (what’s happening with IPUs?), and Cerebras. SambaNova raised over a billion dollars to develop new AI chips but seems to have pivoted to being a generative AI platform.
For a while, there has been a lot of anticipation around quantum computing, with key players being:
IBM’s QPU
Google’s Quantum computer reported a major milestone in quantum error reduction earlier this year in Nature. Its quantum virtual machine is publicly accessible via Google Colab
Research labs such as MIT Center for Quantum Engineering, Max Planck Institute of Quantum Optics, Chicago Quantum Exchange, Oak Ridge National Laboratory, etc.
Another direction that is also super exciting is photonic chips. This is the direciton I know the least about – so please correct me if I’m wrong. Existing chips today use electricity to move data, which consumes a lot of power and also incurs latency. Photonic chips use photons to move data, harnessing the speed of light for faster and more efficient compute. Various startups in this space have raised hundreds of millions of dollars, including Lightmatter ($270M), Ayar Labs ($220M), Lightelligence ($200M+), and Luminous Computing ($115M).
Below is the timeline of advances of the three major methods in photonic matrix computation, from the paper Photonic matrix multiplication lights up photonic accelerator and beyond (Zhou et al., Nature 2022). The three different methods are plane light conversion (PLC), Mach–Zehnder interferometer (MZI), and wavelength division multiplexing (WDM).
GPU has been the dominating hardware for deep learning ever since AlexNet in 2012. In fact, one commonly acknowledged reason for AlexNet’s popularity is that it was the first paper to successfully use GPUs to train neural networks. Before GPUs, if you wanted to train a model at AlexNet’s scale, you’d have to use thousands of CPUs, like the one Google released just a few months before AlexNet. Compared to thousands of CPUs, a couple of GPUs were a lot more accessible to Ph.D. students and researchers, setting off the deep learning research boom.
In the last decade, many, many companies, both big corporations, and startups, have attempted to create new hardware for AI. The most notable attempts are Google’s TPUs, Graphcore’s IPUs (what’s happening with IPUs?), and Cerebras. SambaNova raised over a billion dollars to develop new AI chips but seems to have pivoted to being a generative AI platform.
For a while, there has been a lot of anticipation around quantum computing, with key players being:
IBM’s QPU
Google’s Quantum computer reported a major milestone in quantum error reduction earlier this year in Nature. Its quantum virtual machine is publicly accessible via Google Colab
Research labs such as MIT Center for Quantum Engineering, Max Planck Institute of Quantum Optics, Chicago Quantum Exchange, Oak Ridge National Laboratory, etc.
Another direction that is also super exciting is photonic chips. This is the direciton I know the least about – so please correct me if I’m wrong. Existing chips today use electricity to move data, which consumes a lot of power and also incurs latency. Photonic chips use photons to move data, harnessing the speed of light for faster and more efficient compute. Various startups in this space have raised hundreds of millions of dollars, including Lightmatter ($270M), Ayar Labs ($220M), Lightelligence ($200M+), and Luminous Computing ($115M).
Below is the timeline of advances of the three major methods in photonic matrix computation, from the paper Photonic matrix multiplication lights up photonic accelerator and beyond (Zhou et al., Nature 2022). The three different methods are plane light conversion (PLC), Mach–Zehnder interferometer (MZI), and wavelength division multiplexing (WDM).
As a founder, writing investor updates was a helpful forcing function to step back from the day-to-day and reflect on the trajectory of my business. Now, as an investor, I deeply appreciate working with founders who are disciplined about regular written communication.
I know that it can be daunting to write an effective investor update if you don’t have a template to work off of. So if this idea resonates, but you don’t currently have a great structure for your written investor updates, I recommend this email template from Mathilde Collin—it’s the same one she sends to Front’s investors every month. Here’s why I love this format:
🍫 / It’s short but context rich. For investors, it’s disconcerting when a founder disappears for six months then comes back with tanking KPIs. We want to help, but if we have no context on what’s been going on, it’s difficult to provide meaningful guidance. Written, consistent updates like these, even if they’re only a few bullet points, help us maintain a mental snapshot of your company. These emails are also a great way to refresh ourselves on the latest before jumping on a call with you.
🙈 / There’s no way to hide. With this format, everything—from your monthly burn to the number of customers—is front and center. Keep the same KPIs every month to avoid cherry-picking. There’s even a section dedicated entirely to what’s NOT going well (as all founders know, there’s always something!) This type of structured update keeps you honest and shows investors that they’re getting the full picture vs. a carefully curated highlight reel.
👯♀️ / Opens up a two-way dialogue. Many people view these types of updates as transactional, but I actually think they can create space for more meaningful engagement between founders and investors. For example, as a founder, you can use this monthly touch point to ask us for what you need—whether that’s a referral or a sounding board. As an investor, I love having the opportunity to ask follow-up questions or to just reply with a word of encouragement. These micro-moments can lead to stronger relationships.
Curious if anyone structures their investor updates in a different way? Also, if you’re interested, Mathilde offers several more templates in this Review article: https://bit.ly/40zGoqn
I know that it can be daunting to write an effective investor update if you don’t have a template to work off of. So if this idea resonates, but you don’t currently have a great structure for your written investor updates, I recommend this email template from Mathilde Collin—it’s the same one she sends to Front’s investors every month. Here’s why I love this format:
🍫 / It’s short but context rich. For investors, it’s disconcerting when a founder disappears for six months then comes back with tanking KPIs. We want to help, but if we have no context on what’s been going on, it’s difficult to provide meaningful guidance. Written, consistent updates like these, even if they’re only a few bullet points, help us maintain a mental snapshot of your company. These emails are also a great way to refresh ourselves on the latest before jumping on a call with you.
🙈 / There’s no way to hide. With this format, everything—from your monthly burn to the number of customers—is front and center. Keep the same KPIs every month to avoid cherry-picking. There’s even a section dedicated entirely to what’s NOT going well (as all founders know, there’s always something!) This type of structured update keeps you honest and shows investors that they’re getting the full picture vs. a carefully curated highlight reel.
👯♀️ / Opens up a two-way dialogue. Many people view these types of updates as transactional, but I actually think they can create space for more meaningful engagement between founders and investors. For example, as a founder, you can use this monthly touch point to ask us for what you need—whether that’s a referral or a sounding board. As an investor, I love having the opportunity to ask follow-up questions or to just reply with a word of encouragement. These micro-moments can lead to stronger relationships.
Curious if anyone structures their investor updates in a different way? Also, if you’re interested, Mathilde offers several more templates in this Review article: https://bit.ly/40zGoqn
First Round Review
The Founder’s Guide to Discipline: Lessons from Front’s Mathilde Collin
Front's CEO and co-founder Mathilde Collin shares why a founder’s discipline matters more than vision, unveiling her own best practices and templates for communication, time management, fundraising and team building.
최근 OpenAI Dev Day이후 스타트업은 어떻게 해야하나라는 자조섞인 의견들이 있는데 오늘 Nat Friedman(Builder of Github Copilot)이 이것과 관련된 이야기를 해서 메모.
Q: OpenAI, Google같은 기업이 다 해먹으면 스타트업이 어떤 시장의 문제를 풀어야하나?
GPT Wrapper냐 아니냐가 중요한 질문이 아니라 유저를 잘 확보해서 좋은 비즈니스냐 아니냐라는 고민을 해야한다. 원래 Salesforce도 Oracle DB + HTML 쓴 웹 Wrapper였는데 워낙 유저에게 주는 가치가 크다보니 지금은 엄청 커졌다.
Google 초창기에 MS가 데이터도 많고, 고객도 많아서 구글 없앨 수 있다는 이야기 정말 많이 있었다. 그런데 지금 봐라 ㅎㅎ.
물론 경쟁이 덜 치열한 곳에서 시작하는 게 스타트업에게 맞는 말이지만 Perplexity는 구글이랑 바로 경쟁하는데 지금 너무 잘해주고 있다. 그런거 보면 꼭 경쟁이 치열하다고 해서 무조건 다 죽는건 아닌것 같기도하다.
Q: GPU Cluster는 왜 만든거냐?
MS에 있으면서도 스타트업 투자도 하고 도와주기도 했는데 대부분 GPU가 없다고 해서 MS에서 가장 큰 GPU Broker역할을 했었다?! ㅎㅎ. 그리고 Coreweave라는 데이터 센터 투자가 들어와서 소프트웨어만 투자한다고 거절했더니 그래도 한번 보라고 해서 봤다. 그런데 마진과 성장이 미쳤더라.
앞으로 AI관련 회사에 투자할 건데 대부분 투자를 받은 회사들이 투자금을 GPU를 사용하거나 사는데에 사용한다면 내가 미리 사는게 좋겠다고 생각이 들어서 클러스터를 구축하게 되었다. 무려 2,512 H100로 전세계 5위정도..
Named Andromeda Cluster, this system comprises 2,512 H100 GPUs and can train a 65 billion parameter AI model in approximately 10 days, as stated by the venture capitalists. While it may not be the largest model available, it is undoubtedly a significant achievement.
Q: Github Copilot 어떻게 만들었냐?
처음에 GPT를 썼을 땐 꽤 재밌는 장난감 이었는데, 코딩을 시켜보니 이걸 잘하더라. 근데 이걸 제품화시키는건 다른이야기였다. 고객이 좋아하는지도 모르는 제품/기능을 만들어가면서 고객을 만족시켜야하는게 Product Market Fit을 찾는 과정이었다. 우리가 고객이라 다행이었다.
Q: Scroll 왜하냐?
줄리우스 시저의 할아버지가 과거에 지었던 별장이 화산에 묻혀서 그 유적이 나중에 발굴되었는데 아무도 관련 연구를 하지 않고 있더라. 그 연구를 하는 사람을 우연히 발견해서 너무 신나서 주변에 Billionaires들과 캠핑갈 때 그 사람을 데려갔는데 아무도 관심이 없더라. 그래서 내가 먼저 투자하고 인류의 역사를 넓히는 일에 동참하라고 열심히 주변에 피치하고 트위터에 홍보하다보니 $2m 짜리 프로젝트(관련 연구에서 가장 펀딩을 잘받은 연구)가 되버렸다고 한다..
---
아직도 산업에 일어날 일은 많고 풀어야할 문제는 많다. 창업자, 투자자들에겐 앞으로 5-10년이 정말로 재밌는 시기일듯. 이미 여러 산업을 경험하고 성공과 실패가 많은 사람들이 굉장히 Casual하게 여러 스테이지 창업자들과 교류하고 돕는 생태계는 Bay의 엄청난 장점인듯.
Multimodal의 시대가 되면서 코딩 이외에 다양한 분야에서 변화가 기대된다. 지금은 Copilot으로 코딩하는게 당연한데 1-3년뒤 우리는 무엇을 당연하게 생각할까?
할 수 있는 게 너무 많은 세상이다. 다들 화이팅.
Q: OpenAI, Google같은 기업이 다 해먹으면 스타트업이 어떤 시장의 문제를 풀어야하나?
GPT Wrapper냐 아니냐가 중요한 질문이 아니라 유저를 잘 확보해서 좋은 비즈니스냐 아니냐라는 고민을 해야한다. 원래 Salesforce도 Oracle DB + HTML 쓴 웹 Wrapper였는데 워낙 유저에게 주는 가치가 크다보니 지금은 엄청 커졌다.
Google 초창기에 MS가 데이터도 많고, 고객도 많아서 구글 없앨 수 있다는 이야기 정말 많이 있었다. 그런데 지금 봐라 ㅎㅎ.
물론 경쟁이 덜 치열한 곳에서 시작하는 게 스타트업에게 맞는 말이지만 Perplexity는 구글이랑 바로 경쟁하는데 지금 너무 잘해주고 있다. 그런거 보면 꼭 경쟁이 치열하다고 해서 무조건 다 죽는건 아닌것 같기도하다.
Q: GPU Cluster는 왜 만든거냐?
MS에 있으면서도 스타트업 투자도 하고 도와주기도 했는데 대부분 GPU가 없다고 해서 MS에서 가장 큰 GPU Broker역할을 했었다?! ㅎㅎ. 그리고 Coreweave라는 데이터 센터 투자가 들어와서 소프트웨어만 투자한다고 거절했더니 그래도 한번 보라고 해서 봤다. 그런데 마진과 성장이 미쳤더라.
앞으로 AI관련 회사에 투자할 건데 대부분 투자를 받은 회사들이 투자금을 GPU를 사용하거나 사는데에 사용한다면 내가 미리 사는게 좋겠다고 생각이 들어서 클러스터를 구축하게 되었다. 무려 2,512 H100로 전세계 5위정도..
Named Andromeda Cluster, this system comprises 2,512 H100 GPUs and can train a 65 billion parameter AI model in approximately 10 days, as stated by the venture capitalists. While it may not be the largest model available, it is undoubtedly a significant achievement.
Q: Github Copilot 어떻게 만들었냐?
처음에 GPT를 썼을 땐 꽤 재밌는 장난감 이었는데, 코딩을 시켜보니 이걸 잘하더라. 근데 이걸 제품화시키는건 다른이야기였다. 고객이 좋아하는지도 모르는 제품/기능을 만들어가면서 고객을 만족시켜야하는게 Product Market Fit을 찾는 과정이었다. 우리가 고객이라 다행이었다.
Q: Scroll 왜하냐?
줄리우스 시저의 할아버지가 과거에 지었던 별장이 화산에 묻혀서 그 유적이 나중에 발굴되었는데 아무도 관련 연구를 하지 않고 있더라. 그 연구를 하는 사람을 우연히 발견해서 너무 신나서 주변에 Billionaires들과 캠핑갈 때 그 사람을 데려갔는데 아무도 관심이 없더라. 그래서 내가 먼저 투자하고 인류의 역사를 넓히는 일에 동참하라고 열심히 주변에 피치하고 트위터에 홍보하다보니 $2m 짜리 프로젝트(관련 연구에서 가장 펀딩을 잘받은 연구)가 되버렸다고 한다..
---
아직도 산업에 일어날 일은 많고 풀어야할 문제는 많다. 창업자, 투자자들에겐 앞으로 5-10년이 정말로 재밌는 시기일듯. 이미 여러 산업을 경험하고 성공과 실패가 많은 사람들이 굉장히 Casual하게 여러 스테이지 창업자들과 교류하고 돕는 생태계는 Bay의 엄청난 장점인듯.
Multimodal의 시대가 되면서 코딩 이외에 다양한 분야에서 변화가 기대된다. 지금은 Copilot으로 코딩하는게 당연한데 1-3년뒤 우리는 무엇을 당연하게 생각할까?
할 수 있는 게 너무 많은 세상이다. 다들 화이팅.
https://techcrunch.com/2023/11/08/meta-hugging-face-open-source-ai-station-f/
Open foundation
From today through December 1 (2023), startups can apply to join the new “AI Startup Program” at Station F, with five winners proceeding to the accelerator program that will run from January to June. The chosen startups, selected by a panel of judges from Meta, Hugging Face and French cloud company Scaleway, will have at least one thing in common — they will be working on projects substantively built on open foundation models, or at the very least can demonstrate a “willingness to integrate these models into their products and services,” according to the announcement issued by Meta today.
Open foundation
From today through December 1 (2023), startups can apply to join the new “AI Startup Program” at Station F, with five winners proceeding to the accelerator program that will run from January to June. The chosen startups, selected by a panel of judges from Meta, Hugging Face and French cloud company Scaleway, will have at least one thing in common — they will be working on projects substantively built on open foundation models, or at the very least can demonstrate a “willingness to integrate these models into their products and services,” according to the announcement issued by Meta today.
TechCrunch
Meta taps Hugging Face for startup accelerator to spur adoption of open source AI models
Facebook parent Meta is teaming up with Hugging Face and European cloud infrastructure company Scaleway to launch a new AI-focused startup program at the Station F startup megacampus in Paris.
“인생은 너무 짧아서 다투고 사과하고 가슴앓이 하고 해명을 요구할 시간이 없다. 오직 사랑할 시간만 있을 뿐. 하지만 그 시간마저도 순식간에 지나간다"
- 마크 트웨인
- 마크 트웨인
👍2
Forwarded from 전종현의 인사이트
Forwarded from 전종현의 인사이트
"결론은 인사가 가장 중요하고, 인사 안에서도 채용과 문화가 중요하다. 그 다음에 지속 가능성에 대해서 결국 우리의 본질은 팬과 콘텐츠라고 정리했다. 이런 대전제 하에 지속 성장이 가능한 구조들을 짰다. 매뉴얼을 짜고 문화적인 원칙들을 세우고 지금 우리가 DNA라고 부르는 것들을 만들어내고 우리의 인재상을 정립했다."
https://www.mk.co.kr/news/culture/10868702
https://www.mk.co.kr/news/culture/10868702
매일경제
[단독] 방시혁 “K팝에서 K를 떼야 산다…이대로면 성장 한계 명확”
“나는 태생적으로 현실 안주 못해” “혁신은 일상의 작은 불편들과 부조리 해소” “케이팝에도 독기는 중요한 요소. 그건 야망의 증명이다. 팬들도 알아봐” 방시혁 하이브 의장 인터뷰
❤1