Seek to ~1hr mark.
With the newly announced GPTs, I think we’re seeing a new (still a bit primordial) layer of abstraction in computing. There will be a lot more developers, and a lot more GPTs. GPTs that can read, write, hear, speak, see, paint, think, use existing computing as tools, become experts in focus areas, reference custom data, take actions in the digital world, speak or act in custom ways, and collaborate together. Strap in.
With the newly announced GPTs, I think we’re seeing a new (still a bit primordial) layer of abstraction in computing. There will be a lot more developers, and a lot more GPTs. GPTs that can read, write, hear, speak, see, paint, think, use existing computing as tools, become experts in focus areas, reference custom data, take actions in the digital world, speak or act in custom ways, and collaborate together. Strap in.
Forwarded from BZCF | 비즈까페
최근 정주영 창업경진대회에서 토스의 창업자이자 대표이신 이승건 대표님이 한 강연을 촬영한 영상입니다. 내용이 너무 좋네요. 이러한 한국의 창업자들이 자신들의 이야기를 후배 창업자들에게, 그리고 사회에게 공유해주는 것이 큰 자산이 된다고 생각합니다. 정말로 많은 생각들 할 수 있게 되었네요. 아직 조회수가 높지는 않지만... 그것과 무관하게 정말로 가치가 있는 영상이네요. 시간 되실 때, 한번 보시면 좋지 않을까 합니다. 추천합니다.
https://www.youtube.com/watch?v=jARKSXogEE0
https://www.youtube.com/watch?v=jARKSXogEE0
YouTube
이승건 토스 대표의 뼈 때리는 조언(to. 창업가)
“창업가의 길에 들어서면 회사가 여러분의 가족보다 중요해져요. 가족들의 대소사를 챙길 수 없고, 자녀들에게 굿나잇 키스를 할 수 없게 될 거예요. 또 현금 자산이 필요해 좋은 집과 좋은 차는 포기해야할 거고요. 친구나 지인들은 여러분을 이해하지 못하고, 인생에서 길을 잃고 방황한다고 생각할 거예요. 부디 방황이 짧아야할 텐데 하겠죠.”
전체 기사 보기: https://zdnet.co.kr/view/?no=20231031161456
전체 기사 보기: https://zdnet.co.kr/view/?no=20231031161456
❤3
BZCF | 비즈까페
최근 정주영 창업경진대회에서 토스의 창업자이자 대표이신 이승건 대표님이 한 강연을 촬영한 영상입니다. 내용이 너무 좋네요. 이러한 한국의 창업자들이 자신들의 이야기를 후배 창업자들에게, 그리고 사회에게 공유해주는 것이 큰 자산이 된다고 생각합니다. 정말로 많은 생각들 할 수 있게 되었네요. 아직 조회수가 높지는 않지만... 그것과 무관하게 정말로 가치가 있는 영상이네요. 시간 되실 때, 한번 보시면 좋지 않을까 합니다. 추천합니다. https://w…
될 때까지 할 거 잖아요.
정주영
아직 해보지 않아서 모르는 부분은 배우며 하면되고, 길이 없으면 만들며 해결하면 돼. 사막이 뜨겁다고 하지만 밤에는 서늘하다고 하니 일하는 사람들을 낮에는 에어컨 켜놓은 데서 재우고 밤에 불 켜놓고 일하게 하면 되잖아. 또 물이 부족하다고 하는데 차로 길어오면 되고, 어차피 건설장비는 임대해서 쓰는 거니까 문제없어. 자금도 현대신용가지고 빌려서 해결하면 되.
주한 미대사: 정 회장님, 자동차 독자개발을 포기하십시오.
나는 이렇게 생각합니다. 한나라를 인체에 비교한다면 그 국토에 퍼져있는 도로는 인체의 혈관과 같은 것이고 자동차는 그 혈관을 돌아다니는 피와 같은 것이라고 생각합니다. 도로가 발달하고 그 위를 자동차가 원활하게 다니면 피가 몸에서 원활하게 흐를 때 인체가 성장하고 활력을 갖게되듯이 그나라의 경제가 생동력을 가지고 발달할 수 있게 됩니다. 좋은 자동차를 만들어 값싸게 공급하는 것은 인체에 좋은 피를 공급하는 것과 마찬가지입니다. 우리나라 경제는 이제 막 성장하기 시작하는 소년기에 비교할 수 있기 때문에 자동차 공업의 발전은 그만큼 더 중요한 의미를 갖습니다. 자동차 산업은 기계, 전자, 철강, 화학 등 전 산업에 미치는 연관 효과나 기술 발전과 고용창출 효과가 대단히 큰 현대 산업의 꽃이라고 할 수 있습니다. 한국이 선진 공업국 대열에 진입하기 위해서는 반드시 필요한 분입니다.
그렇기 때문에 대사님께서 염려하는 대로 내가 건설사업을 해서 돈을 모두 쏟아붓고 실패한다 해도 나는 결코 후회하지 않을 것 입니다. 왜냐하면 그것이 밑거름이 되어 우리 후대에 가서라도 한국의 자동차산업이 성공하는데 필요한 디딤돌을 놓을 수 있는 일이라면 나는 그것으로 보람을 삼을 것이기 때문입니다.
https://blog.naver.com/kkjj1948/220188452892
정주영
아직 해보지 않아서 모르는 부분은 배우며 하면되고, 길이 없으면 만들며 해결하면 돼. 사막이 뜨겁다고 하지만 밤에는 서늘하다고 하니 일하는 사람들을 낮에는 에어컨 켜놓은 데서 재우고 밤에 불 켜놓고 일하게 하면 되잖아. 또 물이 부족하다고 하는데 차로 길어오면 되고, 어차피 건설장비는 임대해서 쓰는 거니까 문제없어. 자금도 현대신용가지고 빌려서 해결하면 되.
주한 미대사: 정 회장님, 자동차 독자개발을 포기하십시오.
나는 이렇게 생각합니다. 한나라를 인체에 비교한다면 그 국토에 퍼져있는 도로는 인체의 혈관과 같은 것이고 자동차는 그 혈관을 돌아다니는 피와 같은 것이라고 생각합니다. 도로가 발달하고 그 위를 자동차가 원활하게 다니면 피가 몸에서 원활하게 흐를 때 인체가 성장하고 활력을 갖게되듯이 그나라의 경제가 생동력을 가지고 발달할 수 있게 됩니다. 좋은 자동차를 만들어 값싸게 공급하는 것은 인체에 좋은 피를 공급하는 것과 마찬가지입니다. 우리나라 경제는 이제 막 성장하기 시작하는 소년기에 비교할 수 있기 때문에 자동차 공업의 발전은 그만큼 더 중요한 의미를 갖습니다. 자동차 산업은 기계, 전자, 철강, 화학 등 전 산업에 미치는 연관 효과나 기술 발전과 고용창출 효과가 대단히 큰 현대 산업의 꽃이라고 할 수 있습니다. 한국이 선진 공업국 대열에 진입하기 위해서는 반드시 필요한 분입니다.
그렇기 때문에 대사님께서 염려하는 대로 내가 건설사업을 해서 돈을 모두 쏟아붓고 실패한다 해도 나는 결코 후회하지 않을 것 입니다. 왜냐하면 그것이 밑거름이 되어 우리 후대에 가서라도 한국의 자동차산업이 성공하는데 필요한 디딤돌을 놓을 수 있는 일이라면 나는 그것으로 보람을 삼을 것이기 때문입니다.
https://blog.naver.com/kkjj1948/220188452892
🤩3❤1
Could AMD be another option for NVDIA?
https://www.databricks.com/blog/training-llms-scale-amd-mi250-gpus
https://www.mosaicml.com/blog/amd-mi250
https://docs.google.com/presentation/d/1O-zbXIj103FxOHm9bWs9pjR8N8B47YAM/edit#slide=id.p1
https://www.lamini.ai/blog/lamini-amd-paving-the-road-to-gpu-rich-enterprise-llms
https://www.databricks.com/blog/training-llms-scale-amd-mi250-gpus
https://www.mosaicml.com/blog/amd-mi250
https://docs.google.com/presentation/d/1O-zbXIj103FxOHm9bWs9pjR8N8B47YAM/edit#slide=id.p1
https://www.lamini.ai/blog/lamini-amd-paving-the-road-to-gpu-rich-enterprise-llms
Databricks
Training LLMs at Scale with AMD MI250 GPUs | Databricks Blog
We benchmarked LLM training on a multi-node AMD MI250 cluster and found near-linear scaling on up to 128 GPUs, demonstrating a compelling option for multi-node LLM training.
AI had brain, eyes and ears.
https://x.com/rcweston/status/1706893312588746943?s=46&t=h5Byg6Wosg8MJb4pbPSDow
https://x.com/rcweston/status/1706893312588746943?s=46&t=h5Byg6Wosg8MJb4pbPSDow
Self reflection https://github.com/EGCap/awesome-gpt4-vision/blob/main/data/2309.17421-gpt4vision/self-reflection-image-gen.png
GitHub
awesome-gpt4-vision/data/2309.17421-gpt4vision/self-reflection-image-gen.png at main · EGCap/awesome-gpt4-vision
A collection of awesome GPT4 vision use cases. Contribute to EGCap/awesome-gpt4-vision development by creating an account on GitHub.
https://huyenchip.com/2023/10/10/multimodal.html
Multimodality in the context of AI refers to systems that can process and interpret various types of data, such as text and images, simultaneously. It is crucial as it mirrors human sensory experience and increases the robustness and versatility of AI systems. Different data modalities and tasks require systems to not only analyze but also generate or understand multimodal outputs, such as image captioning or visual question answering.
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
Multimodality in the context of AI refers to systems that can process and interpret various types of data, such as text and images, simultaneously. It is crucial as it mirrors human sensory experience and increases the robustness and versatility of AI systems. Different data modalities and tasks require systems to not only analyze but also generate or understand multimodal outputs, such as image captioning or visual question answering.
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
Chip Huyen
Multimodality and Large Multimodal Models (LMMs)
For a long time, each ML model operated in one data mode – text (translation, language modeling), image (object detection, image classification), or audio (speech recognition).
Multimodality in the context of AI refers to systems that can process and interpret various types of data, such as text and images, simultaneously. It is crucial as it mirrors human sensory experience and increases the robustness and versatility of AI systems. Different data modalities and tasks require systems to not only analyze but also generate or understand multimodal outputs, such as image captioning or visual question answering.
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
https://huyenchip.com/2023/10/10/multimodal.html
The fundamentals involve components like encoders for each data modality, alignment of different modalities into a joint embedding space, and for generative models, a language model to generate text responses.
CLIP (Contrastive Language-Image Pre-training) maps text and images into a shared space, enhancing tasks like image classification and retrieval. Flamingo is a large multimodal model that generates open-ended responses and is considered a significant leap in multimodal domains. CLIP is known for zero-shot learning capabilities, while Flamingo excels in generating responses based on visual and textual inputs.
Recent advances in multimodal research focus on systems like BLIP-2, LLaVA, LLaMA-Adapter V2, and LAVIN, which push forward the capabilities of multimodal output generation and efficient training adapters, thus refining the interaction between AI and various data modalities1.
https://huyenchip.com/2023/10/10/multimodal.html
Chip Huyen
Multimodality and Large Multimodal Models (LMMs)
For a long time, each ML model operated in one data mode – text (translation, language modeling), image (object detection, image classification), or audio (speech recognition).
Continuous Learning_Startup & Investment
Could AMD be another option for NVDIA? https://www.databricks.com/blog/training-llms-scale-amd-mi250-gpus https://www.mosaicml.com/blog/amd-mi250 https://docs.google.com/presentation/d/1O-zbXIj103FxOHm9bWs9pjR8N8B47YAM/edit#slide=id.p1 https://www.lam…
6. Develop GPU alternatives
GPU has been the dominating hardware for deep learning ever since AlexNet in 2012. In fact, one commonly acknowledged reason for AlexNet’s popularity is that it was the first paper to successfully use GPUs to train neural networks. Before GPUs, if you wanted to train a model at AlexNet’s scale, you’d have to use thousands of CPUs, like the one Google released just a few months before AlexNet. Compared to thousands of CPUs, a couple of GPUs were a lot more accessible to Ph.D. students and researchers, setting off the deep learning research boom.
In the last decade, many, many companies, both big corporations, and startups, have attempted to create new hardware for AI. The most notable attempts are Google’s TPUs, Graphcore’s IPUs (what’s happening with IPUs?), and Cerebras. SambaNova raised over a billion dollars to develop new AI chips but seems to have pivoted to being a generative AI platform.
For a while, there has been a lot of anticipation around quantum computing, with key players being:
IBM’s QPU
Google’s Quantum computer reported a major milestone in quantum error reduction earlier this year in Nature. Its quantum virtual machine is publicly accessible via Google Colab
Research labs such as MIT Center for Quantum Engineering, Max Planck Institute of Quantum Optics, Chicago Quantum Exchange, Oak Ridge National Laboratory, etc.
Another direction that is also super exciting is photonic chips. This is the direciton I know the least about – so please correct me if I’m wrong. Existing chips today use electricity to move data, which consumes a lot of power and also incurs latency. Photonic chips use photons to move data, harnessing the speed of light for faster and more efficient compute. Various startups in this space have raised hundreds of millions of dollars, including Lightmatter ($270M), Ayar Labs ($220M), Lightelligence ($200M+), and Luminous Computing ($115M).
Below is the timeline of advances of the three major methods in photonic matrix computation, from the paper Photonic matrix multiplication lights up photonic accelerator and beyond (Zhou et al., Nature 2022). The three different methods are plane light conversion (PLC), Mach–Zehnder interferometer (MZI), and wavelength division multiplexing (WDM).
GPU has been the dominating hardware for deep learning ever since AlexNet in 2012. In fact, one commonly acknowledged reason for AlexNet’s popularity is that it was the first paper to successfully use GPUs to train neural networks. Before GPUs, if you wanted to train a model at AlexNet’s scale, you’d have to use thousands of CPUs, like the one Google released just a few months before AlexNet. Compared to thousands of CPUs, a couple of GPUs were a lot more accessible to Ph.D. students and researchers, setting off the deep learning research boom.
In the last decade, many, many companies, both big corporations, and startups, have attempted to create new hardware for AI. The most notable attempts are Google’s TPUs, Graphcore’s IPUs (what’s happening with IPUs?), and Cerebras. SambaNova raised over a billion dollars to develop new AI chips but seems to have pivoted to being a generative AI platform.
For a while, there has been a lot of anticipation around quantum computing, with key players being:
IBM’s QPU
Google’s Quantum computer reported a major milestone in quantum error reduction earlier this year in Nature. Its quantum virtual machine is publicly accessible via Google Colab
Research labs such as MIT Center for Quantum Engineering, Max Planck Institute of Quantum Optics, Chicago Quantum Exchange, Oak Ridge National Laboratory, etc.
Another direction that is also super exciting is photonic chips. This is the direciton I know the least about – so please correct me if I’m wrong. Existing chips today use electricity to move data, which consumes a lot of power and also incurs latency. Photonic chips use photons to move data, harnessing the speed of light for faster and more efficient compute. Various startups in this space have raised hundreds of millions of dollars, including Lightmatter ($270M), Ayar Labs ($220M), Lightelligence ($200M+), and Luminous Computing ($115M).
Below is the timeline of advances of the three major methods in photonic matrix computation, from the paper Photonic matrix multiplication lights up photonic accelerator and beyond (Zhou et al., Nature 2022). The three different methods are plane light conversion (PLC), Mach–Zehnder interferometer (MZI), and wavelength division multiplexing (WDM).