[llm][model training]
https://blog.replit.com/llm-training
https://blog.replit.com/llm-training
Replit Blog
Replit — How to train your own Large Language Models
Learn how Replit trains Large Language Models (LLMs) using Databricks, Hugging Face, and MosaicML
Introduction
Large Language Models, like OpenAI's GPT-4 or Google's PaLM, have taken the world of artificial intelligence by storm. Yet most companies don't…
Introduction
Large Language Models, like OpenAI's GPT-4 or Google's PaLM, have taken the world of artificial intelligence by storm. Yet most companies don't…
[ml][book]
“Self-supervised learning, dubbed “the dark matter of intelligence” 1, is a promising path to advance machine learning. As opposed to supervised learning, which is limited by the availability of labeled data, self-supervised approaches can learn from vast unlabeled data [Chen et al., 2020b, Misra and Maaten, 2020]. Self-supervised learning (SSL) underpins deep learning’s success in natural language processing leading to advances from automated machine translation to large language models trained on web-scale corpora of unlabeled text [Brown et al., 2020, Popel et al., 2020]. In computer vision, SSL pushed new bounds on data size with models such as SEER trained on 1 billion images [Goyal et al., 2021]. SSL methods for computer vision have been able to match or in some cases surpass models trained on labeled data, even on highly competitive benchmarks like ImageNet [Tomasev et al., 2022, He et al., 2020a, Deng et al., 2009]. SSL has also been successfully applied across other modalities such as video, audio, and time series [Wickstrøm et al., 2022, Liu et al., 2022a, Schiappa et al., 2022a].”
https://arxiv.org/abs/2304.12210
“Self-supervised learning, dubbed “the dark matter of intelligence” 1, is a promising path to advance machine learning. As opposed to supervised learning, which is limited by the availability of labeled data, self-supervised approaches can learn from vast unlabeled data [Chen et al., 2020b, Misra and Maaten, 2020]. Self-supervised learning (SSL) underpins deep learning’s success in natural language processing leading to advances from automated machine translation to large language models trained on web-scale corpora of unlabeled text [Brown et al., 2020, Popel et al., 2020]. In computer vision, SSL pushed new bounds on data size with models such as SEER trained on 1 billion images [Goyal et al., 2021]. SSL methods for computer vision have been able to match or in some cases surpass models trained on labeled data, even on highly competitive benchmarks like ImageNet [Tomasev et al., 2022, He et al., 2020a, Deng et al., 2009]. SSL has also been successfully applied across other modalities such as video, audio, and time series [Wickstrøm et al., 2022, Liu et al., 2022a, Schiappa et al., 2022a].”
https://arxiv.org/abs/2304.12210
arXiv.org
A Cookbook of Self-Supervised Learning
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high...
[twitter][algorithm]
https://tweethunter.io/blog/twitter-algorithm-full-analysis
https://tweethunter.io/blog/twitter-algorithm-full-analysis
[platform engineering]
https://medium.com/hashicorp-engineering/platform-engineering-on-the-hashicorp-ecosystem-part-1-84fb314e833e
https://medium.com/hashicorp-engineering/platform-engineering-on-the-hashicorp-ecosystem-part-1-84fb314e833e
Medium
Platform Engineering on the HashiCorp Ecosystem— Part 1
The goal of this series is to provide a practical guide on how to facilitate a multi-tenant developer PaaS using the HashiCorp ecosystem
[llm][document index demo]
https://gpt-index.readthedocs.io/en/latest/examples/index_structs/doc_summary/DocSummary.html
https://gpt-index.readthedocs.io/en/latest/examples/index_structs/doc_summary/DocSummary.html
[llm]
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks
“While LLMs have shown great success in understanding and generating text in traditional conversational settings, their potential for performing ill-defined complex tasks is largely under-studied. Indeed, we are yet to conduct comprehensive benchmarking studies with multiple LLMs that are exclusively focused on a complex task. However, conducting such benchmarking studies is challenging because of the large variations in LLMs' performance when different prompt types/styles are used and different degrees of detail are provided in the prompts. To address this issue, the paper proposes a general taxonomy that can be used to design prompts with specific properties in order to perform a wide range of complex tasks.”
https://arxiv.org/abs/2305.11430
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks
“While LLMs have shown great success in understanding and generating text in traditional conversational settings, their potential for performing ill-defined complex tasks is largely under-studied. Indeed, we are yet to conduct comprehensive benchmarking studies with multiple LLMs that are exclusively focused on a complex task. However, conducting such benchmarking studies is challenging because of the large variations in LLMs' performance when different prompt types/styles are used and different degrees of detail are provided in the prompts. To address this issue, the paper proposes a general taxonomy that can be used to design prompts with specific properties in order to perform a wide range of complex tasks.”
https://arxiv.org/abs/2305.11430
❤1
[architecture][uber clone]
Juraj Majerik dedicated about 7 months (a total of ~300 hours) to create a simulated version of a ride-sharing app (akin to Uber) as a side project. He described each step in his blog:
https://rides.jurajmajerik.com/system-design
Juraj Majerik dedicated about 7 months (a total of ~300 hours) to create a simulated version of a ride-sharing app (akin to Uber) as a side project. He described each step in his blog:
https://rides.jurajmajerik.com/system-design
🔥3