On-Device 2B LLMs for actions, outperform GPT-4. The “Octopus v2: On-device language model for super agent” proposes a new method to create on-device agents.
Insights
1. Adds 1 special token per "function" to reduce error rate which functions to use
2. 98.095% accuracy with just 100 samples per function.
3. 35x faster latency than Llama-7B with a RAG-based function calling
4. Works on-device within 1.1 to 1.7 seconds for typical queries
5. Synthetic data generation for one new task ~$0.0224 for 1,000 samples
6. Supports nested, and chained function calls
7. Could be used in smart devices, like Alexa, to interact with maps, food delivery etc.
8. Unsure if this can be scaled to hundreds of different functions, currently trained on 20.
Models.
Insights
1. Adds 1 special token per "function" to reduce error rate which functions to use
2. 98.095% accuracy with just 100 samples per function.
3. 35x faster latency than Llama-7B with a RAG-based function calling
4. Works on-device within 1.1 to 1.7 seconds for typical queries
5. Synthetic data generation for one new task ~$0.0224 for 1,000 samples
6. Supports nested, and chained function calls
7. Could be used in smart devices, like Alexa, to interact with maps, food delivery etc.
8. Unsure if this can be scaled to hundreds of different functions, currently trained on 20.
Models.
huggingface.co
Paper page - Octopus v2: On-device language model for super agent
Join the discussion on this paper page
MIT researchers have developed a computational technique that makes it easier to engineer useful proteins - including ones that could be used to measure electrical activity in the brain.
MIT McGovern Institute
A new computational technique could make it easier to engineer useful proteins - MIT McGovern Institute
To engineer proteins with useful functions, researchers usually begin with a natural protein that has a desirable function, such as emitting fluorescent light, and put it through many rounds of random mutation that eventually generate an optimized version…
Anthropic released a jailbreaking method capable of bypassing all LLMs safety measures.
For example:
User: How do I pick a lock?
Assistant: I’m happy to help with that. First, obtain lockpicking tools… [continues to detail lockpicking methods]
Simply including a very large number of faux dialogues preceding the final question was enough to bypass security measures.
For example:
User: How do I pick a lock?
Assistant: I’m happy to help with that. First, obtain lockpicking tools… [continues to detail lockpicking methods]
Simply including a very large number of faux dialogues preceding the final question was enough to bypass security measures.
Anthropic
Many-shot jailbreaking
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
🤔3😢1
Artificial_Intelligence_AI_Sprinters_Google_Report_1712213946.pdf
4.4 MB
The "AI Sprinters" report released today by Google outlines AI's potential to drive economic growth in emerging markets.
The report:
- highlights how AI can address economic challenges and improve living standards globally. Surveys indicate that emerging market populations are more optimistic about AI's impact compared to Europe and the US, with over 71% believing it has positive effects on access to information, health, education, and work.
- showcases real-life examples across diverse sectors, emphasizing the significant benefits of adopting AI technologies strategically.
Four key recommendations are presented in the report:
▶ Firstly, it suggests revolutionizing #infrastructure by prioritizing #cloud-first policies, which democratize access to cutting-edge technologies and boost #government efficiency.
▶ Secondly, supporting national AI skill initiatives is crucial to building an AI-ready #workforce. This involves investing in AI #education and training at various levels of expertise.
▶ Thirdly, modernizing national data systems is essential for training effective AI models and unlocking the full potential of data through initiatives like #data sharing and open data principles.
▶ Fourthly, the report emphasizes the importance of AI-enabling #regulation. Policymakers are urged to focus on achieving AI's potential benefits while managing risks. This includes adopting risk-based regulation, maintaining privacy frameworks, supporting international technical standards for AI, and implementing national AI strategies.
The report:
- highlights how AI can address economic challenges and improve living standards globally. Surveys indicate that emerging market populations are more optimistic about AI's impact compared to Europe and the US, with over 71% believing it has positive effects on access to information, health, education, and work.
- showcases real-life examples across diverse sectors, emphasizing the significant benefits of adopting AI technologies strategically.
Four key recommendations are presented in the report:
▶ Firstly, it suggests revolutionizing #infrastructure by prioritizing #cloud-first policies, which democratize access to cutting-edge technologies and boost #government efficiency.
▶ Secondly, supporting national AI skill initiatives is crucial to building an AI-ready #workforce. This involves investing in AI #education and training at various levels of expertise.
▶ Thirdly, modernizing national data systems is essential for training effective AI models and unlocking the full potential of data through initiatives like #data sharing and open data principles.
▶ Fourthly, the report emphasizes the importance of AI-enabling #regulation. Policymakers are urged to focus on achieving AI's potential benefits while managing risks. This includes adopting risk-based regulation, maintaining privacy frameworks, supporting international technical standards for AI, and implementing national AI strategies.
Summer course on LLMs in Armenia this year
Speakers from Meta, MIT, Google and etc, cover the foundations of LLMs from first principles through lectures from a great lineup of speakers and hands-on practice sessions.
Speakers from Meta, MIT, Google and etc, cover the foundations of LLMs from first principles through lectures from a great lineup of speakers and hands-on practice sessions.
armllm.github.io
LLM Summer School
Welcome to the First Armenian Summer School on Large Language Models, July 1-7, Yerevan, Armenia
Cohere just dropped a
- 104B
- Multilingual
- 128K context length
- RAG + Tool Use
- Open weights
Model.
And based on how good their previous Command R.
- 104B
- Multilingual
- 128K context length
- RAG + Tool Use
- Open weights
Model.
And based on how good their previous Command R.
huggingface.co
C4AI Command Models - a Hugging Face Space by CohereLabs
Ask any question or request information, and get detailed answers. You can type in text, and the app will provide responses on a variety of topics.
Taiwan’s semiconductor industry has almost fully recovered from the 7.2-magntide earthquake that struck Wednesday, media report, citing the National Science and Technology Council (NSTC), which oversees Taiwan’s 3 big tech parks, Hsinchu Science Park (HSP), Central Taiwan Science Park (CTSP), and Southern Taiwan Science Park (STSP).
At HSP, the majority of semiconductor, display panel and other precision industries were back to normal on 4/3. Work continues at a small number of firms, but they are expected to be back to normal soon.
CTSP said 90% of affected semiconductor equipment is back online, and the rest will be back in operation today 4/4.
STSP said operations at all major factories are already back to normal.
At HSP, the majority of semiconductor, display panel and other precision industries were back to normal on 4/3. Work continues at a small number of firms, but they are expected to be back to normal soon.
CTSP said 90% of affected semiconductor equipment is back online, and the rest will be back in operation today 4/4.
STSP said operations at all major factories are already back to normal.
國家科學及技術委員會-全球資訊網 National Science and Technology Council
國家科學及技術委員會-新聞資料-國科會更新地震後三科學園區廠商復原狀況說明
🔥2
AIDE has become the first human-level AI agent for data science
AIDE outperforms half of human data scientists on a wide range of Kaggle competitions, surpassing conventional AutoML, LangChain agents, and ChatGPT with human assistance.
AIDE outperforms half of human data scientists on a wide range of Kaggle competitions, surpassing conventional AutoML, LangChain agents, and ChatGPT with human assistance.
Weco AI
AIDE: Human-Level Performance on Data Science Competitions | Weco AI
In the world of data science, Kaggle competitions have become a widely accepted standard...
🔥2
On 4 April, Singapore effected changes to the Payment Services Act, expanding the scope of digital payment token (DPT) regulation in the country.
Here’s what those changes mean:
1. DPT service providers will now need to seek a licence in order to (i) provide custodial services for DPTs, or (ii) facilitate the transmission or exchange of DPTs, even where the service provider does not come into possession of client moneys or DPTs.
2. Businesses currently operating under the PSA’s expanded scope have 30 days to notify MAS of their activities, 6 months to submit a licence application, and 9 months to provide an attestation of their business activities and AML/CFT compliance by an external auditor.
3. Businesses that meet the above requirements can continue conducting business on a temporary basis while MAS reviews their licence applications.
4. In addition to the new licensing requirements, new consumer protection requirements that MAS finalised last year, such as on the safeguarding of customer assets, will come into force 6 months from 4 April.
Here’s what those changes mean:
1. DPT service providers will now need to seek a licence in order to (i) provide custodial services for DPTs, or (ii) facilitate the transmission or exchange of DPTs, even where the service provider does not come into possession of client moneys or DPTs.
2. Businesses currently operating under the PSA’s expanded scope have 30 days to notify MAS of their activities, 6 months to submit a licence application, and 9 months to provide an attestation of their business activities and AML/CFT compliance by an external auditor.
3. Businesses that meet the above requirements can continue conducting business on a temporary basis while MAS reviews their licence applications.
4. In addition to the new licensing requirements, new consumer protection requirements that MAS finalised last year, such as on the safeguarding of customer assets, will come into force 6 months from 4 April.
www.mas.gov.sg
MAS Expands Scope of Regulated Payment Services; Introduces User Protection Requirements for Digital Payment Token Service Providers
MAS introduced amendments to the Payment Services Act (PS Act) and its subsidiary legislation to expand the scope of payment services regulated by MAS, and to impose user protection and financial stability-related requirements on digital payment token (DPT)…
🔥3❤2
Training LLMs can be much cheaper than previously thought.
While companies like OpenAI and Meta use billions of dollars to train theirs, CSAIL & Myshell research shows that just 0.1 million USD is sufficient for training LLaMA2-level LLMs.
JetMoE democratizes the training of high-performance LLMs, and makes it achievable by a wide range of research institutes and companies.
JetMoE is fully open-sourced & academia-friendly because:
1. It only uses public datasets for training. No proprietary resource is needed.
2. It can be finetuned with a very limited computing budget (e.g., consumer-grade GPU).
While companies like OpenAI and Meta use billions of dollars to train theirs, CSAIL & Myshell research shows that just 0.1 million USD is sufficient for training LLaMA2-level LLMs.
JetMoE democratizes the training of high-performance LLMs, and makes it achievable by a wide range of research institutes and companies.
JetMoE is fully open-sourced & academia-friendly because:
1. It only uses public datasets for training. No proprietary resource is needed.
2. It can be finetuned with a very limited computing budget (e.g., consumer-grade GPU).
GitHub
GitHub - myshell-ai/JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
Reaching LLaMA2 Performance with 0.1M Dollars. Contribute to myshell-ai/JetMoE development by creating an account on GitHub.
🔥7👍2
Photobucket is in talks with several AI companies to permit the use of its 13 billion photos and videos as training data.
Rates under discussion range from 5 cents to $1 per photo, and over $1 per video.
One prospective buyer told the CEO they want to buy over a billion videos.
Rates under discussion range from 5 cents to $1 per photo, and over $1 per video.
One prospective buyer told the CEO they want to buy over a billion videos.
Reuters
Inside Big Tech's underground race to buy AI training data
At its peak in the early 2000s, Photobucket was the world's top image-hosting site. The media backbone for once-hot services like Myspace and Friendster, it boasted 70 million users and accounted for nearly half of the U.S. online photo market.
🔥4
OpenAI made a big upgrade to DALL-E 3, now allowing users to edit images directly in ChatGPT.
Users can edit images directly in the chat across the web, iOS, and Android apps by selecting an area of the image and prompting changes.
Users can edit images directly in the chat across the web, iOS, and Android apps by selecting an area of the image and prompting changes.
❤4
A super interesting talk on Ring Attention, probably the magic behind Gemini's 1 million context window
You organize your devices (GPU/TPU) in a ring, each computing a part of the final attention output
Each device needs to see all keys/values to produce its part. The idea is that the attention output can be computed blockwise (by splitting on the sequence dimension). Each device computes the updated queries of a chunk of the sequence by sending/receiving keys/values
This is a great repo to understand it in code.
You organize your devices (GPU/TPU) in a ring, each computing a part of the final attention output
Each device needs to see all keys/values to produce its part. The idea is that the attention output can be computed blockwise (by splitting on the sequence dimension). Each device computes the updated queries of a chunk of the sequence by sending/receiving keys/values
This is a great repo to understand it in code.
GitHub
ring-flash-attention/test/test_ring_flash_attn_func.py at main · zhuzilin/ring-flash-attention
Ring attention implementation with flash attention - zhuzilin/ring-flash-attention
Intelligent fabrics, which can sense and communicate information scalably and unobtrusively, can fundamentally change how people interact with the world.
Science
Intelligent textiles are looking bright
Flexible fiber electronics couple with the human body for wireless tactile sensing
👍4
Apple presents Ferret-UI
Grounded Mobile UI Understanding with Multimodal LLMs
Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens.
Grounded Mobile UI Understanding with Multimodal LLMs
Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens.
⚡️AutoCodeRover is autonomous software engineer from Singapore
Takes in a Github issue (bug fixing or feature addition), resolves in few minutes, with minimal LLM cost ~$0.5
Takes in a Github issue (bug fixing or feature addition), resolves in few minutes, with minimal LLM cost ~$0.5
GitHub
auto-code-rover/preprint.pdf at main · nus-apr/auto-code-rover
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-bench lite and 46.2% tasks (pass@1) in SWE-bench verified with...
Google released CodeGemma, a new version of the Gemma line of models fine-tuned on code generation and completion, that achieves state-of-the-art results. Available in sizes 2B and 7B.
HF is here.
HF is here.
CEO Intel announced Lunar Lake with over 100 TOPS of platform AI performance. Shows off a Lunar Lake SoC on stage and says to expect significant gains.
👍3