DeepMind AI Expert – Telegram
DeepMind AI Expert
14.9K subscribers
1.28K photos
385 videos
120 files
2.26K links
مقالات کاربردی هوش مصنوعی در پایتون، علوم پزشکی، علوم انسانی، علوم اعصاب و...
دوره های آموزشی از دانشگاه های بزرگ و موسسات انلاین
@ffarzaddh
پژوهشگران هوش مصنوعی ایران

تبادلات پیام بدید
Download Telegram
ده #ایده_جذاب که در یک ماه گذشته منتشر شد. قسمت 3 از 3

1) QLoRA - an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning performance.

2) LIMA - a new 65B parameter LLaMa model fine-tuned on 1000 carefully curated prompts and responses; it doesn't use RLHF, generalizes well to unseen tasks not available in the training data, and generates responses equivalent or preferred to GPT-4 in 43% of cases, and even higher compared to Bard.

3) Voyager - an LLM-powered embodied lifelong learning agent in Minecraft that can continuously explore worlds, acquire skills, and make novel discoveries without human intervention.

4) Gorilla - a finetuned LLaMA-based model that surpasses GPT-4 on writing API calls. This capability can help identify the right API, boosting the ability of LLMs to interact with external tools to complete specific tasks.

5. The False Promise of Imitating Proprietary LLMs - provides a critical analysis of models that are finetuned on the outputs of a stronger model; argues that model imitation is a false premise and that the higher leverage action to improve open source models is to develop better base models.

6) Sophia - presents a simple scalable second-order optimizer that has negligible average per-step time and memory overhead; on language modeling, Sophia achieves 2x speed-up compared to Adam in the number of steps, total compute, and wall-clock time.

7) The Larger They Are, the Harder They Fail - shows that LLMs fail to generate correct Python code when default function names are swapped; they also strongly prefer incorrect continuation as they become bigger.

8) Model Evaluation for Extreme Risks - discusses the importance of model evaluation for addressing extreme risks and making responsible decisions about model training, deployment, and security.

9) LLM Research Directions - discusses a list of research directions for students looking to do research with LLMs.

10) Reinventing RNNs for the Transformer Era - proposes an approach that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs; results show that the method performs on part with similarly sized Transformers.

#مقاله

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍1
ده #ایده_جذاب که در هفته گذشته منتشر شد.

1) Let’s Verify Step by Step - achieves state-of-the-art mathematical problem solving by rewarding each correct step of reasoning in a chain-of-thought instead of rewarding the final answer; the model solves 78% of problems from a representative subset of the MATH test set.

2) No Positional Encodings - shows that explicit position embeddings are not essential for decoder-only Transformers; shows that other positional encoding methods like ALiBi and Rotary are not well suited for length generalization.

3) BiomedGPT - a unified biomedical generative pretrained transformer model for vision, language, and multimodal tasks. Achieves state-of-the-art performance across 5 distinct tasks with 20 public datasets spanning over 15 unique biomedical modalities.

4) Thought Cloning - introduces an imitation learning framework to learn to think while acting; the idea is not only to clone the behaviors of human demonstrators but also the thoughts humans have when performing behaviors.

5. Fine-Tuning Language Models with Just Forward Passes - proposes a memory-efficient zeroth-order optimizer and a corresponding SGD algorithm to finetune large LMs with the same memory footprint as inference.

6) MERT - an acoustic music understanding model with large-scale self-supervised training; it incorporates a superior combination of teacher models to outperform conventional speech and audio approaches.

7) Bytes Are All You Need - investigates performing classification directly on file bytes, without needing to decode files at inference time; achieves ImageNet Top-1 accuracy of 77.33% using a transformer backbone; achieves 95.42% accuracy when operating on WAV files from the Speech Commands v2 dataset.

8) Direct Preference Optimization - while helpful to train safe and useful LLMs, the RLHF process can be complex and often unstable; this work proposes an approach to finetune LMs by solving a classification problem on the human preferences data, with no RL required.

9) SQL-PaLM - an LLM-based Text-to-SQL adopted from PaLM-2; achieves SoTA in both in-context learning and fine-tuning settings; the few-shot model outperforms the previous fine-tuned SoTA by 3.8% on the Spider benchmark; few-shot SQL-PaLM also outperforms few-shot GPT-4 by 9.9%, using a simple prompting approach.

10) CodeTF - an open-source Transformer library for state-of-the-art code LLMs; supports pretrained code LLMs and popular code benchmarks, including standard methods to train and serve code LLMs efficiently.

#مقاله

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍3
یک نقشه راهی برای یادگیری و ۹ دوره رایگان

Generative AI Learning Path

cloudskillsboost.google/paths/118

#هوش_مصنوعی #منابع #منابع_پیشنهادی

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
5
اگر راجب گرافها در مدلهای زبانی دنبال منابع و دیتاستهای خوبی میگشتید اینو پیشنهاد میدم.

Graph-Related Large Language Models (LLMs).
https://github.com/XiaoxinHe/Awesome-Graph-LLM

#هوش_مصنوعی #منابع #منابع_پیشنهادی

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
🔥4
This media is not supported in your browser
VIEW IN TELEGRAM
دراین مقاله اومدن از مدل Segment Anything (SAM) استفاده کردن و یک ماژول سبک وزن Mask-to-Matte (M2M) را برای تطبیق عکسها و... استفاده کردند که به نظرم یک انقلابیه...!!
Matting everything (MAM)

https://huggingface.co/papers/2306.05399

پ.ن:در این مقاله به نظرم میشه صحبت دکتر عسگری رو تایید کرد که پردازش تصویر گیم اور شده پ!!

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
7
نظرات دکتر علی شریفی زارچی استاد کامپیوتر دانشگاه شریف راجب مراحل یادگیری #هوش_مصنوعی
https://twitter.com/SharifiZarchi/status/1667131051104149505


🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍4👎2
Forwarded from Meysam
This media is not supported in your browser
VIEW IN TELEGRAM
ترک کردن همه چی همه جا در یک لحظه!
این مقاله خیلی خیلی خوبه حتما بخونید در مورد ترکینگ‌ هستش:
https://arxiv.org/abs/2306.05422

(یادتون هست میگفتم پردازش تصویر گیم آور شد؟ بعد از مدل segment anything دیگه خیلی از تسکها ساده تر شدند)
Transformers as Statisticians

Unveiling a new mechanism "In-Context Algorithm Selection" for In-Context Learning (ICL) in LLMs/transformers.

arxiv.org/abs/2306.04637

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
ده #ایده_جذاب که در هفته گذشته منتشر شد.

1) Tracking Everything Everywhere All at Once - propose a test-time optimization method for estimating dense and long-range motion; enables accurate, full-length motion estimation of every pixel in a video.

2) AlphaDev - a deep reinforcement learning agent which discovers faster sorting algorithms from scratch; the algorithms outperform previously known human benchmarks and have been integrated into the LLVM C++ library.

3) Sparse-Quantized Representation - a new compressed format and quantization technique that enables near-lossless compression of LLMs across model scales; “allows LLM inference at 4.75 bits with a 15% speedup”.

4) MusicGen - a simple and controllable model for music generation built on top of a single-stage transformer LM together with efficient token interleaving patterns; it can be conditioned on textual denoscriptions or melodic features and shows high performance on a standard text-to-music benchmark.

5. Augmenting LLMs with Databases - combines an LLM with a set of SQL databases, enabling a symbolic memory framework; completes tasks via LLM generating SQL instructions that manipulate the DB autonomously.

6) Concept Scrubbing in LLM - presents a method called LEAst-squares Concept Erasure (LEACE) to erase target concept information from every layer in a neural network; it’s used for reducing gender bias in BERT embeddings.

7) Fine-Grained RLHF - trains LMs with fine-grained human feedback; instead of using overall preference, more explicit feedback is provided at the segment level which helps to improve efficacy on long-form question answering, reduce toxicity, and enables LM customization.

8) Hierarchical Vision Transformer - pretrains vision transformers with a visual pretext task (MAE), while removing unnecessary components from a state-of-the-art multi-stage vision transformer; this enables a simple hierarchical vision transformer that’s more accurate and faster at inference and during training.

9) Humor in ChatGPT - explores ChatGPT’s capabilities to grasp and reproduce humor; finds that over 90% of 1008 generated jokes were the same 25 jokes and that ChatGPT is also overfitted to a particular joke structure.

10) Imitating Reasoning Process of Larger LLMs - develops a 13B parameter model that learns to imitate the reasoning process of large foundational models like GPT-4; it leverages large-scale and diverse imitation data and surpasses instruction-tuned models such as Vicuna-13B in zero-shot reasoning.

#مقاله

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍71
Applications of Transformers

New survey paper highlighting major applications of Transformers for deep learning tasks. Includes a comprehensive list of Transformer models.

arxiv.org/abs/2306.07303

#مقاله

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
🔥3
Exploring the MIT Mathematics and EECS Curriculum Using LLMs

"GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images."

arxiv.org/abs/2306.08997

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍1
This media is not supported in your browser
VIEW IN TELEGRAM
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

https://ai.facebook.com/blog/voicebox-generative-ai-model-speech/

Large-scale generative models such as GPT and DALL-E have revolutionized natural language processing and computer vision research. These models not only generate high fidelity text or image outputs, but are also generalists which can solve tasks not explicitly taught. In contrast, speech generative models are still primitive in terms of scale and task generalization. In this paper, we present Voicebox, the most versatile text-guided generative model for speech at scale. Voicebox is a non-autoregressive flow-matching model trained to infill speech, given audio context and text, trained on over 50K hours of speech that are neither filtered nor enhanced. Similar to GPT, Voicebox can perform many different tasks through in-context learning, but is more flexible as it can also condition on future context.

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
Can LLMs Teach Weaker Agents?

Aligned teachers can intervene w/ free-text explanations using Theory of Mind (ExpUtility+Personalization) to improve students on future unexplained data🙂

Misaligned teachers hurt students😢

arxiv.org/abs/2306.09299

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
میخواید اخبار و مقالات و... راجب استارت اپ ها و کمپانیها دریافت کنید اینجا ثبت نام کنید
https://www.joinsuperhuman.ai/subscribe


#خبر

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍1
رایگان امتحان بدید و رایگان آموزش ببینید

https://lightning.ai/pages/ai-education/deep-learning-fundamentals/

#یادگیری_عمیق #منابع #منابع_پیشنهادی

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
Unifying Large Language Models and Knowledge Graphs: A Roadmap

arxiv.org/abs/2306.08302

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
Sentiment Analysis Of Twitter Data Towards COVID-19 Vaccines Using A Deep Learning Approach

https://ieeexplore.ieee.org/abstract/document/10139297

#مقاله #ایده_جذاب

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
This media is not supported in your browser
VIEW IN TELEGRAM
سالانه یک میلیون و هشتصدهزار مقاله منتشر میشه.
محققان هوش مصنوعی برای توضیح و خلاصه کردن مقاله‌ها این راهو معرفی کردند.
https://www.explainpaper.com/

AI explaining AI!

#خبر #هوش_مصنوعی

🔸 مطالب بیشتر 👇👇

@AI_DeepMind
👍6
چطوری QR Code هنری خودمونو با هوش مصنوعی تولید کنیم؟!

https://huggingface.co/spaces/huggingface-projects/QR-code-AI-art-generator

#خبر #هوش_مصنوعی

🔸 مطالب بیشتر 👇👇

@AI_DeepMind