https://youtu.be/K3W61Ae0M9g this is so damn beautiful
YouTube
Dear Nia
This is not a good summation of the many-worlds interpretation of quantum mechanics. This is what happens when you stay up reading about it while imbibing moderate-to-inadvisable amounts of beer. (I’m sure Nia and Soren were fine in the end. (More stuff below.))…
Pepemedia 🇺🇦🏳️🌈
Photo
New original meme for y'all, in response to:
(https://news.1rj.ru/str/atellschannel/4736)
(https://news.1rj.ru/str/atellschannel/4736)
Telegram
киберпсалмы & mathюки
"With GPT-3 costing around $4.6 million in compute, than would put a price of $8.6 billion for the compute to train "GPT-4".
If making bigger models was so easy with parameter partitioning from a memory point of view then this seems like the hardest challenge, but you do need to solve the memory issue to actually get it to load at all.
However, if you're lucky you can get 3-6x compute increase from Nvidia A100s over V100s, https://developer.nvidia.com/blog/nvidia-ampere-architecture-in-depth/
But even a 6x compute gain would still put the cost at $1.4 billion.
Nvidia only reported $1.15 billion in revenue from "Data Center" in 2020 Q1, so just to train "GPT-4" you would pretty much need the entire world's supply of graphic cards for 1 quarter (3 months), at least on that order of magnitude."
Looks like where will be no significantly larger model in GPT4 for us :C
source - https://www.reddit.com/r/MachineLearning/comments/i49jf8/d_biggest_roadblock_in_making_gpt4_a_20_trillion/
If making bigger models was so easy with parameter partitioning from a memory point of view then this seems like the hardest challenge, but you do need to solve the memory issue to actually get it to load at all.
However, if you're lucky you can get 3-6x compute increase from Nvidia A100s over V100s, https://developer.nvidia.com/blog/nvidia-ampere-architecture-in-depth/
But even a 6x compute gain would still put the cost at $1.4 billion.
Nvidia only reported $1.15 billion in revenue from "Data Center" in 2020 Q1, so just to train "GPT-4" you would pretty much need the entire world's supply of graphic cards for 1 quarter (3 months), at least on that order of magnitude."
Looks like where will be no significantly larger model in GPT4 for us :C
source - https://www.reddit.com/r/MachineLearning/comments/i49jf8/d_biggest_roadblock_in_making_gpt4_a_20_trillion/
just have found an amazing site - https://code.likeagirl.io/
really awesome concept that may inspire someone
really awesome concept that may inspire someone
Code Like A Girl
Welcome to Code Like A Girl, a space that celebrates redefining society's perceptions of women in technology. Share your story with us!