NEW BOT Телеграм, страница

Continuous Learning_Startup & Investment

폴 그레이엄이 환상적인 글을 남겼다. 이 글은 평생에 걸쳐서 읽어야겠다.

"How to Do Great Work"

<번역본>
https://frontierbydoyeob.substack.com/p/frontier-13-how-to-do-great-work?utm_source=post-email-noscript&publication_id=944480&post_id=132707382&isFreemail=true&utm_medium=email

<원문>
http://www.paulgraham.com/greatwork.html?utm_source=substack&utm_medium=email

❤3

57 views06:32

Continuous Learning_Startup & Investment

https://youtu.be/GQNYD9yzIjU

wow…

YouTube

＂30명이 부르는 듯한 압도감＂ 오스틴킴X김성현X서영택X이동규의 'Demain n'existe pas'| 팬텀싱어4 | JTBC 230526 방송

04:00 (포르테나)오스틴킴X김성현X서영택X이동규 - Demain n'existe pas

#팬텀싱어4 #포르테나 #Demain n' existe pas

[JTBC봐야지] 구독하기☞ http://asq.kr/VgFZ3

------------------------------------------------------
📌영상 속 프로그램은?

【팬텀싱어 4】
https://tv.jtbc.co.kr/jtbc4singer4
----------------…

63 views06:57

Continuous Learning_Startup & Investment

LIMA 모델을 만든 기법을 실증한 사례들이 생겨나고 있는것 같습니다.

WizardLM 모델 파인튜닝에 사용된, 원본 Evol Instruct 데이터셋은 V1이 70K, V2가 190K 여건으로 구성되어 있습니다.

이번에 공개된 WizardLM V1.1은 1K의 데이터셋 만으로 파인튜닝 되었다고 합니다. 원본 Evol Instruct에서 큐레이션된 데이터셋일지, 새롭게 구성된 프롬프트로 얻은 데이터셋일지는 아직 공개된 바가 없어서 확실치 않습니다 (조만간 데이터셋과 그 방법을 공개한다고 하네요).

하지만 모델(13B)의 가중치는 이미 공개되어 있고, 벤치마크 성능은 WizardLM 30B V1.0과 유사한 수준으로 보입니다. 사실 LIMA 논문이 나왔을 때는 약간 의구심이 들기도 했는데요, 실제로 가능함을 보여주는 실제 사례들이 등장하면서 가능성이 있겠다는 생각이 드는군요.

예전부터 생각해왔듯이 "매우 고 품질"의 데이터가 필요하기는 하며, 사실 이 정도는 기업 차원에서는 쉽게 구축할 수 있을듯 합니다. 이미 LLaMA와 동일한 구조를 띈 "상업적 이용이 가능한" OpenLLaMA 및 XGen과 같은 기반 모델들이 나와있습니다. 1K 정도라면, 인력으로도 충분히 만들어낼 수 있는 수준이기 때문에, 조만간 GPT4의 개입 없이 실제 상업적 활용이 가능한 파인튜닝된 모델들이 속출할 것으로 예상됩니다.

당장 커뮤니티 차원에서도 데이터 만들기 품앗이를 해볼 수 있겠는데요. 일단은 GPT4로 데이터를 다량으로 만들기는 할 것인데, 이것을 사람이 큐레이션 하는 작업이 필요합니다. 먼저 "고 품질"로 1K 여건의 데이터를 추리는 작업이 필요하고, 그 다음 기계 번역합니다. 여기 까지하면 GPT4가 생성한 결과 그대로이기 때문에 라이선스가 애매하죠. 따라서 번역된 1K 여건을 사람이 일일이 검수합니다. 당연히 구조를 뒤틀고, 용어도 좀 더 자연스럽게 교정하는 등 매우 꼼꼼히 1K 데이터셋을 검수해보는 방향입니다. 그러면 사실상 GPT4가 만든 원본이라고해도, 결과물은 원본과는 완전히 다르기 때문에, 활용 가능할 것이라고 생각합니다.
: 혹시 관심있으신 분 계신가요?

https://twitter.com/WizardLM_AI/status/1677282955490918401

https://huggingface.co/WizardLM

X (formerly Twitter)

WizardLM on X

Introduce the newest WizardLM models trained with only 1k high-quality evolved data!

WizardLM-13B-V1.1 achieves:

(1). 6.74 on MT-Bench
(2). 86.32% on Alpaca Eval (ChatGPT is 86.09%)
(3). 99.3% on WizardLM Eval

Github: https://t.co/4OuXolwS1P

Weights:…

59 views00:34

Continuous Learning_Startup & Investment

https://m.moneys.co.kr/article.html?no=2023070717032471434&code=w0405&fbclid=IwAR35qilbjwWAoz0TIbEfcdOXIEvYJs1mYUpz2FCXvNRMIdpp0RnRL64hueE_aem_AX-6ueGsTtvEudxSzQFJHSw9CsrZvytSjzyjU0PHWVPP8TD94mbyuJT_d6-bDI80JnM&mibextid=Zxz2cZ#_enliple

머니S

[이사람] 배민 떠나는 창업자 김봉진 "고맙고 또 고맙습니다" - 머니S

배달의민족 운영사 우아한형제들의 창업주인 김봉진 우아한형제들 의장이 회사를 떠난다. 우아한형제들 창립 13년 만의 결정이다.김 의장은 지난 7일 임직원 대상 전사 메일을 통해 "우리 구성원들과의 함께했던 그 열정의 시간들 너무 행복했다. 그러나

51 views03:40

Continuous Learning_Startup & Investment

전종현의 인사이트

폴 그레이엄이 환상적인 글을 남겼다. 이 글은 평생에 걸쳐서 읽어야겠다. "How to Do Great Work" <번역본> https://frontierbydoyeob.substack.com/p/frontier-13-how-to-do-great-work?utm_source=post-email-noscript&publication_id=944480&post_id=132707382&isFreemail=true&utm_medium=email <원문> h…

Oh… 산전수전수중전까지 겪은 아저씨가 Bob아저씨가 그림 그려놓고 쉽죠 하는 것처럼… Great work란 말이지 하면서 이야기해주는 것 같네 ㅎㅎ

추가로 Patrcik(CEO of Stripe)도 블로그에서 인류가 만들어낸 위대한 일에 대한 기록들을 공유한 적이 있는데 같이 읽어볼만한 것 같다.

Hardy's *A Mathematician's Apology*

Some materials about successful industrial/applied research labs. I recommend all of them. Further recommendations very [welcome](mailto:patrick@collison.ie). Also, does it just *seem* that their heyday is past, or has something structurally changed?

❤2

44 viewsedited 08:18

Continuous Learning_Startup & Investment

전종현의 인사이트

폴 그레이엄이 환상적인 글을 남겼다. 이 글은 평생에 걸쳐서 읽어야겠다. "How to Do Great Work" <번역본> https://frontierbydoyeob.substack.com/p/frontier-13-how-to-do-great-work?utm_source=post-email-noscript&publication_id=944480&post_id=132707382&isFreemail=true&utm_medium=email <원문> h…

Amazon

Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age

Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age [Hiltzik, Michael A.] on Amazon.com. *FREE* shipping on qualifying offers. Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age

46 views08:18

Continuous Learning_Startup & Investment

전종현의 인사이트

폴 그레이엄이 환상적인 글을 남겼다. 이 글은 평생에 걸쳐서 읽어야겠다. "How to Do Great Work" <번역본> https://frontierbydoyeob.substack.com/p/frontier-13-how-to-do-great-work?utm_source=post-email-noscript&publication_id=944480&post_id=132707382&isFreemail=true&utm_medium=email <원문> h…

- [Alvarez: Adventures Of A Physicist](https://www.amazon.com/Alvarez-Adventures-Physicist-Alfred-Foundation/dp/0465001165). Luis Alvarez's first-hand account of participating in the development of [GCA](https://en.wikipedia.org/wiki/Ground-controlled_approach), radar, and the Manhattan Project.
- [Doing the Impossible](https://www.amazon.com/Doing-Impossible-Management-Spaceflight-Exploration/dp/1461437008). How George Mueller managed the Apollo Program.

41 views08:18

Continuous Learning_Startup & Investment

https://www.theinformation.com/articles/nvidia-acquired-ai-startup-that-shrinks-machine-learning-models?utm_campaign=article_email&utm_content=article-10790&utm_source=sg&utm_medium=email&rc=ocojsj

The Information

Nvidia Acquired AI Startup That Shrinks Machine-Learning Models

Nvidia in February quietly acquired OmniML, a two-year-old artificial intelligence startup whose software helped shrink machine-learning models so they could run on devices rather than in the cloud, according to a spokesperson and LinkedIn profiles of former…

40 views08:27

Continuous Learning_Startup & Investment

https://softwarestackinvesting.com/snowflake-snow-q1-fy2024-earnings-review/

Software Stack Investing

Snowflake (SNOW) Q1 FY2024 Earnings Review - Software Stack Investing

Snowflake's consumption business continues to feel pressure as large enterprise customers look for ways to optimize usage. While it seemed that management had

39 views23:15

Continuous Learning_Startup & Investment

https://m.blog.naver.com/PostView.naver?blogId=humbleinvest&logNo=223127883654&proxyReferer=

NAVER

[인텍플러스] Advanced Packaging부터 2차전지까지(총정리)

인텍플러스를 얘기하면 이해하기 어렵다는 피드백을 많이 듣습니다. 그럴것이 시가총액 3,000억원 주제에 ...

41 views23:16

Continuous Learning_Startup & Investment

https://www.sedaily.com/NewsView/29S02FWDDI

서울경제

최시영 사장 "2025년 'GAA 3D 패키징' 세계 첫 구현"

삼성전자(005930)가 3㎚(나노미터·10억분의 1m)이하 반도체에서 승기를 잡기 위한 기술 로드맵을 밝혔다. 파운드리에서 ...

33 views23:16

Continuous Learning_Startup & Investment

https://n.news.naver.com/article/018/0005521278?sid=105&fbclid=IwAR0JfY18MMYLHJey1sqOpq91q-OpCmQMfMruPKNbVgr4rVf2B8XVZmnpxjI&utm_source=substack&utm_medium=email

Naver

"AI반도체로 넥스트 삼성전자 되겠다” 는 이 회사, 무기는 ‘글로벌팀’

지난달 25일, 성남 분당구 정자동에 있는 리벨리온 본사에서 만난 박성현(39) 리벨리온 대표가 기자에게 삼성 5나노 공정 활용과 부동소수점 연산을 지원하는 국내 유일의 AI 반도체 기술에 대해 설명하고 있다. 지난

35 views23:17

Continuous Learning_Startup & Investment

https://m.blog.naver.com/PostView.naver?blogId=tmdejr1267&logNo=223143932128&proxyReferer=

NAVER

2023년 상반기 결산

1. 항상 느끼지만, 시간이란 것은 참으로 오묘합니다. 올해 6개월은 너무나도 많은 일이 있어서, 유독 시간...

33 views23:19

Continuous Learning_Startup & Investment

Data Freshness in Machine Learning Systems.
When it comes to Machine Learning Systems, we usually plug in on top of Data Engineering Systems where data is already collected, transformed and curated for efficient usage in downstream systems - ML System is just one of them. This does not mean however that no additional data transformations need to happen after data is handed over. We refer to Data Freshness in Machine Learning Systems as Feature Freshness.
When thinking about composition of how data is served to the end user in ML Systems there are two mostly independent pieces, hence also two perspectives on

33 views23:36

Continuous Learning_Startup & Investment

Feature Freshness:
Feature Freshness at Model Training time: how much time does it take for a generated data point to be included when training a Machine Learning Model which is then deployed to serve the end user. Remember that Machine Learning models are nothing more than Statistical models trained to predict certain outcomes on a given feature distribution. We can’t avoid ML Models becoming stale if not retrained. This phenomenon of ML models becoming stale is called Feature and Concept Drift (you can read more about them here).
Feature Freshness at inference time: how much time does it take for a generated data point to be available when performing Inference with the previously trained and deployed model. Features used for inference are usually decoupled in terms of freshness from the ones that are used while training the model and are less stale.

32 views23:36

Continuous Learning_Startup & Investment

🥇Top ML Papers of the Week

How Language Models Use Long Contexts

- finds that LM performance is often highest when relevant information occurs at the beginning or end of the input context; performance degrades when relevant information is provided in the middle of a long context. ([paper](https://substack.com/redirect/4e6b797d-9aed-4940-88c7-3af5b63e4f20?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/3a9b6a9f-fc9e-40a0-b172-f779c899bacf?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

LLMs as Effective Text Rankers

- proposes a prompting technique that enables open-source LLMs to perform state-of-the-art text ranking on standard benchmarks. ([paper](https://substack.com/redirect/7782bdfe-6f9c-4c37-87da-f353da8a7a7f?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/dfbde4a7-b3ae-4df2-ac6d-8283361c3ad3?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

Multimodal Generation with Frozen LLMs

- introduces an approach that effectively maps images to the token space of LLMs; enables models like PaLM and GPT-4 to tackle visual tasks without parameter updates; enables multimodal tasks and uses in-context learning to tackle various visual tasks. ([paper](https://substack.com/redirect/8377a115-b5c0-4a05-80c3-821099b7ccbf?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/56122f84-fd95-4fb3-bc68-2239ef4ba411?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

Elastic Decision Transformer

- introduces an advancement over Decision Transformers and variants by facilitating trajectory stitching during action inference at test time, achieved by adjusting to shorter history that allows transitions to diverse and better future states. ([paper](https://substack.com/redirect/2b1b7dcb-4143-465f-9cd4-aba578c73279?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/adc07e2e-3c86-423c-aa6b-6e4ab3ed2a0a?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

Physics-based Motion Retargeting in Real-Time

- proposes a method that uses reinforcement learning to train a policy to control characters in a physics simulator; it retargets motions in real-time from sparse human sensor data to characters of various morphologies. ([paper](https://substack.com/redirect/d7cf6278-7ebf-42f2-9d21-6f598e29cd1e?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/88b736f9-b052-4aa4-be5a-697533fa2d94?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

InterCode

- introduces a framework of interactive coding as a reinforcement learning environment; this is different from the typical coding benchmarks that consider a static sequence-to-sequence process. ([paper](https://substack.com/redirect/48889a92-d287-4fd2-87e0-3b72f395c3ed?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI)|[tweet](https://substack.com/redirect/e4f3aeb7-b2d3-49a9-9b32-9d0a842dd7f4?j=eyJ1IjoiMWRheDAifQ.YVDAzsk3G87vydkjsTF3WruaemlL7xgZ83byJs8O8dI))

34 viewsedited 23:39

Continuous Learning_Startup & Investment

https://github.com/0hq/tinyvector
SQLite + Python(Flask) + Numpy 로 구성
500라인도 안되는 코드로 쉽게 커스터마이징 가능
중/소규모 데이터셋에서 고급 벡터 데이터베이스들과 비슷한 성능
모든 인덱스를 메모리에 저장해서 빠르게 쿼리 가능
곧 추가될 기능들
강력한 쿼리(SQL 기능 모두 지원)
모델과 통합(SBert, Hugging Face models, OpenAI, Cohere,..)
Python/JS 클라이언트

GitHub

GitHub - 0hq/tinyvector: A tiny nearest-neighbor embedding database built with SQLite and Pytorch. (In development!)

A tiny nearest-neighbor embedding database built with SQLite and Pytorch. (In development!) - 0hq/tinyvector

36 views23:41

Continuous Learning_Startup & Investment

Forwarded from 요즘AI

📌 AI 산업의 비즈니스 기회는 어디?

2022년 11월 30일 ChatGPT의 등장으로 AI산업이 첫 전환점에 도달하고 대략 7개월이 지났습니다.

일선에서는 서서히 두 번째 물결이 시작되고 있는 것 같습니다.

AI 유니콘 기업인 Cohere(약 $2B 기업 가치)는 AI 생산성 향상에 대해 3단계(3 phase)로 나누어 이 물결을 설명했습니다 :

Phase 1 : 현재 시점의 단계.
대규모 언어 모델(LLM)이 전반적으로 학습 및 배포되는 단계로, 사용자는 ChatGPT와 같은 프론트엔드(front-end) 도구를 활용하여 텍스트의 구상, 작성, 개선에 도움을 받음. 기업의 자체 데이터의 사용은 거의 일어나지 않는 초기 단계.

Phase 2 : 검색 증강 생성((Retrieval Augmented generation, RAG)을 활용하는 단계.
LLM이 기업 데이터에 대한 접근 권한을 갖게 되는 단계로, 챗봇과 같은 형태로 언어 모델과 상호작용 할 수 있음.
말 그대로 사람이 할 수 있는 거의 모든 작업을 효율적으로 검색, 종합, 보고할 수 있는 지식 도우미(Knowledge Assistant, KA) 역할.
예시 : 👩🏻 "스크랜턴 지사의 최근 보고서 5개를 요약하고 최고의 영업 사원을 찾아줘.", "매출 기준 상위 5개 제품 중 총 마진이 가장 높은 제품이 뭐야?"

Phase 3 : 지식 도우미(Knowledge Assistant, KA)가 작업자를 대신하여 조치를 취할 수 있는 단계.
지식 도우미가 기업의 시스템과 지능적이면서 안정적으로 상호 작용할 수 있는 단계로, 실제 작업의 ‘실행’까지 맡아서 할 수 있음. 필요한 기업의 시스템과 인터페이스의 다양성을 고려할 때 시간이 걸리겠지만, 언어 모델이 필요한 형식을 비교적 빠르게 생성하고 이해할 수 있을 것으로 예상.
예시 : 👨🏻 "시라큐스 지점에서 80파운드 재고 500묶음과 송장을 할인 없이 보내줘.”

기술의 발전은 항상 새로운 혁신을 야기합니다. 모바일 인터페이스의 발전(Phase 1)이 수많은 라이프 스타일을 혁신(Phase 2, 3)한 것처럼, AI 산업도 이와 비슷한 양상을 띨 것 같습니다.

샘 알트만은 GPT 모델이 방대한 지식 데이터베이스가 아닌 추론 엔진이라는 점을 강조한 바 있습니다. Phase 1은 이와 같은 강력한 추론 모델이 만들어진 단계가 아니었나 싶습니다.

이제는 강력한 추론 모델이 최적의 답변을 만들어내기 위한 정제된 데이터가 필요할 때인 것 같습니다. 기업의 고유한 데이터가 바로 이 정제된 데이터의 역할을 하게 될 것 같습니다. 기업의 데이터를 활용한 생산성 향상의 영역은 AI 기술에게는 최적의 시장이기 때문이죠.

따라서 이제는 기업이 자체 데이터를 잘 활용할 수 있도록 하는 AI 서비스는 무엇일지, 그 형태는 어떨지 고민을 해봐야 할 때인 것 같습니다.

아마 이 단계에서는 기존의 사용자 인터페이스와는 전혀 다른 형태로 전환될 가능성도 있을 것 같습니다. 창의력이 필요한 시기 같네요.

✔️ Cohere의 ‘How Generative AI and LLMs Unlock Greator Workforce Productivity’라는 글을 참고했습니다. 여기에서 원문을 의역한 글을 읽으실 수 있습니다. 요즘AI가 담은 내용 외에도 좋은 내용이 많으니 한 번씩 읽어보시면 좋을 것 같습니다.

NAVER

생성AI와 LLM으로 생산성을 높이는 방법 - Cohere

2022년 11월 30일 ChatGPT의 등장으로 AI산업이 첫 전환점에 도달하고 대략 7개월이 지났습니다.

35 views00:10

Continuous Learning_Startup & Investment

Everyone will be able to code using English. The methods we use to learn programming and develop software are set to undergo a radical transformation in the upcoming months.

https://chat.openai.com/share/5689e899-673b-469f-af85-977b03c9e825

ChatGPT

Terminal 시작, Code Interpreter.

A conversational AI system that listens, learns, and challenges

45 viewsedited 00:19

Continuous Learning_Startup & Investment

https://twitter.com/swyx/status/1677478189080395777?s=46&t=h5Byg6Wosg8MJb4pbPSDow

Twitter

The most advanced AI agent the world has ever seen is rolling out to 20m people this weekend and not enough people are talking about it.

Absurd. Read @emollick and @simonw's blogs of what it can do.

If you've used @OpenAI's new Code Interpreter model, or…

54 views01:12

Continuous Learning_Startup & Investment

https://www.oneusefulthing.org/p/what-ai-can-do-with-a-toolbox-getting?utm_medium=reader2

www.oneusefulthing.org