NEW BOT Телеграм, страница

Engineer Readings

[databases]
https://www.uber.com/en-NL/blog/auto-categorizing-data-through-ai-ml/
Data categorization–the process of classifying data based on its characteristics and essence–is a foundational pillar of any privacy or security program. The effectiveness of fine-grained data categorization is pivotal in implementing privacy and security controls, such as access policies and encryption, as well as managing the lifecycle of data assets, encompassing retention and deletion. This blog delves into Uber’s approach to achieving data categorization at scale by leveraging various AI/ML techniques.

581 viewsedited 13:49

Engineer Readings

[highload]

https://openai.com/index/scaling-kubernetes-to-7500-nodes/

Openai

Scaling Kubernetes to 7,500 nodes

We’ve scaled Kubernetes clusters to 7,500 nodes, producing a scalable infrastructure for large models like GPT-3, CLIP, and DALL·E, but also for rapid small-scale iterative research such as Scaling Laws for Neural Language Models.

👍2

647 views20:23

Engineer Readings

[llm][usecase][text-to-sql]

https://medium.com/pinterest-engineering/how-we-built-text-to-sql-at-pinterest-30bad30dabff

Medium

How we built Text-to-SQL at Pinterest

Adam Obeng | Data Scientist, Data Platform Science; J.C. Zhong | Tech Lead, Analytics Platform; Charlie Gu | Sr. Manager, Engineering

👍1

716 views19:04

Engineer Readings

[llm][usecase][files organizer]

https://devpost.com/software/llamafs

Devpost

LlamaFS

A self-organizing file system - automatically rename and organize your computer with Llama. `Unnoscriptd (1).ipynb` no more!

696 views19:13

Engineer Readings

[llm][usecase][coding-agent]

https://github.com/princeton-nlp/SWE-agent

GitHub

GitHub - SWE-agent/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can…

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024] - Gi...

819 views16:34

Engineer Readings

[news][ai][hackaton]
Great projects out of the Mistral AI hackaton which took place in Paris.

https://x.com/alexreibman/status/1796349663710511114?s=46&t=eNN3Y-GKeBSlFyyj1ozvgg

944 views07:13

Engineer Readings

[testing][uber]

https://www.uber.com/en-NL/blog/flaky-tests-overhaul/

👍3

826 viewsedited 09:07

Engineer Readings

[distributed systems][kafka]

Kora: A Cloud-Native Event Streaming Platform For Kafka

https://www.vldb.org/pvldb/vol16/p3822-povzner.pdf

744 views06:14

Engineer Readings

[course]

https://ppc.cs.aalto.fi/

ppc.cs.aalto.fi

Programming Parallel Computers

Aalto University · This is a practical hands-on course on algorithm engineering for modern parallel computers. You will learn how to design programs that make the best possible use of the computing power of multicore CPUs and GPUs.

❤1

957 views21:06

Engineer Readings

[fun][video]
Coding Doom in C

https://m.youtube.com/watch?v=p7f9p9nDsmc&feature=youtu.be

YouTube

Code a DOOM-like game engine from scratch in C [PART I]

This is part I of the DOOM-like game engine from scratch tutorial.

It introduces the basics of 2.5D (Pseudo-3D) graphics rendering and movement mechanics. Also concepts such as portals, rendering, world transformations, and others.

You can download the…

🔥2👍1

729 views09:50

Engineer Readings

[memory]

What Every Programmer Should Know About Memory

This paper explains the structure of memory subsys- tems in use on modern commodity hardware, illustrating why CPU caches were developed, how they work, and what programs should do to achieve optimal performance by utilizing them.

https://people.freebsd.org/~lstewart/articles/cpumemory.pdf

🔥2

858 views16:19

Engineer Readings

[data streaming]

https://www.notion.so/blog/building-and-scaling-notions-data-lake

Notion

How Notion build and grew our data lake to keep up with rapid growth

🔥2

693 views15:50

Engineer Readings

[learning][distributed systems]
Colleague shared an amazing thing you can try to study distributed systems by building.

https://fly.io/dist-sys/1/

Fly

Challenge #1: Echo

Documentation and guides from the team at Fly.io.

🔥3

737 views11:06

Engineer Readings

[java][virtual threads][netflix]

https://netflixtechblog.com/java-21-virtual-threads-dude-wheres-my-lock-3052540e231d?gi=40fe9bdcedac

Medium

Java 21 Virtual Threads - Dude, Where’s My Lock?

Getting real with virtual threads

👍2

608 views21:08

Engineer Readings

[video]
How computers work

https://www.youtube.com/watch?v=HaBMAD-Dr8M&list=PLnAxReCloSeTJc8ZGogzjtCtXl_eE6yzA&index=2

YouTube

Logic gates - From transistors to logic gates NAND, AND, NOR, OR, NOT, XOR how computers work PART 1

Logic Gates - This video describes how the main logic gates are built starting from transistors in C-MOS technology, mostly used in CPU and RAM Memory. We see the NAND, AND, OR, NOR, NOT, XOR gates. At the end we see how built a three inputs AND gate and…

👍1🔥1

676 views10:37

Engineer Readings

[distributed systems][paper]

Event-Based Programming without Inversion of Control

https://lampwww.epfl.ch/~odersky/papers/jmlc06.pdf

684 views20:34

Engineer Readings

[netflix][reliability]

https://t.co/RzjbUKNXjo

Medium

Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding

Applying Quality of Service techniques at the application level

608 views20:46

Engineer Readings

[video][coding][visual language model]

https://youtu.be/vAmKB7iPkWw?si=7hCLqseUtJ-Pmi1C

YouTube

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Full coding of a Multimodal (Vision) Language Model from scratch using only Python and PyTorch.

We will be coding the PaliGemma Vision Language Model from scratch while explaining all the concepts behind it:
- Transformer model (Embeddings, Positional Encoding…

788 views16:14

Engineer Readings

[asyncio][python]
https://www.roguelynn.com/words/asyncio-we-did-it-wrong/

roguelynn

asyncio: We Did It Wrong

"The concurrent Python programmer’s dream", the answer to everyone's asynchronous prayers. The `asyncio` module has various layers of abstraction allowing developers as much control as they need and are comfortable with. But it's easy to get lulled into a…

635 views15:43

Engineer Readings

[paper][GC][state machine]
https://arxiv.org/html/2405.11182v1

In this paper, the authors quantify the overhead of running a state machine replication system for cloud systems written in a language with garbage collection (GC). To this end, they (1) design a canonical cloud system—a distributed, consensus-based, linearizable key-value store—from scratch, (2) implement it in C++, Java, Rust, and Go, and (3) evaluate the implementations under update-heavy and read-heavy workloads on AWS with different resource constraints, aiming to maximize throughput while maintaining low tail latency. The results show that GC incurs a non-trivial cost, even with ample memory. With limited memory, languages with manual memory management can achieve an order of magnitude higher throughput than those with GC on the same hardware. A key observation is that if a cloud system is expected to scale significantly, building it in a language with manual memory management, despite the higher development cost, may lead to substantial cloud cost savings in the long run.

🔥2

804 viewsedited 18:53

Engineer Readings

[ethz][computer architecture][lectures]

https://safari.ethz.ch/architecture/fall2022/doku.php?id=schedule

🔥1

686 views08:26

About

Blog

Apps

Platform