Dev Miscellaneous – Telegram
Dev Miscellaneous
342 subscribers
884 photos
6 videos
5 files
917 links
A channel where you can find developer tips, tools, APIs, resources, memes and interesting contents.

Join our comments chat for more.

Comments chat (friendly :D)
https://news.1rj.ru/str/+r_fUfa1bx1g0MGRk
Download Telegram
⚠️ RegreSSHion: RCE in OpenSSH's server, on glibc-based Linux systems

- The vulnerability is a regression of a previous issue (CVE-2006-5051) that was introduced in OpenSSH 8.5p1 in October 2020.
- The vulnerability allows for remote code execution as root on glibc-based Linux systems due to the SIGALRM handler calling non-async-signal-safe functions like syslog().
- Older vulnerable OpenSSH versions like 3.4p1 and 4.2p1 can be exploited by interrupting free() calls and leveraging heap corruption techniques like unlink() and House of Mind.
- Newer vulnerable versions like 9.2p1 can be exploited by interrupting malloc() calls and corrupting FILE structures to gain arbitrary code execution.
- Precise timing and network delay mitigation techniques are critical to winning the signal handler race condition.
- The exploit requires carefully crafting the heap layout and leveraging leftover data from previous allocations.
- OpenBSD is not vulnerable because it uses a safer syslog_r() function in its SIGALRM handler.
- The vulnerability is present in the default configuration of OpenSSH and affects the privileged sshd process.
- Significant effort and multiple iterations were required to develop reliable exploits for the different OpenSSH versions.
- The research demonstrates the continued need for vigilance in secure software development, as even a well-designed system like OpenSSH can have subtle regressions that introduce critical vulnerabilities.


https://www.qualys.com/2024/07/01/cve-2024-6387/regresshion.txt

@DevMisc (🟠 comments)
#security #ssh #linux
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥2
Reverse Engineering Ticketmaster's Rotating Barcodes

- TicketMaster has moved away from traditional printable PDF tickets in favor of a proprietary "SafeTix" system that uses rotating barcodes displayed on a mobile device.
- The rotating barcodes are meant to prevent ticket fraud, but the author argues they create significant usability issues, especially when cell service is poor at venues.
- TicketMaster markets SafeTix as a solution to ticket scalping and fraud, but the author believes the real motivations are to lock users into TicketMaster's ecosystem and make it harder to resell tickets outside their platform.
- The author was able to reverse engineer the SafeTix system and discovered it uses Time-based One-Time Passwords (TOTPs) along with a static bearer token to authenticate tickets.
- With the TOTP secrets and bearer token, the author could theoretically generate valid barcodes and bypass TicketMaster's security measures.
- TicketMaster makes it easy to extract the necessary token information by printing it to the browser console when the barcode is loaded.
- The author is uncertain about the lifetime of the TOTP tokens, but believes they may only be valid for up to 20 hours before the event based on TicketMaster's documentation.
- The author developed a tool called "TicketGimp" that can render valid SafeTix barcodes using the extracted token information.
- The author is highly critical of TicketMaster's practices, accusing them of using technology to exclude and disadvantage customers for their own financial gain.
- The author calls on TicketMaster's developers to have more integrity and use their technical skills responsibly, rather than enabling TicketMaster's "cruel business practices".


https://conduition.io/coding/ticketmaster/

@DevMisc
#web #rev #misc
2👍2
Python has too many package managers

- Python has a fragmented package and environment management ecosystem, with over a dozen different tools, each with their own strengths and weaknesses.
- The lack of a standardized, efficient, and user-friendly package manager in Python has been an "inexcusable pain-in-the-ass" for many years.
- Other programming languages like Rust, C#, and JavaScript have developed more cohesive and widely-loved package management solutions like Cargo, NuGet, and npm.
- Python's legacy package manager pip has historically had poor dependency resolution, only recently adding backtracking capabilities. It also lacks environment management features.
- The proliferation of various configuration files like requirements.txt, setup.py, Pipfile, environment.yml etc. has led to a lot of redundancy and lack of standardization in Python package management.
- The introduction of PEP 621 in 2020 aimed to consolidate dependencies and configuration into a single pyproject.toml file, leading to the emergence of new tools like Poetry, PDM, Flit, and Hatch.
- Poetry is currently the closest Python tool to the Cargo experience, but it suffers from slow dependency resolution, especially for large projects.
- Conda is a popular choice for data scientists and experimentalists as it can manage non-Python dependencies, but it lacks some features like lock files and can be cumbersome for production use.
- The Rust community's influence is evident in promising new Python package management tools like uv, which aims to be a fast, Cargo-like drop-in replacement for pip.
- The Python community still lacks a cohesive, standardized, and widely-adopted package management solution, but tools like uv hold promise for the future.


https://dublog.net/blog/so-many-python-package-managers/

@DevMisc
#python #pip #extra
💯3👍1👏1
Why German Strings are Everywhere

- Developed by Umbra (CedarDB's predecessor)
- Adopted by DuckDB, Apache Arrow, Polars, and Facebook Velox

German Strings are a custom string type highly optimized for data processing. They offer significant improvements over traditional C and C++ string implementations.

Key Features:
- 128-bit struct representation (vs. 192 bits in C++)
- Short string optimization for strings ≤12 characters
- Long string format with 4-char prefix for quick comparisons
- Immutable design for better performance and concurrency
- Storage classes: persistent, transient, temporary

Advantages:
- Space-efficient, fitting in two CPU registers
- Reduced allocations and data movement
- Easier parallelization due to immutability
- Flexible lifetime management with storage classes
- Optimized for common database operations (comparisons, sorting)

Trade-offs:
- Requires careful consideration of string usage and lifetime
- Updates are more expensive (but rare in database systems)
- Maximum string length limited to 4 GiB


https://cedardb.com/blog/german_strings/

@DevMisc
#cpp #data #misc
👍5
Counting Bytes Faster Than You'd Think Possible

- The author was able to significantly optimize a byte-counting program, achieving a ~550x speedup over a naive implementation.
- The key optimization was using an interleaved memory access pattern, reading from different 4KB pages in a round-robin fashion, instead of sequential access.
- This interleaved access pattern takes advantage of the "Streamer" hardware prefetcher in modern CPUs, which can maintain separate forward and backward access streams for each 4KB page.
- Interleaving 8 different 4KB pages was found to be the optimal approach, providing up to a 30% performance boost over sequential access.
- The author also unrolled the inner loop to process 2 cache lines (64 bytes) at a time, and added a prefetch instruction to fetch the next set of data.
- The final solution uses AVX2 SIMD instructions to perform the byte counting in a highly efficient manner.
- The author was able to achieve a ranking of #13 on the HighLoad leaderboard with this optimized solution.
- The interleaved memory access pattern seems to be an under-discussed optimization technique, with the author not recalling seeing it used in other code.
- The author encourages readers to share any other memory-based optimizations they are aware of, as the author is interested in learning about them.
- The document provides the full source code for the optimized byte-counting program, allowing readers to study and potentially apply the techniques in their own work.


https://blog.mattstuchlik.com/2024/07/21/fastest-memory-read.html

@DevMisc
#asm #cpp #optimization
1🤯1
Scaling One Million Checkboxes to 650M checks

- The website "One Million Checkboxes" (OMCB) launched on June 26th, 2024 and unexpectedly went viral, attracting millions of users and checkbox checks within the first few days.
- The initial architecture used a single Flask server, nginx reverse proxy, and Redis for state management, but this was quickly overwhelmed by the surge in traffic.
- Key principles for scaling the site included bounding costs, embracing short-term solutions, using simple self-hosted tech, and keeping the experience global.
- Scaling efforts involved adding more Flask servers, implementing batching and connection pooling, and capping bandwidth usage with Linux tc to control costs.
- Bugs like allowing checkbox checks beyond the 1 million limit caused issues that required quick fixes, like truncating the bitset.
- Adding a Redis replica helped spread the load, though finding the private IP address was a challenge.
- Ensuring clients received consistent, up-to-date checkbox state required adding timestamps and logic to handle stale updates.
- Rewriting the backend in Go provided a significant performance boost, allowing the implementation of a "sunsetting" feature to freeze checked boxes over time.
- Using Redis and Lua noscripts made the sunsetting logic simple and race condition-free.
- The author learned valuable lessons about building for the unpredictable nature of the internet, validating their belief in demand for constrained anonymous interactions, and the benefits of launching quickly versus extensive planning.


https://eieio.games/essays/scaling-one-million-checkboxes/

@DevMisc (🟠 comments)
#learn #fun #scaling #misc
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥2
TOTP tokens on my wrist with the smartest dumb watch

- The author has replaced the logic board of a classic Casio F-91W watch with a new ARM Cortex M0+ powered board from Sensor Watch.
- The new board allows the watch to be programmed and customized, including adding features like TOTP (time-based one-time password) support for two-factor authentication.
- The author was able to set up TOTP support for their Google and GitHub accounts, allowing them to access the OTP codes directly on their wrist.
- The author also created a new "ratemeter" watchface that can be used to measure rates, such as rowing strokes or cadence.
- The document provides detailed instructions on how to add TOTP secrets to the watchface code and how the ratemeter watchface was implemented.
- The Sensor Watch project provides a clean and easy to modify set of watchfaces and complications that can be customized.
- The F-91W watch case, combined with the new programmable board, offers a powerful and hackable platform with long battery life.
- The author highlights the availability of a WASM-based emulator that makes it easy to test and play with the custom builds.
- The document mentions other interesting watchfaces available in the Sensor Watch project, including a pulsometer and orrery.
- The author recommends getting a Sensor Watch from Oddly Specific Objects, though they have no affiliation with the company.


https://blog.singleton.io/posts/2022-10-17-otp-on-wrist/

@DevMisc (🟠 comments)
#c #fun #misc
Please open Telegram to view this post
VIEW IN TELEGRAM
"We ran out of columns" - The best, worst codebase

- The database was the central component of the system, with a table called "Merchants" that had over 500 columns due to running out of columns in the original "Merchants" table.
- The "SequenceKey" table was a single-column, single-row table used to generate IDs, demonstrating a creative but unconventional solution.
- The system had a manually maintained "calendar" table to track login access, which was a fragile and outdated approach.
- The employee data was reloaded from a CSV file every morning, with an email-based process to replicate the data to headquarters.
- There was a normalized copy of the database, but it required 7 joins to go from the "Merchants" table to a phone number, showing the complexity.
- The codebase was a mix of VB and C#, with a proliferation of JavaScript frameworks and custom modifications.
- The "shipping manager" application was built in a weekend by a single developer named Gilfoyle, who was known for not checking in his code.
- The author discovered a bug related to the shipping queue that was caused by a SOAP service client doing all the side effects instead of the service itself.
- The "Merchants Search" page was optimized by a senior developer named Justin, who was able to make significant improvements by decoupling the page into separate endpoints.
- The codebase, despite its flaws, allowed for a sense of freedom and creativity, with developers carving out their own "little worlds of sanity" within the larger monolithic application.


https://jimmyhmiller.github.io/ugliest-beautiful-codebase

@DevMisc
#sql #fun #badcode
2🔥1
How to Get or Create in PostgreSQL

- Implementing "get or create" functionality correctly in PostgreSQL can be tricky, with potential issues around race conditions, concurrency, and bloat.
- A simple INSERT statement is not idempotent, as executing it with the same input twice will trigger a unique constraint violation error.
- To provide idempotency, the process needs to handle two situations: 1) if the tag already exists, return the existing tag, and 2) if the tag does not exist, create it and return the new tag.
- Using a unique constraint violation to handle "get or create" can lead to bloat, as new rows are first inserted and then marked as dead if a duplicate is found.
- Checking if a tag exists before inserting it (the "look before you leap" approach) can suffer from time-of-check to time-of-use issues when used concurrently.
- The "ask for forgiveness" approach using INSERT ON CONFLICT DO NOTHING is a better way to handle "get or create" without generating bloat.
- Concatenating results from the target table and a WITH clause using UNION ALL can handle visibility issues with data-modifying statements in the WITH clause.
- Even the "ask for forgiveness" approach can suffer from concurrency issues, as a race condition can still occur between checking if a tag exists and inserting it.
- Using INSERT ON CONFLICT with a DO NOTHING clause is the most robust solution, providing idempotency, concurrency safety, and preventing bloat.
- Starting in PostgreSQL 17, the MERGE statement with RETURNING can provide an alternative approach that doesn't require unique or exclusion constraints on the target table.


https://hakibenita.com/postgresql-get-or-create

@DevMisc (🟠 comments)
#postgres #database #learn #misc
Please open Telegram to view this post
VIEW IN TELEGRAM
2
Handling Concurrency Without Locks

- Concurrency issues can be difficult to recognize and often get overlooked, leading to hard-to-debug bugs.
- It's tempting to dismiss concurrency issues due to perceived low likelihood, but they can still crop up unexpectedly under high load.
- Locking is a common approach to handling concurrency, but locks can be overused and lead to performance issues.
- The database is the lowest common denominator for coordinating locks across multiple processes and servers.
- The "ask for forgiveness" (EAFP) approach, where you try an operation and handle exceptions, is often more Pythonic than checking conditions in advance.
- In PostgreSQL, when an exception occurs within a transaction, it can block further commands until the transaction ends, requiring special handling.
- Using SELECT FOR UPDATE can lock rows to prevent race conditions, but this can also cause performance issues with high concurrency.
- Incrementing counters directly in the database, using an F expression, can avoid race conditions without the need for explicit locking.
- Combining database-level updates with RETURNING to immediately fetch the updated object can optimize the process further.
- The key is to keep concurrency issues in mind, avoid dismissing them due to perceived low likelihood, and use the most appropriate concurrency control mechanisms for the specific situation.

https://hakibenita.com/django-concurrency

@DevMisc
#sql #reliability #learn #misc
Can you convert a video to pure CSS?

- It is possible to convert a video into a pure CSS animation, where each pixel of the video is represented by an animated CSS element.
- This technique involves downscaling the video, extracting the pixel data, and generating a massive CSS animation with keyframes for each pixel.
- The process can be optimized by skipping keyframes where the pixel color hasn't changed, but this introduces some visual artifacts.
- An alternative approach uses CSS box-shadows to represent the video pixels, which is simpler and more performant, especially on Chrome browsers.
- Browser support and performance varies greatly, with Safari handling much larger CSS animations than Chrome before crashing.
- The final CSS-based video animation can be further converted into an animated GIF using a library like GIF.js.
- The motivation behind this project seems to be more about the technical challenge and novelty rather than practical application.
- The author emphasizes the importance of style over substance, suggesting this approach could be used for startup landing pages to send a message, even if it crashes most browsers.
- The author hints at the possibility of creating a new file format called ".vibcss" to standardize this CSS-based video approach.
- Overall, the document showcases an innovative, if impractical, technique to push the boundaries of what is possible with CSS animations.


https://dgerrells.com/blog/can-you-convert-a-video-to-pure-css

@DevMisc (🟠 comments)
#css #web #fun
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
Floating points between zero and one

TL;DR: the number of float representations between 0.0 and 1.0 is the same as between 1.0 and +inf


https://chadnauseam.com/coding/random/floating-points-between-zero-and-one

@DevMisc
#lowlevel #fun #misc
😁1
100M Token Context Windows

- The document discusses research on "ultra-long context models" that can reason on up to 100 million tokens of context during inference, rather than relying on short-term memory.
- Current long-context evaluation methods like "Needle in a Haystack" have flaws that allow traditional models to perform well, so the document introduces a new "HashHop" evaluation that requires models to store and retrieve maximum information content.
- The company has trained a 100M token context model called LTM-2-mini, which is significantly more memory-efficient than large language models like Llama 405B when using a 100M token context.
- The company is building new supercomputing infrastructure on Google Cloud, including the Magic-G4 and Magic-G5 systems powered by NVIDIA GPUs, to support training and serving their ultra-long context models.
- The company has raised $465 million in funding from investors including Eric Schmidt, Jane Street, Sequoia, and Atlassian, with the goal of making AI models that can reliably produce high-quality code and software features.
- The company believes inference-time compute is the next frontier in AI, beyond just pre-training models, and is building custom training and inference stacks from scratch to enable this.
- The company is committed to responsible AI development, including a focus on cybersecurity and higher regulatory standards, and is hiring a Head of Security to lead these efforts.
- The company is rapidly growing, now at 23 people plus 8000 GPUs, and is hiring for roles in engineering, research, and supercomputing/systems to accelerate their work.
- The document showcases examples of the company's 100M token context model producing reasonable outputs for tasks like implementing a custom GUI calculator and a password strength meter, despite being several orders of magnitude smaller than state-of-the-art models.
- The overall focus is on developing AI models that can leverage ultra-long context to enable more reliable and capable AI systems, especially in domains like software development.


https://magic.dev/blog/100m-token-context-windows

@DevMisc (🟠 comments)
#ai #llm #misc
Please open Telegram to view this post
VIEW IN TELEGRAM
🤯1
Password protect a static HTML page, decrypted in-browser in JavaScript

Safely encrypt and password protect the content of your public static HTML file, to be decrypted in-browser without any back-end - to serve it over static hosting like Netlify, GitHub pages, etc. (see a live example)


https://github.com/robinmoisson/staticrypt

@DevMisc
#fun #web #cryptography #tools #extra
👍2
Programming Zero Knowledge Proofs: From Zero to Hero

- Zero Knowledge Proofs (ZKPs) allow one party (the prover) to prove to another party (the verifier) that they know some secret information without revealing that information.
- ZKPs have the key properties of privacy (proving something without revealing anything else) and succinctness (the proof stays roughly the same size regardless of the complexity of the computation).
- ZKPs can be programmed by writing special programs called circuits, which specify constraints that must be satisfied.
- The process of generating a ZKP involves a trusted setup to create a proving key and verification key, which can then be used to generate and verify proofs.
- Implementing basic ZKP programs involves defining constraints, compiling the circuit, performing a trusted setup, generating a witness, and generating/verifying proofs.
- Improving ZKP programs often involves adding more complex constraints, such as checking that inputs are not equal to 1.
- ZKPs can be used to implement cryptographic primitives like digital signatures, using hash functions and commitments instead of public-key cryptography.
- Implementing a digital signature scheme with ZKPs involves generating a private/public key pair (identity secret/identity commitment) and using them to sign and verify messages.
- ZKPs make it possible to implement complex cryptographic protocols, like group signatures, in a more straightforward way compared to traditional cryptographic techniques.
- Writing effective ZKP programs requires developing an intuition for how to express constraints in a way that satisfies the requirements of the underlying arithmetic circuits.


https://zkintro.com/articles/programming-zkps-from-zero-to-hero

@DevMisc
#cryptography #zkp #misc
Securing a Linux Server

A guide to secure and harden a Linux server install.


https://kenhv.com/blog/securing-a-linux-server

@DevMisc
#security #devops #learn
How does cosine similarity work?

When working with LLM embeddings, it is often important to be able to compare them. Cosine similarity is the recommended way to do this.


https://tomhazledine.com/cosine-similarity/

@DevMisc
#embeddings #llm #learn
👍1
const fn: Pure Functions in Rust

I was taught formal methods at university but these ultra-safe development techniques are expensive, require using unusual external verification languages, and most damning for web and application developers, they slow down iteration.
After graduating and getting a webdev job, I despaired that the safety and guarantees of the formal systems that I had been introduced to weren't available to me as a web developer.
I was going to have to act if I wanted to live in a different world.


https://www.youtube.com/watch?v=voRBS0r4EyI

@DevMisc
#rust #fp #purity #learn
2
The sorry state of OpenSSL usability

OpenSSL is a widely used but poorly documented software, making it difficult for users to figure out how to use basic functionality like generating an RSA key. The documentation is scattered across multiple websites, often contradictory, and assumes a level of cryptographic knowledge that many users lack. Even simple tasks like determining the key format can be challenging due to the lack of clear guidance. The author highlights several usability issues with OpenSSL, such as the default use of a weak 512-bit RSA key, and the lack of warnings or guidance when using deprecated interfaces. The author argues that improving OpenSSL's usability through better documentation, user testing, and avoiding unnecessary forks could go a long way in making this critical piece of software more accessible to a wider audience.


https://jameshfisher.com/2017/12/02/the-sorry-state-of-openssl-usability/

@DevMisc
#cryptography #openssl #learn
👍1
Replacing FastAPI with Rust — A series

The author is starting a blog series exploring the possibility of replacing the FastAPI Python web framework with the Rust programming language for their use case. FastAPI is the author's go-to backend framework, but they are interested in moving beyond some of the limitations of Python. The key requirements the author is hoping to achieve with a Rust-based solution are high performance, type safety, and ease of deployment. The author is unsure where this exploration will lead, but is excited to document the process and see if they can create a compelling alternative to FastAPI. Readers are encouraged to follow along, provide feedback, and suggest ideas for the blog series.


1. Intro (defining requirements)
2. Research (rweb, rocket, paperclip)
3. Trying Actix (actix-web)
4. A Solution (rweb, dropshot, rocket with okapi)
5. Rocket 0.5 (rocket, async tests, memory management)
6. AWS Lambda (AWS CDK)

@DevMisc
#rust #fastapi #rocket #learn
2👍1