Parallel C++ for Scientific Applications: Threads & Synchronization
https://www.youtube.com/watch?v=Db2DPWaZy7M
https://redd.it/1pr0exa
@r_cpp
https://www.youtube.com/watch?v=Db2DPWaZy7M
https://redd.it/1pr0exa
@r_cpp
YouTube
CSC4700 -Threads & Synchronization
This lecture covers parallel programming models in C++, focusing on threads and synchronization. The numerical approximation of pi is used as a practical example to illustrate concepts, introducing numerical integration techniques (Riemann sums and Simpson's…
Exercise in Removing All Traces Of C and C++ at Microsoft
https://www.linkedin.com/posts/galenh_principal-software-engineer-coreai-microsoft-activity-7407863239289729024-WTzf/
https://redd.it/1pr3s7u
@r_cpp
https://www.linkedin.com/posts/galenh_principal-software-engineer-coreai-microsoft-activity-7407863239289729024-WTzf/
https://redd.it/1pr3s7u
@r_cpp
Linkedin
Principal Software Engineer (CoreAI) | Microsoft Careers | Galen Hunt | 31 comments
Update:
It appears my post generated far more attention than I intended... with a lot of speculative reading between the lines.
Just to clarify... Windows is *NOT* being rewritten in Rust with AI.
My team’s project is a research project. We are building…
It appears my post generated far more attention than I intended... with a lot of speculative reading between the lines.
Just to clarify... Windows is *NOT* being rewritten in Rust with AI.
My team’s project is a research project. We are building…
POC of custom conditional warnings exploiting C++26's expansion statements and deprecated attribute for compile-time debugging
I came up with this hacky trick for custom compiler warnings (not errors) that are conditional on a compile-time known bool. I know it is not the prettiest error message but it at least has all the relevant information to be useful for compile-time (print) debugging. Thought it would be cool to share here and please let me know if there is a better way to achieve this or if it can be achieved in C++23 or prior. Check it out here: https://godbolt.org/z/br6vGdvex
https://redd.it/1pr7dpm
@r_cpp
I came up with this hacky trick for custom compiler warnings (not errors) that are conditional on a compile-time known bool. I know it is not the prettiest error message but it at least has all the relevant information to be useful for compile-time (print) debugging. Thought it would be cool to share here and please let me know if there is a better way to achieve this or if it can be achieved in C++23 or prior. Check it out here: https://godbolt.org/z/br6vGdvex
https://redd.it/1pr7dpm
@r_cpp
godbolt.org
Compiler Explorer - C++ (x86-64 gcc (C++26 reflection))
template <bool flag, class Callable>
struct CTWarning;
template <class Callable>
struct CTWarning<false, Callable> {
static constexpr std::array<Callable, 0> arr{};
};
template <class Callable>
struct CTWarning<true, Callable> {
static constexpr…
struct CTWarning;
template <class Callable>
struct CTWarning<false, Callable> {
static constexpr std::array<Callable, 0> arr{};
};
template <class Callable>
struct CTWarning<true, Callable> {
static constexpr…
5hrs spent debugging just to find out i forgot to initialize to 0 in class.
Yup, it finally happened.
I am making a voxel terrain generation project to learn OpenGL. It was my abstraction of vertex arrays. Initially, when I created this class, it generated an ID in the constructor, but then when I introduced multithreading, I needed to stop doing that in the constructor (bad design, I know—need to study design patterns). So I introduced a boolean to initialize it when the first call to
Yup, that's my rant. And here's the ugly code I wrote:
cpp
class VertexArray {
public:
explicit VertexArray(bool lazy = false);
~VertexArray();
// Returns the vertex array ID
GLuint id() const { return arrayid; }
void Generate();
// Binds this vertex array
void Bind();
void UnBind();
// Adds and enables the attribute
void AddAttribute(Attribute attribute);
private:
GLuint arrayid{};
};
https://redd.it/1prafjr
@r_cpp
Yup, it finally happened.
I am making a voxel terrain generation project to learn OpenGL. It was my abstraction of vertex arrays. Initially, when I created this class, it generated an ID in the constructor, but then when I introduced multithreading, I needed to stop doing that in the constructor (bad design, I know—need to study design patterns). So I introduced a boolean to initialize it when the first call to
Bind() is made. But I didn't set it to false at that time. I saw chunks rendering, but there were gaps between them. So I started debugging, and honestly, the VertexArray class wasn't even on my mind. I just printed the VAO values in the output along with some other data. Although the values were somewhat random, I ignored it because OpenGL only guarantees unique unused values, not how they're generated. But then in the middle, I saw some were small and continuous like 1, 2, ..., 10. Then I put a print statement in the Generate() function of VertexArray and realized it wasn't even being called.Yup, that's my rant. And here's the ugly code I wrote:
cpp
class VertexArray {
public:
explicit VertexArray(bool lazy = false);
~VertexArray();
// Returns the vertex array ID
GLuint id() const { return arrayid; }
void Generate();
// Binds this vertex array
void Bind();
void UnBind();
// Adds and enables the attribute
void AddAttribute(Attribute attribute);
private:
GLuint arrayid{};
};
https://redd.it/1prafjr
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
CUDA C++ GPU Accelerated Data Structures on Google Colab usin CuCollections
https://leetarxiv.substack.com/p/gpu-accelerated-data-structures-on
https://redd.it/1prdiaz
@r_cpp
https://leetarxiv.substack.com/p/gpu-accelerated-data-structures-on
https://redd.it/1prdiaz
@r_cpp
Substack
[CUDA] GPU Accelerated Hash Maps on Google Colab
CUDA on a Budget on Google Colab's Free GPUs
Implicit contract assertions: systematizing eliminating all undefined behavior for C++
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3100r5.pdf
https://redd.it/1prhmt8
@r_cpp
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3100r5.pdf
https://redd.it/1prhmt8
@r_cpp
Blog: Stripping the Noise: 6 Heuristics for Readable C++ STL Errors
https://ozacod.github.io/posts/how-to-filter-cpp-errors/
I've ported stlfilt to Go and added some modern C++ features. You can check out the project at https://github.com/ozacod/stlfilt-go
https://redd.it/1prmpvb
@r_cpp
https://ozacod.github.io/posts/how-to-filter-cpp-errors/
I've ported stlfilt to Go and added some modern C++ features. You can check out the project at https://github.com/ozacod/stlfilt-go
https://redd.it/1prmpvb
@r_cpp
ozacod.github.io
Stripping the Noise: 6 Heuristics for Readable C++ STL Errors - ozacod
C++ is incredibly powerful but notoriously difficult to debug. While we have valgrind for memory leaks and gdb for complex logic, a simple syntax error can p...
Boost.MultiIndex refactored
https://bannalia.blogspot.com/2025/12/boostmultiindex-refactored.html
https://redd.it/1proaik
@r_cpp
https://bannalia.blogspot.com/2025/12/boostmultiindex-refactored.html
https://redd.it/1proaik
@r_cpp
Blogspot
Boost.MultiIndex refactored
Boost.MultiIndex was launched as part of Boost 1.32 in November 2004. The library is still actively maintained and in use by some notable p...
Decent tooling for concept autocompletion?
* The noscript pretty much explains itself. Before concepts I could at least give VS an instance from the codebase, and IntelliSense worked fine, but with concepts now, sometimes it feels like I am coding on Notepad. Tried CLion, and it is not any better. I understand the technical complexities that come with code completion with concepts, but I want to hear your view on this anyway.
https://redd.it/1prnkb2
@r_cpp
* The noscript pretty much explains itself. Before concepts I could at least give VS an instance from the codebase, and IntelliSense worked fine, but with concepts now, sometimes it feels like I am coding on Notepad. Tried CLion, and it is not any better. I understand the technical complexities that come with code completion with concepts, but I want to hear your view on this anyway.
https://redd.it/1prnkb2
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
Jubi - Lightweight 2D Physics Engine
Jubi is a passion project I've been creating for around the past month, which is meant to be a lightweight physics engine, targeted for 2D. As of this post, it's on v0.2.1, with world creation, per-body integration, built-in error detection, force-based physics, and other basic needs for a physics engine.
Jubi has been intended for C/C++ projects, with C99 & C++98 as the standards. I've been working on it by myself, since around late-November, early-December. It has started from a basic single-header library to just create worlds/bodies and do raw-collision checks manually, to as of the current version, being able to handle hundreds of bodies with little to no slow down, even without narrow/broadphase implemented yet. Due to Jubi currently using o(n²) to check objects, compilation time can stack fast if used for larger scaled projects, limiting the max bodies at the minute to 1028.
It's main goal is to be extremely easy, and lightweight to use. With tests done, translated as close as I could to 1:1 replicas in Box2D & Chipmunk2D, Jubi has performed the fastest, with the least amount of LOC and boilerplate required for the same tests. We hope, by Jubi-1.0.0, to be near the level of usage/fame as Box2D and/or Chipmunk2D.
Jubi Samples:
#define JUBIIMPLEMENTATION
#include "../Jubi.h"
#include <stdio.h>
int main() {
JubiWorld2D WORLD = JubiCreateWorld2D();
// JBody2DCreateBox(JubiWorld2D *WORLD, Vector2 Position, Vector2 Size, BodyType2D Type, float Mass)
Body2D *Box = JBody2DCreateBox(&WORLD, (Vector2){0, 0}, (Vector2){1, 1}, BODYDYNAMIC, 1.0f);
// ~1 second at 60 FPS
for (int i=0; i < 60; i++) {
JubiStepWorld2D(&WORLD, 0.033f);
printf("Frame: %02d | Position: (%.3f, %.3f) | Velocity: (%.3f, %.3f) | Index: %d\n", i, Box -> Position.x, Box -> Position.y, Box -> Velocity.x, Box -> Velocity.y, Box -> Index);
}
return 0;
}
Jubi runtime compared to other physic engines:
|Physics Engine|Runtime|
|:-|:-|
|Jubi|0.0036ms|
|Box2D|0.0237ms|
|Chipmunk2D|0.0146ms|
Jubi Github: https://github.com/Avery-Personal/Jubi
https://redd.it/1prwf8o
@r_cpp
Jubi is a passion project I've been creating for around the past month, which is meant to be a lightweight physics engine, targeted for 2D. As of this post, it's on v0.2.1, with world creation, per-body integration, built-in error detection, force-based physics, and other basic needs for a physics engine.
Jubi has been intended for C/C++ projects, with C99 & C++98 as the standards. I've been working on it by myself, since around late-November, early-December. It has started from a basic single-header library to just create worlds/bodies and do raw-collision checks manually, to as of the current version, being able to handle hundreds of bodies with little to no slow down, even without narrow/broadphase implemented yet. Due to Jubi currently using o(n²) to check objects, compilation time can stack fast if used for larger scaled projects, limiting the max bodies at the minute to 1028.
It's main goal is to be extremely easy, and lightweight to use. With tests done, translated as close as I could to 1:1 replicas in Box2D & Chipmunk2D, Jubi has performed the fastest, with the least amount of LOC and boilerplate required for the same tests. We hope, by Jubi-1.0.0, to be near the level of usage/fame as Box2D and/or Chipmunk2D.
Jubi Samples:
#define JUBIIMPLEMENTATION
#include "../Jubi.h"
#include <stdio.h>
int main() {
JubiWorld2D WORLD = JubiCreateWorld2D();
// JBody2DCreateBox(JubiWorld2D *WORLD, Vector2 Position, Vector2 Size, BodyType2D Type, float Mass)
Body2D *Box = JBody2DCreateBox(&WORLD, (Vector2){0, 0}, (Vector2){1, 1}, BODYDYNAMIC, 1.0f);
// ~1 second at 60 FPS
for (int i=0; i < 60; i++) {
JubiStepWorld2D(&WORLD, 0.033f);
printf("Frame: %02d | Position: (%.3f, %.3f) | Velocity: (%.3f, %.3f) | Index: %d\n", i, Box -> Position.x, Box -> Position.y, Box -> Velocity.x, Box -> Velocity.y, Box -> Index);
}
return 0;
}
Jubi runtime compared to other physic engines:
|Physics Engine|Runtime|
|:-|:-|
|Jubi|0.0036ms|
|Box2D|0.0237ms|
|Chipmunk2D|0.0146ms|
Jubi Github: https://github.com/Avery-Personal/Jubi
https://redd.it/1prwf8o
@r_cpp
GitHub
GitHub - Avery-Personal/Jubi: Leightweight physics engine written in pure C.
Leightweight physics engine written in pure C. Contribute to Avery-Personal/Jubi development by creating an account on GitHub.
Best conference talks of 2025
As we all know that we are heading towards the end of this year so it would be great for you guys to share your favourite conference speech related to c++ happened in this year and also kindly mention the reason behind picking it as your #1 conference talk.
https://redd.it/1przzbc
@r_cpp
As we all know that we are heading towards the end of this year so it would be great for you guys to share your favourite conference speech related to c++ happened in this year and also kindly mention the reason behind picking it as your #1 conference talk.
https://redd.it/1przzbc
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
[OC] Tired of "blind" C++ debugging in VS Code for Computer Vision? I built CV DebugMate C++ to view cv::Mat and 3D Point Clouds directly.
Hey everyone,
As a developer working on **SLAM and Computer Vision projects in C++**, I was constantly frustrated by the lack of proper debugging tools in VS Code after moving away from Visual Studio's Image Watch. Staring at memory addresses for cv::Mat and std::vector<cv::Point3f> felt like debugging blind!
So, I decided to build what I needed and open-source it: [CV DebugMate C++](https://marketplace.visualstudio.com/items?itemName=zwdai.cv-debugmate-cpp).
It's a **VS Code extension** that brings back essential visual debugging capabilities for C++ projects, with a special focus on 3D/CV applications.
**🌟 Key Features**
**1.** 🖼️ **Powerful cv::Mat Visualization**
* Diverse Types: Supports various depths (uint8, float, double) and channels (Grayscale, BGR, RGBA).
* Pixel-Level Inspection: Hover your mouse to see real-time pixel values, with zoom and grid support.
* Pro Export: Exports to common formats like PNG, and crucially, TIFF for preserving floating-point data integrity (a must for deep CV analysis
**2.** 📊 **Exclusive: Real-Time 3D Point Cloud Viewing**
* Direct Rendering: Directly renders your **std::vector<cv::Point3f>** or **cv::Point3d** variables as an interactive 3D point cloud.
* Interactive 3D: Built on Three.js, allowing you to drag, rotate, and zoom the point cloud right within your debugger session. Say goodbye to blindly debugging complex 3D algorithm
**3. 🔍 CV DebugMate Panel**
[](https://github.com/dull-bird/cv_debug_mate_cpp/tree/main#-cv-debugmate-panel)
* Automatic Variable Collection: Automatically detects all visualizable OpenCV variables in the current stack frame.
* Dedicated Sidebar View: A new view in the Debug sidebar for quick access to all Mat and Point Cloud variables.
* Type Identification: Distinct icons for images (Mat) and 3D data (Point Cloud).
* One-Click Viewing: Quick-action buttons to open visualization tabs without using context menus
**4. Wide Debugger Support**
Confirmed compatibility with common setups: Windows (MSVC/MinGW), Linux (GDB), and macOS (LLDB). (Check the documentation for the full list).
**🛠 How to Use**
It's designed to be plug-and-play. During a debug session, simply Right-Click on your cv::Mat or std::vector<cv::Point3f> variable in the Locals/Watch panel and select "View by CV DebugMate".**🔗 Get It & Support**
The plugin is completely free and open-source. It's still early in development, so feedback and bug reports are highly welcome!
**VS Code Marketplace**: Search for CV DebugMate or zwdai
**GitHub Repository**: [https://github.com/dull-bird/cv\_debug\_mate\_cpp](https://github.com/dull-bird/cv_debug_mate_cpp)
If you find it useful, please consider giving it a Star on GitHub or a rating on the Marketplace—it's the fuel for continued bug fixes and feature development! 🙏
https://redd.it/1ps4a8n
@r_cpp
Hey everyone,
As a developer working on **SLAM and Computer Vision projects in C++**, I was constantly frustrated by the lack of proper debugging tools in VS Code after moving away from Visual Studio's Image Watch. Staring at memory addresses for cv::Mat and std::vector<cv::Point3f> felt like debugging blind!
So, I decided to build what I needed and open-source it: [CV DebugMate C++](https://marketplace.visualstudio.com/items?itemName=zwdai.cv-debugmate-cpp).
It's a **VS Code extension** that brings back essential visual debugging capabilities for C++ projects, with a special focus on 3D/CV applications.
**🌟 Key Features**
**1.** 🖼️ **Powerful cv::Mat Visualization**
* Diverse Types: Supports various depths (uint8, float, double) and channels (Grayscale, BGR, RGBA).
* Pixel-Level Inspection: Hover your mouse to see real-time pixel values, with zoom and grid support.
* Pro Export: Exports to common formats like PNG, and crucially, TIFF for preserving floating-point data integrity (a must for deep CV analysis
**2.** 📊 **Exclusive: Real-Time 3D Point Cloud Viewing**
* Direct Rendering: Directly renders your **std::vector<cv::Point3f>** or **cv::Point3d** variables as an interactive 3D point cloud.
* Interactive 3D: Built on Three.js, allowing you to drag, rotate, and zoom the point cloud right within your debugger session. Say goodbye to blindly debugging complex 3D algorithm
**3. 🔍 CV DebugMate Panel**
[](https://github.com/dull-bird/cv_debug_mate_cpp/tree/main#-cv-debugmate-panel)
* Automatic Variable Collection: Automatically detects all visualizable OpenCV variables in the current stack frame.
* Dedicated Sidebar View: A new view in the Debug sidebar for quick access to all Mat and Point Cloud variables.
* Type Identification: Distinct icons for images (Mat) and 3D data (Point Cloud).
* One-Click Viewing: Quick-action buttons to open visualization tabs without using context menus
**4. Wide Debugger Support**
Confirmed compatibility with common setups: Windows (MSVC/MinGW), Linux (GDB), and macOS (LLDB). (Check the documentation for the full list).
**🛠 How to Use**
It's designed to be plug-and-play. During a debug session, simply Right-Click on your cv::Mat or std::vector<cv::Point3f> variable in the Locals/Watch panel and select "View by CV DebugMate".**🔗 Get It & Support**
The plugin is completely free and open-source. It's still early in development, so feedback and bug reports are highly welcome!
**VS Code Marketplace**: Search for CV DebugMate or zwdai
**GitHub Repository**: [https://github.com/dull-bird/cv\_debug\_mate\_cpp](https://github.com/dull-bird/cv_debug_mate_cpp)
If you find it useful, please consider giving it a Star on GitHub or a rating on the Marketplace—it's the fuel for continued bug fixes and feature development! 🙏
https://redd.it/1ps4a8n
@r_cpp
Visualstudio
CV DebugMate C++ - Visual Studio Marketplace
Extension for Visual Studio Code - Visualize OpenCV Mat and Point Cloud during C++ debugging. Inspired by Image Watch.
link me a bgfx guide
has anyone found a good guide on how to use the bgfx library? i have been searching for days and i only found bad ones
https://redd.it/1ps6lqv
@r_cpp
has anyone found a good guide on how to use the bgfx library? i have been searching for days and i only found bad ones
https://redd.it/1ps6lqv
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
My 70+ video playlist exploring Unreal Engine's unique flavor of C++ (eg language additions, data structures, networking APIs, etc.)
https://www.youtube.com/playlist?list=PL22CMuqloY0oZBwPaqRbu_WGA1btwhfC2
https://redd.it/1ps8937
@r_cpp
https://www.youtube.com/playlist?list=PL22CMuqloY0oZBwPaqRbu_WGA1btwhfC2
https://redd.it/1ps8937
@r_cpp
YouTube
Unreal C++ Tutorials
My personal series on Unreal C++. Here I show you what I have learned about using C++ effectively with Unreal. Watching this should let you bypass hours of i...
Constvector: Log-structured std:vector alternative – 30-40% faster push/pop
Usually std::vector starts with 'N' capacity and grows to '2 * N' capacity once its size crosses X; at that time, we also copy the data from the old array to the new array. That has few problems
1. Copy cost,
2. OS needs to manage the small capacity array (size N) that's freed by the application.
3. L1 and L2 cache need to invalidate the array items, since the array moved to new location, and CPU need to fetch to L1/L2 since it's new data for CPU, but in reality it's not.
std::vector's reallocations and recopies are amortised O(1), but at low level they have lot of negative impact. Here's a log-structured alternative (constvector) with power-of-2 blocks: Push: 3.5 ns/op (vs 5 ns std::vector) Pop: 3.4 ns/op (vs 5.3 ns) Index: minor slowdown (3.8 vs 3.4 ns) Strict worst-case O(1), Θ(N) space trade-off, only log(N) extra compared to std::vector.
It reduces internal memory fragmentation. It won't invalidate L1, L2 cache without modifications, hence improving performance: In the github I benchmarked for 1K to 1B size vectors and this consistently improved showed better performance for push and pop operations.
Github: https://github.com/tendulkar/constvector
Youtube: https://youtu.be/ledS08GkD40
Practically we can use 64 size for meta array (for the log(N)) as extra space. I implemented the bare vector operations to compare, since the actual std::vector implementations have a lot of iterator validation code, causing the extra overhead.
https://redd.it/1ps8k53
@r_cpp
Usually std::vector starts with 'N' capacity and grows to '2 * N' capacity once its size crosses X; at that time, we also copy the data from the old array to the new array. That has few problems
1. Copy cost,
2. OS needs to manage the small capacity array (size N) that's freed by the application.
3. L1 and L2 cache need to invalidate the array items, since the array moved to new location, and CPU need to fetch to L1/L2 since it's new data for CPU, but in reality it's not.
std::vector's reallocations and recopies are amortised O(1), but at low level they have lot of negative impact. Here's a log-structured alternative (constvector) with power-of-2 blocks: Push: 3.5 ns/op (vs 5 ns std::vector) Pop: 3.4 ns/op (vs 5.3 ns) Index: minor slowdown (3.8 vs 3.4 ns) Strict worst-case O(1), Θ(N) space trade-off, only log(N) extra compared to std::vector.
It reduces internal memory fragmentation. It won't invalidate L1, L2 cache without modifications, hence improving performance: In the github I benchmarked for 1K to 1B size vectors and this consistently improved showed better performance for push and pop operations.
Github: https://github.com/tendulkar/constvector
Youtube: https://youtu.be/ledS08GkD40
Practically we can use 64 size for meta array (for the log(N)) as extra space. I implemented the bare vector operations to compare, since the actual std::vector implementations have a lot of iterator validation code, causing the extra overhead.
https://redd.it/1ps8k53
@r_cpp
GitHub
GitHub - tendulkar/constvector
Contribute to tendulkar/constvector development by creating an account on GitHub.
Crunch: A Message Definition and Serialization Tool Written in Modern C++
https://github.com/sam-w-yellin/crunch
https://redd.it/1psa0pg
@r_cpp
https://github.com/sam-w-yellin/crunch
https://redd.it/1psa0pg
@r_cpp
GitHub
GitHub - sam-w-yellin/crunch: A small, fast message definition and serdes protocol with built-in field and message validation.
A small, fast message definition and serdes protocol with built-in field and message validation. - sam-w-yellin/crunch
[Project] Parallax - Universal GPU Acceleration for C++ Parallel Algorithms
Hey r/cpp!
I'm excited to share **Parallax**, an open-source project that brings automatic GPU acceleration to C++ standard parallel algorithms.
# The Idea
Use `std::execution::par` in your code, link with Parallax, and your parallel algorithms run on the GPU. No code changes, no vendor lock-in, works on any GPU with Vulkan support (AMD, NVIDIA, Intel, mobile).
# Example
std::vector<float> data(1'000'000);
std::for_each(std::execution::par, data.begin(), data.end(),
[](float& x) { x *= 2.0f; });
With Parallax, this runs on the GPU automatically. 30-40x speedup on typical workloads.
# Why Vulkan?
* **Universal**: Works on all major GPU vendors
* **Modern**: Actively developed, not deprecated like OpenCL
* **Fast**: Direct compute access, no translation overhead
* **Open**: No vendor lock-in like CUDA/HIP
# Current Status
This is an early MVP (v0.1.0-dev):
* ✅ Vulkan backend (all platforms)
* ✅ Unified memory management
* ✅ macOS (MoltenVK), Linux, Windows
* 🔨 Compiler integration (in progress)
* 🔨 Full algorithm coverage (coming soon)
# Architecture
Built on:
* Vulkan 1.2+ for compute
* C ABI for stability
* LLVM/Clang for future compiler integration
* Lessons learned from vkStdpar
# Looking for Contributors
We need help with:
* LLVM/Clang plugin development
* Algorithm implementations
* Testing on different GPUs
* Documentation
# Links
* GitHub: [https://github.com/parallax-compiler/parallax-runtime](https://github.com/parallax-compiler/parallax-runtime)
* Docs: [https://github.com/parallax-compiler/parallax-docs](https://github.com/parallax-compiler/parallax-docs)
* License: Apache 2.0
Would love to hear your thoughts and feedback!
https://redd.it/1psfb7p
@r_cpp
Hey r/cpp!
I'm excited to share **Parallax**, an open-source project that brings automatic GPU acceleration to C++ standard parallel algorithms.
# The Idea
Use `std::execution::par` in your code, link with Parallax, and your parallel algorithms run on the GPU. No code changes, no vendor lock-in, works on any GPU with Vulkan support (AMD, NVIDIA, Intel, mobile).
# Example
std::vector<float> data(1'000'000);
std::for_each(std::execution::par, data.begin(), data.end(),
[](float& x) { x *= 2.0f; });
With Parallax, this runs on the GPU automatically. 30-40x speedup on typical workloads.
# Why Vulkan?
* **Universal**: Works on all major GPU vendors
* **Modern**: Actively developed, not deprecated like OpenCL
* **Fast**: Direct compute access, no translation overhead
* **Open**: No vendor lock-in like CUDA/HIP
# Current Status
This is an early MVP (v0.1.0-dev):
* ✅ Vulkan backend (all platforms)
* ✅ Unified memory management
* ✅ macOS (MoltenVK), Linux, Windows
* 🔨 Compiler integration (in progress)
* 🔨 Full algorithm coverage (coming soon)
# Architecture
Built on:
* Vulkan 1.2+ for compute
* C ABI for stability
* LLVM/Clang for future compiler integration
* Lessons learned from vkStdpar
# Looking for Contributors
We need help with:
* LLVM/Clang plugin development
* Algorithm implementations
* Testing on different GPUs
* Documentation
# Links
* GitHub: [https://github.com/parallax-compiler/parallax-runtime](https://github.com/parallax-compiler/parallax-runtime)
* Docs: [https://github.com/parallax-compiler/parallax-docs](https://github.com/parallax-compiler/parallax-docs)
* License: Apache 2.0
Would love to hear your thoughts and feedback!
https://redd.it/1psfb7p
@r_cpp
GitHub
GitHub - parallax-compiler/parallax-runtime: Universal GPU runtime for C++ parallel algorithms
Universal GPU runtime for C++ parallel algorithms. Contribute to parallax-compiler/parallax-runtime development by creating an account on GitHub.
CRTP-based Singleton with private construction token — looking for feedback
I experimented with a CRTP-based Singleton that enforces construction via a private token. Curious to hear thoughts.
So, I wanted to implement a singleton in my ECS crtp engine for design and architectural reasons, and I sat down to think about an efficient and crtp-friendly way to do this kind of pattern without necessarily having to alter the original Singleton class contract. The solution is a crtp-based Singleton in which the Derived (the original singleton) inherits from the base Singleton, which exposes the methods required for instantiation and the single exposure of the object. Simply put, instead of boilerplating the class with the classic Singleton code (op = delete), we move this logic and transform it into a proxy that returns a static instance of the derivative without the derivative even being aware of it.
In this way, we manage private instantiation with a struct token which serves as a specific specialization for the constructor and which allows, among other things, making the construction exclusive to objects that have this token.
This keeps the singleton type-safe, zero-cost, CRTP-friendly, and easy to integrate with proxy-based or ECS-style architectures.
Link to the GitHub repo
https://redd.it/1psmcaf
@r_cpp
I experimented with a CRTP-based Singleton that enforces construction via a private token. Curious to hear thoughts.
So, I wanted to implement a singleton in my ECS crtp engine for design and architectural reasons, and I sat down to think about an efficient and crtp-friendly way to do this kind of pattern without necessarily having to alter the original Singleton class contract. The solution is a crtp-based Singleton in which the Derived (the original singleton) inherits from the base Singleton, which exposes the methods required for instantiation and the single exposure of the object. Simply put, instead of boilerplating the class with the classic Singleton code (op = delete), we move this logic and transform it into a proxy that returns a static instance of the derivative without the derivative even being aware of it.
In this way, we manage private instantiation with a struct token which serves as a specific specialization for the constructor and which allows, among other things, making the construction exclusive to objects that have this token.
This keeps the singleton type-safe, zero-cost, CRTP-friendly, and easy to integrate with proxy-based or ECS-style architectures.
Link to the GitHub repo
https://redd.it/1psmcaf
@r_cpp
GitHub
GitHub - unrays/crtp-singleton: A type-safe, CRTP-based Singleton pattern in C++ with secure construction and proxy support.
A type-safe, CRTP-based Singleton pattern in C++ with secure construction and proxy support. - unrays/crtp-singleton
Any Libraries for Asynchronous requests with HTTP2
Ive recently picked up C++ and am looking to port a program that i had previously written in python using aiohttp, but im having trouble finding a library that makes it easy to handle asynchronous http requests. I initially tried using liburing in conjunction with nghttp2, but quickly found that that was way over my level of knowledge. does anyone have any possible suggestions on what i should do. I cant use any libraries like boost because i need HTTP2 for its multiplexing capabilities.
https://redd.it/1psmrx6
@r_cpp
Ive recently picked up C++ and am looking to port a program that i had previously written in python using aiohttp, but im having trouble finding a library that makes it easy to handle asynchronous http requests. I initially tried using liburing in conjunction with nghttp2, but quickly found that that was way over my level of knowledge. does anyone have any possible suggestions on what i should do. I cant use any libraries like boost because i need HTTP2 for its multiplexing capabilities.
https://redd.it/1psmrx6
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
how i can share my projects in reddit if they run in console?
Should i send all code in messeage? Or file of my code? And if file, how?
https://redd.it/1pt0fdx
@r_cpp
Should i send all code in messeage? Or file of my code? And if file, how?
https://redd.it/1pt0fdx
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
Maintaining the Legacy: Total-Random takes over pcg-cpp maintenance (Support for Win ARM64, MSVC fixes, and Modern C++)
Hi everyone,
Like many of you, we consider the PCG (Permuted Congruential Generator) family of PRNGs by Prof. Melissa O'Neill to be the gold standard for performance and statistical quality. However, the original pcg-cpp repository has been inactive for over 3 years, leaving many critical community-submitted patches unmerged.
To ensure this vital library remains usable in modern development environments, we have formed Total-Random, a community-led organization dedicated to maintaining and modernizing legacy RNG libraries.
We have just released our first stable version of the Total-Random/pcg-cpp fork, which includes:
Windows ARM64 Support: Integrated fixes for ARM64 architecture (thanks to Demonese/LuaSTG).
MSVC Compatibility: Resolved C2678 ambiguous operator errors and other MSVC-specific build failures.
Empty Base Class Optimization (EBCO): Enabled __declspec(empty_bases) for MSVC to ensure optimal memory layout, matching GCC/Clang behavior.
Robust 128-bit Fallback: Improved handling for platforms lacking native __uint128_t support.
Improved unxorshift: Replaced the recursive implementation with a more efficient iterative doubling loop to prevent stack issues and improve clarity.
Our goal is to keep the library header-only, bit-for-bit compatible with the original algorithm, and ready for C++11/17/20/23.
Community Recognition: We are honored to have received early attention and feedback from researchers in the field, including Ben Haller (@bhaller) from Cornell University. You can see the community discussion regarding our transition here:https://github.com/imneme/pcg-cpp/issues/106
Check us out on GitHub: Total-Random/pcg-cpp
We welcome PRs, issues, and feedback from the community. Let's keep the best PRNG alive and kicking!
Best regards, The Total-Random Team
https://redd.it/1pt25rg
@r_cpp
Hi everyone,
Like many of you, we consider the PCG (Permuted Congruential Generator) family of PRNGs by Prof. Melissa O'Neill to be the gold standard for performance and statistical quality. However, the original pcg-cpp repository has been inactive for over 3 years, leaving many critical community-submitted patches unmerged.
To ensure this vital library remains usable in modern development environments, we have formed Total-Random, a community-led organization dedicated to maintaining and modernizing legacy RNG libraries.
We have just released our first stable version of the Total-Random/pcg-cpp fork, which includes:
Windows ARM64 Support: Integrated fixes for ARM64 architecture (thanks to Demonese/LuaSTG).
MSVC Compatibility: Resolved C2678 ambiguous operator errors and other MSVC-specific build failures.
Empty Base Class Optimization (EBCO): Enabled __declspec(empty_bases) for MSVC to ensure optimal memory layout, matching GCC/Clang behavior.
Robust 128-bit Fallback: Improved handling for platforms lacking native __uint128_t support.
Improved unxorshift: Replaced the recursive implementation with a more efficient iterative doubling loop to prevent stack issues and improve clarity.
Our goal is to keep the library header-only, bit-for-bit compatible with the original algorithm, and ready for C++11/17/20/23.
Community Recognition: We are honored to have received early attention and feedback from researchers in the field, including Ben Haller (@bhaller) from Cornell University. You can see the community discussion regarding our transition here:https://github.com/imneme/pcg-cpp/issues/106
Check us out on GitHub: Total-Random/pcg-cpp
We welcome PRs, issues, and feedback from the community. Let's keep the best PRNG alive and kicking!
Best regards, The Total-Random Team
https://redd.it/1pt25rg
@r_cpp
GitHub
Continuing the Journey: A Community-Maintained Fork for Modern C++ and Platform Support · Issue #106 · imneme/pcg-cpp
Dear Professor O'Neill (@imneme) and the PCG community, First of all, thank you for the incredible work on the PCG family of generators. It has been a cornerstone for high-quality random number...