How did you learn Linux kernel development?
Embedded engineer here aiming to move into Linux kernel/driver development.
If you currently work in this area, how did you learn it?
1. What skills mattered most?
2. How much C knowledge is required
3. How is the work different from MCU firmware?
4. Any resources or projects you recommend?
It would be so helpfull if real-world experiences people respond.
https://redd.it/1rh88r5
@r_embedded
Embedded engineer here aiming to move into Linux kernel/driver development.
If you currently work in this area, how did you learn it?
1. What skills mattered most?
2. How much C knowledge is required
3. How is the work different from MCU firmware?
4. Any resources or projects you recommend?
It would be so helpfull if real-world experiences people respond.
https://redd.it/1rh88r5
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
How to stage U-Boot right? Context: OTA, A/B, eMMC boot partitions
Hi all,
I’d appreciate some architectural feedback on using U-Boot the right way in an A/B update strategy.
**Context:**
* Storage: eMMC with two boot partitions
* U-Boot booting Linux
* Implementing some A/B schema and brick safe system
I’m primarily a hardware engineer / PCB designer. I’ve wrote embedded software for MCUs (ARM-M + FreeRTOS) in previous roles, so I know some embedded software things, but 64-bit ARM-A multicore systems with external DRAM, Linux and U-Boot are outside my core area of software experience. That said, given my background, I’m not entirely comfortable with the bootloader strategy we’re currently establishing and would really appreciate feedback to either validate or challenge my concerns.
**Proposed design (I have my concerns with):**
* U-Boot instance in eMMC boot0 only boots eMMC UDA Image A
* U-Boot instance in eMMC boot1 only boots eMMC UDA Image B
* Switching between A/B is done by changing eMMC EXT\_CSD\[179\] (PARTITION\_CONFIG) and rebooting.
* For an OTA update:
* Write the inactive slot (including its bootloader in the respective boot partition).
* Flip EXT\_CSD\[179\]
* reboot
* If (new) image (meaning kernel and above) fails, use U-Boot bootcount / altbootcmd to switch back to the other boot partition slot:
* Flip EXT\_CSD\[179\]
* reboot
The presented rationale is that U-Boot will be updated frequently alongside Linux and the handover may change, so booting from the old bootloader to the new image might be incompatible in the future. Thus neither a fallback like boot1 U-Boot boots Image A should be implemented, nor a immutable U-Boot in the boot partition.
**My concerns**
1. Frequent rewriting of the bootloader in eMMC boot partitions. Even if we skip rewriting when hashes match, realistically both boot partitions will often contain identical U-Boot builds. Doesn't that defy a A/B strategie?
2. Symmetry problem
* In certain failure cases, neither instance may be able to boot “the other” image, depending on assumptions baked into each build.
* Thus, in case the newly written bootloader is bad, there is no fallback to the second bootloader. So we may have an A/B image, but A/B bootloader is a lie.
1. Bootloader compatibility argument. I’m not fully convinced that an immutable U-Boot in the bootpartition will get incompatible. But I might be underestimating real-world drift between U-Boot and kernel expectations.
**My alternative idea (more staged / partially immutable)**:
If we don’t just keep a single immutable U-Boot in the boot partition that just boots the mutable kernel and above in eMMC UDA for years to come, I would prefer something like:
* eMMC boot partition is held immutable and contains:
* A) just U-Boot SPL with custom implementation for A/B logic
* afaik, U-Boot SPL to U-Boot proper handover will not get incompatible, no "linux boot contract", no big handover.
* B) If possibly, U-Boot SPL plus a minimal “proper” U-Boot stripped to just A/B logic.
* mini U-Boot just to have more software support regarding the A/B logic, and less need need to implement a SPL solution by ourselves.
* In this case, U-Boot SPL hands over to my mini proper U-Boot in the boot partiton, that then hands over to the proper proper U-Boot in UDA.
* A/B Images in UDA each bundle a Full-featured U-Boot (updated frequently), Rootfs, DTBs, kernel, App
* Boot flow:
* ROM → SPL (boot partition)
* SPL → minimal U-Boot (boot partition)
* That stage selects A/B and then loads a second-stage U-Boot from UDA \* If possible. Otherwise SPL needs to do the A/B to UDA directly.
* Second-stage U-Boot boots Linux
In that model, there is a small, stable, trusted part, while the higher-level U-Boot and Linux can evolve more freely.
**My questions:**
1. Can U-Boot proper chainload another U-Boot proper?
1. Decides weather A/B logic gets implemented in SPL, or my thought of immutable mini U-Boot pre-stage after SPL before proper full U-Boot in the UDA living image-bundle.
2.
Hi all,
I’d appreciate some architectural feedback on using U-Boot the right way in an A/B update strategy.
**Context:**
* Storage: eMMC with two boot partitions
* U-Boot booting Linux
* Implementing some A/B schema and brick safe system
I’m primarily a hardware engineer / PCB designer. I’ve wrote embedded software for MCUs (ARM-M + FreeRTOS) in previous roles, so I know some embedded software things, but 64-bit ARM-A multicore systems with external DRAM, Linux and U-Boot are outside my core area of software experience. That said, given my background, I’m not entirely comfortable with the bootloader strategy we’re currently establishing and would really appreciate feedback to either validate or challenge my concerns.
**Proposed design (I have my concerns with):**
* U-Boot instance in eMMC boot0 only boots eMMC UDA Image A
* U-Boot instance in eMMC boot1 only boots eMMC UDA Image B
* Switching between A/B is done by changing eMMC EXT\_CSD\[179\] (PARTITION\_CONFIG) and rebooting.
* For an OTA update:
* Write the inactive slot (including its bootloader in the respective boot partition).
* Flip EXT\_CSD\[179\]
* reboot
* If (new) image (meaning kernel and above) fails, use U-Boot bootcount / altbootcmd to switch back to the other boot partition slot:
* Flip EXT\_CSD\[179\]
* reboot
The presented rationale is that U-Boot will be updated frequently alongside Linux and the handover may change, so booting from the old bootloader to the new image might be incompatible in the future. Thus neither a fallback like boot1 U-Boot boots Image A should be implemented, nor a immutable U-Boot in the boot partition.
**My concerns**
1. Frequent rewriting of the bootloader in eMMC boot partitions. Even if we skip rewriting when hashes match, realistically both boot partitions will often contain identical U-Boot builds. Doesn't that defy a A/B strategie?
2. Symmetry problem
* In certain failure cases, neither instance may be able to boot “the other” image, depending on assumptions baked into each build.
* Thus, in case the newly written bootloader is bad, there is no fallback to the second bootloader. So we may have an A/B image, but A/B bootloader is a lie.
1. Bootloader compatibility argument. I’m not fully convinced that an immutable U-Boot in the bootpartition will get incompatible. But I might be underestimating real-world drift between U-Boot and kernel expectations.
**My alternative idea (more staged / partially immutable)**:
If we don’t just keep a single immutable U-Boot in the boot partition that just boots the mutable kernel and above in eMMC UDA for years to come, I would prefer something like:
* eMMC boot partition is held immutable and contains:
* A) just U-Boot SPL with custom implementation for A/B logic
* afaik, U-Boot SPL to U-Boot proper handover will not get incompatible, no "linux boot contract", no big handover.
* B) If possibly, U-Boot SPL plus a minimal “proper” U-Boot stripped to just A/B logic.
* mini U-Boot just to have more software support regarding the A/B logic, and less need need to implement a SPL solution by ourselves.
* In this case, U-Boot SPL hands over to my mini proper U-Boot in the boot partiton, that then hands over to the proper proper U-Boot in UDA.
* A/B Images in UDA each bundle a Full-featured U-Boot (updated frequently), Rootfs, DTBs, kernel, App
* Boot flow:
* ROM → SPL (boot partition)
* SPL → minimal U-Boot (boot partition)
* That stage selects A/B and then loads a second-stage U-Boot from UDA \* If possible. Otherwise SPL needs to do the A/B to UDA directly.
* Second-stage U-Boot boots Linux
In that model, there is a small, stable, trusted part, while the higher-level U-Boot and Linux can evolve more freely.
**My questions:**
1. Can U-Boot proper chainload another U-Boot proper?
1. Decides weather A/B logic gets implemented in SPL, or my thought of immutable mini U-Boot pre-stage after SPL before proper full U-Boot in the UDA living image-bundle.
2.
Is this a supported/common approach?
3. Can I trust this mechanism longterm?
2. Is using an immutable SPL + updatable U-Boot proper in UDA a common/recognized pattern? I don't want to reinvent the wheel.
3. Is my staged approach overengineering or misguided? Devil’s advocate welcome.
4. What am I missing?- eMMC wear implications?- Corner cases around bootcount + power loss?- Secure boot implications?- Further readings you can link me to?
This topic has been keeping me up at night for a week now, so I’d really appreciate critical feedback.
Thanks in advance!
EDIT: formatting lists / indents
https://redd.it/1rh7c20
@r_embedded
3. Can I trust this mechanism longterm?
2. Is using an immutable SPL + updatable U-Boot proper in UDA a common/recognized pattern? I don't want to reinvent the wheel.
3. Is my staged approach overengineering or misguided? Devil’s advocate welcome.
4. What am I missing?- eMMC wear implications?- Corner cases around bootcount + power loss?- Secure boot implications?- Further readings you can link me to?
This topic has been keeping me up at night for a week now, so I’d really appreciate critical feedback.
Thanks in advance!
EDIT: formatting lists / indents
https://redd.it/1rh7c20
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Is automotive radar DSP too niche? 3 YOE and starting to overthink my path
Hello all,
I’ve been working for about 3 years in automotive radar (low-level, not in perception): mostly signal processing , antenna/RF stuff, and a bit of C++ implementation for embedded targets. I have a masters in robotics.
Lately I’ve been wondering: is radar too niche? Am i dedicating so much time on a particular area?
On one hand, I feel like I’ve built solid fundamentals:
* Working with noisy real-world data
* Some antenna / array processing exposure
* Performance-aware C++ in constrained environments
On the other hand, radar feels like a fairly specialized corner of DSP compared to, say, audio, ML, or general data science. I’m starting to overthink whether I should proactively upskill into something broader (e.g., more advanced ML, CPU optimizations, etc.) to avoid being “boxed in.”
A few specific questions for those with more experience:
1. Are radar DSP skills generally transferable to other domains? I like robotics, sensor-fusion, work in drones flight control.
2. Is staying in radar long-term a career risk, or is it actually a strong niche to have?
3. Am I just overthinking this?
I enjoy the technical depth, but I don’t want to wake up in 5–10 years and realize I’ve limited my options.
Would really appreciate perspective from people who’ve moved across domains or stayed in radar long-term.
Thank you.
https://redd.it/1rhbjf7
@r_embedded
Hello all,
I’ve been working for about 3 years in automotive radar (low-level, not in perception): mostly signal processing , antenna/RF stuff, and a bit of C++ implementation for embedded targets. I have a masters in robotics.
Lately I’ve been wondering: is radar too niche? Am i dedicating so much time on a particular area?
On one hand, I feel like I’ve built solid fundamentals:
* Working with noisy real-world data
* Some antenna / array processing exposure
* Performance-aware C++ in constrained environments
On the other hand, radar feels like a fairly specialized corner of DSP compared to, say, audio, ML, or general data science. I’m starting to overthink whether I should proactively upskill into something broader (e.g., more advanced ML, CPU optimizations, etc.) to avoid being “boxed in.”
A few specific questions for those with more experience:
1. Are radar DSP skills generally transferable to other domains? I like robotics, sensor-fusion, work in drones flight control.
2. Is staying in radar long-term a career risk, or is it actually a strong niche to have?
3. Am I just overthinking this?
I enjoy the technical depth, but I don’t want to wake up in 5–10 years and realize I’ve limited my options.
Would really appreciate perspective from people who’ve moved across domains or stayed in radar long-term.
Thank you.
https://redd.it/1rhbjf7
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Should i go deeper into OS?
Hello guys! I am an automation engineer who has been studying and working with C for the past 6-7 months and i want to transition to an embedded engineer role. Currently i am studying OSTEP in order to get a beginners introduction to OS fundamentals. My question is do i need to go in deeper into OS? Will that be helpful long term? Best thing I’ve got out of it right now is scheduling algos for one core cpus. Any advice will be appreciated
https://redd.it/1rhevty
@r_embedded
Hello guys! I am an automation engineer who has been studying and working with C for the past 6-7 months and i want to transition to an embedded engineer role. Currently i am studying OSTEP in order to get a beginners introduction to OS fundamentals. My question is do i need to go in deeper into OS? Will that be helpful long term? Best thing I’ve got out of it right now is scheduling algos for one core cpus. Any advice will be appreciated
https://redd.it/1rhevty
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Allwinner A33 Handheld Retro Game Emulator Advice
Hello all, I'm planning to build a handheld retro game emulator (e.g., PS1, N64) based around the Allwinner A33 chip as an educational side project. My plan is to design the PCB in KiCad from scratch and I am wondering if there are any major design elements to consider? I have a bit of experience with PCB layout design from an ESP32-C3 development board using the chip but I didn't get around to implementing the battery circuitry. The A33 chip feels like a big step up considering it has to run a full operating system.
Also, on the software side of things, my experience with Linux is limited to the mainstream distros. What are some good low-level distro options for this application? I have read about tools such as buildroot for specialized embedded systems. Would this be a big headache to get working for someone with basic Linux experience? My primary goal is gaining hands-on embedded Linux and hardware development experience.
Thanks a lot for reading!
https://redd.it/1rhgfjl
@r_embedded
Hello all, I'm planning to build a handheld retro game emulator (e.g., PS1, N64) based around the Allwinner A33 chip as an educational side project. My plan is to design the PCB in KiCad from scratch and I am wondering if there are any major design elements to consider? I have a bit of experience with PCB layout design from an ESP32-C3 development board using the chip but I didn't get around to implementing the battery circuitry. The A33 chip feels like a big step up considering it has to run a full operating system.
Also, on the software side of things, my experience with Linux is limited to the mainstream distros. What are some good low-level distro options for this application? I have read about tools such as buildroot for specialized embedded systems. Would this be a big headache to get working for someone with basic Linux experience? My primary goal is gaining hands-on embedded Linux and hardware development experience.
Thanks a lot for reading!
https://redd.it/1rhgfjl
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
I built my first ever DIY presence, temperature, humidity sensor that i use in my home.
https://youtu.be/6H5WjfhZxaM?si=MbUwPyEgN2zPQDhW
https://redd.it/1rgsp6p
@r_embedded
https://youtu.be/6H5WjfhZxaM?si=MbUwPyEgN2zPQDhW
https://redd.it/1rgsp6p
@r_embedded
YouTube
This $10 ESP32 Sensor can see through walls - here's what i built with it!
In this video i automated smart lights with home assistant and esp32, mmwave sensor and also added temperature, humidity sensor in it. All that for $10.
Very cheap logic analyzer
Hi!
I'd like to buy a first logic analyzer for basic stuff (signals monitoring, UART, maybe I2C).
Initially, I was going for the LHT00SU1 but then I found a LA1010 which seems better and it costs not a lot more on Aliexpress.
Then, I found a random LA4016 which... doesn't exist? I can find it on Aliexpress, Amazon and Ebay (only from certain countries) but I can't find any official website for it. It seems a clone of an in between version of LA2016 and LA5016.
So, what do you suggest? Based on the Aliexpress prices, I would like to stay around 50 euros.
Thanks!
https://redd.it/1rhjd9o
@r_embedded
Hi!
I'd like to buy a first logic analyzer for basic stuff (signals monitoring, UART, maybe I2C).
Initially, I was going for the LHT00SU1 but then I found a LA1010 which seems better and it costs not a lot more on Aliexpress.
Then, I found a random LA4016 which... doesn't exist? I can find it on Aliexpress, Amazon and Ebay (only from certain countries) but I can't find any official website for it. It seems a clone of an in between version of LA2016 and LA5016.
So, what do you suggest? Based on the Aliexpress prices, I would like to stay around 50 euros.
Thanks!
https://redd.it/1rhjd9o
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Need advice 🥲
Hey everyone,
I've been on this subreddit for a while and finally decided to make a post because I'm genuinely stuck and don't really know what I'm doing wrong.
I'm in my final year of B.Tech EEE and for the past 2-3 months I've been applying to embedded firmware and hardware roles — internships, entry level, anything I can find. Most of the time I either get a rejection or just no response at all. The silence is honestly worse than a rejection.
Here's where I stand:
Languages: C, Embedded C, Python MCUs: STM32 (register-level, without HAL), some ESP32 Peripherals: UART, SPI, I2C, ADC, PWM, Timers — used most of these in real projects Electronics: decent foundation in power electronics, analog and digital — comes with the EEE degree
Projects:
1.3S Li-ion BMS on bare-metal STM32
2.DC motor speed controller (20kHz PWM, H-bridge)
3.Sensor interfacing project with a custom PCB made in KiCad
Currently going through CAN protocol and just starting to look at FreeRTOS.
So my problem is — this doesn't look bad to me on paper, but I'm clearly missing something because I'm not even getting interview calls. Are my projects not detailed enough? Do companies actually expect RTOS at entry level or is it just a bonus? Should I focus more on hardware debugging skills like oscilloscopes and logic analyzers, or is firmware side more important?
Also does GitHub actually matter in embedded? I keep seeing different opinions on this.
I'm not looking for someone to tell me it'll be fine. If my projects are too basic, I want to know. If I'm applying to the wrong places or framing my resume badly, that's helpful too. Just want honest feedback so I can stop wasting time and actually fix what's broken.
Thanks for reading
https://redd.it/1rgut0j
@r_embedded
Hey everyone,
I've been on this subreddit for a while and finally decided to make a post because I'm genuinely stuck and don't really know what I'm doing wrong.
I'm in my final year of B.Tech EEE and for the past 2-3 months I've been applying to embedded firmware and hardware roles — internships, entry level, anything I can find. Most of the time I either get a rejection or just no response at all. The silence is honestly worse than a rejection.
Here's where I stand:
Languages: C, Embedded C, Python MCUs: STM32 (register-level, without HAL), some ESP32 Peripherals: UART, SPI, I2C, ADC, PWM, Timers — used most of these in real projects Electronics: decent foundation in power electronics, analog and digital — comes with the EEE degree
Projects:
1.3S Li-ion BMS on bare-metal STM32
2.DC motor speed controller (20kHz PWM, H-bridge)
3.Sensor interfacing project with a custom PCB made in KiCad
Currently going through CAN protocol and just starting to look at FreeRTOS.
So my problem is — this doesn't look bad to me on paper, but I'm clearly missing something because I'm not even getting interview calls. Are my projects not detailed enough? Do companies actually expect RTOS at entry level or is it just a bonus? Should I focus more on hardware debugging skills like oscilloscopes and logic analyzers, or is firmware side more important?
Also does GitHub actually matter in embedded? I keep seeing different opinions on this.
I'm not looking for someone to tell me it'll be fine. If my projects are too basic, I want to know. If I'm applying to the wrong places or framing my resume badly, that's helpful too. Just want honest feedback so I can stop wasting time and actually fix what's broken.
Thanks for reading
https://redd.it/1rgut0j
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Learning about vector tables, (Mistake in example?)
Hello!! I am learning from this example https://github.com/kristianklein/stm32-without-cubeide/blob/part\_1/startup.c on how to use arm-gcc to program my stm32. I am going through the doc, specifically page 200/201, in section 9,
https://redd.it/1rgwrsh
@r_embedded
Hello!! I am learning from this example https://github.com/kristianklein/stm32-without-cubeide/blob/part\_1/startup.c on how to use arm-gcc to program my stm32. I am going through the doc, specifically page 200/201, in section 9,
Interrupts and events. I believe there is a mistake in the example code when setting up the vector interrupt table. There should be only 4 zeros following the usage fault error, and the bus and usage fault should be preceded by a memManage fault. Am i correct that the example code's ISR table is incorrect?https://redd.it/1rgwrsh
@r_embedded
GitHub
stm32-without-cubeide/startup.c at part_1 · kristianklein/stm32-without-cubeide
Code for the "STM32 without CubeIDE" blog post series at https://kleinembedded.com - kristianklein/stm32-without-cubeide
New STM32C5 family and new top-end parts in STM32H5 family
STM32C5 - new STM32 family:
Summary = 144MHz ARM Cortex-M33F, up to 1MB Flash (some as data flash), up to 256KB SRAM, 2x CAN-FD, optional Ethernet, optional Octo-SPI, optional crypto engine, up to 3 ADCs, 20 to 144 pins.
New Dev Boards = NUCLEO-C542RC, NUCLEO-C562RE, NUCLEO-C5A3ZG.
https://www.mouser.com/new/stmicroelectronics/st-stm32c5-arm-core-mainstream-mcus/
Note: as of the time of this post, the official STM32C5 webpage is not online yet, maybe March 2 or 3 ??
STM32H5 - adding new parts with more memory:
Summary = 250MHz ARM Cortex-M33F, up to 4MB Flash, up to 1MB SRAM, ...
New Dev Boards = NUCLEO-H5E5ZJ, STM32H5F5J-DK.
Start reading at page 27 in the following recently released PDF.
https://www.st.com/resource/en/productpresentation/microcontrollers-stm32h5-series-overview.pdf
**STM32U3** & **STM32F7** - adding new parts:
* New Dev Boards = NUCLEO-U3C5ZI-Q.
* https://www.st.com/resource/en/releasenote/rn0114-stm32cubeide-release-v210-stmicroelectronics.pdf
https://redd.it/1rgfhsk
@r_embedded
STM32C5 - new STM32 family:
Summary = 144MHz ARM Cortex-M33F, up to 1MB Flash (some as data flash), up to 256KB SRAM, 2x CAN-FD, optional Ethernet, optional Octo-SPI, optional crypto engine, up to 3 ADCs, 20 to 144 pins.
New Dev Boards = NUCLEO-C542RC, NUCLEO-C562RE, NUCLEO-C5A3ZG.
https://www.mouser.com/new/stmicroelectronics/st-stm32c5-arm-core-mainstream-mcus/
Note: as of the time of this post, the official STM32C5 webpage is not online yet, maybe March 2 or 3 ??
STM32H5 - adding new parts with more memory:
Summary = 250MHz ARM Cortex-M33F, up to 4MB Flash, up to 1MB SRAM, ...
New Dev Boards = NUCLEO-H5E5ZJ, STM32H5F5J-DK.
Start reading at page 27 in the following recently released PDF.
https://www.st.com/resource/en/productpresentation/microcontrollers-stm32h5-series-overview.pdf
**STM32U3** & **STM32F7** - adding new parts:
* New Dev Boards = NUCLEO-U3C5ZI-Q.
* https://www.st.com/resource/en/releasenote/rn0114-stm32cubeide-release-v210-stmicroelectronics.pdf
https://redd.it/1rgfhsk
@r_embedded
Mouser
STM32C5 Arm® Cortex®-M33 Core Mainstream MCUs - STMicro | Mouser
STMicroelectronics STM32C5 Arm® Cortex®-M33 Core Mainstream MCUs are designed to support the development of cost-effective applications.
got tired of "just vibes" testing for edge ML models, so I built automated quality gates
so about 6 months ago I was messing around with a vision model on a Snapdragon device as a side project. worked great on my laptop. deployed to actual hardware and latency had randomly jumped 40% after a tiny preprocessing change.
the kicker? I only caught it because I was obsessively re-running benchmarks between changes. if I hadn't been that paranoid, it would've just shipped broken.
and that's basically the state of ML deployment to edge devices right now. we've got CI/CD for code — linting, unit tests, staging, the whole nine yards. for models going to phones/robots/cameras? you quantize, squint at some outputs, maybe run a notebook, and pray lol.
so I started building automated gates that test on real Snapdragon hardware through Qualcomm AI Hub. not simulators, actual device runs.
ran our FP32 model on Snapdragon 8 Gen 3 (Galaxy S24) — 0.176ms inference, 121MB memory. INT8 version came in at 0.187ms and 124MB. both passed gates no problem. then threw ResNet50 at it — 1.403ms inference, 236MB memory. both gates failed instantly. that's the kind of stuff that would've slipped through with manual testing.
also added signed evidence bundles (Ed25519 + SHA-256) because "the ML team said it looked good" shouldn't be how we ship models in 2026 lmao.
still super early but the core loop works. anyone else shipping to mobile/embedded dealing with this? what does your testing setup look like? genuinely curious because most teams I've talked to are basically winging it.
https://redd.it/1rhq3jn
@r_embedded
so about 6 months ago I was messing around with a vision model on a Snapdragon device as a side project. worked great on my laptop. deployed to actual hardware and latency had randomly jumped 40% after a tiny preprocessing change.
the kicker? I only caught it because I was obsessively re-running benchmarks between changes. if I hadn't been that paranoid, it would've just shipped broken.
and that's basically the state of ML deployment to edge devices right now. we've got CI/CD for code — linting, unit tests, staging, the whole nine yards. for models going to phones/robots/cameras? you quantize, squint at some outputs, maybe run a notebook, and pray lol.
so I started building automated gates that test on real Snapdragon hardware through Qualcomm AI Hub. not simulators, actual device runs.
ran our FP32 model on Snapdragon 8 Gen 3 (Galaxy S24) — 0.176ms inference, 121MB memory. INT8 version came in at 0.187ms and 124MB. both passed gates no problem. then threw ResNet50 at it — 1.403ms inference, 236MB memory. both gates failed instantly. that's the kind of stuff that would've slipped through with manual testing.
also added signed evidence bundles (Ed25519 + SHA-256) because "the ML team said it looked good" shouldn't be how we ship models in 2026 lmao.
still super early but the core loop works. anyone else shipping to mobile/embedded dealing with this? what does your testing setup look like? genuinely curious because most teams I've talked to are basically winging it.
https://redd.it/1rhq3jn
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
I built a prototype fixed-point overflow analyzer for C — looking for feedback from embedded/DSP engineers
Hi everyone,
I’ve been experimenting with static analysis for fixed-point arithmetic in C, specifically around overflow detection in DSP-style code.
The motivation is simple: In embedded firmware, fixed-point math is everywhere — but overflow assumptions often live only in comments or in the developer’s head. Bugs from truncation, scaling, or multiply-accumulate operations can be subtle and hard to catch with testing alone.
So I built a small prototype tool that:
Compiles C to LLVM IR
Performs interval analysis on arithmetic operations
Tracks ranges through `mul`, `add`, `shl`, `shr`
Reports potential overflow
Here’s an example function it can analyze:
#include <stdint.h>
int32t applycomp(int32t x, int32t T, int32t a1, int32t a2) {
// y = x (1 + T (a1 + a2 T))
int32_t tmp = a2 T; // range = (-1073709056,1073741824)
tmp = tmp + (a1 << 10); // range = (-1107263488,1107295232)
tmp = ((int64t)tmp * (int64t)T) >> 17; // range = (-276823808,276815872)
tmp = tmp + (1 << 25); // range = (-243269376,310370304)
int32t y = ((int64t)x (int64_t)tmp) >> 25; // range = (-303096,303086)
return y;
}
Running the analyzer produces per-line interval results:
No overflow case (PASS):
Per-line result ranges:
apply_comp.c:11: [-1073709056, 1073741824] | int32_t tmp = a2T;
applycomp.c:14: [-1107263488, 1107295232] | tmp = tmp + (a1 << 10);
applycomp.c:17: -276823808, 276815872 | tmp = ((int64t)tmp * (int64t)T) >> 17;
applycomp.c:20: [-243269376, 310370304] | tmp = tmp + (1 << 25);
applycomp.c:23: -303096, 303086 | int32t y = ((int64t)x (int64_t)tmp) >> 25;
FINAL RESULT: PASS
Overflow guarantee: YES (no potential overflows detected for analyzed operations).
Overflow case (FAIL):
Per-line result ranges:
apply_comp.c:11: [-1073709056, 1073741824] | int32_t tmp = a2T;
applycomp.c:14: [-1107263488, 1107295232] | tmp = tmp + (a1 << 10);
applycomp.c:17: -4429180928, 4429053952 | tmp = ((int64t)tmp * (int64t)T) >> 13;
applycomp.c:20: [-2113929216, 2181038079] | tmp = tmp + (1 << 25);
applycomp.c:23: -2097152, 2097087 | int32t y = ((int64t)x (int64_t)tmp) >> 25;
FINAL RESULT: FAIL (2 potential overflow(s) detected)
Overflow happens on source line(s):
apply_comp.c:17 | tmp = ((int64_t)tmp (int64t)T) >> 13;
applycomp.c:20 | tmp = tmp + (1 << 25);
Overflow guarantee: NO (cannot guarantee absence of overflow under current assumptions).
Use --show-all-intervals for full per-operation interval traces and detailed findings.
Right now this is very much a prototype. I’m curious to learn from people who work with embedded or DSP systems:
How do you detect fixed-point overflow in your projects today?
How do you ensure rounding errors stay within specification when doing fixed-point math?
Do you rely purely on testing, or static analysis?
Would something like this be useful in CI/CD or code review pipelines?
Thanks for any insights 🙏
https://redd.it/1rhs596
@r_embedded
Hi everyone,
I’ve been experimenting with static analysis for fixed-point arithmetic in C, specifically around overflow detection in DSP-style code.
The motivation is simple: In embedded firmware, fixed-point math is everywhere — but overflow assumptions often live only in comments or in the developer’s head. Bugs from truncation, scaling, or multiply-accumulate operations can be subtle and hard to catch with testing alone.
So I built a small prototype tool that:
Compiles C to LLVM IR
Performs interval analysis on arithmetic operations
Tracks ranges through `mul`, `add`, `shl`, `shr`
Reports potential overflow
Here’s an example function it can analyze:
#include <stdint.h>
int32t applycomp(int32t x, int32t T, int32t a1, int32t a2) {
// y = x (1 + T (a1 + a2 T))
int32_t tmp = a2 T; // range = (-1073709056,1073741824)
tmp = tmp + (a1 << 10); // range = (-1107263488,1107295232)
tmp = ((int64t)tmp * (int64t)T) >> 17; // range = (-276823808,276815872)
tmp = tmp + (1 << 25); // range = (-243269376,310370304)
int32t y = ((int64t)x (int64_t)tmp) >> 25; // range = (-303096,303086)
return y;
}
Running the analyzer produces per-line interval results:
No overflow case (PASS):
Per-line result ranges:
apply_comp.c:11: [-1073709056, 1073741824] | int32_t tmp = a2T;
applycomp.c:14: [-1107263488, 1107295232] | tmp = tmp + (a1 << 10);
applycomp.c:17: -276823808, 276815872 | tmp = ((int64t)tmp * (int64t)T) >> 17;
applycomp.c:20: [-243269376, 310370304] | tmp = tmp + (1 << 25);
applycomp.c:23: -303096, 303086 | int32t y = ((int64t)x (int64_t)tmp) >> 25;
FINAL RESULT: PASS
Overflow guarantee: YES (no potential overflows detected for analyzed operations).
Overflow case (FAIL):
Per-line result ranges:
apply_comp.c:11: [-1073709056, 1073741824] | int32_t tmp = a2T;
applycomp.c:14: [-1107263488, 1107295232] | tmp = tmp + (a1 << 10);
applycomp.c:17: -4429180928, 4429053952 | tmp = ((int64t)tmp * (int64t)T) >> 13;
applycomp.c:20: [-2113929216, 2181038079] | tmp = tmp + (1 << 25);
applycomp.c:23: -2097152, 2097087 | int32t y = ((int64t)x (int64_t)tmp) >> 25;
FINAL RESULT: FAIL (2 potential overflow(s) detected)
Overflow happens on source line(s):
apply_comp.c:17 | tmp = ((int64_t)tmp (int64t)T) >> 13;
applycomp.c:20 | tmp = tmp + (1 << 25);
Overflow guarantee: NO (cannot guarantee absence of overflow under current assumptions).
Use --show-all-intervals for full per-operation interval traces and detailed findings.
Right now this is very much a prototype. I’m curious to learn from people who work with embedded or DSP systems:
How do you detect fixed-point overflow in your projects today?
How do you ensure rounding errors stay within specification when doing fixed-point math?
Do you rely purely on testing, or static analysis?
Would something like this be useful in CI/CD or code review pipelines?
Thanks for any insights 🙏
https://redd.it/1rhs596
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Do IOT developers use code collaboration platforms like Github?
Most software developers use Github or something like that for code collaborations and keep code versions as backups.
Do IOT developers do that?
https://redd.it/1rht3gw
@r_embedded
Most software developers use Github or something like that for code collaborations and keep code versions as backups.
Do IOT developers do that?
https://redd.it/1rht3gw
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Can someone please recommend a playlist or site(free), to learns esp 32 from the ground up
Can someone please recommend a playlist or site ( free ) to learn esp32
same as noscript
p.s. I did search on the sub about the playlist but didn't get anything that teaches from total basics to a decent level which was free
I am a total noob and want to get started , plz help a little guys 😭, and i am doing esp 32 as me and my friend have to make a college project tehy know arduino rasberry etc so i thought i should do esp32 as they dont know how to use it and will be able to help in the project if i should do soemthing else , then please also tell about that
p.s. didn't know which tag to use so sorry if it was wrong
https://redd.it/1rhw1tg
@r_embedded
Can someone please recommend a playlist or site ( free ) to learn esp32
same as noscript
p.s. I did search on the sub about the playlist but didn't get anything that teaches from total basics to a decent level which was free
I am a total noob and want to get started , plz help a little guys 😭, and i am doing esp 32 as me and my friend have to make a college project tehy know arduino rasberry etc so i thought i should do esp32 as they dont know how to use it and will be able to help in the project if i should do soemthing else , then please also tell about that
p.s. didn't know which tag to use so sorry if it was wrong
https://redd.it/1rhw1tg
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Multiplexed IR beam grid for fast 6mm projectile detection – viable approach?
Hi all,
I’m looking to detect the impact position of a 6mm projectile traveling at up to \~100 m/s and register the hit as accurately as possible in X/Y coordinates. The sensing area would be roughly the size of a 24” monitor in vertical orientation.
My idea is to build a rectangular IR detection frame with IR LEDs on one side and receivers on the opposite side, forming a 2D beam grid. An ESP32 would handle time-multiplexed scanning and determine X/Y based on which beams are interrupted.
Planned components:
• 940nm IR LEDs
• 38kHz modulation
• TSOP-style IR receivers (for ambient light rejection)
• \~40–50 cm beam span
The goal is reliable detection of very brief beam interruptions caused by a fast-moving 6mm object, ideally with as much positional resolution as the beam spacing allows.
Before I commit to printing the frame and soldering a large number of emitters/receivers:
• Does this architecture sound reasonable for this speed and distance?
• Are there known pitfalls when multiplexing many 38kHz receivers?
• Would a different sensing method be more appropriate for sub-millisecond interruptions?
• Is TSOP-style detection suitable for this kind of “beam break” application?
Appreciate any advice from people who’ve built similar optical detection systems.
https://redd.it/1rhxxql
@r_embedded
Hi all,
I’m looking to detect the impact position of a 6mm projectile traveling at up to \~100 m/s and register the hit as accurately as possible in X/Y coordinates. The sensing area would be roughly the size of a 24” monitor in vertical orientation.
My idea is to build a rectangular IR detection frame with IR LEDs on one side and receivers on the opposite side, forming a 2D beam grid. An ESP32 would handle time-multiplexed scanning and determine X/Y based on which beams are interrupted.
Planned components:
• 940nm IR LEDs
• 38kHz modulation
• TSOP-style IR receivers (for ambient light rejection)
• \~40–50 cm beam span
The goal is reliable detection of very brief beam interruptions caused by a fast-moving 6mm object, ideally with as much positional resolution as the beam spacing allows.
Before I commit to printing the frame and soldering a large number of emitters/receivers:
• Does this architecture sound reasonable for this speed and distance?
• Are there known pitfalls when multiplexing many 38kHz receivers?
• Would a different sensing method be more appropriate for sub-millisecond interruptions?
• Is TSOP-style detection suitable for this kind of “beam break” application?
Appreciate any advice from people who’ve built similar optical detection systems.
https://redd.it/1rhxxql
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Mimicking a generator automatic mains failure unit( Datakom-105)
Hello peers
As the noscript says, I’m trying to build a control panel/ printed circuit board to mimic the unit to run a generator using a microcontroller
The city I live in is under siege and because of shortage of these units and if there’s any they’re sold at really high price
That’s when I considered to come up with something similar and perhaps much cheaper
Is this reliable, I would appreciate any advice
https://redd.it/1ri2je5
@r_embedded
Hello peers
As the noscript says, I’m trying to build a control panel/ printed circuit board to mimic the unit to run a generator using a microcontroller
The city I live in is under siege and because of shortage of these units and if there’s any they’re sold at really high price
That’s when I considered to come up with something similar and perhaps much cheaper
Is this reliable, I would appreciate any advice
https://redd.it/1ri2je5
@r_embedded
Reddit
From the embedded community on Reddit
Explore this post and more from the embedded community
Sharing Rovari Platform to help beginners get into RISC-V Embedded
https://redd.it/1riawnb
@r_embedded
https://redd.it/1riawnb
@r_embedded