Opensource by Reddit – Telegram
Opensource by Reddit
21 subscribers
5 photos
2 videos
9.61K links
Reddit's ♨️ take on Open Source Technology.

Join the discussion ➡️ @opensource_chats

Channel Inquiries ➡️ @group_contacts_bot

👄 TIPS ➡️➡️➡️ https://news.1rj.ru/str/addlist/mB9fRZOHTUk5ZjZk

🌈 made possible by
@reddit2telegram
@r_channels
Download Telegram
I needed an efficient way to convert 5tb of unstructured html into dictionaries using just my laptop, so I wrote doc2dict.

I'm the developer of an open source package to work with SEC data. It turns out the SEC has 5tb of html. This data is visually standardized to humans, but under the hood is a mess of different tags and css.

There are a couple existing solutions for parsing html, but they usually involve a combination of LLMs and OCR, which is slow and expensive. So, I decided to write a flexible, algorithmic solution: doc2dict.

Installation

pip install doc2dict

User interface

dct = html2dict(content,mappingdict=None) # converts content to dictionary
visualize
dict(dct) # visualizes the dictionary using your browser.

Note: I don't use this UI much, as I mostly use it via my SEC package. Docs

# Architecture

1. Iterate through DOM and via inheritance get characteristics such as bold, visual height, italics, etc for text on same line (e.g. within a block) to create instructions, e.g.[{'text': 'BOARD MEETINGS', 'all_caps': True, 'bold': True, 'font-size': 15.995999999999999}]
2. Use a rule set to determine how to convert instructions into a nested dictionary. This is customizable. For example, the mapping dict below tells the parser that 'items' should be nested under 'parts', in addition to the default rules.

​

tenkmappingdict = {
('part',r'^part\s([ivx]+)$') : 0,
('signatures',r'^signatures?\.
$') : 0,
('item',r'^item\s(\d+)') : 1,
}

Note: This approach kinda works for modern pdfs. The text stream is often in the order a human would view as correct, so this kinda works. I've added the functionality to doc2dict, but it's in an early stage. (AKA, it sucks).

# Benchmarks

Benchmarks vary as I update the package w.r.t. to features (tables are slow!). Via my laptop:

500 pages per second single threaded
5,000 pages per second multi threaded

# Links

doc2dict GitHub
[raw html](https://html-preview.github.io/?url=https://raw.githubusercontent.com/john-friedman/doc2dict/refs/heads/main/example_output/html/msft_10k_2024.html#:~:text=embracing)
dictionary visualization (old)
[instructions visualization](https://html-preview.github.io/?url=https://github.com/john-friedman/doc2dict/blob/main/example_output/html/instructions_visualization.html) (old)
dictionary (old)

https://redd.it/1mrbkno
@r_opensource
Best practice for including third-party licenses in an OSS library?

I built a public library that’s MIT-licensed (the license is in a LICENSE file).
The package uses some third-party code, each with its own license.

I’m trying to figure out the standard way to include those third-party licenses in my repo:

Add them directly to my LICENSE file?

Create a separate file like THIRDPARTYLICENSES or NOTICE?


Also, when someone uses my package, do they need to include all these third-party licenses in their app?

One concern: I’ve noticed that some app license generators only pull the main LICENSE file of each dependency, so if third-party licenses are in a separate file, they might be missed. How do you handle this?

My library has 300k downloads a month, and I think it’s time to fix this in the best way.

Currently I only have in the readme a section with links to the third party code that I use with their license type.

Thanks

https://redd.it/1mrep4m
@r_opensource
Seeking code review for open source Canadian shopping extension before launch

Built a browser extension for Canadian e-commerce (keeping details light until launch). Looking for a code review from experienced developers

Stack is TypeScript + Vue. Considering the Canadian angle, this might interest Canadian devs, but would welcome feedback from anyone

Send a DM for the repo link

Thanks

https://redd.it/1mrimzr
@r_opensource
Need Contributors for PairPay

Need a contributor to add a feature for PairPay

PairPay uses:

1. React Native
2. react-native-reanimated
3. expo
4. supabase

The feature is about adding a chart for customers to see their data on a chart. The chart can show data how much they owe in which currencies and how much they are owed and in which currency.

If you would like to be part of this project DM.
https://play.google.com/store/apps/details?id=com.alisinayousofi.greenred

https://redd.it/1mrnl2l
@r_opensource
Rust Utility for Managing PATH

✦ Global Path Add - Rust Utility for Managing PATH



I've built a Rust utility that permanently adds directories to your PATH environment variable across different shell environments.



What it does:

Makes persistent PATH changes that apply to all new terminal sessions, unlike temporary solutions.



Current status (Pre-Alpha):

\- Works with Bash shell

\- ⚠️ Fish shell support semi-implemented (files created but not fully functional)

\- ⚠️ Only works with absolute paths

\- ⚠️ Not thoroughly tested - use at your own risk!



Usage:



1 global_path_add /absolute/path/to/directory



Why I'm sharing:

This is my first Rust project and I'm looking for feedback and contributors to help improve it. I need help with:

\- Completing Fish shell support

\- Support for other shells

\- Better error handling

\- Unit tests

\- Code refactoring



Licensed under MIT. Any feedback or contributions would be greatly appreciated!



GitHub: https://github.com/streamtechteam/global\_path\_add

What do you think? Would you find this useful?

https://redd.it/1mrplcl
@r_opensource
🛡️ Find security pitfalls fast: heuristics + local AI (StarCoder2‑3B) — NeuralScan

\- 💻 Lightweight desktop code scanner with a minimal GUI. Fast heuristics + optional on-device AI explanations.

\- 🧭 What it flags: command exec, unsafe deserialization, weak crypto (MD5/SHA1/DES), destructive FS, secrets, network IOCs. Works on common source/configs (e.g., .py/.sh/Dockerfile).

\- 🤖 AI: bigcode/starcoder2‑3b via HF Transformers; local-only, with deterministic fallback when AI isn’t available.

\- 🐳 Optional Trivy integration (Docker) for dependency scanning. Safe degradation if Docker is off.

\- 📊 Outputs a security score, risk categories (with severity weighting), and keeps recent scan history locally.

\- 🧰 Cross‑platform (Linux/Win/macOS), Python 3.9+, MIT.

GitHub

https://redd.it/1mrteh0
@r_opensource
What are some cool open source projects where I can contribute ?

I am a full stack developer having 1.5 YOE but no projects in my resume, so it gets rejected everytime.

My skillset -
- Javanoscript
- Typenoscript
- Nodejs
- Nestjs
- ReactJS
- Postgres & Mongodb
- Sequelize & Momgoose
- Docker

I am more interested in backend.
Any help would be appreciated

Thanks in adv.


https://redd.it/1mrteef
@r_opensource
looking for FOSS alarm clock, for windows.

i used to use alarm clock pro alarm clock pro (paid) for some reason it glitched out in my old device and i was able to use the free trial eternally (LOL)

and since i switched to a new device, i have been looking for an alternative...
found Free Alarm Clock a run down version of (paid) Hot Alarm Clock , working fine but it was not able to play a flac audio file

was wondering if there was an opensource or free alternative for alarm clock pro, i mainly need features like to play audio files (.flac also) in loop (single/many),can autostart, can wake up from sleep, can run in background(stay alive in hidden icons- bottom right) if possible- can open files , can run timers with same output mechanism

features of this alarm clock pro
can autorun at startup and probably wake from sleep (has never let me down)
set multiple alarms, on and off them (basic function)

select the alarm time, and snooze timings for each snooze and frequency of alarms

play any audio,many A/V files (plays even flac!), customize volume of alarm too + more

change the loop and playback speed 😱

can shutdown/sleep at alarm , can open any? file or folder , create log, run shell command settings-shell command


freealarmclock

choose frequency, time, song (mp3), custom volume at time of alarm +(https://files.catbox.moe/4tzf3j.png)

https://redd.it/1mrx1p6
@r_opensource
I built a Markdown note-taking app for students and creators — and I’d love your feedback

**Hey everyone!**
I’d love to share a project I’ve been building over the past few years: **Alexandrie** 📚

It’s a web-based note-taking app designed primarily for **students**, but also great for **developers, content creators, and anyone** who writes a lot. The goal is to offer a **beautiful, intuitive interface** and produce clean, well-formatted documents—without the frustration of traditional tools like Word.

You can easily manage **hundreds of notes**, organize them into folders, export them, and boost your productivity with **custom snippets, markdown shortcuts**, and more.

# 🛠 Tech stack:

* Frontend: **Vue.js + Nuxt**
* Backend: **Go**
* File storage: **MinIO**

I’m currently the only developer working on it, but I’d love to have contributors! Whether you’re into coding, UI/UX, documentation, or just want to share feedback and suggestions, **you're very welcome to join** 🫶

👉 GitHub repo: [https://github.com/Smaug6739/Alexandrie](https://github.com/Smaug6739/Alexandrie)

If you like the idea, a on GitHub would mean a lot — and feel free to reach out if you want to get involved!

https://redd.it/1ms28bn
@r_opensource
Are there any chrome like apps for Android/iPhone?

Im saying like, let's say a messaging app, instead of having a "share live location" feature built into it, the app can have extensions that are community built additions. Are there apps like this?

https://redd.it/1ms8bde
@r_opensource
Thinking about making Nextips open source, would you contribute?

I’ve been running Nextips , a social football predictions platform. I’m considering making it open source so anyone can contribute, improve it, and help grow the community.

Would you use it or contribute if I did? I’d love to hear your thoughts!

https://redd.it/1msk4x4
@r_opensource