NEW BOT Телеграм, страница

A teammate asks you a question. You answer. They move on.

Repeat that enough, and you’ve accidentally trained your team not to think

https://luminousmen.com/post/what-do-you-think/

Blog | iamluminousmen

What Do You Think?

Encourage critical thinking in your team with a simple phrase. Learn how asking "What do you think?" can transform you from a problem-solver to a mentor and team multiplier.

👍5

268 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Not sure what to make of this, but Googling HDFS now routes me directly to Harley-Davidson financing. Either Google's confused... or this is how the internet tells you you've reached the 'motorcycle loan' demographic

🦄5❤1👀1

247 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Come on, this is fucking ridiculous

"hey claude, create a datasheet where our model is leading on every benchmark (btw create a benchmark)"

🔗Link: https://www.anthropic.com/news/claude-opus-4-5

🔥4💯1

227 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Most people treat BigQuery like a magic SQL endpoint.

You write a query, hit Run, wait a few seconds... and a petabyte-sized answer pops out.

If it's slow or expensive, the default reaction is: "I need more compute".

That's backwards.

BigQuery is designed to skip work, not to muscle through it:

https://luminousmen.com/post/bigquery-explained-what-really-happens-when-you-hit-run

Blog | iamluminousmen

BigQuery Explained: What Really Happens When You Hit “Run”

Discover the magic behind BigQuery's "infinite cluster" in this insightful breakdown of its internals. Learn how SQL queries get executed in seconds, unraveling the mystery behind Google's serverless system.

🔥1

169 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Security researchers at PromptArmor have discovered a critical vulnerability in Google Antigravity - Google's new AI-powered IDE that uses Gemini-based agents. Through an indirect prompt-injection attack, an outside actor can:

- Trick Gemini into reading sensitive local files (like .env files or API keys)
- Use the built-in agent browser to quietly exfiltrate that data through crafted URLs
- Bypass safeguards such as "secret filtering" or .gitignore protections by triggering shell commands like cat

Antigravity's agents are granted broad capabilities - access to code, a shell, and a browser - a single injected prompt hidden in a README or a code comment can silently leak data without any user action😦

If you're experimenting with Antigravity or any similar agent-driven development tools, keep the following in mind:

- Lock down access to secrets
- Audit what capabilities your agents actually have
- Treat AI agents like remote developers - don't give them any more power than you'd hand to a junior engineer with near-root access

🔗 Link: https://promptarmor.com/resources/google-antigravity-exfiltrates-data

Promptarmor

Google Antigravity Exfiltrates Data

An indirect prompt injection in an implementation blog can manipulate Antigravity to invoke a malicious browser subagent in order to steal credentials and sensitive code from a user’s IDE.

👍2

235 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

ONLYFANS could be the most revenue-efficient company on the planet, beating Nvidia, Meta, Tesla, and Amazon - powered by ass, not AI.

😎9

247 views09:12

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Lowering the gates to the CUDA moat.
NotebookLM - generated infographics follows the Google's new TPU announcement

🔗Link: https://www.linkedin.com/posts/semianalysis_notebooklm-recently-introduced-a-new-function-activity-7400973159853780992-PsXz

👍1

208 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Throughout my career, I keep coming back to the same optimization in data pipelines:

Filter as early as possible.

Recently I cut a 3-hour job down to 30 minutes and dropped compute cost from $600 to $9 just by doing that.

If your analytics team needs sales from just three stores, don't build the full sales mart and filter later. That's waste.

Push the store filter upstream-before joins, before aggregations, as close to storage as you can. Join only on those store IDs from the start.

On most engines this means less data scanned, less shuffling, and better use of partition pruning / predicate pushdown. In practice you get:

- Less I/O
- Less memory pressure
- Faster, cheaper queries

But here's the nuance: don't hardcode business logic upstream. Maintainability still matters.

Instead of sprinkling storeid IN (...) across jobs, drive those filters from config, parameters, or dimension tables (like an activestores view). Same optimization, less brittleness.

Before you run your next pipeline, ask:

Can I reduce data volume earlier without introducing fragile business logic?

💯5👍1

197 viewsedited 16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

AWS Lambda Managed Instances allows you to run Lambda functions on EC2 instances, preserving the familiar serverless model while gaining control over the hardware and EC2-based pricing. Wow, serverful computing

This is an attempt to cover use cases where Lambda is great from a development perspective but not cost- or hardware-efficient-without fully switching to ECS/EC2. In architectures with steady-state load or specific hardware requirements, this could be a game-changer, but you'll need to carefully profile multiconcurrency and realistically calculate the cost for your workload.

🔗 Link: https://aws.amazon.com/blogs/aws/introducing-aws-lambda-managed-instances-serverless-simplicity-with-ec2-flexibility/

Amazon

Introducing AWS Lambda Managed Instances: Serverless simplicity with EC2 flexibility | Amazon Web Services

Run Lambda functions on EC2 compute while maintaining serverless simplicity—enabling access to specialized hardware and cost optimizations through EC2 pricing models, with AWS handling all infrastructure management.

👍3

249 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

🔥5😁1

289 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Now we have a solution

🔥4😢1👀1

237 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

The real reason your Spark cluster is burning money:

https://luminousmen.com/post/dive-into-spark-memory

Blog | iamluminousmen

Deep Dive into Spark Memory Management

Discover why your Spark cluster is losing money with a deep dive into Spark memory management. Uncover the complexities of memory allocation, off-heap memory, and task management for optimal performance.

👍1

141 views16:19

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

It was a long year and you still hold on to my writing?

Thank you - genuinely.

Now, since you've made it this far, I want to give you a gift.

You know, I'm a simple man - my favorite holiday is New Year, and if you check the calendar you can guess I'm a bit happier right now.

I've been writing for a long time without giving much back to you, fellow reader - I assume a data engineer, maybe a future colleague.

What I write is usually deeply technical stuff, occasional rants, sometimes practical tips, and sentimental career advice for fellow data engineers. If you like how that sounds and want access to the paid posts too, there's a 30% off yearly discount running right now: https://luminousmen.substack.com/129bfd67

I keep some work on the paid side to make it sustainable and to go deeper instead of chasing clicks. As I said before, gated knowledge is where we're heading - I'm just trying to keep the gate cheap and honest.

ho-ho-ho-ho 🎄

Substack

Subscribe to Blog | luminousmen

helping robots conquer the earth and trying not to increase entropy using Python, Data Engineering, Machine Learning. Click to read Blog | luminousmen, a Substack publication with thousands of subscribers.

❤4🔥4👀1

127 views16:19

About

Blog

Apps

Platform