Data Engineers – Telegram
Data Engineers
9.49K subscribers
314 photos
79 files
299 links
Free Data Engineering Ebooks & Courses
Download Telegram
Life of a Data Engineer.....


Business user : Can we add a filter on this dashboard. This will help us track a critical metric.
me : sure this should be a quick one.

Next day :

I quickly opened the dashboard to find the column in the existing dashboard's data sources.  -- column not found

Spent a couple of hours to identify the data source and how to bring the column into the existence data pipeline which feeds the dashboard( table granularity , join condition etc..).

Then comes the pipeline changes , data model changes , dashboard changes , validation/testing.

Finally deploying to production and a simple email to the user that the filter has been added.

A small change in the front end but a lot of work in the backend to bring that column to life.

Never underestimate data engineers and data pipelines 💪
1
These are the Top 5 Most Common SQL Questions for Data Engineering:


1. Total records after joining two tables on all types of joins
2. Rolling Sum and Nth salary based questions
3. Lag/Lead based questions e.g., consecutive months of increasing sales or YoY growth
4. Query to find employees who earn more than their managers
5. Removing duplicates from a table


Key Takeaways:
- Master window functions and joins
- Practice medium to hard SQL questions regularly

Getting good at SQL will pay off in the long run! 💪

Join our WhatsApp channel of Data Engineers: https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C
FREE RESOURCES TO LEARN DATA ENGINEERING
👇👇

Big Data and Hadoop Essentials free course

https://bit.ly/3rLxbul

Data Engineer: Prepare Financial Data for ML and Backtesting FREE UDEMY COURSE
[4.6 stars out of 5]

https://bit.ly/3fGRjLu

Understanding Data Engineering from Datacamp

https://clnk.in/soLY

Data Engineering Free Books

https://ia600201.us.archive.org/4/items/springer_10.1007-978-1-4419-0176-7/10.1007-978-1-4419-0176-7.pdf

https://www.darwinpricing.com/training/Data_Engineering_Cookbook.pdf

Big Data of Data Engineering Free book

https://databricks.com/wp-content/uploads/2021/10/Big-Book-of-Data-Engineering-Final.pdf

https://aimlcommunity.com/wp-content/uploads/2019/09/Data-Engineering.pdf

The Data Engineer’s Guide to Apache Spark

https://news.1rj.ru/str/datasciencefun/783

Data Engineering with Python

https://news.1rj.ru/str/pythondevelopersindia/343

Data Engineering Projects -

1.End-To-End From Web Scraping to Tableau  https://lnkd.in/ePMw63ge

2. Building Data Model and Writing ETL Job https://lnkd.in/eq-e3_3J

3. Data Modeling and Analysis using Semantic Web Technologies https://lnkd.in/e4A86Ypq

4. ETL Project in Azure Data Factory - https://lnkd.in/eP8huQW3

5. ETL Pipeline on AWS Cloud - https://lnkd.in/ebgNtNRR

6. Covid Data Analysis Project - https://lnkd.in/eWZ3JfKD

7. YouTube Data Analysis 
   (End-To-End Data Engineering Project) - https://lnkd.in/eYJTEKwF

8. Twitter Data Pipeline using Airflow - https://lnkd.in/eNxHHZbY

9. Sentiment analysis Twitter:
    Kafka and Spark Structured Streaming -  https://lnkd.in/esVAaqtU

ENJOY LEARNING 👍👍
2👍2
𝗚𝗼𝗼𝗴𝗹𝗲 𝗙𝗥𝗘𝗘 𝗖𝗲𝗿𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗖𝗼𝘂𝗿𝘀𝗲𝘀😍 

Data analytics is a must-have skill in today’s digital era, and Google offers exceptional free courses to help you excel

- Google Analytics Certification
- Google Analytics for Power Users
- Advanced Google Analytics

𝐋𝐢𝐧𝐤 👇:- 

https://pdlink.in/423LMom

Enroll For FREE & Get Certified🎓
Tools for Data Engineers 👆
🔥4👍2
Languages used by data engineers:

📍SQL
📍Python
📍Scala
📍Pyspark
📍Spark SQL
🔥1
Here are some incredible platforms where you can download datasets for your project:


Our World in Data https://ourworldindata.org/

World Health Organization (https://www.who.int/data/gho

Statcounter (https://gs.statcounter.com/

Food and Agriculture Organization of the UN (FAO) (https://www.fao.org/home/en

World Bank (https://data.worldbank.org/)
𝗚𝗲𝘁 𝗬𝗼𝘂𝗿 𝗗𝗿𝗲𝗮𝗺 𝗝𝗼𝗯 𝗜𝗻 𝗔𝗺𝗮𝘇𝗼𝗻, 𝗚𝗼𝗼𝗴𝗹𝗲, 𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁, 𝗡𝗩𝗜𝗗𝗜𝗔, 𝗮𝗻𝗱 𝗠𝗲𝘁𝗮 (𝗙𝗮𝗰𝗲𝗯𝗼𝗼𝗸) 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲𝘀𝗲 𝗰𝗼𝗺𝗽𝗿𝗲𝗵𝗲𝗻𝘀𝗶𝘃𝗲 𝗿𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀😍

1️⃣ Amazon Interviewing Guide
2️⃣ Google Interview Tips
3️⃣ Microsoft Hiring Tips
4️⃣ NVIDIA Hiring Process
5️⃣ Meta Onsite SWE Prep Guide

𝐋𝐢𝐧𝐤👇:-

https://pdlink.in/40OSJJ6

Crack Interview & Get Your Dream Job In Top MNCs
Flow chart of commonly used statistical tests
🔥2
𝐅𝐑𝐄𝐄 𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 𝐂𝐨𝐮𝐫𝐬𝐞𝐬 😍

1) Generative AI

2) Big data artificial intelligence

3 ) Microsoft Al for beginners

4) Prompt Engineering for Chat GPT

𝐋𝐢𝐧𝐤👇 :- 

https://pdlink.in/40Fbg9d

Enroll For FREE & Get Certified🎓
1
Struggling with Machine Learning algorithms? 🤖

Then you better stay with me! 🤓

We are going back to the basics to simplify ML algorithms.
... today's turn is Logistic Regression! 👇🏻

1️⃣ 𝗟𝗢𝗚𝗜𝗦𝗧𝗜𝗖 𝗥𝗘𝗚𝗥𝗘𝗦𝗦𝗜𝗢𝗡
It is a binary classification model used to classify our input data into two main categories.

It can be extended to multiple classifications... but today we'll focus on a binary one.

Also known as Simple Logistic Regression.

2️⃣ 𝗛𝗢𝗪 𝗧𝗢 𝗖𝗢𝗠𝗣𝗨𝗧𝗘 𝗜𝗧?
The Sigmoid Function is our mathematical wand, turning numbers into neat probabilities between 0 and 1.

It's what makes Logistic Regression tick, giving us a clear 'probabilistic' picture.

3️⃣ 𝗛𝗢𝗪 𝗧𝗢 𝗗𝗘𝗙𝗜𝗡𝗘 𝗧𝗛𝗘 𝗕𝗘𝗦𝗧 𝗙𝗜𝗧?
For every parametric ML algorithm, we need a LOSS FUNCTION.

It is our map to find our optimal solution or global minimum.

(hoping there is one! 😉)

✚ 𝗕𝗢𝗡𝗨𝗦 - FROM LINEAR TO LOGISTIC REGRESSION
To obtain the sigmoid function, we can derive it from the Linear Regression equation.
👍31
Understand the power of Data Lakehouse Architecture for 𝗙𝗥𝗘𝗘 here...


🚨𝗢𝗹𝗱 𝘄𝗮𝘆
• Complicated ETL processes for data integration.
• Silos of data storage, separating structured and unstructured data.
• High data storage and management costs in traditional warehouses.
• Limited scalability and delayed access to real-time insights.

𝗡𝗲𝘄 𝗪𝗮𝘆
• Streamlined data ingestion and processing with integrated SQL capabilities.
• Unified storage layer accommodating both structured and unstructured data.
• Cost-effective storage by combining benefits of data lakes and warehouses.
• Real-time analytics and high-performance queries with SQL integration.

The shift?

Unified Analytics and Real-Time Insights > Siloed and Delayed Data Processing

Leveraging SQL to manage data in a data lakehouse architecture transforms how businesses handle data.

Data Engineering Interview Preparation Resources: https://whatsapp.com/channel/0029Vaovs0ZKbYMKXvKRYi3C

All the best 👍👍
👍1
𝗧𝗼𝗽 𝗙𝗿𝗲𝗲 𝗣𝘆𝘁𝗵𝗼𝗻 𝗖𝗼𝘂𝗿𝘀𝗲𝘀 𝗳𝗼𝗿 𝗕𝗲𝗴𝗶𝗻𝗻𝗲𝗿𝘀😍

Python is one of the most versatile and in-demand programming languages today.

Whether you’re a beginner or looking to refresh your coding skills, these beginner-friendly courses will guide you step by step.

𝗟𝗲𝗮𝗿𝗻 𝗙𝗼𝗿 𝗙𝗥𝗘𝗘👇:-

https://pdlink.in/4gG4k2q

All The Best 🎉
djangobookwzy482.pdf
1.2 MB
Python Django pdf 🚀
👍4
𝗦𝗤𝗟 𝗙𝗥𝗘𝗘 𝗖𝗲𝗿𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗖𝗼𝘂𝗿𝘀𝗲𝘀 😍

Best Free SQL Courses to Get Started

1) Introduction to Databases and SQL
2) Advanced Database and SQL
3) Learn SQL 
4) SQL Tutorial

𝐋𝐢𝐧𝐤 👇:- 

https://pdlink.in/3EyjUPt

Enroll For FREE & Get Certified 🎓
👍1
https://drive.google.com/drive/folders/1SkCOcAS0Kqvuz-MJkkjbFr1GSue6Ms6m

all companies placement material🔥🔥🔥

Share with your friends ❣️
https://news.1rj.ru/str/sqlspecialist
Python Programming and SQL 7 in 1 book: https://drive.google.com/file/d/1nBfEzab3VgUJ59lZmP6iJzpdd7qPSrUr/view?usp=drivesdk

Join telegram channels for more free resources: https://news.1rj.ru/str/addlist/JbC2D8X2g700ZGMx