Great work on the projects!
Here is the feedback: https://docs.google.com/spreadsheets/d/e/2PACX-1vQuMt9m1XlPrCACqnsFTXTV_KGiSnsl9UjL7kdTMsLJ8DLu3jNJlPzoUKG6baxc8APeEQ8RaSP1U2VX/pubhtml?gid=27207346&single=true
The updated leaderboard: https://docs.google.com/spreadsheets/d/e/2PACX-1vTbL00GcdQp0bJt9wf1ROltMq7s3qyxl-NYF7Pvk79Jfxgwfn9dNWmPD_yJHTDq_Wzvps8EIr6cOKWm/pubhtml
Here is the feedback: https://docs.google.com/spreadsheets/d/e/2PACX-1vQuMt9m1XlPrCACqnsFTXTV_KGiSnsl9UjL7kdTMsLJ8DLu3jNJlPzoUKG6baxc8APeEQ8RaSP1U2VX/pubhtml?gid=27207346&single=true
The updated leaderboard: https://docs.google.com/spreadsheets/d/e/2PACX-1vTbL00GcdQp0bJt9wf1ROltMq7s3qyxl-NYF7Pvk79Jfxgwfn9dNWmPD_yJHTDq_Wzvps8EIr6cOKWm/pubhtml
❤36👍2
Dave recorded the solution to the PipeRider homework. You can check it here: https://www.youtube.com/watch?v=inNrUys7W8U&list=PL3MmuxUbc_hJjEePXIdE-LVUx_1ZZjYGW
YouTube
PipeRider DE Zoomcamp Workshop Homework Answers
The answers for the homework questions as part of the PipeRider Workshop from week 4 of the DataTalks.Club Data Engineering Zoomcamp.
Homework:
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/workshops/piperider.md
Workshop…
Homework:
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/workshops/piperider.md
Workshop…
👍10❤5
We have two announcements
First, the deadline for the project is extended by one week. The new due date is the 4th of May
Also, we have generated the certificates for those who completed the first project. Here's an example: https://certificate.datatalks.club/dezoomcamp/2023/fe629854d45c559e9c10b3b8458ea392fdeb68a9.pdf
Instructions how to get a certificate: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cerficates.md
Later (hopefully not too late) you will receive an email with your personal link, but for now you can generate this link yourself. Pay attention to the hash - it's not the same as in the leaderboard
First, the deadline for the project is extended by one week. The new due date is the 4th of May
Also, we have generated the certificates for those who completed the first project. Here's an example: https://certificate.datatalks.club/dezoomcamp/2023/fe629854d45c559e9c10b3b8458ea392fdeb68a9.pdf
Instructions how to get a certificate: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cerficates.md
Later (hopefully not too late) you will receive an email with your personal link, but for now you can generate this link yourself. Pay attention to the hash - it's not the same as in the leaderboard
🔥39👍10👏4❤2🐳1
Great work on the projects!
Here's the peer review assignments: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/project.md#project-attempt-2
Find your hash in the table and submit 3 reviews in the form for submissions.
Have fun learning from your peers!
Here's the peer review assignments: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/project.md#project-attempt-2
Find your hash in the table and submit 3 reviews in the form for submissions.
Have fun learning from your peers!
GitHub
data-engineering-zoomcamp/cohorts/2023/project.md at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering. - DataTalksClub/data-engineering-zoomcamp
🔥27👍5
Great work on the projects!
Now the scores are available in the updated leaderboard: https://docs.google.com/spreadsheets/d/e/2PACX-1vTbL00GcdQp0bJt9wf1ROltMq7s3qyxl-NYF7Pvk79Jfxgwfn9dNWmPD_yJHTDq_Wzvps8EIr6cOKWm/pubhtml
You can see the feedback here: https://docs.google.com/spreadsheets/d/e/2PACX-1vQuMt9m1XlPrCACqnsFTXTV_KGiSnsl9UjL7kdTMsLJ8DLu3jNJlPzoUKG6baxc8APeEQ8RaSP1U2VX/pubhtml?gid=246029638&single=true
You can access your certificates using these instructions: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cerficates.md
Now the scores are available in the updated leaderboard: https://docs.google.com/spreadsheets/d/e/2PACX-1vTbL00GcdQp0bJt9wf1ROltMq7s3qyxl-NYF7Pvk79Jfxgwfn9dNWmPD_yJHTDq_Wzvps8EIr6cOKWm/pubhtml
You can see the feedback here: https://docs.google.com/spreadsheets/d/e/2PACX-1vQuMt9m1XlPrCACqnsFTXTV_KGiSnsl9UjL7kdTMsLJ8DLu3jNJlPzoUKG6baxc8APeEQ8RaSP1U2VX/pubhtml?gid=246029638&single=true
You can access your certificates using these instructions: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cerficates.md
🔥11🙏7😢3👍2
Hi everyone!
If you have finished among the top-100 participants (according to the leaderboard), please check your email. You will find a link to a form there, which you need to fill out if you want to be on a public leaderboard of the course. Please do it within one week.
If you don't want to be on the public leaderboard, just ignore the email.
If you think you should have received an email but you didn't, please let Alexey know.
If you have finished among the top-100 participants (according to the leaderboard), please check your email. You will find a link to a form there, which you need to fill out if you want to be on a public leaderboard of the course. Please do it within one week.
If you don't want to be on the public leaderboard, just ignore the email.
If you think you should have received an email but you didn't, please let Alexey know.
🔥15🙏4❤3👍3
We will have two interesting workshops pretty soon:
- Identity resolution - we will see how to recognize that two different accounts belong to the same user. This is often a very important task when merging multiple datasets. Sign up here: https://www.eventbrite.com/e/identity-resolution-essentials-from-a-data-scientist-tickets-654866582577
- Mage - a workflow orchestration tool, a nice alternative to Airflow and Prefect. We will see how to set up a simple pipeline with Mage. Sign up here: https://eventbrite.com/e/647017044397
- Identity resolution - we will see how to recognize that two different accounts belong to the same user. This is often a very important task when merging multiple datasets. Sign up here: https://www.eventbrite.com/e/identity-resolution-essentials-from-a-data-scientist-tickets-654866582577
- Mage - a workflow orchestration tool, a nice alternative to Airflow and Prefect. We will see how to set up a simple pipeline with Mage. Sign up here: https://eventbrite.com/e/647017044397
❤31👍7🔥6
Here's the public top-100 leaderboard!
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/leaderboard.md
Thanks everyone for taking part in the course
https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2023/leaderboard.md
Thanks everyone for taking part in the course
GitHub
data-engineering-zoomcamp/cohorts/2023/leaderboard.md at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
❤27👏11👍8🙏3
We're starting a workshop about Mage, you can join it here or watch it later in recording:
https://www.youtube.com/watch?v=nUfAqM2Sguc
https://www.youtube.com/watch?v=nUfAqM2Sguc
YouTube
Data Plumbing without the 💩 - Tommy Dang
Links:
- Repo: https://github.com/mage-ai/mage_demo_project
- Demo: https://demo.mage.ai/pipelines
- Setup: https://docs.mage.ai/getting-started/setup
- Getting started docs: https://docs.mage.ai/getting-started/setup
- Git integration: https://docs.mag…
- Repo: https://github.com/mage-ai/mage_demo_project
- Demo: https://demo.mage.ai/pipelines
- Setup: https://docs.mage.ai/getting-started/setup
- Getting started docs: https://docs.mage.ai/getting-started/setup
- Git integration: https://docs.mag…
❤25👍17🤣5
Hey everyone!
We're about to start another workshop about Mage, it might be relevant to some of you
here's a link to the stream: https://www.youtube.com/watch?v=JKALtxziBG0
(As always you can watch it later)
We're about to start another workshop about Mage, it might be relevant to some of you
here's a link to the stream: https://www.youtube.com/watch?v=JKALtxziBG0
(As always you can watch it later)
YouTube
Make Data Magical with Mage - Matt Palmer
Links:
- https://go.mage.ai/dtc-data-magic
Free ML Engineering course: http://mlzoomcamp.com
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
- https://go.mage.ai/dtc-data-magic
Free ML Engineering course: http://mlzoomcamp.com
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
👍45❤19🔥5
Hi everyone!
The next iteration of the course is starting soon (in 1.5 months). In the meantime, you probably have a ton of questions
That's why we organize a Q&A stream on December 18 (Monday) at 17:00 CET where we will answer all your questions
Sign up here: https://lu.ma/1u1jlz4x
Also, on Monday (tomorrow) we will have a workshop that many of you will find relevant. We will talk about using Terraform for setting up a data warehouse (ClickHouse)
Sign up here: https://lu.ma/5fil21de
The next iteration of the course is starting soon (in 1.5 months). In the meantime, you probably have a ton of questions
That's why we organize a Q&A stream on December 18 (Monday) at 17:00 CET where we will answer all your questions
Sign up here: https://lu.ma/1u1jlz4x
Also, on Monday (tomorrow) we will have a workshop that many of you will find relevant. We will talk about using Terraform for setting up a data warehouse (ClickHouse)
Sign up here: https://lu.ma/5fil21de
lu.ma
Introduction to Data Engineering Zoomcamp · Luma
Live session about the upcoming Data Engineering Zoomcamp course - Alexey Grigorev
About the event
Join us for a Q&A session with Alexey Grigorev, the…
About the event
Join us for a Q&A session with Alexey Grigorev, the…
❤96👍54🔥24👏8
We're starting the workshop about Terraform and ClickHouse
Join now or watch later in replay
https://www.youtube.com/watch?v=YFr_5NTjv0Q
Join now or watch later in replay
https://www.youtube.com/watch?v=YFr_5NTjv0Q
YouTube
Terraform: Reshaping the Data Engineering Experience - Andrei Tserakhau
In this workshop, Andrei Tserakhau, Tech Lead at DoubleCloud, gave a hands-on tutorial about using Terraform for data engineering projects.
He explained how to leverage Terraform to manage and automate data infrastructure, focusing on practical applications…
He explained how to leverage Terraform to manage and automate data infrastructure, focusing on practical applications…
🔥42👍23❤8
We're starting the Q&A stream in 30 minutes
In the meantime, you can already ask your questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
In the meantime, you can already ask your questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
app.sli.do
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
❤15👍2
Live stream: https://www.youtube.com/watch?v=91b8u9GmqB4
(Available for replay later)
Ask your questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
(Available for replay later)
Ask your questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
YouTube
Data Engineering Zoomcamp 2024 - Pre-Launch Q&A
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
🔗 CONNECT WITH DataTalksClub
Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email
Subscribe…
🔗 CONNECT WITH DataTalksClub
Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email
Subscribe…
❤59👏9👍6
If you're wondering what you should do to prepare the environment for the course, check these two videos
- Using GitHub Codespaces: https://www.youtube.com/watch?v=XOSUt8Ih3zA&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb (Thanks Luis for recording it!)
- Using a GCP VM: https://www.youtube.com/watch?v=ae-CV2KfoN0&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb
- Using GitHub Codespaces: https://www.youtube.com/watch?v=XOSUt8Ih3zA&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb (Thanks Luis for recording it!)
- Using a GCP VM: https://www.youtube.com/watch?v=ae-CV2KfoN0&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb
YouTube
DE Zoomcamp 1.4.2 - Using Github Codespaces for the Course (by Luis Oliveira)
Timecodes:
00:00 Intro to GitHub Codespaces
1:05 Create a Repo
1:47 Create New Codespace
2:54 Run Codespace Locally/Desktop
3:22 GitHub Codespaces Extension
3:57 Codespaces Overview and Features
5:02 Install Terraform
6:05 Jupyter Notebook
8:54 Running Docker…
00:00 Intro to GitHub Codespaces
1:05 Create a Repo
1:47 Create New Codespace
2:54 Run Codespace Locally/Desktop
3:22 GitHub Codespaces Extension
3:57 Codespaces Overview and Features
5:02 Install Terraform
6:05 Jupyter Notebook
8:54 Running Docker…
👍95❤34👏20🔥11
Many of you ask how much time you should devote to the course. The answer we have previously given was "it depends".
But actually we collected some data in the past editions of the course that can give a more accurate answer
Here's the dataset: https://github.com/DataTalksClub/zoomcamp-analytics/tree/main/data/de-zoomcamp-2023 (it also contains data from other courses)
You can do some analytics and then share the results with us
In this repo you can also find a notebook from Timur, our past student and teaching assistant, who did the analysis for the first edition of ML Zoomcamp. Half of his notebook is devoted to data cleaning, but actually DE Zoomcamp 2023 data is much cleaner, so most of it is not needed anymore
Have fun!
But actually we collected some data in the past editions of the course that can give a more accurate answer
Here's the dataset: https://github.com/DataTalksClub/zoomcamp-analytics/tree/main/data/de-zoomcamp-2023 (it also contains data from other courses)
You can do some analytics and then share the results with us
In this repo you can also find a notebook from Timur, our past student and teaching assistant, who did the analysis for the first edition of ML Zoomcamp. Half of his notebook is devoted to data cleaning, but actually DE Zoomcamp 2023 data is much cleaner, so most of it is not needed anymore
Have fun!
GitHub
zoomcamp-analytics/data/de-zoomcamp-2023 at main · DataTalksClub/zoomcamp-analytics
Public data and analytics for our open course . Contribute to DataTalksClub/zoomcamp-analytics development by creating an account on GitHub.
🔥78👍51❤36👏3😁3🤔1
We're starting today at 17:00 CET! (In approximately 7 hours from now)
You can ask your questions in advance using this link:
https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
See you soon!
You can ask your questions in advance using this link:
https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
See you soon!
app.sli.do
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
❤90🔥43👍29👏6🥰1
We're starting!
Watch here: https://www.youtube.com/watch?v=AtRhA-NfS24 (or later in replay)
Ask questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
Watch here: https://www.youtube.com/watch?v=AtRhA-NfS24 (or later in replay)
Ask questions here: https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX
YouTube
Data Engineering Zoomcamp 2024
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
🔗 CONNECT WITH DataTalksClub
Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email
Subscribe…
🔗 CONNECT WITH DataTalksClub
Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email
Subscribe…
❤61🔥16👍10👏1
We're starting working on module 1 today
Content: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/01-docker-terraform
Homework: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2024/01-docker-terraform/homework.md (due in 2 weeks)
We will share the link to the homework form soon
Happy learning!
Content: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/01-docker-terraform
Homework: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cohorts/2024/01-docker-terraform/homework.md (due in 2 weeks)
We will share the link to the homework form soon
Happy learning!
GitHub
data-engineering-zoomcamp/01-docker-terraform at main · DataTalksClub/data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp
❤116👍61🔥17👌7🫡7🥰1
The form for submitting homework 1: https://courses.datatalks.club/de-zoomcamp-2024/homework/hw01
(Please ignore the "This homework is already scored. You didn't submit your answers." part - it's a bug which we'll fix later)
This platform is under active development, but hopefully you won't have any problems.
If you come across a problem, you can report it in the #course-management-platform channel
If you're interested in how it works, the code is here: https://github.com/DataTalksClub/course-management-platform
(Please ignore the "This homework is already scored. You didn't submit your answers." part - it's a bug which we'll fix later)
This platform is under active development, but hopefully you won't have any problems.
If you come across a problem, you can report it in the #course-management-platform channel
If you're interested in how it works, the code is here: https://github.com/DataTalksClub/course-management-platform
GitHub
GitHub - DataTalksClub/course-management-platform: Django-based course management platform for Zoomcamps
Django-based course management platform for Zoomcamps - GitHub - DataTalksClub/course-management-platform: Django-based course management platform for Zoomcamps
🔥35👍28❤21👏5
How is it going with module 1?
Anonymous Poll
57%
Not started yet
25%
Halfway through
8%
Almost done
7%
Finished everything
4%
Not taking the course
❤28👍6