Data Engineering Zoomcamp – Telegram
We have two announcements

First, the deadline for the project is extended by one week. The new due date is the 4th of May

Also, we have generated the certificates for those who completed the first project. Here's an example: https://certificate.datatalks.club/dezoomcamp/2023/fe629854d45c559e9c10b3b8458ea392fdeb68a9.pdf

Instructions how to get a certificate: https://github.com/DataTalksClub/data-engineering-zoomcamp/blob/main/cerficates.md

Later (hopefully not too late) you will receive an email with your personal link, but for now you can generate this link yourself. Pay attention to the hash - it's not the same as in the leaderboard
🔥39👍10👏42🐳1
Hi everyone!

If you have finished among the top-100 participants (according to the leaderboard), please check your email. You will find a link to a form there, which you need to fill out if you want to be on a public leaderboard of the course. Please do it within one week.

If you don't want to be on the public leaderboard, just ignore the email.

If you think you should have received an email but you didn't, please let Alexey know.
🔥15🙏43👍3
We will have two interesting workshops pretty soon:

- Identity resolution - we will see how to recognize that two different accounts belong to the same user. This is often a very important task when merging multiple datasets. Sign up here: https://www.eventbrite.com/e/identity-resolution-essentials-from-a-data-scientist-tickets-654866582577

- Mage - a workflow orchestration tool, a nice alternative to Airflow and Prefect. We will see how to set up a simple pipeline with Mage. Sign up here: https://eventbrite.com/e/647017044397
31👍7🔥6
Hi everyone!

The next iteration of the course is starting soon (in 1.5 months). In the meantime, you probably have a ton of questions

That's why we organize a Q&A stream on December 18 (Monday) at 17:00 CET where we will answer all your questions

Sign up here: https://lu.ma/1u1jlz4x

Also, on Monday (tomorrow) we will have a workshop that many of you will find relevant. We will talk about using Terraform for setting up a data warehouse (ClickHouse)

Sign up here: https://lu.ma/5fil21de
96👍54🔥24👏8
Many of you ask how much time you should devote to the course. The answer we have previously given was "it depends".

But actually we collected some data in the past editions of the course that can give a more accurate answer

Here's the dataset: https://github.com/DataTalksClub/zoomcamp-analytics/tree/main/data/de-zoomcamp-2023 (it also contains data from other courses)

You can do some analytics and then share the results with us

In this repo you can also find a notebook from Timur, our past student and teaching assistant, who did the analysis for the first edition of ML Zoomcamp. Half of his notebook is devoted to data cleaning, but actually DE Zoomcamp 2023 data is much cleaner, so most of it is not needed anymore

Have fun!
🔥78👍5136👏3😁3🤔1
We're starting today at 17:00 CET! (In approximately 7 hours from now)

You can ask your questions in advance using this link:

https://app.sli.do/event/su9wCLiM9nHnCwtGBfgicX

See you soon!
90🔥43👍29👏6🥰1
The form for submitting homework 1: https://courses.datatalks.club/de-zoomcamp-2024/homework/hw01

(Please ignore the "This homework is already scored. You didn't submit your answers." part - it's a bug which we'll fix later)

This platform is under active development, but hopefully you won't have any problems.

If you come across a problem, you can report it in the #course-management-platform channel

If you're interested in how it works, the code is here: https://github.com/DataTalksClub/course-management-platform
🔥35👍2821👏5