Data Analytics – Telegram
Data Analytics
108K subscribers
126 photos
2 files
791 links
Perfect channel to learn Data Analytics

Learn SQL, Python, Alteryx, Tableau, Power BI and many more

For Promotions: @coderfun @love_data
Download Telegram
Tableau Learning Series Part-3

Complete Tableau Topics for Data Analysis: https://news.1rj.ru/str/sqlspecialist/667

Today, let's learn about Data Transformation and Preparation

Effective data transformation and preparation are crucial steps in ensuring that your data is clean, well-structured, and ready for analysis. Tableau provides several tools and features to help with this process.

#### Data Cleaning and Shaping

1. Renaming Fields: Double-click on a field name in the Data pane and give it a meaningful name.
2. Changing Data Types: Right-click on a field, select "Change Data Type," and choose the appropriate data type (e.g., string, number, date).
3. Splitting Fields: Split a field into multiple fields based on a delimiter. Right-click on a field and select "Split" or "Custom Split."
4. Pivoting Data: Convert columns into rows to reshape your data. This is useful for transforming wide data into a long format.
- Select the columns you want to pivot, right-click, and choose "Pivot."

#### Calculated Fields

Calculated fields allow you to create new data from existing data using formulas. Here’s how to create one:

1. Create a Calculated Field:
- Right-click in the Data pane and select "Create Calculated Field."
- Name your field and enter a formula. For example, to calculate a profit margin, you might use:

     [Profit] / [Sales] 

2. Common Functions:
- String Functions: E.g., LEFT(), RIGHT(), MID(), CONCAT().
- Date Functions: E.g., DATEPART(), DATETRUNC(), DATEDIFF().
- Logical Functions: E.g., IF, THEN, ELSEIF, ELSE, END.
- Aggregate Functions: E.g., SUM(), AVG(), MIN(), MAX().

#### Level of Detail (LOD) Expressions

LOD expressions allow you to control the granularity of your calculations. They are useful for performing complex aggregations and analyses.

1. Types of LOD Expressions:
- Fixed: Calculates the value using the specified dimensions, ignoring other dimensions in the view.

     { FIXED [Region] : SUM([Sales]) }

- Include: Adds dimensions to the view’s level of detail.

     { INCLUDE [Category] : SUM([Sales]) }

- Exclude: Removes dimensions from the view’s level of detail.

     { EXCLUDE [Segment] : SUM([Sales]) }

#### Using Tableau Prep for Data Preparation

Tableau Prep is a tool specifically designed for data preparation, offering an intuitive interface to clean and shape your data.

1. Connecting to Data: Similar to Tableau Desktop, connect to your data sources.
2. Flows: Tableau Prep uses flows, which are sequences of steps (clean, shape, combine, etc.) that you apply to your data.
3. Cleaning Steps:
- Cleaning and Shaping: Perform tasks like renaming fields, changing data types, splitting fields, and pivoting data.
- Union and Join: Combine multiple tables using unions and joins.
- Aggregate and Group: Aggregate data to create summary statistics and group similar values.
4. Output: Once the data is prepared, you can output it to a file or publish it to Tableau Server/Tableau Online for use in Tableau Desktop.

#### Example of Data Preparation in Tableau Prep

1. Start Tableau Prep and connect to your data source (e.g., an Excel file).
2. Add Steps:
- Drag a "Clean Step" to rename fields, split columns, and fix data types.
- Drag a "Join Step" to combine multiple tables.
- Add a "Pivot Step" to reshape data if needed.
3. Output Data:
- Add an "Output Step" and choose the output location and format.
- Run the flow to generate the cleaned data.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like for more such content 👍❤️

Hope it helps :)
👍3310
Tableau Learning Series Part-4

Complete Tableau Topics for Data Analysis: https://news.1rj.ru/str/sqlspecialist/667

Today, let's learn about Building Basic Visualizations

Bar Charts

Bar charts are useful for comparing data across categories.

1. Creating a Simple Bar Chart:
- Drag a dimension (e.g., Category) to the Columns shelf.
- Drag a measure (e.g., Sales) to the Rows shelf.
- Tableau automatically creates a bar chart.

2. Customizing the Bar Chart:
- Use the Color shelf to color bars by another dimension (e.g., Sub-Category).
- Adjust the Size shelf to change bar thickness.
- Add labels by dragging a measure to the Label shelf.

Line Charts

Line charts are ideal for showing trends over time.

1. Creating a Simple Line Chart:
- Drag a date field (e.g., Order Date) to the Columns shelf.
- Drag a measure (e.g., Sales) to the Rows shelf.
- Tableau creates a line chart automatically if the date field is continuous.

2. Customizing the Line Chart:
- Use the Color shelf to distinguish lines by category (e.g., Region).
- Add markers by checking the "Show Markers" option in the Marks card.
- Adjust the date granularity (e.g., year, quarter, month) by clicking on the date field in the Columns shelf and selecting the desired granularity.

Pie Charts

Pie charts show proportions and percentages of a whole.

1. Creating a Simple Pie Chart:
- Drag a dimension (e.g., Category) to the Columns shelf.
- Drag a measure (e.g., Sales) to the Rows shelf.
- Click on the Show Me panel and select the pie chart icon.
- Move Category to the Color shelf and Sales to the Angle shelf.

2. Customizing the Pie Chart:
- Add labels by dragging the dimension or measure to the Label shelf.
- Adjust the Size shelf to change the size of the pie chart.
- Use the Color shelf to adjust colors for better distinction.

Scatter Plots

Scatter plots show relationships between two measures.

1. Creating a Simple Scatter Plot:
- Drag one measure (e.g., Sales) to the Columns shelf.
- Drag another measure (e.g., Profit) to the Rows shelf.
- Tableau creates a scatter plot automatically.

2. Customizing the Scatter Plot:
- Add a dimension (e.g., Region) to the Color shelf to color code the points.
- Add another dimension to the Detail shelf to distinguish between data points.
- Adjust the Size shelf to change the size of the points.

Histograms

Histograms display the distribution of a single measure.

1. Creating a Histogram:
- Drag a measure (e.g., Sales) to the Columns shelf.
- Right-click the measure in the Columns shelf, select "Create Bins," and set the bin size.
- Drag the newly created bin field to the Columns shelf.
- Drag another measure (e.g., Number of Records) to the Rows shelf.
- Tableau creates a histogram.

2. Customizing the Histogram:
- Adjust bin size by editing the bin field.
- Use the Color shelf to color bins by another dimension.
- Add labels by dragging a measure to the Label shelf.

Geographic Maps

Geographic maps are used to visualize data geographically.

1. Creating a Simple Map:
- Drag a geographic dimension (e.g., State) to the Columns shelf.
- Drag a measure (e.g., Sales) to the Rows shelf.
- Tableau creates a map with filled areas.

2. Customizing the Map:
- Use the Color shelf to color the regions by the measure.
- Add labels by dragging the dimension or measure to the Label shelf.
- Adjust the map style and layers through the Map menu.

## Building Dashboards

Once you have individual visualizations, you can combine them into a dashboard.

1. Creating a Dashboard:
- Click the New Dashboard icon at the bottom of the Tableau workspace.
- Drag sheets from the Sheets pane to the dashboard workspace.
- Arrange and resize the visualizations as needed.

2. Adding Interactivity:
- Use filters, actions, and parameters to make your dashboard interactive.
- Add text boxes, images, and web content for additional context.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like for more such content 👍❤️

Hope it helps :)
👍2710🔥1
Requirements for data analyst role based on some jobs from @jobs_sql

👉 Must be proficient in writing complex SQL Queries.

👉 Understand business requirements in BI context and design data models to transform raw data into meaningful insights.

👉 Connecting data sources, importing data, and transforming data for Business intelligence.

👉 Strong working knowledge in Excel and visualization tools like PowerBI, Tableau or QlikView

👉 Developing visual reports, KPI scorecards, and dashboards using Power BI desktop.

Nowadays, recruiters primary focus on SQL & BI skills for data analyst roles. So try practicing SQL & create some BI projects using Tableau or Power BI.

You can refer our Power BI & SQL Series to understand the essential concepts.

Here are some essential telegram channels with important resources:

❯ SQL ➟ t.me/sqlanalyst
❯ Power BI ➟ t.me/PowerBI_analyst
❯ Resources ➟ @learndataanalysis

I am planning to come up with interview series as well to share some essential questions based on my experience in data analytics field.

Like this post if you want me to start the interview series 👍❤️

Hope it helps :)
👍5916👏2
SQL Interview Preparation Part-2

How to use window functions and CTEs to solve SQL interview questions?

1. Common Table Expressions (CTEs):

CTEs are temporary result sets that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. They help break down complex queries and improve readability.

Syntax:
WITH cte_name AS (
SELECT column1, column2
FROM table_name
WHERE condition
)
SELECT column1, column2
FROM cte_name
WHERE another_condition;

Example Problem:
Find the top 3 highest-paid employees in each department.

Solution Using CTE:
WITH RankedSalaries AS (
SELECT
employee_id,
department_id,
salary,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees
)
SELECT
employee_id,
department_id,
salary
FROM RankedSalaries
WHERE rank <= 3;

2. Window Functions:

Window functions perform calculations across a set of table rows related to the current row. They do not reduce the number of rows returned.

Common Window Functions:
- ROW_NUMBER(): Assigns a unique number to each row within the partition.
- RANK(): Assigns a rank to each row within the partition, with gaps in ranking for ties.
- DENSE_RANK(): Similar to RANK(), but without gaps.
- SUM(), AVG(), COUNT(), etc., over a partition.

Syntax:
SELECT column1, 
column2,
window_function() OVER (PARTITION BY column1 ORDER BY column2) AS window_column
FROM table_name;

Example Problem:
Calculate the running total of sales for each salesperson.

Solution Using Window Function:
SELECT 
salesperson_id,
sale_date,
amount,
SUM(amount) OVER (PARTITION BY salesperson_id ORDER BY sale_date) AS running_total
FROM sales;

Combining CTEs and Window Functions:

Example Problem:
Find the cumulative sales per department and the rank of each employee within their department based on their sales.

Solution:
WITH DepartmentSales AS (
SELECT
department_id,
employee_id,
SUM(sales_amount) AS total_sales
FROM sales
GROUP BY department_id, employee_id
),
RankedSales AS (
SELECT
department_id,
employee_id,
total_sales,
RANK() OVER (PARTITION BY department_id ORDER BY total_sales DESC) AS sales_rank
FROM DepartmentSales
)
SELECT
department_id,
employee_id,
total_sales,
sales_rank,
SUM(total_sales) OVER (PARTITION BY department_id ORDER BY sales_rank) AS cumulative_sales
FROM RankedSales;


For those of you who are new to this channel read SQL Basics before going through advanced concepts 😄

Part-1: https://news.1rj.ru/str/sqlspecialist/558

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍7621🔥4👎1
SQL INTERVIEW PREPARATION PART-3

What are the different types of SQL commands?

SQL commands can be categorized into several types based on their functionality:

- DDL (Data Definition Language): These commands are used to define and modify database structures, such as tables and indexes.
- Examples: CREATE, ALTER, DROP
- Example:

         CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100),
position VARCHAR(50)
);

- DML (Data Manipulation Language): These commands are used to manipulate the data within the database.
- Examples: SELECT, INSERT, UPDATE, DELETE
- Example:

         INSERT INTO employees (id, name, position) VALUES (1, 'John Doe', 'Manager');

- DCL (Data Control Language): These commands are used to control access to data within the database.
- Examples: GRANT, REVOKE
- Example:

         GRANT SELECT ON employees TO user_name;

- TCL (Transaction Control Language): These commands are used to manage transactions in the database.
- Examples: COMMIT, ROLLBACK, SAVEPOINT
- Example:

         BEGIN;
UPDATE employees SET position = 'Senior Manager' WHERE id = 1;
COMMIT;


Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍5217🔥1👏1
SQL INTERVIEW PREPARATION PART-4

What is the difference between INNER JOIN and OUTER JOIN?

- INNER JOIN: Returns only the rows where there is a match in both tables.
- OUTER JOIN: Returns the matched rows as well as unmatched rows from one or both tables. There are three types of OUTER JOIN:
- LEFT OUTER JOIN (or LEFT JOIN): Returns all rows from the left table, and the matched rows from the right table. If no match is found, the result is NULL on the right side.
- RIGHT OUTER JOIN (or RIGHT JOIN): Returns all rows from the right table, and the matched rows from the left table. If no match is found, the result is NULL on the left side.
- FULL OUTER JOIN: Returns rows when there is a match in one of the tables. This means it returns all rows from the left table and the right table, filling in NULLs when there is no match.

Examples:
- INNER JOIN:

     SELECT employees.name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.id;

- LEFT JOIN:

     SELECT employees.name, departments.department_name
FROM employees
LEFT JOIN departments ON employees.department_id = departments.id;

- RIGHT JOIN:

     SELECT employees.name, departments.department_name
FROM employees
RIGHT JOIN departments ON employees.department_id = departments.id;

- FULL OUTER JOIN:

     SELECT employees.name, departments.department_name
FROM employees
FULL OUTER JOIN departments ON employees.department_id = departments.id;

Go though SQL Learning Series to refresh your basics

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍4119🥰4
SQL INTERVIEW PREPARATION PART-5

Let's discuss about normalization today

   - Normalization is the process of organizing the data in a database to reduce redundancy and improve data integrity. The goal is to divide a database into two or more tables and define relationships between them to reduce redundancy and dependency. There are several normal forms, each with specific rules to help achieve this goal.

Normalization involves multiple steps, usually referred to as "normal forms" (NFs):

   - First Normal Form (1NF): Ensures that the table has a primary key and that each column contains atomic (indivisible) values.
     - Example:
      
       CREATE TABLE customers (
           customer_id INT PRIMARY KEY,
           customer_name VARCHAR(100),
           contact_number VARCHAR(15)
       );
      

   - Second Normal Form (2NF): Achieves 1NF and ensures that all non-key attributes are fully functionally dependent on the primary key. This means removing partial dependencies of any column on the primary key.
     - Example: If a table has a composite key (e.g., order_id, product_id) and some columns depend only on part of that key, those columns should be moved to another table.

   - Third Normal Form (3NF): Achieves 2NF and ensures that all the attributes are functionally dependent only on the primary key. This eliminates transitive dependencies.
     - Example:
      
       CREATE TABLE orders (
           order_id INT PRIMARY KEY,
           customer_id INT,
           order_date DATE,
           FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
       );

       CREATE TABLE order_details (
           order_id INT,
           product_id INT,
           quantity INT,
           PRIMARY KEY (order_id, product_id),
           FOREIGN KEY (order_id) REFERENCES orders(order_id)
       );
      

   - Boyce-Codd Normal Form (BCNF): A stricter version of 3NF where every determinant is a candidate key. This addresses situations where 3NF is not sufficient to eliminate all redundancies.

Tricky Question:
   - How would you approach normalizing a table that contains repeating groups of data?
     - This question tests the understanding of the concept of atomicity and the process of transforming a table into 1NF.

Example Answer:
   - "If a table contains repeating groups, such as multiple phone numbers in one column separated by commas, I would first ensure that each piece of data is atomic. I would create a separate table for the repeating group and link it with a foreign key to the original table, thereby normalizing the data into 1NF."

Go though SQL Learning Series to refresh your basics

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍5011🔥2
SQL INTERVIEW PREPARATION PART-6

Let's discuss about subquery today

- A subquery, also known as an inner query or nested query, is a query within another SQL query. It is used to provide data to the main query (outer query). Subqueries can be used in various clauses such as SELECT, FROM, WHERE, and HAVING.

Types of Subqueries:
- Single-row subquery: Returns a single row and is used with operators like =, <, >.
- Multi-row subquery: Returns multiple rows and is used with operators like IN, ANY, ALL.
- Correlated subquery: A subquery that references columns from the outer query. It is evaluated once for each row processed by the outer query.

Examples:
- Single-row subquery:

     SELECT name
FROM employees
WHERE department_id = (SELECT id FROM departments WHERE department_name = 'Sales');

- Multi-row subquery:

     SELECT name
FROM employees
WHERE department_id IN (SELECT id FROM departments WHERE region = 'North');

- Correlated subquery:

     SELECT e.name
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);


Go though SQL Learning Series to refresh your basics

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍435
SQL INTERVIEW PREPARATION PART-7

Explain the difference between GROUP BY and ORDER BY in SQL.

- GROUP BY: Groups rows that have the same values into summary rows.
- ORDER BY: Sorts the result set in ascending or descending order based on one or more columns.

Tips:
- Mention that GROUP BY is typically used with aggregate functions like COUNT, SUM, AVG, etc., while ORDER BY is used for sorting the result set.
- Provide an example to illustrate the distinction between the two clauses.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍387
SQL INTERVIEW PREPARATION PART-8

How do you find the nth highest salary from a table in SQL?

Answer:
You can use the LIMIT clause in combination with the ORDER BY clause to find the nth highest salary.

Example:
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT n-1, 1;

Replace 'n' with the desired rank of the salary.

Tip: Emphasize the importance of using DISTINCT to handle cases where there are duplicate salaries, and ensure the ORDER BY clause is sorting the salaries in descending order to find the nth highest salary.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍3212
SQL INTERVIEW PREPARATION PART-8

How can you find the second highest salary in a table without using the LIMIT clause?

You can use a subquery to find the maximum salary that is less than the overall maximum salary.

Example:
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Tip: Explain that this approach can be useful when the LIMIT clause is not supported or if you want to demonstrate proficiency in using subqueries.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍413🔥1
I am planning to parallely start another interview series related to data analytics. What should be the topic?
Anonymous Poll
16%
Excel
28%
Power BI
2%
Alteryx
5%
Tableau
17%
Python
3%
R
22%
Data Analyst Interview (mix of all tools)
7%
Data Science/ ML / AI
👍7111👏3👎1🔥1
SQL INTERVIEW PREPARATION PART-9

What are window functions in SQL and can you provide an example?

Answer:
Window functions perform a calculation across a set of table rows that are somehow related to the current row. Unlike aggregate functions, window functions do not cause rows to become grouped into a single output row.

Example using ROW_NUMBER():
SELECT name, salary, department_id,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) as row_num
FROM employees;

In this example, ROW_NUMBER() assigns a unique rank to each row within each department, ordered by salary in descending order.

Tip: Highlight the usefulness of window functions for complex analytics and reporting tasks, where you need to perform calculations across rows while still returning individual rows. Explain other common window functions like RANK(), DENSE_RANK(), and NTILE().

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍518🔥2
Data Analytics
I am planning to parallely start another interview series related to data analytics. What should be the topic?
Glad to see the amazing response. I will start other series parallely with SQL Interview Series very soon :)
👍263🎉1
SQL INTERVIEW PREPARATION PART-10

Explain what a CTE (Common Table Expression) is and provide an example.

Answer:
A Common Table Expression (CTE) is a temporary result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs are defined using the WITH keyword and can improve the readability and organization of complex queries.

Example:
WITH EmployeeCTE AS (
SELECT department_id, AVG(salary) as avg_salary
FROM employees
GROUP BY department_id
)
SELECT e.name, e.salary, e.department_id, c.avg_salary
FROM employees e
JOIN EmployeeCTE c ON e.department_id = c.department_id
WHERE e.salary > c.avg_salary;

In this example, the CTE EmployeeCTE calculates the average salary per department, which is then used in the main query to find employees earning above the average salary in their department.

Tip: Explain that CTEs can be particularly useful for breaking down complex queries into more manageable parts, improving both readability and maintainability. They also allow for recursive queries, which can be useful in hierarchical data structures.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍345
SQL INTERVIEW PREPARATION PART-11

What is a self join in SQL? Provide an example.

A self join is a regular join but the table is joined with itself. This is useful when the data is hierarchical or when you need to compare rows within the same table.

Example:
SELECT e1.name AS Employee, e2.name AS Manager
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.id;

In this example, the employees table is joined with itself to find the manager for each employee.

Tip: Explain that self joins are particularly useful for hierarchical data, such as organizational charts, and emphasize the importance of using table aliases (e.g., e1 and e2) to differentiate between the different instances of the same table.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍344👏2
Data Analytics
I am planning to parallely start another interview series related to data analytics. What should be the topic?
Thank you so much for the amazing response again. Since we got maximum votes for Power BI Interview Series, let's start with it.

Power BI Interview Preparation Part-1

1. How can you optimize a Power BI report for performance?

Answer:
Optimizing a Power BI report for performance involves several strategies:
1. Data Modeling:
- Use star schema design to simplify relationships.
- Reduce the number of columns in tables by removing unnecessary data.
- Use numeric keys for relationships instead of text keys.

2. Query Optimization:
- Use Power Query to filter and aggregate data before loading it into Power BI.
- Disable auto-detect relationships to manually manage them.
- Reduce the number of calculated columns and use measures instead.

3. DAX Optimization:
- Use variables to store repeated calculations within DAX expressions.
- Avoid using complex or nested DAX functions that can slow down performance.
- Optimize DAX queries by reducing the number of operations and focusing on simpler calculations.

4. Data Refresh:
- Schedule incremental data refresh instead of full data refresh.
- Ensure efficient data source connectivity, such as using DirectQuery or Import mode appropriately.

5. Visualization Optimization:
- Limit the number of visuals on a single page to reduce the rendering time.
- Use simple visuals and avoid using too many custom visuals.
- Enable the "Optimize for Power BI" option in the settings.

Tips:
- Use the Performance Analyzer tool in Power BI Desktop to identify and diagnose performance bottlenecks.
- Regularly review and refine the data model and DAX formulas as the dataset grows and changes.
- Monitor the report's performance after deployment and make adjustments as necessary based on user feedback and performance metrics.

You can refer these Power BI Resources to learn more

Like this post if you want me to continue this Power BI series 👍♥️

Share with credits: https://news.1rj.ru/str/sqlspecialist

Hope it helps :)
👍5010
SQL INTERVIEW PREPARATION PART-12

How can you detect and delete duplicate rows in a SQL table?

Answer:
To detect and delete duplicate rows, you can use a Common Table Expression (CTE) along with the ROW_NUMBER() window function to identify duplicates and then delete them.

Example to detect duplicates:
WITH DuplicateCTE AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY column1, column2, ... ORDER BY (SELECT NULL)) as row_num
FROM
table_name
)
SELECT * FROM DuplicateCTE WHERE row_num > 1;

Example to delete duplicates:
WITH DuplicateCTE AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY column1, column2, ... ORDER BY (SELECT NULL)) as row_num
FROM
table_name
)
DELETE FROM DuplicateCTE WHERE row_num > 1;

In these examples, replace column1, column2, ... with the columns that define a duplicate.

Tip: Emphasize the importance of carefully choosing the columns in the PARTITION BY clause to accurately identify duplicates. Also, mention that the ORDER BY (SELECT NULL) is used to avoid influencing the row numbering, but a specific column can be used if a specific order is needed.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍275
Power BI Interview Preparation Part-2

2. What are the differences between DirectQuery and Import mode in Power BI, and when should you use each?

Answer:

Import Mode:
- Data Storage: Data is imported and stored in the Power BI Desktop file.
- Performance: Generally faster performance because data is preloaded into memory.
- Refresh Frequency: Requires scheduled data refreshes to keep data up-to-date.
- Data Volume: Can handle larger datasets up to the Power BI dataset size limit.
- Latency: Low latency since data is already loaded into Power BI.
- When to Use: Ideal for static or infrequently updated data, when performance is critical, and when handling large datasets.

DirectQuery Mode:
- Data Storage: Data remains in the source system; only metadata is imported into Power BI.
- Performance: Dependent on the performance of the underlying data source.
- Refresh Frequency: Real-time or near-real-time data access, no need for scheduled refreshes.
- Data Volume: Limited by the performance of the underlying data source, but can handle very large datasets.
- Latency: Higher latency since each query is sent to the data source in real-time.
- When to Use: Ideal for real-time data analysis, when data changes frequently, and when dealing with data sources that cannot be fully imported due to size or security constraints.

Tips:
- Consider using a composite model to leverage the benefits of both Import and DirectQuery modes within the same report.
- Evaluate the performance of the underlying data source when using DirectQuery to ensure it can handle the expected query load.
- Optimize queries and data models in DirectQuery mode to minimize performance issues.

Like this post if you want me to continue this Power BI Interview series 👍♥️

Share with credits: https://news.1rj.ru/str/sqlspecialist

Hope it helps :)
👍368
SQL INTERVIEW PREPARATION PART-13

Explain the difference between DELETE and TRUNCATE commands in SQL.

Answer:
- DELETE: This command is used to delete rows from a table based on a condition. It is a DML (Data Manipulation Language) operation. Each row deletion is logged individually, and you can use a WHERE clause to specify which rows to delete. It can be rolled back if used within a transaction.

  DELETE FROM employees WHERE department_id = 10;

- TRUNCATE: This command is used to delete all rows from a table. It is a DDL (Data Definition Language) operation. It is faster than DELETE because it deallocates the data pages rather than logging individual row deletions. It cannot be rolled back in some databases because it does not log individual row deletions.

  TRUNCATE TABLE employees;

Tip: Mention that DELETE can be more flexible when you need to remove specific rows, while TRUNCATE is more efficient for removing all rows. Also, point out that TRUNCATE resets any auto-increment counters on the table and may require higher privileges compared to DELETE.

Share with credits: https://news.1rj.ru/str/sqlspecialist

Like this post if you want me to continue SQL Interview Preparation Series 👍❤️

Hope it helps :)
👍2910