Data Analyst Interview Resources – Telegram
Data Analyst Interview Resources
52K subscribers
257 photos
1 video
53 files
321 links
Join our telegram channel to learn how data analysis can reveal fascinating patterns, trends, and stories hidden within the numbers! 📊

For ads & suggestions: @love_data
Download Telegram
1. What are the different subsets of SQL?

Data Definition Language (DDL) – It allows you to perform various operations on the database such as CREATE, ALTER, and DELETE objects.
Data Manipulation Language(DML) – It allows you to access and manipulate data. It helps you to insert, update, delete and retrieve data from the database.
Data Control Language(DCL) – It allows you to control access to the database. Example – Grant, Revoke access permissions.

2. List the different types of relationships in SQL.

There are different types of relations in the database:
One-to-One – This is a connection between two tables in which each record in one table corresponds to the maximum of one record in the other.
One-to-Many and Many-to-One – This is the most frequent connection, in which a record in one table is linked to several records in another.
Many-to-Many – This is used when defining a relationship that requires several instances on each sides.
Self-Referencing Relationships – When a table has to declare a connection with itself, this is the method to employ.

3. How to create empty tables with the same structure as another table?

To create empty tables:
Using the INTO operator to fetch the records of one table into a new table while setting a WHERE clause to false for all entries, it is possible to create empty tables with the same structure. As a result, SQL creates a new table with a duplicate structure to accept the fetched entries, but nothing is stored into the new table since the WHERE clause is active.

4. What is Normalization and what are the advantages of it?

Normalization in SQL is the process of organizing data to avoid duplication and redundancy. Some of the advantages are:
Better Database organization
More Tables with smaller rows
Efficient data access
Greater Flexibility for Queries
Quickly find the information
Easier to implement Security
2👍1
1. What is the difference between the RANK() and DENSE_RANK() functions?

The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.

2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of the given dataset?

One-hot encoding is the representation of categorical variables as binary vectors. Label Encoding is converting labels/words into numeric form. Using one-hot encoding increases the dimensionality of the data set. Label encoding doesn’t affect the dimensionality of the data set. One-hot encoding creates a new variable for each level in the variable whereas, in Label encoding, the levels of a variable get encoded as 1 and 0.

3. Explain the Difference Between Tableau Worksheet, Dashboard, Story, and Workbook in Tableau?

Tableau uses a workbook and sheet file structure, much like Microsoft Excel.
A workbook contains sheets, which can be a worksheet, dashboard, or a story.
A worksheet contains a single view along with shelves, legends, and the Data pane.
A dashboard is a collection of views from multiple worksheets.
A story contains a sequence of worksheets or dashboards that work together to convey information.

4. How can you split a column into 2 or more columns?

You can split a column into 2 or more columns by following the below steps:
1. Select the cell that you want to split. Then, navigate to the Data tab, after that, select Text to Columns. 2. Select the delimiter. 3. Choose the column data format and select the destination you want to display the split. 4. The final output will look like below where the text is split into multiple columns.

5. Do you wanna make your career in Data Science & Analytics but don't know how to start ?

https://news.1rj.ru/str/sqlspecialist/851

Here are free resources that will make you technically strong enough to crack any Data Analyst and also learn Pro Career Growth Hacks to land on your Dream Job.
👍2
Data Analyst Scenario based Question and Answers 👇👇

1. Scenario: Creating a Dynamic Sales Growth Report in Power BI
Approach:
Load Data: Import sales data and calendar tables.
Data Model: Establish a relationship between the sales and calendar tables.
Create Measures:
Current Sales: Current Sales = SUM(Sales[Amount]).
Previous Year Sales: Previous Year Sales = CALCULATE(SUM(Sales[Amount]), DATEADD(Calendar[Date], -1, YEAR)).
Sales Growth: Sales Growth = [Current Sales] - [Previous Year Sales].
Visualization:
Use Line Chart for trends.
Use Card Visual for displaying numeric growth values.
Slicers and Filters: Add slicers for selecting specific time periods.

2. Scenario: Identifying Top 5 Customers by Revenue in SQL
Approach:
Understand the Schema: Know the relevant tables and columns, e.g., Orders table with CustomerID and Revenue.
SQL Query:
SELECT TOP 5 CustomerID, SUM(Revenue) AS TotalRevenue
FROM Orders
GROUP BY CustomerID
ORDER BY TotalRevenue DESC;

3. Scenario: Creating a Monthly Sales Forecast in Power BI
Approach:
Load Historical Data: Import historical sales data.
Data Model: Ensure proper relationships.
Time Series Analysis:
Use built-in Power BI forecasting features.
Create measures for historical and forecasted sales.
Visualization:
Use a Line Chart to display historical and forecasted sales.
Adjust Forecast Parameters: Customize the forecast length and confidence intervals.

4. Scenario: Updating a SQL Table with New Data
Approach:
Understand the Schema: Identify the table and columns to be updated.
SQL Query:
UPDATE Employees
SET JobTitle = 'Senior Developer'
WHERE EmployeeID = 1234;

5. Scenario: Creating a Custom KPI in Power BI
Approach:
Define KPI: Identify the key performance indicators.
Create Measures:
Define the KPI measure using DAX.
Visualization:
Use KPI Visual or Card Visual.
Configure the target and actual values.
Conditional Formatting: Apply conditional formatting based on the KPI thresholds.

Data Analytics Resources
👇👇
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02

Hope it helps :)
1👍1
1. What is Data Integrity?

Data Integrity is the assurance of accuracy and consistency of data over its entire life-cycle and is a critical aspect of the design, implementation, and usage of any system which stores, processes, or retrieves data. It also defines integrity constraints to enforce business rules on the data when it is entered into an application or a database.

2. What is the Difference Between Joining and Blending in Tableau?

Combining the data from two or more different sources is data blending, such as Oracle, Excel, and SQL Server. In data blending, each data source contains its own set of dimensions and measures. Combining the data between two or more tables or sheets within the same data source is data joining. All the combined tables or sheets contain a common set of dimensions and measures.

3. What is slicing in Python?
As the name suggests, ‘slicing’ is taking parts of.
Syntax for slicing is [start : stop : step]
start is the starting index from where to slice a list or tuple
stop is the ending index or where to stop.
step is the number of steps to jump.
Default value for start is 0, stop is number of items, step is 1.
Slicing can be done on strings, arrays, lists, and tuples.

4. What is the difference between NOW() and CURRENT_DATE() in SQL?

NOW() returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger, NOW() returns the time at which the function or triggering statement began to execute.

The simple difference between NOW() and CURRENT_DATE() is that NOW() will fetch the current date and time both in format ‘YYYY-MM_DD HH:MM:SS’ while CURRENT_DATE() will fetch the date of the current day ‘YYYY-MM_DD’.
3
The Shift in Data Analyst Roles: What You Should Apply for in 2025

The traditional “Data Analyst” noscript is gradually declining in demand in 2025 not because data is any less important, but because companies are getting more specific in what they’re looking for.

Today, many roles that were once grouped under “Data Analyst” are now split into more domain-focused noscripts, depending on the team or function they support.

Here are some roles gaining traction:
* Business Analyst
* Product Analyst
* Growth Analyst
* Marketing Analyst
* Financial Analyst
* Operations Analyst
* Risk Analyst
* Fraud Analyst
* Healthcare Analyst
* Technical Analyst
* Business Intelligence Analyst
* Decision Support Analyst
* Power BI Developer
* Tableau Developer

Focus on the skillsets and business context these roles demand.

Whether you're starting out or transitioning, look beyond "Data Analyst" and align your profile with industry-specific roles. It’s not about the noscript—it’s about the value you bring to a team.
👍41
1. What is Data Integrity?

Data Integrity is the assurance of accuracy and consistency of data over its entire life-cycle and is a critical aspect of the design, implementation, and usage of any system which stores, processes, or retrieves data. It also defines integrity constraints to enforce business rules on the data when it is entered into an application or a database.

2. What is the Difference Between Joining and Blending in Tableau?

Combining the data from two or more different sources is data blending, such as Oracle, Excel, and SQL Server. In data blending, each data source contains its own set of dimensions and measures. Combining the data between two or more tables or sheets within the same data source is data joining. All the combined tables or sheets contain a common set of dimensions and measures.

3. What is slicing in Python?
As the name suggests, ‘slicing’ is taking parts of.
Syntax for slicing is [start : stop : step]
start is the starting index from where to slice a list or tuple
stop is the ending index or where to stop.
step is the number of steps to jump.
Default value for start is 0, stop is number of items, step is 1.
Slicing can be done on strings, arrays, lists, and tuples.

4. What is the difference between NOW() and CURRENT_DATE() in SQL?

NOW() returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger, NOW() returns the time at which the function or triggering statement began to execute.

The simple difference between NOW() and CURRENT_DATE() is that NOW() will fetch the current date and time both in format ‘YYYY-MM_DD HH:MM:SS’ while CURRENT_DATE() will fetch the date of the current day ‘YYYY-MM_DD’.
4
Data analysis can be categorized into four types: denoscriptive, diagnostic, predictive, and prenoscriptive analysis. Denoscriptive analysis summarizes raw data, diagnostic analysis determines why something happened, predictive analysis uses past data to predict the future, and prenoscriptive analysis suggests actions based on predictions.

Data analysis is a comprehensive method that involves inspecting, cleansing, transforming, and modeling data to discover useful information, make conclusions, and support decision-making. It's a process that empowers organizations to make informed decisions, predict trends, and improve operational efficiency.

The data analysis process involves several steps, including defining objectives and questions, data collection, data cleaning, data analysis, data interpretation and visualization, and data storytelling. Each step is crucial to ensuring the accuracy and usefulness of the results.

There are various data analysis techniques, including exploratory analysis, regression analysis, Monte Carlo simulation, factor analysis, cohort analysis, cluster analysis, time series analysis, and sentiment analysis. Each has its unique purpose and application in interpreting data.

Data analysis typically utilizes tools such as Python, R, SQL for programming, and Power BI, Tableau, and Excel for visualization and data management

You can start learning data analysis by understanding the basics of statistical concepts, data types, and structures. Then learn a programming language like Python or R, master data manipulation and visualization, and delve into specific data analysis techniques.
2
Complete SQL Topics for Data Analysts 😄👇

1. Introduction to SQL:
- Basic syntax and structure
- Understanding databases and tables

2. Querying Data:
- SELECT statement
- Filtering data using WHERE clause
- Sorting data with ORDER BY

3. Joins:
- INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN
- Combining data from multiple tables

4. Aggregation Functions:
- GROUP BY
- Aggregate functions like COUNT, SUM, AVG, MAX, MIN

5. Subqueries:
- Using subqueries in SELECT, WHERE, and HAVING clauses

6. Data Modification:
- INSERT, UPDATE, DELETE statements
- Transactions and Rollback

7. Data Types and Constraints:
- Understanding various data types (e.g., INT, VARCHAR)
- Using constraints (e.g., PRIMARY KEY, FOREIGN KEY)

8. Indexes:
- Creating and managing indexes for performance optimization

9. Views:
- Creating and using views for simplified querying

10. Stored Procedures and Functions:
- Writing and executing stored procedures
- Creating and using functions

11. Normalization:
- Understanding database normalization concepts

12. Data Import and Export:
- Importing and exporting data using SQL

13. Window Functions:
- ROW_NUMBER(), RANK(), DENSE_RANK(), and others

14. Advanced Filtering:
- Using CASE statements for conditional logic

15. Advanced Join Techniques:
- Self-joins and other advanced join scenarios

16. Analytical Functions:
- LAG(), LEAD(), OVER() for advanced analytics

17. Working with Dates and Times:
- Date and time functions and formatting

18. Performance Tuning:
- Query optimization strategies

19. Security:
- Understanding SQL injection and best practices for security

20. Handling NULL Values:
- Dealing with NULL values in queries

Ensure hands-on practice on these topics to strengthen your SQL skills.

Since SQL is one of the most essential skill for data analysts, I have decided to teach each topic daily in this channel for free. Like this post if you want me to continue this SQL series 👍♥️

Share with credits: https://news.1rj.ru/str/sqlspecialist

Hope it helps :)
2
Data Analyst Interview!

𝐑𝐨𝐮𝐧𝐝 1: Technical Round - 15 mins
1. Tell me about yourself
2. Tell me about your experience
3. What is VLookup, when we are using VLookup what do we have to check before applying?
4. Are you familiar with dashboards and generating reports
5. How do you generate reports generally
6. How to delete duplicates in Power BI
7. In Power BI do you know how to draw all charts
8. Do you have any questions?

𝐑𝐨𝐮𝐧𝐝 2: Manager Round - 30 mins
1. Tell me about yourself
2. Tell me about our Organization
3. Tell me about your work experience
4. To whom do you report usually
5. Why do you choose this role
6. Why this organization only
7. Why do you think you will be suitable for this role
8. Do you have any questions

I have curated best 80+ top-notch Data Analytics Resources 👇👇
https://whatsapp.com/channel/0029VaGgzAk72WTmQFERKh02

Hope this helps you 😊
4👍1
Quick Recap of Power BI Concepts

1️⃣ Power Query: The data transformation engine that lets you clean, reshape, and combine data before loading it into Power BI.

2️⃣ Data Model: A structure of tables, relationships, and calculated fields that supports report creation.

3️⃣ Relationships: Connections between tables that allow you to create reports using data from multiple tables.

4️⃣ DAX (Data Analysis Expressions): A formula language used for creating calculated columns, measures, and custom tables.

5️⃣ Visualizations: Graphical representations of data, such as bar charts, line charts, maps, and tables.

6️⃣ Slicers: Interactive filters added to reports to help users refine data views.

7️⃣ Measures: Calculations created using DAX that perform dynamic aggregations based on the context in your report.

8️⃣ Calculated Columns: Static columns created using DAX expressions that perform row-by-row calculations.

9️⃣ Reports: A collection of visualizations, text, and slicers that tell a story using your data.

🔟 Power BI Service: The online platform where you publish, share, and collaborate on Power BI reports and dashboards.

I have curated the best interview resources to crack Power BI Interviews 👇👇
https://news.1rj.ru/str/DataSimplifier

Hope you'll like it

Like this post if you need more content like this 👍❤️

Share with credits: https://news.1rj.ru/str/sqlspecialist

Hope it helps :)
1
Hey guys,

Today, let’s talk about SQL conceptual questions that are often asked in data analyst interviews. These questions test not only your technical skills but also your conceptual understanding of SQL and its real-world applications.

1. What is the difference between SQL and NoSQL?

- SQL (Structured Query Language) is a relational database management system, meaning it uses tables (rows and columns) to store data.
- NoSQL databases, on the other hand, handle unstructured data and don’t rely on a schema, making them more flexible in terms of data storage and retrieval.
- Interview Tip: Don't just memorize definitions. Be prepared to explain scenarios where you’d use SQL over NoSQL, and vice versa.

2. What is the difference between INNER JOIN and OUTER JOIN?

- An INNER JOIN returns records that have matching values in both tables.
- An OUTER JOIN returns all records from one table and the matched records from the second table. If there's no match, NULL values are returned.

3. How do you optimize a SQL query for better performance?

- Indexing: Create indexes on columns used frequently in WHERE, JOIN, or GROUP BY clauses.
- Query optimization: Use appropriate WHERE clauses to reduce the data set and avoid unnecessary calculations.
- Avoid SELECT *: Always specify the columns you need to reduce the amount of data retrieved.
- Limit results: If you only need a subset of the data, use the LIMIT clause.

4. What are the different types of SQL constraints?

Constraints are used to enforce rules on data in a table. They ensure the accuracy and reliability of the data. The most common types are:

- PRIMARY KEY: Ensures each record is unique and not null.
- FOREIGN KEY: Enforces a relationship between two tables.
- UNIQUE: Ensures all values in a column are unique.
- NOT NULL: Prevents NULL values from being entered into a column.
- CHECK: Ensures a column's values meet a specific condition.

5. What is normalization? What are the different normal forms?

Normalization is the process of organizing data to reduce redundancy and improve data integrity. Here’s a quick overview of normal forms:

- 1NF (First Normal Form): Ensures that all values in a table are atomic (indivisible).
- 2NF (Second Normal Form): Ensures that the table is in 1NF and that all non-key columns are fully dependent on the primary key.
- 3NF (Third Normal Form): Ensures that the table is in 2NF and all columns are independent of each other except for the primary key.

6. What is a subquery?

A subquery is a query within another query. It's used to perform operations that need intermediate results before generating the final query.

Example:
SELECT employee_id, name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

In this case, the subquery calculates the average salary, and the outer query selects employees whose salary is greater than the average.

7. What is the difference between a UNION and a UNION ALL?

- UNION combines the result sets of two SELECT statements and removes duplicates.
- UNION ALL combines the result sets and includes duplicates.

8. What is the difference between WHERE and HAVING clause?

- WHERE filters rows before any groupings are made. It’s used with SELECT, INSERT, UPDATE, or DELETE statements.
- HAVING filters groups after the GROUP BY clause.

9. How would you handle NULL values in SQL?

NULL values can represent missing or unknown data. Here’s how to manage them:

- Use IS NULL or IS NOT NULL in WHERE clauses to filter null values.
- Use COALESCE() or IFNULL() to replace NULL values with default ones.

Example:
SELECT name, COALESCE(age, 0) AS age
FROM employees;


10. What is the purpose of the GROUP BY clause?

The GROUP BY clause groups rows with the same values into summary rows. It’s often used with aggregate functions like COUNT, SUM, AVG, etc.

Example:
SELECT department, COUNT(*)
FROM employees
GROUP BY department;


Here you can find SQL Interview Resources👇
https://news.1rj.ru/str/DataSimplifier

Share with credits: https://news.1rj.ru/str/sqlspecialist

Hope it helps :)
2
Breaking into Data Analytics doesn’t need to be complicated.

If you’re just starting out,

Here’s how to simplify your approach:

Avoid:
🚫 Jumping into advanced tools like Hadoop or Spark before mastering the basics.
🚫 Focusing only on tools, not on business problem-solving.
🚫 Collecting certificates instead of solving real problems.
🚫 Thinking you need to know everything from SQL to machine learning right away.

Instead:
Start with Excel, SQL, and one visualization tool (like Power BI or Tableau).
Learn how to clean, explore, and interpret data to solve business questions.
Understand core concepts like KPIs, dashboards, and business metrics.
Pick real datasets and analyze them with clear goals and insights.
Build a portfolio that shows you can translate data into decisions.

React ❤️ for more
2
How Data Analysts use Python 👆
2
Data Analyst Interview Questions with Answers

1. What is the difference between the RANK() and DENSE_RANK() functions?

The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7. The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. If we have three records at rank 4, for example, the next level indicated is 5.

2. Explain One-hot encoding and Label Encoding. How do they affect the dimensionality of the given dataset?

One-hot encoding is the representation of categorical variables as binary vectors. Label Encoding is converting labels/words into numeric form. Using one-hot encoding increases the dimensionality of the data set. Label encoding doesn’t affect the dimensionality of the data set. One-hot encoding creates a new variable for each level in the variable whereas, in Label encoding, the levels of a variable get encoded as 1 and 0.

3. What is the shortcut to add a filter to a table in EXCEL?

The filter mechanism is used when you want to display only specific data from the entire dataset. By doing so, there is no change being made to the data. The shortcut to add a filter to a table is Ctrl+Shift+L.

4. What is DAX in Power BI?

DAX stands for Data Analysis Expressions. It's a collection of functions, operators, and constants used in formulas to calculate and return values. In other words, it helps you create new info from data you already have.

5. Define shelves and sets in Tableau?

Shelves: Every worksheet in Tableau will have shelves such as columns, rows, marks, filters, pages, and more. By placing filters on shelves we can build our own visualization structure. We can control the marks by including or excluding data.
Sets: The sets are used to compute a condition on which the dataset will be prepared. Data will be grouped together based on a condition. Fields which is responsible for grouping are known assets. For example – students having grades of more than 70%.

React ❤️ for more
2
Data Analytics Interview Questions
1