Data Analytics – Telegram
Data Analytics
108K subscribers
126 photos
2 files
793 links
Perfect channel to learn Data Analytics

Learn SQL, Python, Alteryx, Tableau, Power BI and many more

For Promotions: @coderfun @love_data
Download Telegram
Data Analyst Interview QnA

1. Find avg of salaries department wise from table.

Answer-
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;


2. What does Filter context in DAX mean?

Answer - Filter context in DAX refers to the subset of data that is actively being used in the calculation of a measure or in the evaluation of an expression. This context is determined by filters on the dashboard items like slicers, visuals, and filters pane which restrict the data being processed.

3. Explain how to implement Row-Level Security (RLS) in Power BI.

Answer - Row-Level Security (RLS) in Power BI can be implemented by:

- Creating roles within the Power BI service.
- Defining DAX expressions that specify the data each role can access.
- Assigning users to these roles either in Power BI or dynamically through AD group membership.

4. Create a dictionary, add elements to it, modify an element, and then print the dictionary in alphabetical order of keys.

Answer -
d = {'apple': 2, 'banana': 5}
d['orange'] = 3 # Add element
d['apple'] = 4 # Modify element
sorted_d = dict(sorted(d.items())) # Sort dictionary
print(sorted_d)


5. Find and print duplicate values in a list of assorted numbers, along with the number of times each value is repeated.

Answer -
from collections import Counter

numbers = [1, 2, 2, 3, 4, 5, 1, 6, 7, 3, 8, 1]
count = Counter(numbers)
duplicates = {k: v for k, v in count.items() if v > 1}
print(duplicates)
11
Few ways to optimise SQL Queries 👇👇

Use Indexing: Properly indexing your database tables can significantly speed up query performance by allowing the database to quickly locate the rows needed for a query.

Optimize Joins: Minimize the number of joins and use appropriate join types (e.g., INNER JOIN, LEFT JOIN) to ensure efficient data retrieval.

Avoid SELECT * : Instead of selecting all columns using SELECT *, explicitly specify only the columns needed for the query to reduce unnecessary data transfer and processing overhead.

Use WHERE Clause Wisely: Filter rows early in the query using WHERE clause to reduce the dataset size before joining or aggregating data.

Avoid Subqueries: Whenever possible, rewrite subqueries as JOINs or use Common Table Expressions (CTEs) for better performance.

Limit the Use of DISTINCT: Minimize the use of DISTINCT as it requires sorting and duplicate removal, which can be resource-intensive for large datasets.

Optimize GROUP BY and ORDER BY: Use GROUP BY and ORDER BY clauses judiciously, and ensure that they are using indexed columns whenever possible to avoid unnecessary sorting.

Consider Partitioning: Partition large tables to distribute data across multiple nodes, which can improve query performance by reducing I/O operations.

Monitor Query Performance: Regularly monitor query performance using tools like query execution plans, database profiler, and performance monitoring tools to identify and address bottlenecks.

React ❤️ for more
15
Which keyword sorts the result set?
Anonymous Quiz
39%
SORT BY
11%
ORDER
1%
ALIGN BY
49%
ORDER BY
8
Which function counts the number of rows?
Anonymous Quiz
6%
SUM()
87%
COUNT()
4%
TOTAL()
3%
NUMBER()
3
Which SQL function returns the value from a subsequent row in the table?
Anonymous Quiz
36%
LEAD()
21%
LAG()
28%
NEXT()
15%
FOLLOW()
7
Which window function assigns a unique sequential integer to each row within a partition in SQL?
Anonymous Quiz
27%
RANK()
27%
DENSE_RANK()
5%
NTILE()
41%
ROW_NUMBER()
3
Data Analyst Interview Questions

Q1: How would you analyze data to understand user connection patterns on a professional network?

Ans: I'd use graph databases like Neo4j for social network analysis. By analyzing connection patterns, I can identify influencers or isolated communities.

Q2: Describe a challenging data visualization you created to represent user engagement metrics.

Ans: I visualized multi-dimensional data showing user engagement across features, regions, and time using tools like D3.js, creating an interactive dashboard with drill-down capabilities.

Q3: How would you identify and target passive job seekers on LinkedIn?

Ans: I'd analyze user behavior patterns, like increased profile updates, frequent visits to job postings, or engagement with career-related content, to identify potential passive job seekers.

Q4: How do you measure the effectiveness of a new feature launched on LinkedIn?


Ans: I'd set up A/B tests, comparing user engagement metrics between those who have access to the new feature and a control group. I'd then analyze metrics like time spent, feature usage frequency, and overall platform engagement to measure effectiveness.

Hope it helps :)
2🔥2👍1
You’re not a failure as a data analyst if:

• It takes you more than two months to land a job (remove the time expectation!)

• Complex concepts don’t immediately sink in

• You use Google/YouTube daily on the job (this is a sign you’re successful, actually)

• You don’t make as much money as others in the field

• You don’t code in 12 different languages (SQL is all you need. Add Python later if you want.)
6👍5
Which of the following is NOT a primary component of Power BI?
Anonymous Quiz
8%
Power BI Desktop
6%
Power BI Service
31%
Power BI Mobile
54%
Power BI Code Editor
5
Which of the following is NOT a valid data source that Power BI can connect to directly?
Anonymous Quiz
4%
Excel
5%
SQL Server
12%
Web page
79%
Adobe Photoshop
3
Scenario based  Interview Questions & Answers for Data Analyst

1. Scenario: You are working on a SQL database that stores customer information. The database has a table called "Orders" that contains order details. Your task is to write a SQL query to retrieve the total number of orders placed by each customer.
  Question:
  - Write a SQL query to find the total number of orders placed by each customer.
Expected Answer:
    SELECT CustomerID, COUNT(*) AS TotalOrders
    FROM Orders
    GROUP BY CustomerID;

2. Scenario: You are working on a SQL database that stores employee information. The database has a table called "Employees" that contains employee details. Your task is to write a SQL query to retrieve the names of all employees who have been with the company for more than 5 years.
  Question:
  - Write a SQL query to find the names of employees who have been with the company for more than 5 years.
Expected Answer:
    SELECT Name
    FROM Employees
    WHERE DATEDIFF(year, HireDate, GETDATE()) > 5;

Power BI Scenario-Based Questions

1. Scenario: You have been given a dataset in Power BI that contains sales data for a company. Your task is to create a report that shows the total sales by product category and region.
    Expected Answer:
    - Load the dataset into Power BI.
    - Create relationships if necessary.
    - Use the "Fields" pane to select the necessary fields (Product Category, Region, Sales).
    - Drag these fields into the "Values" area of a new visualization (e.g., a table or bar chart).
    - Use the "Filters" pane to filter data as needed.
    - Format the visualization to enhance clarity and readability.

2. Scenario: You have been asked to create a Power BI dashboard that displays real-time stock prices for a set of companies. The stock prices are available through an API.
  Expected Answer:
    - Use Power BI Desktop to connect to the API.
    - Go to "Get Data" > "Web" and enter the API URL.
    - Configure the data refresh settings to ensure real-time updates (e.g., setting up a scheduled refresh or using DirectQuery if supported).
    - Create visualizations using the imported data.
    - Publish the report to the Power BI service and set up a data gateway if needed for continuous refresh.

3. Scenario: You have been given a Power BI report that contains multiple visualizations. The report is taking a long time to load and is impacting the performance of the application.
    Expected Answer:
    - Analyze the current performance using Performance Analyzer.
    - Optimize data model by reducing the number of columns and rows, and removing unnecessary calculations.
    - Use aggregated tables to pre-compute results.
    - Simplify DAX calculations.
    - Optimize visualizations by reducing the number of visuals per page and avoiding complex custom visuals.
    - Ensure proper indexing on the data source.

Free SQL Resources: https://whatsapp.com/channel/0029VanC5rODzgT6TiTGoa1v

Like if you need more similar content

Hope it helps :)
7
What is the correct syntax to print "Hello, World!" in Python?
Anonymous Quiz
87%
print("Hello, World!")
3%
echo "Hello, World!"
9%
printf("Hello, World!")
4
Which data type is used to store a sequence of characters in Python?
Anonymous Quiz
13%
Integer
4%
Float
79%
String
4%
Boolean
6
Which of the following is a valid way to define a list in Python?
Anonymous Quiz
13%
my_list = (1, 2, 3)
18%
my_list = {1, 2, 3}
66%
my_list = [1, 2, 3]
3%
my_list = "1, 2, 3"
4
Which loop is used to iterate over a sequence (like a list or a string) in Python?
Anonymous Quiz
31%
while loop
61%
for loop
8%
if loop
4
What will be the output of the following code?

my_string = "Python" print(len(my_string))
Anonymous Quiz
14%
5
61%
6
6%
7
18%
Error
8
SQL Interviews LOVE to test you on Window Functions. Here’s the list of 7 most popular window functions

👇 𝟕 𝐌𝐨𝐬𝐭 𝐓𝐞𝐬𝐭𝐞𝐝 𝐖𝐢𝐧𝐝𝐨𝐰 𝐅𝐮𝐧𝐜𝐭𝐢𝐨𝐧𝐬

* RANK() - gives a rank to each row in a partition based on a specified column or value

* DENSE_RANK() - gives a rank to each row, but DOESN'T skip rank values

* ROW_NUMBER() - gives a unique integer to each row in a partition based on the order of the rows

* LEAD() - retrieves a value from a subsequent row in a partition based on a specified column or expression

* LAG() - retrieves a value from a previous row in a partition based on a specified column or expression

* NTH_VALUE() - retrieves the nth value in a partition

React ❤️ for the detailed explanation
18