Data Analytics Interview Questions with Answers Part-1: 📱
1. What is the difference between data analysis and data analytics?
⦁ Data analysis involves inspecting, cleaning, and modeling data to discover useful information and patterns for decision-making.
⦁ Data analytics is a broader process that includes data collection, transformation, analysis, and interpretation, often involving predictive and prenoscriptive techniques to drive business strategies.
2. Explain the data cleaning process you follow.
⦁ Identify missing, inconsistent, or corrupt data.
⦁ Handle missing data by imputation (mean, median, mode) or removal if appropriate.
⦁ Standardize formats (dates, strings).
⦁ Remove duplicates.
⦁ Detect and treat outliers.
⦁ Validate cleaned data against known business rules.
3. How do you handle missing or duplicate data?
⦁ Missing data: Identify patterns; if random, impute using statistical methods or predictive modeling; else consider domain knowledge before removal.
⦁ Duplicate data: Detect with key fields; remove exact duplicates or merge fuzzy duplicates based on context.
4. What is a primary key in a database?
A primary key uniquely identifies each record in a table, ensuring entity integrity and enabling relationships between tables via foreign keys.
5. Write a SQL query to find the second highest salary in a table.
6. Explain INNER JOIN vs LEFT JOIN with examples.
⦁ INNER JOIN: Returns only matching rows between two tables.
⦁ LEFT JOIN: Returns all rows from the left table, plus matching rows from the right; if no match, right columns are NULL.
Example:
7. What are outliers? How do you detect and treat them?
⦁ Outliers are data points significantly different from others that can skew analysis.
⦁ Detect with boxplots, z-score (>3), or IQR method (values outside 1.5*IQR).
⦁ Treat by investigating causes, correcting errors, transforming data, or removing if they’re noise.
8. Describe what a pivot table is and how you use it.
A pivot table is a data summarization tool that groups, aggregates (sum, average), and displays data cross-categorically. Used in Excel and BI tools for quick insights and reporting.
9. How do you validate a data model’s performance?
⦁ Use relevant metrics (accuracy, precision, recall for classification; RMSE, MAE for regression).
⦁ Perform cross-validation to check generalizability.
⦁ Test on holdout or unseen data sets.
10. What is hypothesis testing? Explain t-test and z-test.
⦁ Hypothesis testing assesses if sample data supports a claim about a population.
⦁ t-test: Used when sample size is small and population variance is unknown, often comparing means.
⦁ z-test: Used for large samples with known variance to test population parameters.
React ♥️ for Part-2
1. What is the difference between data analysis and data analytics?
⦁ Data analysis involves inspecting, cleaning, and modeling data to discover useful information and patterns for decision-making.
⦁ Data analytics is a broader process that includes data collection, transformation, analysis, and interpretation, often involving predictive and prenoscriptive techniques to drive business strategies.
2. Explain the data cleaning process you follow.
⦁ Identify missing, inconsistent, or corrupt data.
⦁ Handle missing data by imputation (mean, median, mode) or removal if appropriate.
⦁ Standardize formats (dates, strings).
⦁ Remove duplicates.
⦁ Detect and treat outliers.
⦁ Validate cleaned data against known business rules.
3. How do you handle missing or duplicate data?
⦁ Missing data: Identify patterns; if random, impute using statistical methods or predictive modeling; else consider domain knowledge before removal.
⦁ Duplicate data: Detect with key fields; remove exact duplicates or merge fuzzy duplicates based on context.
4. What is a primary key in a database?
A primary key uniquely identifies each record in a table, ensuring entity integrity and enabling relationships between tables via foreign keys.
5. Write a SQL query to find the second highest salary in a table.
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
6. Explain INNER JOIN vs LEFT JOIN with examples.
⦁ INNER JOIN: Returns only matching rows between two tables.
⦁ LEFT JOIN: Returns all rows from the left table, plus matching rows from the right; if no match, right columns are NULL.
Example:
SELECT * FROM A INNER JOIN B ON A.id = B.id;
SELECT * FROM A LEFT JOIN B ON A.id = B.id;
7. What are outliers? How do you detect and treat them?
⦁ Outliers are data points significantly different from others that can skew analysis.
⦁ Detect with boxplots, z-score (>3), or IQR method (values outside 1.5*IQR).
⦁ Treat by investigating causes, correcting errors, transforming data, or removing if they’re noise.
8. Describe what a pivot table is and how you use it.
A pivot table is a data summarization tool that groups, aggregates (sum, average), and displays data cross-categorically. Used in Excel and BI tools for quick insights and reporting.
9. How do you validate a data model’s performance?
⦁ Use relevant metrics (accuracy, precision, recall for classification; RMSE, MAE for regression).
⦁ Perform cross-validation to check generalizability.
⦁ Test on holdout or unseen data sets.
10. What is hypothesis testing? Explain t-test and z-test.
⦁ Hypothesis testing assesses if sample data supports a claim about a population.
⦁ t-test: Used when sample size is small and population variance is unknown, often comparing means.
⦁ z-test: Used for large samples with known variance to test population parameters.
React ♥️ for Part-2
Please open Telegram to view this post
VIEW IN TELEGRAM
❤39👍2👌1
Which SQL function is used to calculate the total of a numeric column?
Anonymous Quiz
45%
A) COUNT()
3%
B) AVG()
52%
C) SUM()
1%
D) MIN()
❤3
What does the GROUP BY clause do in SQL?
Anonymous Quiz
14%
A) Filters rows before aggregation
78%
B) Groups rows with the same values to apply aggregation
5%
C) Sorts the output alphabetically
3%
D) Joins two tables together
❤9
Which function finds the maximum value in a column?
Anonymous Quiz
2%
A) MIN()
95%
B) MAX()
1%
C) AVG()
2%
D) COUNT()
❤4
What does this query return?
SELECT job_noscript, COUNT(*) FROM employees GROUP BY job_noscript;
SELECT job_noscript, COUNT(*) FROM employees GROUP BY job_noscript;
Anonymous Quiz
13%
A) Total salaries per job noscript
3%
B) Average salary per job noscript
82%
C) Number of employees per job noscript
1%
D) Highest salary per job
❤7
Data Analytics Interview Questions with Answers Part-2: ✅
11. How do you explain complex data insights to non-technical stakeholders?
Use simple, clear language; avoid jargon. Focus on key takeaways and business impact. Use visuals and storytelling to make insights relatable.
12. What tools do you use for data visualization?
Common tools include Tableau, Power BI, Excel, Python libraries like Matplotlib and Seaborn, and R’s ggplot2.
13. How do you optimize a slow SQL query?
Add indexes, avoid SELECT *, limit joins and subqueries, review execution plans, and rewrite queries for efficiency.
14. Describe a time when your analysis impacted a business decision.
Use the STAR approach: e.g., identified sales drop pattern, recommended marketing focus shift, which increased revenue by 10%.
15. What is the difference between clustered and non-clustered indexes?
Clustered indexes sort data physically in storage (one per table). Non-clustered indexes are separate pointers to data rows (multiple allowed).
16. Explain the bias-variance tradeoff.
Bias is error from oversimplified models (underfitting). Variance is error from models too sensitive to training data (overfitting). The tradeoff balances them to minimize total prediction error.
17. What is collaborative filtering?
A recommendation technique predicting user preferences based on similarities between users or items.
18. How do you handle large datasets?
Use distributed computing frameworks (Spark, Hadoop), sampling, optimized queries, efficient storage formats, and cloud resources.
19. What Python libraries do you use for data analysis?
Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Statsmodels are popular.
20. Describe data profiling and its importance.
Data profiling involves examining data for quality, consistency, and structure, helping detect issues early and ensuring reliability for analysis.
React ♥️ for Part-3
11. How do you explain complex data insights to non-technical stakeholders?
Use simple, clear language; avoid jargon. Focus on key takeaways and business impact. Use visuals and storytelling to make insights relatable.
12. What tools do you use for data visualization?
Common tools include Tableau, Power BI, Excel, Python libraries like Matplotlib and Seaborn, and R’s ggplot2.
13. How do you optimize a slow SQL query?
Add indexes, avoid SELECT *, limit joins and subqueries, review execution plans, and rewrite queries for efficiency.
14. Describe a time when your analysis impacted a business decision.
Use the STAR approach: e.g., identified sales drop pattern, recommended marketing focus shift, which increased revenue by 10%.
15. What is the difference between clustered and non-clustered indexes?
Clustered indexes sort data physically in storage (one per table). Non-clustered indexes are separate pointers to data rows (multiple allowed).
16. Explain the bias-variance tradeoff.
Bias is error from oversimplified models (underfitting). Variance is error from models too sensitive to training data (overfitting). The tradeoff balances them to minimize total prediction error.
17. What is collaborative filtering?
A recommendation technique predicting user preferences based on similarities between users or items.
18. How do you handle large datasets?
Use distributed computing frameworks (Spark, Hadoop), sampling, optimized queries, efficient storage formats, and cloud resources.
19. What Python libraries do you use for data analysis?
Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Statsmodels are popular.
20. Describe data profiling and its importance.
Data profiling involves examining data for quality, consistency, and structure, helping detect issues early and ensuring reliability for analysis.
React ♥️ for Part-3
❤20👏2
Data Analytics Interview Questions with Answers Part-3: ✅
21. How do you detect and handle multicollinearity?
Detect multicollinearity by calculating Variance Inflation Factor (VIF) or checking correlation matrices. Handle it by removing or combining highly correlated variables, or using regularization techniques.
22. Can you explain the concept of data partitioning?
Data partitioning involves splitting datasets into subsets such as training, validation, and test sets to build and evaluate models reliably without overfitting.
23. What is data normalization? Why is it important?
Normalization scales features to a common range, improving convergence and accuracy in algorithms sensitive to scale like KNN or gradient descent.
24. Describe your experience with A/B testing.
Implemented controlled experiments by splitting users into groups, measuring metrics like conversion rate, and using statistical tests to infer causal impact of changes.
25. What’s the difference between supervised and unsupervised learning?
Supervised learning uses labeled data to predict outcomes; unsupervised learning finds patterns or groupings in unlabeled data.
26. How do you keep yourself updated with new tools and techniques?
Follow industry blogs, attend webinars, take online courses, engage in forums like Kaggle, and participate in data science communities.
27. What’s a use case for a LEFT JOIN over an INNER JOIN?
Use LEFT JOIN when you need all records from the primary table regardless of matches, e.g., showing all customers including those with no orders.
28. Explain the curse of dimensionality.
As feature numbers grow, data becomes sparse in high-dimensional space, making models harder to train and increasing risk of overfitting.
29. What are the key metrics you track in your analyses?
Depends on goals: could be accuracy, precision, recall, churn rate, revenue growth, engagement metrics, or RMSE, among others.
30. Describe a situation when you had conflicting priorities in a project.
Prioritized tasks based on impact and deadlines, communicated clearly with stakeholders, and adjusted timelines to deliver critical components on time.
React ♥️ for Part-4
21. How do you detect and handle multicollinearity?
Detect multicollinearity by calculating Variance Inflation Factor (VIF) or checking correlation matrices. Handle it by removing or combining highly correlated variables, or using regularization techniques.
22. Can you explain the concept of data partitioning?
Data partitioning involves splitting datasets into subsets such as training, validation, and test sets to build and evaluate models reliably without overfitting.
23. What is data normalization? Why is it important?
Normalization scales features to a common range, improving convergence and accuracy in algorithms sensitive to scale like KNN or gradient descent.
24. Describe your experience with A/B testing.
Implemented controlled experiments by splitting users into groups, measuring metrics like conversion rate, and using statistical tests to infer causal impact of changes.
25. What’s the difference between supervised and unsupervised learning?
Supervised learning uses labeled data to predict outcomes; unsupervised learning finds patterns or groupings in unlabeled data.
26. How do you keep yourself updated with new tools and techniques?
Follow industry blogs, attend webinars, take online courses, engage in forums like Kaggle, and participate in data science communities.
27. What’s a use case for a LEFT JOIN over an INNER JOIN?
Use LEFT JOIN when you need all records from the primary table regardless of matches, e.g., showing all customers including those with no orders.
28. Explain the curse of dimensionality.
As feature numbers grow, data becomes sparse in high-dimensional space, making models harder to train and increasing risk of overfitting.
29. What are the key metrics you track in your analyses?
Depends on goals: could be accuracy, precision, recall, churn rate, revenue growth, engagement metrics, or RMSE, among others.
30. Describe a situation when you had conflicting priorities in a project.
Prioritized tasks based on impact and deadlines, communicated clearly with stakeholders, and adjusted timelines to deliver critical components on time.
React ♥️ for Part-4
❤9
Data Analytics Interview Questions with Answers Part-4: ✅
31. What is ETL? Have you worked with any ETL tools?
ETL stands for Extract, Transform, Load — it’s the process of extracting data from sources, cleaning and transforming it, then loading it into a database or warehouse. Tools include Talend, Informatica, Apache NiFi, and Apache Airflow.
32. How do you ensure data quality?
Implement validation rules, data profiling, automate quality checks, monitor data pipelines, and collaborate with data owners to maintain accuracy and consistency.
33. What’s your approach to storytelling with data?
Focus on the key message, structure insights logically, use compelling visuals, and link findings to business objectives to engage the audience.
34. How would you improve an existing dashboard?
Make it user-friendly, remove clutter, add relevant filters, ensure real-time or frequent updates, and align KPIs to stakeholders’ needs.
35. What’s the role of machine learning in data analytics?
Machine learning automates discovering patterns and predictions, enhancing analytics by enabling forecasting, segmentation, and decision automation.
36. Explain a time when you automated a repetitive data task.
For example, noscripted data extraction and cleaning using Python to replace manual Excel work, saving hours weekly and reducing errors.
37. What’s your experience with cloud platforms for data analytics?
Used AWS (S3, Redshift), Azure Synapse, Google BigQuery for scalable data storage and processing.
38. How do you approach exploratory data analysis (EDA)?
Start with data summaries, visualize distributions and relationships, check for missing data and outliers to understand dataset structure.
39. What’s the difference between outlier detection and anomaly detection?
Outlier detection finds extreme values; anomaly detection looks for unusual patterns that may not be extreme but indicate different behavior.
40. Describe a challenging data problem you solved.
Tackled inconsistent customer records by merging multiple data sources using fuzzy matching, improving customer segmentation accuracy.
React ♥️ for Part-5
31. What is ETL? Have you worked with any ETL tools?
ETL stands for Extract, Transform, Load — it’s the process of extracting data from sources, cleaning and transforming it, then loading it into a database or warehouse. Tools include Talend, Informatica, Apache NiFi, and Apache Airflow.
32. How do you ensure data quality?
Implement validation rules, data profiling, automate quality checks, monitor data pipelines, and collaborate with data owners to maintain accuracy and consistency.
33. What’s your approach to storytelling with data?
Focus on the key message, structure insights logically, use compelling visuals, and link findings to business objectives to engage the audience.
34. How would you improve an existing dashboard?
Make it user-friendly, remove clutter, add relevant filters, ensure real-time or frequent updates, and align KPIs to stakeholders’ needs.
35. What’s the role of machine learning in data analytics?
Machine learning automates discovering patterns and predictions, enhancing analytics by enabling forecasting, segmentation, and decision automation.
36. Explain a time when you automated a repetitive data task.
For example, noscripted data extraction and cleaning using Python to replace manual Excel work, saving hours weekly and reducing errors.
37. What’s your experience with cloud platforms for data analytics?
Used AWS (S3, Redshift), Azure Synapse, Google BigQuery for scalable data storage and processing.
38. How do you approach exploratory data analysis (EDA)?
Start with data summaries, visualize distributions and relationships, check for missing data and outliers to understand dataset structure.
39. What’s the difference between outlier detection and anomaly detection?
Outlier detection finds extreme values; anomaly detection looks for unusual patterns that may not be extreme but indicate different behavior.
40. Describe a challenging data problem you solved.
Tackled inconsistent customer records by merging multiple data sources using fuzzy matching, improving customer segmentation accuracy.
React ♥️ for Part-5
❤16👏1
Data Analytics Interview Questions with Answers Part-5: ✅
41. Explain the concept of data aggregation.
Data aggregation is the process of summarizing detailed data into a summarized form, like totals, averages, counts, or other statistics over groups or time periods, to make analysis manageable and insightful.
42. What’s your favorite data visualization technique and why?
Depends on the use case, but bar charts are great for comparisons, scatter plots for relationships, and dashboards for monitoring multiple KPIs in one view. I prefer clear, simple visuals that communicate the story effectively.
43. How do you handle unstructured data?
Use techniques like natural language processing (NLP) for text, image recognition for pictures, or convert unstructured data into structured formats via parsing and feature extraction.
44. What’s the difference between R and Python for data analytics?
R excels at statistical analysis and has a vast array of domain-specific packages. Python is more versatile with general programming capabilities, easier for deploying models, and integrates well with data engineering pipelines.
45. Describe your process for preparing a dataset for analysis.
Acquire data, clean it (handle missing values, outliers, duplicates), transform (normalize, encode categories), perform feature engineering, and split it into training and test sets if modeling.
46. What is a data lake vs a data warehouse?
A data lake stores raw, unstructured or structured data in its native format, ideal for big data and flexible querying. A data warehouse stores cleaned, structured data optimized for fast analytics and reporting.
47. How do you manage version control of your analysis noscripts?
Use Git or similar systems to track changes, collaborate with teammates, and maintain a history of noscript modifications and improvements.
48. What are your strategies for effective teamwork in analytics projects?
Clear communication, defined roles and responsibilities, regular updates, collaborative tools (Slack, Jira), and openness to feedback foster smooth teamwork.
49. How do you handle feedback on your analysis?
Listen actively, clarify doubts, be open-minded, incorporate valid suggestions, and update analysis or reports as needed while communicating changes clearly.
50. Can you share an example where you turned data into actionable insights?
Analyzed customer churn by modeling behavioral patterns, identified at-risk segments, and recommended targeted retention offers that reduced churn by 12%.
Data Analytics Interview Questions: https://news.1rj.ru/str/sqlspecialist/2205
React ♥️ if this helped you
41. Explain the concept of data aggregation.
Data aggregation is the process of summarizing detailed data into a summarized form, like totals, averages, counts, or other statistics over groups or time periods, to make analysis manageable and insightful.
42. What’s your favorite data visualization technique and why?
Depends on the use case, but bar charts are great for comparisons, scatter plots for relationships, and dashboards for monitoring multiple KPIs in one view. I prefer clear, simple visuals that communicate the story effectively.
43. How do you handle unstructured data?
Use techniques like natural language processing (NLP) for text, image recognition for pictures, or convert unstructured data into structured formats via parsing and feature extraction.
44. What’s the difference between R and Python for data analytics?
R excels at statistical analysis and has a vast array of domain-specific packages. Python is more versatile with general programming capabilities, easier for deploying models, and integrates well with data engineering pipelines.
45. Describe your process for preparing a dataset for analysis.
Acquire data, clean it (handle missing values, outliers, duplicates), transform (normalize, encode categories), perform feature engineering, and split it into training and test sets if modeling.
46. What is a data lake vs a data warehouse?
A data lake stores raw, unstructured or structured data in its native format, ideal for big data and flexible querying. A data warehouse stores cleaned, structured data optimized for fast analytics and reporting.
47. How do you manage version control of your analysis noscripts?
Use Git or similar systems to track changes, collaborate with teammates, and maintain a history of noscript modifications and improvements.
48. What are your strategies for effective teamwork in analytics projects?
Clear communication, defined roles and responsibilities, regular updates, collaborative tools (Slack, Jira), and openness to feedback foster smooth teamwork.
49. How do you handle feedback on your analysis?
Listen actively, clarify doubts, be open-minded, incorporate valid suggestions, and update analysis or reports as needed while communicating changes clearly.
50. Can you share an example where you turned data into actionable insights?
Analyzed customer churn by modeling behavioral patterns, identified at-risk segments, and recommended targeted retention offers that reduced churn by 12%.
Data Analytics Interview Questions: https://news.1rj.ru/str/sqlspecialist/2205
React ♥️ if this helped you
❤14🤩2👏1
Top 50 SQL Interview Questions (2025)
1. What is SQL?
2. Differentiate between SQL and NoSQL databases.
3. What are the different types of SQL commands?
4. Explain the difference between WHERE and HAVING clauses.
5. Write a SQL query to find the second highest salary in a table.
6. What is a JOIN? Explain different types of JOINs.
7. How do you optimize slow-performing SQL queries?
8. What is a primary key? What is a foreign key?
9. What are indexes? Explain clustered and non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
11. What is a subquery? Give an example.
12. Explain the concept of normalization.
13. What is denormalization? When is it used?
14. Describe transactions and their properties (ACID).
15. What is a stored procedure?
16. How do you handle NULL values in SQL?
17. Explain the difference between UNION and UNION ALL.
18. What are views? How are they useful?
19. What is a trigger? Give use cases.
20. How do you perform aggregate functions in SQL?
21. What is data partitioning?
22. How do you find duplicates in a table?
23. What is the difference between DELETE and TRUNCATE?
24. Explain window functions with examples.
25. What is the difference between correlated and non-correlated subqueries?
26. How do you enforce data integrity?
27. What are CTEs (Common Table Expressions)?
28. Explain EXISTS and NOT EXISTS operators.
29. How do SQL constraints work?
30. What is an execution plan? How do you use it?
31. Describe how to handle errors in SQL.
32. What are temporary tables?
33. Explain the difference between CHAR and VARCHAR.
34. How do you perform pagination in SQL?
35. What is a composite key?
36. How do you convert data types in SQL?
37. Explain locking and isolation levels in SQL.
38. How do you write recursive queries?
39. What are the advantages of using prepared statements?
40. How to debug SQL queries?
41. Differentiate between OLTP and OLAP databases.
42. What is schema in SQL?
43. How do you implement many-to-many relationships in SQL?
44. What is query optimization?
45. How do you handle large datasets in SQL?
46. Explain the difference between CROSS JOIN and INNER JOIN.
47. What is a materialized view?
48. How do you backup and restore a database?
49. Explain how indexing can degrade performance.
50. Can you write a query to find employees with no managers?
Double tap ❤️ for detailed answers!
1. What is SQL?
2. Differentiate between SQL and NoSQL databases.
3. What are the different types of SQL commands?
4. Explain the difference between WHERE and HAVING clauses.
5. Write a SQL query to find the second highest salary in a table.
6. What is a JOIN? Explain different types of JOINs.
7. How do you optimize slow-performing SQL queries?
8. What is a primary key? What is a foreign key?
9. What are indexes? Explain clustered and non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
11. What is a subquery? Give an example.
12. Explain the concept of normalization.
13. What is denormalization? When is it used?
14. Describe transactions and their properties (ACID).
15. What is a stored procedure?
16. How do you handle NULL values in SQL?
17. Explain the difference between UNION and UNION ALL.
18. What are views? How are they useful?
19. What is a trigger? Give use cases.
20. How do you perform aggregate functions in SQL?
21. What is data partitioning?
22. How do you find duplicates in a table?
23. What is the difference between DELETE and TRUNCATE?
24. Explain window functions with examples.
25. What is the difference between correlated and non-correlated subqueries?
26. How do you enforce data integrity?
27. What are CTEs (Common Table Expressions)?
28. Explain EXISTS and NOT EXISTS operators.
29. How do SQL constraints work?
30. What is an execution plan? How do you use it?
31. Describe how to handle errors in SQL.
32. What are temporary tables?
33. Explain the difference between CHAR and VARCHAR.
34. How do you perform pagination in SQL?
35. What is a composite key?
36. How do you convert data types in SQL?
37. Explain locking and isolation levels in SQL.
38. How do you write recursive queries?
39. What are the advantages of using prepared statements?
40. How to debug SQL queries?
41. Differentiate between OLTP and OLAP databases.
42. What is schema in SQL?
43. How do you implement many-to-many relationships in SQL?
44. What is query optimization?
45. How do you handle large datasets in SQL?
46. Explain the difference between CROSS JOIN and INNER JOIN.
47. What is a materialized view?
48. How do you backup and restore a database?
49. Explain how indexing can degrade performance.
50. Can you write a query to find employees with no managers?
Double tap ❤️ for detailed answers!
❤41👍2👏2🥰1😁1
SQL Interview Questions with Answers Part-1: ☑️
1. What is SQL?
SQL (Structured Query Language) is a standardized programming language designed to manage and manipulate relational databases. It allows you to query, insert, update, and delete data, as well as create and modify schema objects like tables and views.
2. Differentiate between SQL and NoSQL databases.
SQL databases are relational, table-based, and use structured query language with fixed schemas, ideal for complex queries and transactions. NoSQL databases are non-relational, can be document, key-value, graph, or column-oriented, and are schema-flexible, designed for scalability and handling unstructured data.
3. What are the different types of SQL commands?
⦁ DDL (Data Definition Language): CREATE, ALTER, DROP (define and modify structure)
⦁ DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE (data operations)
⦁ DCL (Data Control Language): GRANT, REVOKE (permission control)
⦁ TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT (transaction management)
4. Explain the difference between WHERE and HAVING clauses.
⦁
⦁
5. Write a SQL query to find the second highest salary in a table.
Using a subquery:
Or using DENSE_RANK():
6. What is a JOIN? Explain different types of JOINs.
A JOIN combines rows from two or more tables based on a related column:
⦁ INNER JOIN: returns matching rows from both tables.
⦁ LEFT JOIN (LEFT OUTER JOIN): all rows from the left table, matched rows from right.
⦁ RIGHT JOIN (RIGHT OUTER JOIN): all rows from right table, matched rows from left.
⦁ FULL JOIN (FULL OUTER JOIN): all rows when there’s a match in either table.
⦁ CROSS JOIN: Cartesian product of both tables.
7. How do you optimize slow-performing SQL queries?
⦁ Use indexes appropriately to speed up lookups.
⦁ Avoid SELECT *; only select necessary columns.
⦁ Use joins carefully; filter early with WHERE clauses.
⦁ Analyze execution plans to identify bottlenecks.
⦁ Avoid unnecessary subqueries; use EXISTS or JOINs.
⦁ Limit result sets with pagination if dealing with large datasets.
8. What is a primary key? What is a foreign key?
⦁ Primary Key: A unique identifier for records in a table; it cannot be NULL.
⦁ Foreign Key: A field that creates a link between two tables by referring to the primary key in another table, enforcing referential integrity.
9. What are indexes? Explain clustered and non-clustered indexes.
⦁ Indexes speed up data retrieval by providing quick lookups.
⦁ Clustered Index: Sorts and stores the actual data rows in the table based on the key; a table can have only one clustered index.
⦁ Non-Clustered Index: Creates a separate structure that points to the data rows; tables can have multiple non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
In SQL Server and PostgreSQL:
In SQL Server (older syntax):
React ♥️ for Part 2
1. What is SQL?
SQL (Structured Query Language) is a standardized programming language designed to manage and manipulate relational databases. It allows you to query, insert, update, and delete data, as well as create and modify schema objects like tables and views.
2. Differentiate between SQL and NoSQL databases.
SQL databases are relational, table-based, and use structured query language with fixed schemas, ideal for complex queries and transactions. NoSQL databases are non-relational, can be document, key-value, graph, or column-oriented, and are schema-flexible, designed for scalability and handling unstructured data.
3. What are the different types of SQL commands?
⦁ DDL (Data Definition Language): CREATE, ALTER, DROP (define and modify structure)
⦁ DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE (data operations)
⦁ DCL (Data Control Language): GRANT, REVOKE (permission control)
⦁ TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT (transaction management)
4. Explain the difference between WHERE and HAVING clauses.
⦁
WHERE filters rows before grouping (used with SELECT, UPDATE).⦁
HAVING filters groups after aggregation (used with GROUP BY), e.g., filtering aggregated results like sums or counts.5. Write a SQL query to find the second highest salary in a table.
Using a subquery:
SELECT MAX(salary) FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Or using DENSE_RANK():
SELECT salary FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) as rnk
FROM employees) t
WHERE rnk = 2;
6. What is a JOIN? Explain different types of JOINs.
A JOIN combines rows from two or more tables based on a related column:
⦁ INNER JOIN: returns matching rows from both tables.
⦁ LEFT JOIN (LEFT OUTER JOIN): all rows from the left table, matched rows from right.
⦁ RIGHT JOIN (RIGHT OUTER JOIN): all rows from right table, matched rows from left.
⦁ FULL JOIN (FULL OUTER JOIN): all rows when there’s a match in either table.
⦁ CROSS JOIN: Cartesian product of both tables.
7. How do you optimize slow-performing SQL queries?
⦁ Use indexes appropriately to speed up lookups.
⦁ Avoid SELECT *; only select necessary columns.
⦁ Use joins carefully; filter early with WHERE clauses.
⦁ Analyze execution plans to identify bottlenecks.
⦁ Avoid unnecessary subqueries; use EXISTS or JOINs.
⦁ Limit result sets with pagination if dealing with large datasets.
8. What is a primary key? What is a foreign key?
⦁ Primary Key: A unique identifier for records in a table; it cannot be NULL.
⦁ Foreign Key: A field that creates a link between two tables by referring to the primary key in another table, enforcing referential integrity.
9. What are indexes? Explain clustered and non-clustered indexes.
⦁ Indexes speed up data retrieval by providing quick lookups.
⦁ Clustered Index: Sorts and stores the actual data rows in the table based on the key; a table can have only one clustered index.
⦁ Non-Clustered Index: Creates a separate structure that points to the data rows; tables can have multiple non-clustered indexes.
10. Write a SQL query to fetch the top 5 records from a table.
In SQL Server and PostgreSQL:
SELECT * FROM table_name
ORDER BY some_column DESC
LIMIT 5;
In SQL Server (older syntax):
SELECT TOP 5 * FROM table_name
ORDER BY some_column DESC;
React ♥️ for Part 2
Please open Telegram to view this post
VIEW IN TELEGRAM
❤32🥰2👏1
SQL Interview Questions with Answers Part-2: ☑️
11. What is a subquery? Give an example.
A subquery is a query nested inside another query (SELECT, INSERT, UPDATE, DELETE). It helps filter or calculate values dynamically.
Example:
12. Explain the concept of normalization.
Normalization is organizing data to minimize redundancy by dividing tables and defining relationships using keys. It improves data integrity and reduces update anomalies. Common normal forms: 1NF, 2NF, 3NF.
13. What is denormalization? When is it used?
Denormalization is combining tables to reduce joins, improving read performance at the cost of redundancy. Used in data warehousing or OLAP scenarios requiring fast query responses.
14. Describe transactions and their properties (ACID).
A transaction is a set of SQL operations treated as a single unit. ACID properties:
⦁ Atomicity: all or nothing execution
⦁ Consistency: database moves from one valid state to another
⦁ Isolation: concurrent transactions don’t interfere
⦁ Durability: changes persist after commit
15. What is a stored procedure?
A stored procedure is a precompiled SQL program stored in the database, which can accept parameters and perform complex operations efficiently, improving performance and reusability.
16. How do you handle NULL values in SQL?
Use
17. Explain the difference between UNION and UNION ALL.
⦁ UNION combines results of two queries and removes duplicates.
⦁ UNION ALL combines results including duplicates, faster than UNION.
18. What are views? How are they useful?
A view is a virtual table based on a SELECT query. It simplifies complex queries, provides security by restricting access, and allows data abstraction.
19. What is a trigger? Give use cases.
Triggers are special procedures that automatically execute in response to certain events on a table (e.g., INSERT, UPDATE). Use cases: auditing changes, enforcing business rules, cascading changes.
20. How do you perform aggregate functions in SQL?
Aggregate functions process multiple rows to return a single value, e.g.,
React ♥️ for Part 3
11. What is a subquery? Give an example.
A subquery is a query nested inside another query (SELECT, INSERT, UPDATE, DELETE). It helps filter or calculate values dynamically.
Example:
SELECT name FROM employees
WHERE department_id = (SELECT id FROM departments WHERE name = 'Sales');
12. Explain the concept of normalization.
Normalization is organizing data to minimize redundancy by dividing tables and defining relationships using keys. It improves data integrity and reduces update anomalies. Common normal forms: 1NF, 2NF, 3NF.
13. What is denormalization? When is it used?
Denormalization is combining tables to reduce joins, improving read performance at the cost of redundancy. Used in data warehousing or OLAP scenarios requiring fast query responses.
14. Describe transactions and their properties (ACID).
A transaction is a set of SQL operations treated as a single unit. ACID properties:
⦁ Atomicity: all or nothing execution
⦁ Consistency: database moves from one valid state to another
⦁ Isolation: concurrent transactions don’t interfere
⦁ Durability: changes persist after commit
15. What is a stored procedure?
A stored procedure is a precompiled SQL program stored in the database, which can accept parameters and perform complex operations efficiently, improving performance and reusability.
16. How do you handle NULL values in SQL?
Use
IS NULL or IS NOT NULL to check NULLs. Functions like COALESCE() or IFNULL() replace NULLs with specified values in queries.17. Explain the difference between UNION and UNION ALL.
⦁ UNION combines results of two queries and removes duplicates.
⦁ UNION ALL combines results including duplicates, faster than UNION.
18. What are views? How are they useful?
A view is a virtual table based on a SELECT query. It simplifies complex queries, provides security by restricting access, and allows data abstraction.
19. What is a trigger? Give use cases.
Triggers are special procedures that automatically execute in response to certain events on a table (e.g., INSERT, UPDATE). Use cases: auditing changes, enforcing business rules, cascading changes.
20. How do you perform aggregate functions in SQL?
Aggregate functions process multiple rows to return a single value, e.g.,
COUNT(), SUM(), AVG(), MIN(), and MAX(). Often used with GROUP BY to group results.React ♥️ for Part 3
Please open Telegram to view this post
VIEW IN TELEGRAM
❤18
SQL interview questions Part-3 ✅
21. What is data partitioning?
Splitting large tables into smaller, manageable pieces (partitions) based on a key like date or region, improving query performance and maintenance.
22. How do you find duplicates in a table?
Use GROUP BY with HAVING:
23. What is the difference between DELETE and TRUNCATE?
⦁ DELETE removes rows one by one, can have WHERE clause, logs each row, slower.
⦁ TRUNCATE removes all rows instantly, no WHERE, resets identity, faster but less flexible.
24. Explain window functions with examples.
Window functions perform calculations across sets of rows related to the current row without collapsing results. Example:
25. What is the difference between correlated and non-correlated subqueries?
⦁ Correlated subqueries depend on the outer query and execute for each row.
⦁ Non-correlated subqueries run independently once.
26. How do you enforce data integrity?
Using constraints (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, NOT NULL), triggers, and transactions.
27. What are CTEs (Common Table Expressions)?
Temporary named result sets within SQL statements to improve query readability and recursion:
28. Explain EXISTS and NOT EXISTS operators.
⦁ EXISTS returns TRUE if a subquery returns any rows.
⦁ NOT EXISTS returns TRUE if subquery returns no rows.
29. How do SQL constraints work?
Constraints enforce rules at the database level to ensure data validity and integrity during insert/update/delete operations.
30. What is an execution plan? How do you use it?
A detailed roadmap of how SQL Server executes a query. Used to analyze and optimize query performance by revealing bottlenecks.
React ♥️ for Part 4
21. What is data partitioning?
Splitting large tables into smaller, manageable pieces (partitions) based on a key like date or region, improving query performance and maintenance.
22. How do you find duplicates in a table?
Use GROUP BY with HAVING:
SELECT column, COUNT(*)
FROM table_name
GROUP BY column
HAVING COUNT(*) > 1;
23. What is the difference between DELETE and TRUNCATE?
⦁ DELETE removes rows one by one, can have WHERE clause, logs each row, slower.
⦁ TRUNCATE removes all rows instantly, no WHERE, resets identity, faster but less flexible.
24. Explain window functions with examples.
Window functions perform calculations across sets of rows related to the current row without collapsing results. Example:
SELECT name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;
25. What is the difference between correlated and non-correlated subqueries?
⦁ Correlated subqueries depend on the outer query and execute for each row.
⦁ Non-correlated subqueries run independently once.
26. How do you enforce data integrity?
Using constraints (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, NOT NULL), triggers, and transactions.
27. What are CTEs (Common Table Expressions)?
Temporary named result sets within SQL statements to improve query readability and recursion:
WITH cte AS (SELECT * FROM employees WHERE salary > 5000)
SELECT * FROM cte;
28. Explain EXISTS and NOT EXISTS operators.
⦁ EXISTS returns TRUE if a subquery returns any rows.
⦁ NOT EXISTS returns TRUE if subquery returns no rows.
29. How do SQL constraints work?
Constraints enforce rules at the database level to ensure data validity and integrity during insert/update/delete operations.
30. What is an execution plan? How do you use it?
A detailed roadmap of how SQL Server executes a query. Used to analyze and optimize query performance by revealing bottlenecks.
React ♥️ for Part 4
Please open Telegram to view this post
VIEW IN TELEGRAM
❤16👍1👏1
SQL interview questions Part-4 ✅
31. Describe how to handle errors in SQL.
Use
32. What are temporary tables?
Temporary tables store intermediate results temporarily during a session or procedure, usually with names prefixed by
33. Explain the difference between CHAR and VARCHAR.
⦁
⦁
34. How do you perform pagination in SQL?
Use
Or in SQL Server:
35. What is a composite key?
A primary key made up of two or more columns that uniquely identify a record.
36. How do you convert data types in SQL?
Using
37. Explain locking and isolation levels in SQL.
Locks control concurrent access to data. Isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE) define visibility of changes between concurrent transactions, balancing consistency and performance.
38. How do you write recursive queries?
Using Recursive CTEs with
39. What are the advantages of using prepared statements?
Improved performance (query plan reuse), security (prevents SQL injection), and ease of use with parameterized inputs.
40. How to debug SQL queries?
Analyze execution plans, check syntax errors, use denoscriptive aliases, test subqueries separately, and monitor performance metrics.
React ♥️ for Part-5
31. Describe how to handle errors in SQL.
Use
TRY...CATCH blocks (in SQL Server) or exception handling constructs provided by the database to catch and manage runtime errors, ensuring graceful failure or rollback.32. What are temporary tables?
Temporary tables store intermediate results temporarily during a session or procedure, usually with names prefixed by
# (local) or ## (global) in SQL Server.33. Explain the difference between CHAR and VARCHAR.
⦁
CHAR is fixed-length and pads unused spaces, faster for fixed-size data.⦁
VARCHAR is variable-length, saves space for variable data but may be slightly slower.34. How do you perform pagination in SQL?
Use
LIMIT and OFFSET (MySQL/PostgreSQL):SELECT * FROM table_name ORDER BY id LIMIT 10 OFFSET 20;
Or in SQL Server:
SELECT * FROM table_name ORDER BY id OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY;
35. What is a composite key?
A primary key made up of two or more columns that uniquely identify a record.
36. How do you convert data types in SQL?
Using
CAST() or CONVERT() functions, e.g.,SELECT CAST(column_name AS INT) FROM table_name;
37. Explain locking and isolation levels in SQL.
Locks control concurrent access to data. Isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE) define visibility of changes between concurrent transactions, balancing consistency and performance.
38. How do you write recursive queries?
Using Recursive CTEs with
WITH clause:WITH RECURSIVE cte AS (
SELECT id, parent_id FROM table WHERE parent_id IS NULL
UNION ALL
SELECT t.id, t.parent_id FROM table t INNER JOIN cte ON t.parent_id = cte.id
)
SELECT * FROM cte;
39. What are the advantages of using prepared statements?
Improved performance (query plan reuse), security (prevents SQL injection), and ease of use with parameterized inputs.
40. How to debug SQL queries?
Analyze execution plans, check syntax errors, use denoscriptive aliases, test subqueries separately, and monitor performance metrics.
React ♥️ for Part-5
Please open Telegram to view this post
VIEW IN TELEGRAM
❤21
What is the purpose of a subquery in SQL?
Anonymous Quiz
9%
A) To join two tables
87%
B) To use the result of one query inside another
2%
C) To rename columns
3%
D) To insert data into a table
❤3👌2
Which SQL clause commonly uses subqueries to filter data?
Anonymous Quiz
21%
A) SELECT
8%
B) FROM
54%
C) WHERE
17%
D) GROUP BY
❤5😁2
Why are aliases useful in SQL?
Anonymous Quiz
10%
A) They speed up query execution
82%
B) They make queries easier to read and manage
5%
C) They prevent data loss
3%
D) They encrypt the data
❤4😁2
What will this query return?
SELECT employee_name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);
SELECT employee_name, salary FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);
Anonymous Quiz
12%
A) Employees with salary less than average
4%
B) All employees and their salaries
75%
C) Employees with salary greater than average
9%
D) Average salary of all employees
❤4😁3
SQL Interview Questions with Answers Part-5: ☑️
41. Differentiate between OLTP and OLAP databases.
⦁ OLTP (Online Transaction Processing) is optimized for transactional tasks—fast inserts, updates, and deletes with many users.
⦁ OLAP (Online Analytical Processing) is optimized for complex queries and data analysis, often dealing with large historical datasets.
42. What is schema in SQL?
A schema is a logical container that holds database objects like tables, views, and procedures, helping organize and manage database permissions.
43. How do you implement many-to-many relationships in SQL?
By creating a junction (or associative) table with foreign keys referencing the two related tables.
44. What is query optimization?
The process of improving query execution efficiency by rewriting queries, indexing, and analyzing execution plans to reduce resource consumption.
45. How do you handle large datasets in SQL?
Use partitioning, indexing, batch processing, query optimization, and sometimes materialized views or data archiving to manage performance.
46. Explain the difference between CROSS JOIN and INNER JOIN.
⦁ CROSS JOIN returns the Cartesian product (all combinations) of two tables.
⦁ INNER JOIN returns only matching rows based on join conditions.
47. What is a materialized view?
A stored physical copy of the result set of a query, which improves performance for complex queries by avoiding recomputation every time.
48. How do you backup and restore a database?
Use built-in commands/tools like
49. Explain how indexing can degrade performance.
Too many indexes slow down write operations (INSERT, UPDATE, DELETE) because indexes must also be updated; large indexes can consume extra storage and memory.
50. Can you write a query to find employees with no managers?
Example:
SQL Interview Questions: https://news.1rj.ru/str/sqlspecialist/2220
React ♥️ if this helped you
41. Differentiate between OLTP and OLAP databases.
⦁ OLTP (Online Transaction Processing) is optimized for transactional tasks—fast inserts, updates, and deletes with many users.
⦁ OLAP (Online Analytical Processing) is optimized for complex queries and data analysis, often dealing with large historical datasets.
42. What is schema in SQL?
A schema is a logical container that holds database objects like tables, views, and procedures, helping organize and manage database permissions.
43. How do you implement many-to-many relationships in SQL?
By creating a junction (or associative) table with foreign keys referencing the two related tables.
44. What is query optimization?
The process of improving query execution efficiency by rewriting queries, indexing, and analyzing execution plans to reduce resource consumption.
45. How do you handle large datasets in SQL?
Use partitioning, indexing, batch processing, query optimization, and sometimes materialized views or data archiving to manage performance.
46. Explain the difference between CROSS JOIN and INNER JOIN.
⦁ CROSS JOIN returns the Cartesian product (all combinations) of two tables.
⦁ INNER JOIN returns only matching rows based on join conditions.
47. What is a materialized view?
A stored physical copy of the result set of a query, which improves performance for complex queries by avoiding recomputation every time.
48. How do you backup and restore a database?
Use built-in commands/tools like
BACKUP DATABASE and RESTORE DATABASE in SQL Server, or mysqldump in MySQL, often automating with noscripts for regular backups.49. Explain how indexing can degrade performance.
Too many indexes slow down write operations (INSERT, UPDATE, DELETE) because indexes must also be updated; large indexes can consume extra storage and memory.
50. Can you write a query to find employees with no managers?
Example:
SELECT * FROM employees e
WHERE NOT EXISTS (SELECT 1 FROM employees m WHERE m.id = e.manager_id);
SQL Interview Questions: https://news.1rj.ru/str/sqlspecialist/2220
React ♥️ if this helped you
❤16
Top 50 Python Interview Questions for Data Analysts (2025) ✅
1. What is Python and why is it popular for data analysis?
2. Differentiate between lists, tuples, and sets in Python.
3. How do you handle missing data in a dataset?
4. What are list comprehensions and how are they useful?
5. Explain Pandas DataFrame and Series.
6. How do you read data from different file formats (CSV, Excel, JSON) in Python?
7. What is the difference between Python’s
8. How do you filter rows in a Pandas DataFrame?
9. Explain the use of
10. What are lambda functions and how are they used?
11. How do you merge or join two DataFrames?
12. What is the difference between
13. How do you handle duplicates in a DataFrame?
14. Explain how to deal with outliers in data.
15. What is data normalization and how can it be done in Python?
16. Describe different data types in Python.
17. How do you convert data types in Pandas?
18. What are Python dictionaries and how are they useful?
19. How do you write efficient loops in Python?
20. Explain error handling in Python with
21. How do you perform basic statistical operations in Python?
22. What libraries do you use for data visualization?
23. How do you create plots using Matplotlib or Seaborn?
24. What is the difference between
25. How do you export Pandas DataFrames to CSV or Excel files?
26. What is the difference between Python’s
27. How can you profile and optimize Python code?
28. What are Python decorators and give a simple example?
29. How do you handle dates and times in Python?
30. Explain list slicing in Python.
31. What are the differences between Python 2 and Python 3?
32. How do you use regular expressions in Python?
33. What is the purpose of the
34. Explain how to use virtual environments.
35. How do you connect Python with SQL databases?
36. What is the role of the
37. How do you handle JSON data in Python?
38. What are generator functions and why use them?
39. How do you perform feature engineering with Python?
40. What is the purpose of the Pandas
41. How do you handle categorical data?
42. Explain the difference between deep copy and shallow copy.
43. What is the use of the
44. How do you detect and handle multicollinearity?
45. How can you improve Python noscript performance?
46. What are Python’s built-in data structures?
47. How do you automate repetitive data tasks with Python?
48. Explain the use of Assertions in Python.
49. How do you write unit tests in Python?
50. How do you handle large datasets in Python?
Double tap ❤️ for detailed answers!
1. What is Python and why is it popular for data analysis?
2. Differentiate between lists, tuples, and sets in Python.
3. How do you handle missing data in a dataset?
4. What are list comprehensions and how are they useful?
5. Explain Pandas DataFrame and Series.
6. How do you read data from different file formats (CSV, Excel, JSON) in Python?
7. What is the difference between Python’s
append() and extend() methods?8. How do you filter rows in a Pandas DataFrame?
9. Explain the use of
groupby() in Pandas with an example.10. What are lambda functions and how are they used?
11. How do you merge or join two DataFrames?
12. What is the difference between
.loc[] and .iloc[] in Pandas?13. How do you handle duplicates in a DataFrame?
14. Explain how to deal with outliers in data.
15. What is data normalization and how can it be done in Python?
16. Describe different data types in Python.
17. How do you convert data types in Pandas?
18. What are Python dictionaries and how are they useful?
19. How do you write efficient loops in Python?
20. Explain error handling in Python with
try-except.21. How do you perform basic statistical operations in Python?
22. What libraries do you use for data visualization?
23. How do you create plots using Matplotlib or Seaborn?
24. What is the difference between
.apply() and .map() in Pandas?25. How do you export Pandas DataFrames to CSV or Excel files?
26. What is the difference between Python’s
range() and xrange()?27. How can you profile and optimize Python code?
28. What are Python decorators and give a simple example?
29. How do you handle dates and times in Python?
30. Explain list slicing in Python.
31. What are the differences between Python 2 and Python 3?
32. How do you use regular expressions in Python?
33. What is the purpose of the
with statement?34. Explain how to use virtual environments.
35. How do you connect Python with SQL databases?
36. What is the role of the
__init__.py file?37. How do you handle JSON data in Python?
38. What are generator functions and why use them?
39. How do you perform feature engineering with Python?
40. What is the purpose of the Pandas
.pivot_table() method?41. How do you handle categorical data?
42. Explain the difference between deep copy and shallow copy.
43. What is the use of the
enumerate() function?44. How do you detect and handle multicollinearity?
45. How can you improve Python noscript performance?
46. What are Python’s built-in data structures?
47. How do you automate repetitive data tasks with Python?
48. Explain the use of Assertions in Python.
49. How do you write unit tests in Python?
50. How do you handle large datasets in Python?
Double tap ❤️ for detailed answers!
Please open Telegram to view this post
VIEW IN TELEGRAM
❤37👏4👍2🥰1
Python Interview Questions with Answers Part-1: ☑️
1. What is Python and why is it popular for data analysis?
Python is a high-level, interpreted programming language known for simplicity and readability. It’s popular in data analysis due to its rich ecosystem of libraries like Pandas, NumPy, and Matplotlib that simplify data manipulation, analysis, and visualization.
2. Differentiate between lists, tuples, and sets in Python.
⦁ List: Mutable, ordered, allows duplicates.
⦁ Tuple: Immutable, ordered, allows duplicates.
⦁ Set: Mutable, unordered, no duplicates.
3. How do you handle missing data in a dataset?
Common methods: removing rows/columns with missing values, filling with mean/median/mode, or using interpolation. Libraries like Pandas provide
4. What are list comprehensions and how are they useful?
Concise syntax to create lists from iterables using a single readable line, often replacing loops for cleaner and faster code.
Example:
5. Explain Pandas DataFrame and Series.
⦁ Series: 1D labeled array, like a column.
⦁ DataFrame: 2D labeled data structure with rows and columns, like a spreadsheet.
6. How do you read data from different file formats (CSV, Excel, JSON) in Python?
Using Pandas:
⦁ CSV:
⦁ Excel:
⦁ JSON:
7. What is the difference between Python’s
⦁
⦁
8. How do you filter rows in a Pandas DataFrame?
Using boolean indexing:
9. Explain the use of
Example:
10. What are lambda functions and how are they used?
Anonymous, inline functions defined with
Example:
React ♥️ for Part 2
1. What is Python and why is it popular for data analysis?
Python is a high-level, interpreted programming language known for simplicity and readability. It’s popular in data analysis due to its rich ecosystem of libraries like Pandas, NumPy, and Matplotlib that simplify data manipulation, analysis, and visualization.
2. Differentiate between lists, tuples, and sets in Python.
⦁ List: Mutable, ordered, allows duplicates.
⦁ Tuple: Immutable, ordered, allows duplicates.
⦁ Set: Mutable, unordered, no duplicates.
3. How do you handle missing data in a dataset?
Common methods: removing rows/columns with missing values, filling with mean/median/mode, or using interpolation. Libraries like Pandas provide
.dropna(), .fillna() functions to do this easily.4. What are list comprehensions and how are they useful?
Concise syntax to create lists from iterables using a single readable line, often replacing loops for cleaner and faster code.
Example:
[x**2 for x in range(5)] → ``5. Explain Pandas DataFrame and Series.
⦁ Series: 1D labeled array, like a column.
⦁ DataFrame: 2D labeled data structure with rows and columns, like a spreadsheet.
6. How do you read data from different file formats (CSV, Excel, JSON) in Python?
Using Pandas:
⦁ CSV:
pd.read_csv('file.csv')⦁ Excel:
pd.read_excel('file.xlsx')⦁ JSON:
pd.read_json('file.json')7. What is the difference between Python’s
append() and extend() methods?⦁
append() adds its argument as a single element to the end of a list.⦁
extend() iterates over its argument adding each element to the list.8. How do you filter rows in a Pandas DataFrame?
Using boolean indexing:
df[df['column'] > value] filters rows where ‘column’ is greater than value.9. Explain the use of
groupby() in Pandas with an example. groupby() splits data into groups based on column(s), then you can apply aggregation. Example:
df.groupby('category')['sales'].sum() gives total sales per category.10. What are lambda functions and how are they used?
Anonymous, inline functions defined with
lambda keyword. Used for quick, throwaway functions without formally defining with def. Example:
df['new'] = df['col'].apply(lambda x: x*2)React ♥️ for Part 2
Please open Telegram to view this post
VIEW IN TELEGRAM
❤17