Selecting with Transformations and Conditional Logic
#### Data Setup
#### pandas
#### polars
#### SQL (Conceptual Table Structure and Data)
---
Creating New Columns with Expressions (
#### pandas
#### polars
#### SQL
---
Conditional Column Creation (
#### pandas
#### Data Setup
#### pandas
import pandas as pd
data = {
'product_id': [101, 102, 103, 104, 105, 106, 107, 108],
'product_name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Microphone', 'Speakers', 'Charger'],
'category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Peripherals', 'Peripherals', 'Audio', 'Accessories'],
'price': [1200.00, 25.00, 75.00, 300.00, 50.00, 80.00, 150.00, 15.00],
'stock_quantity': [50, 200, 150, 70, 100, 60, 40, 0]
}
df_pd = pd.DataFrame(data)
#### polars
import polars as pl
data = {
'product_id': [101, 102, 103, 104, 105, 106, 107, 108],
'product_name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Microphone', 'Speakers', 'Charger'],
'category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Peripherals', 'Peripherals', 'Audio', 'Accessories'],
'price': [1200.00, 25.00, 75.00, 300.00, 50.00, 80.00, 150.00, 15.00],
'stock_quantity': [50, 200, 150, 70, 100, 60, 40, 0]
}
df_pl = pl.DataFrame(data)
#### SQL (Conceptual Table Structure and Data)
-- CREATE TABLE products (
-- product_id INT PRIMARY KEY,
-- product_name VARCHAR(255),
-- category VARCHAR(255),
-- price DECIMAL(10, 2),
-- stock_quantity INT
-- );
-- INSERT INTO products VALUES
-- (101, 'Laptop', 'Electronics', 1200.00, 50),
-- (102, 'Mouse', 'Electronics', 25.00, 200),
-- (103, 'Keyboard', 'Electronics', 75.00, 150),
-- (104, 'Monitor', 'Electronics', 300.00, 70),
-- (105, 'Webcam', 'Peripherals', 50.00, 100),
-- (106, 'Microphone', 'Peripherals', 80.00, 60),
-- (107, 'Speakers', 'Audio', 150.00, 40),
-- (108, 'Charger', 'Accessories', 15.00, 0);
---
Creating New Columns with Expressions (
SELECT col1, col2 + col3 AS new_col)#### pandas
# Select 'product_name', 'price', and calculate 'total_inventory_value'
result_pd = df_pd.assign(
total_inventory_value=df_pd['price'] * df_pd['stock_quantity'],
discounted_price=df_pd['price'] * 0.9
)[['product_name', 'price', 'total_inventory_value', 'discounted_price']]
print(result_pd)
#### polars
# Select 'product_name', 'price', and calculate 'total_inventory_value'
result_pl = df_pl.select(
'product_name',
'price',
(pl.col('price') * pl.col('stock_quantity')).alias('total_inventory_value'),
(pl.col('price') * 0.9).alias('discounted_price')
)
print(result_pl)
#### SQL
-- Select product_name, price, and calculate total_inventory_value and discounted_price
SELECT
product_name,
price,
price * stock_quantity AS total_inventory_value,
price * 0.9 AS discounted_price
FROM products;
---
Conditional Column Creation (
CASE WHEN equivalent)#### pandas
# Create 'price_level' based on price and 'stock_status'
def get_price_level(price):
if price > 200:
return 'High'
elif price > 50:
return 'Medium'
else:
return 'Low'
def get_stock_status(stock):
if stock == 0:
return 'Out of Stock'
elif stock < 50:
return 'Low Stock'
else:
return 'In Stock'
result_pd = df_pd.assign(
price_level=df_pd['price'].apply(get_price_level),
stock_status=df_pd['stock_quantity'].apply(get_stock_status)
)[['product_name', 'price', 'price_level', 'stock_quantity', 'stock_status']]
print(result_pd)
#### polars
# Create 'price_level' based on price and 'stock_status'
result_pl = df_pl.select(
'product_name',
'price',
pl.when(pl.col('price') > 200).then(pl.lit('High'))
.when(pl.col('price') > 50).then(pl.lit('Medium'))
.otherwise(pl.lit('Low'))
.alias('price_level'),
'stock_quantity',
pl.when(pl.col('stock_quantity') == 0).then(pl.lit('Out of Stock'))
.when(pl.col('stock_quantity') < 50).then(pl.lit('Low Stock'))
.otherwise(pl.lit('In Stock'))
.alias('stock_status')
)
print(result_pl)
#### SQL
-- Create price_level and stock_status based on conditions
SELECT
product_name,
price,
CASE
WHEN price > 200 THEN 'High'
WHEN price > 50 THEN 'Medium'
ELSE 'Low'
END AS price_level,
stock_quantity,
CASE
WHEN stock_quantity = 0 THEN 'Out of Stock'
WHEN stock_quantity < 50 THEN 'Low Stock'
ELSE 'In Stock'
END AS stock_status
FROM products;
---
String Transformations in Select
#### pandas
# Select product_name in uppercase and first 3 characters of category
result_pd = df_pd.assign(
product_name_upper=df_pd['product_name'].str.upper(),
category_prefix=df_pd['category'].str.slice(0, 3)
)[['product_name', 'product_name_upper', 'category', 'category_prefix']]
print(result_pd)
#### polars
# Select product_name in uppercase and first 3 characters of category
result_pl = df_pl.select(
'product_name',
pl.col('product_name').str.to_uppercase().alias('product_name_upper'),
'category',
pl.col('category').str.slice(0, 3).alias('category_prefix')
)
print(result_pl)
#### SQL
-- Select product_name in uppercase and first 3 characters of category
SELECT
product_name,
UPPER(product_name) AS product_name_upper,
category,
SUBSTRING(category, 1, 3) AS category_prefix -- Or LEFT(category, 3) in some SQL dialects
FROM products;
---
Selecting with Advanced Filtering (
IN, BETWEEN equivalents)#### pandas
# Select products in 'Electronics' or 'Audio' categories
print("Products in Electronics or Audio:")
print(df_pd[df_pd['category'].isin(['Electronics', 'Audio'])])
# Select products with price between 50 and 200 (inclusive)
print("\nProducts with price between 50 and 200:")
print(df_pd[df_pd['price'].between(50, 200)])
#### polars
❤1
# Select products in 'Electronics' or 'Audio' categories
print("Products in Electronics or Audio:")
print(df_pl.filter(pl.col('category').is_in(['Electronics', 'Audio'])))
# Select products with price between 50 and 200 (inclusive)
print("\nProducts with price between 50 and 200:")
print(df_pl.filter(pl.col('price').is_between(50, 200)))
#### SQL
-- Select products in 'Electronics' or 'Audio' categories
SELECT *
FROM products
WHERE category IN ('Electronics', 'Audio');
-- Select products with price between 50 and 200 (inclusive)
SELECT *
FROM products
WHERE price BETWEEN 50 AND 200;
https://news.1rj.ru/str/DataAnalyticsX
Please open Telegram to view this post
VIEW IN TELEGRAM
❤4
This media is not supported in your browser
VIEW IN TELEGRAM
The Python library PandasAI has been released for simplified data analysis using AI.
You can ask questions about the dataset in plain language directly in the AI dialogue, compare different datasets, and create graphs. It saves a lot of time, especially in the initial stage of getting acquainted with the data. It supports CSV, SQL, and Parquet.
And here's the link😍
👉 https://news.1rj.ru/str/DataAnalyticsX
You can ask questions about the dataset in plain language directly in the AI dialogue, compare different datasets, and create graphs. It saves a lot of time, especially in the initial stage of getting acquainted with the data. It supports CSV, SQL, and Parquet.
And here's the link
Please open Telegram to view this post
VIEW IN TELEGRAM
❤5
Forwarded from Machine Learning with Python
This channels is for Programmers, Coders, Software Engineers.
0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages
✅ https://news.1rj.ru/str/addlist/8_rRW2scgfRhOTc0
✅ https://news.1rj.ru/str/Codeprogrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❤1
Data Analytics
Photo
1. What is the primary purpose of the pandas library?
A. Working with unstructured multimedia data
B. Creating and manipulating structured tabular data
C. Building machine learning models
D. Visualizing neural networks
Correct answer: B.
2. Which pandas object is one-dimensional and enforces a homogeneous data type?
A. DataFrame
B. Index
C. Series
D. Panel
Correct answer: C.
3. How can a pd.Series be best compared to an Excel structure?
A. Entire worksheet
B. Row
C. Column
D. Pivot table
Correct answer: C.
4. Which object in pandas represents labels for rows or columns?
A. Series
B. DataFrame
C. Index
D. ndarray
Correct answer: C.
5. What happens if no index is provided when creating a pd.Series?
A. An error is raised
B. A random index is created
C. A RangeIndex starting at 0 is created
D. Index values must be inferred manually
Correct answer: C.
6. Which argument is used to explicitly set the data type of a pd.Series?
A. type=
B. data_type=
C. dtype=
D. astype=
Correct answer: C.
7. What is the default value of the name attribute of a pd.Series if not provided?
A. Empty string
B. Undefined
C. None
D. "Series"
Correct answer: C.
8. Which structure allows heterogeneous column data types?
A. Series
B. Index
C. ndarray
D. DataFrame
Correct answer: D.
9. When constructing a DataFrame from a dictionary, what do the dictionary keys represent?
A. Row labels
B. Index levels
C. Column labels
D. Data types
Correct answer: C.
10. Which attribute returns the number of rows in a pd.Series?
A. size
B. shape
C. len()
D. index
Correct answer: B.
11. What does the pd.Series.shape attribute return?
A. An integer
B. A list
C. A one-element tuple
D. A two-element tuple
Correct answer: C.
12. Which attribute of a DataFrame returns a Series of column data types?
A. dtype
B. dtypes
C. types
D. schema
Correct answer: B.
13. What does len(df) return for a DataFrame?
A. Number of columns
B. Total number of elements
C. Number of rows
D. Size of memory used
Correct answer: C.
14. In basic DataFrame selection using df["a"], what is returned?
A. A DataFrame
B. A scalar
C. A NumPy array
D. A Series
Correct answer: D.
15. What does df[["a"]] return?
A. A Series
B. A DataFrame
C. A scalar
D. A NumPy array
Correct answer: B.
16. When using [] with a Series that has a non-default integer index, selection is done by:
A. Position
B. Order of insertion
C. Label
D. Data type
Correct answer: C.
17. Which method should be used for explicit position-based selection in a Series?
A. loc
B. at
C. iloc
D. ix
Correct answer: C.
18. What does ser.iloc[1] return?
A. All rows with label 1
B. The value at position 1
C. A slice of the Series
D. A DataFrame
Correct answer: B.
19. How many indexers are required when using DataFrame.iloc?
A. One
B. Two
C. Three
D. Unlimited
Correct answer: B.
20. What does df.iloc[:, 0] return?
A. First row
B. First column as a Series
C. First column as a DataFrame
D. Entire DataFrame
Correct answer: B.
21. Which method performs label-based selection in a Series?
A. iloc
B. at
C. loc
D. take
Correct answer: C.
22. What is a key difference between slicing with loc and iloc?
A. loc excludes the stop value
B. iloc includes labels
C. loc includes the stop label
D. iloc works only with strings
Correct answer: C.
23. Which operation may raise a KeyError when using loc?
A. Slicing with ordered unique labels
B. Selecting existing labels
C. Slicing with non-unique unordered labels
D. Selecting with lists
Correct answer: C.
24. In a DataFrame, df.loc["Jack", :] selects:
A. All rows named Jack
B. All columns named Jack
C. All columns for the row labeled Jack
D. Only numeric columns
Correct answer: C.
A. Working with unstructured multimedia data
B. Creating and manipulating structured tabular data
C. Building machine learning models
D. Visualizing neural networks
Correct answer: B.
2. Which pandas object is one-dimensional and enforces a homogeneous data type?
A. DataFrame
B. Index
C. Series
D. Panel
Correct answer: C.
3. How can a pd.Series be best compared to an Excel structure?
A. Entire worksheet
B. Row
C. Column
D. Pivot table
Correct answer: C.
4. Which object in pandas represents labels for rows or columns?
A. Series
B. DataFrame
C. Index
D. ndarray
Correct answer: C.
5. What happens if no index is provided when creating a pd.Series?
A. An error is raised
B. A random index is created
C. A RangeIndex starting at 0 is created
D. Index values must be inferred manually
Correct answer: C.
6. Which argument is used to explicitly set the data type of a pd.Series?
A. type=
B. data_type=
C. dtype=
D. astype=
Correct answer: C.
7. What is the default value of the name attribute of a pd.Series if not provided?
A. Empty string
B. Undefined
C. None
D. "Series"
Correct answer: C.
8. Which structure allows heterogeneous column data types?
A. Series
B. Index
C. ndarray
D. DataFrame
Correct answer: D.
9. When constructing a DataFrame from a dictionary, what do the dictionary keys represent?
A. Row labels
B. Index levels
C. Column labels
D. Data types
Correct answer: C.
10. Which attribute returns the number of rows in a pd.Series?
A. size
B. shape
C. len()
D. index
Correct answer: B.
11. What does the pd.Series.shape attribute return?
A. An integer
B. A list
C. A one-element tuple
D. A two-element tuple
Correct answer: C.
12. Which attribute of a DataFrame returns a Series of column data types?
A. dtype
B. dtypes
C. types
D. schema
Correct answer: B.
13. What does len(df) return for a DataFrame?
A. Number of columns
B. Total number of elements
C. Number of rows
D. Size of memory used
Correct answer: C.
14. In basic DataFrame selection using df["a"], what is returned?
A. A DataFrame
B. A scalar
C. A NumPy array
D. A Series
Correct answer: D.
15. What does df[["a"]] return?
A. A Series
B. A DataFrame
C. A scalar
D. A NumPy array
Correct answer: B.
16. When using [] with a Series that has a non-default integer index, selection is done by:
A. Position
B. Order of insertion
C. Label
D. Data type
Correct answer: C.
17. Which method should be used for explicit position-based selection in a Series?
A. loc
B. at
C. iloc
D. ix
Correct answer: C.
18. What does ser.iloc[1] return?
A. All rows with label 1
B. The value at position 1
C. A slice of the Series
D. A DataFrame
Correct answer: B.
19. How many indexers are required when using DataFrame.iloc?
A. One
B. Two
C. Three
D. Unlimited
Correct answer: B.
20. What does df.iloc[:, 0] return?
A. First row
B. First column as a Series
C. First column as a DataFrame
D. Entire DataFrame
Correct answer: B.
21. Which method performs label-based selection in a Series?
A. iloc
B. at
C. loc
D. take
Correct answer: C.
22. What is a key difference between slicing with loc and iloc?
A. loc excludes the stop value
B. iloc includes labels
C. loc includes the stop label
D. iloc works only with strings
Correct answer: C.
23. Which operation may raise a KeyError when using loc?
A. Slicing with ordered unique labels
B. Selecting existing labels
C. Slicing with non-unique unordered labels
D. Selecting with lists
Correct answer: C.
24. In a DataFrame, df.loc["Jack", :] selects:
A. All rows named Jack
B. All columns named Jack
C. All columns for the row labeled Jack
D. Only numeric columns
Correct answer: C.
❤1
Data Analytics
Photo
25. What is the main advantage of using pd.Index.get_indexer when mixing selection styles?
A. Improved readability
B. Lazy evaluation
C. Better performance by avoiding intermediate objects
D. Automatic type conversion
Correct answer: C.
https://news.1rj.ru/str/DataAnalyticsX✅
A. Improved readability
B. Lazy evaluation
C. Better performance by avoiding intermediate objects
D. Automatic type conversion
Correct answer: C.
https://news.1rj.ru/str/DataAnalyticsX
Please open Telegram to view this post
VIEW IN TELEGRAM
Telegram
Data Analytics
Dive into the world of Data Analytics – uncover insights, explore trends, and master data-driven decision making.
Admin: @HusseinSheikho || @Hussein_Sheikho
Admin: @HusseinSheikho || @Hussein_Sheikho
❤1
1. What is the result of the following code?
A. 10
B. 20
C. 30
D. KeyError
Correct answer: A.
2. What will this code output?
A. 10
B. 20
C. 30
D. IndexError
Correct answer: B.
3. What does this print?
A. (4,)
B. (2, 2)
C. (1, 4)
D. (2,)
Correct answer: B.
4. What is returned by this expression?
A. DataFrame
B. Series
C. list
D. ndarray
Correct answer: B.
5. What does this code output?
A. (2,)
B. (1, 2)
C. (2, 1)
D. (4, 1)
Correct answer: C.
6. What is the result?
A. [False, True, True]
B. Series of booleans
C. ndarray of booleans
D. True
Correct answer: B.
7. What does this code produce?
A. Series [2, 3]
B. Series [False, True, True]
C. [2, 3]
D. IndexError
Correct answer: A.
8. What is the output?
A. 1
B. 2
C. 3
D. 4
Correct answer: C.
9. What does this select?
A. First row
B. First column as Series
C. First column as DataFrame
D. Entire DataFrame
Correct answer: B.
10. What will this code output?
A. 1
B. 2
C. 3
D. Error
Correct answer: C.
11. What is returned?
A. Series
B. DataFrame
C. NumPy ndarray
D. list
Correct answer: C.
12. What does this code output?
A. [0, 1, 2]
B. list
C. RangeIndex
D. ndarray
Correct answer: C.
13. What is the result?
A. list
B. Series
C. Index
D. dict
Correct answer: C.
14. What does this return?
A. dict
B. Series
C. DataFrame
D. ndarray
Correct answer: B.
15. What is printed?
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
16. What does this code output?
A. [1, None, 3]
B. [None]
C. [1, 3]
D. Error
Correct answer: C.
17. What does this expression return?
A. First column
B. First row as Series
C. First row as DataFrame
D. Entire DataFrame
Correct answer: C.
18. What is the output?
A. 1
B. 2
C. 3
D. Error
Correct answer: C.
19. What happens here?
A. Raises KeyError
B. Modifies column a
C. Adds new column c
D. No effect
Correct answer: C.
20. What does this code output?
A. 1
B. 3
C. 6
D. Error
Correct answer: C.
21. What does df.mean() return?
A. scalar
B. Series
C. DataFrame
D. ndarray
Correct answer: B.
22. What is the result?
A. int
B. numpy.int64
C. object
D. float
Correct answer: B.
23. What does this code do?
A. Renames index
B. Renames column a to x
C. Deletes column a
D. Copies DataFrame only
Correct answer: B.
24. What does this expression return?
A. Boolean Series
B. Filtered DataFrame
C. Filtered Series
D. Error
Correct answer: B.
25. What is printed?
A. True
B. False
C. None
D. Error
Correct answer: B.
https://news.1rj.ru/str/DataAnalyticsX😱
import pandas as pd
s = pd.Series([10, 20, 30], index=[1, 2, 3])
print(s[1])
A. 10
B. 20
C. 30
D. KeyError
Correct answer: A.
2. What will this code output?
import pandas as pd
s = pd.Series([10, 20, 30])
print(s.iloc[1])
A. 10
B. 20
C. 30
D. IndexError
Correct answer: B.
3. What does this print?
import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df.shape)
A. (4,)
B. (2, 2)
C. (1, 4)
D. (2,)
Correct answer: B.
4. What is returned by this expression?
df["a"]
A. DataFrame
B. Series
C. list
D. ndarray
Correct answer: B.
5. What does this code output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df[["a"]].shape)
A. (2,)
B. (1, 2)
C. (2, 1)
D. (4, 1)
Correct answer: C.
6. What is the result?
import pandas as pd
s = pd.Series([1, 2, 3])
print(s > 1)
A. [False, True, True]
B. Series of booleans
C. ndarray of booleans
D. True
Correct answer: B.
7. What does this code produce?
import pandas as pd
s = pd.Series([1, 2, 3])
print(s[s > 1])
A. Series [2, 3]
B. Series [False, True, True]
C. [2, 3]
D. IndexError
Correct answer: A.
8. What is the output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
print(df.iloc[0, 1])
A. 1
B. 2
C. 3
D. 4
Correct answer: C.
9. What does this select?
df.loc[:, "a"]
A. First row
B. First column as Series
C. First column as DataFrame
D. Entire DataFrame
Correct answer: B.
10. What will this code output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(len(df))
A. 1
B. 2
C. 3
D. Error
Correct answer: C.
11. What is returned?
df.values
A. Series
B. DataFrame
C. NumPy ndarray
D. list
Correct answer: C.
12. What does this code output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.index)
A. [0, 1, 2]
B. list
C. RangeIndex
D. ndarray
Correct answer: C.
13. What is the result?
df.columns
A. list
B. Series
C. Index
D. dict
Correct answer: C.
14. What does this return?
df.dtypes
A. dict
B. Series
C. DataFrame
D. ndarray
Correct answer: B.
15. What is printed?
import pandas as pd
s = pd.Series([1, None, 3])
print(s.isna().sum())
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
16. What does this code output?
import pandas as pd
s = pd.Series([1, None, 3])
print(s.dropna().values)
A. [1, None, 3]
B. [None]
C. [1, 3]
D. Error
Correct answer: C.
17. What does this expression return?
df.head(1)
A. First column
B. First row as Series
C. First row as DataFrame
D. Entire DataFrame
Correct answer: C.
18. What is the output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.tail(1)["a"].iloc[0])
A. 1
B. 2
C. 3
D. Error
Correct answer: C.
19. What happens here?
df["c"] = df["a"] * 2
A. Raises KeyError
B. Modifies column a
C. Adds new column c
D. No effect
Correct answer: C.
20. What does this code output?
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.sum().iloc[0])
A. 1
B. 3
C. 6
D. Error
Correct answer: C.
21. What does df.mean() return?
A. scalar
B. Series
C. DataFrame
D. ndarray
Correct answer: B.
22. What is the result?
df["a"].dtype
A. int
B. numpy.int64
C. object
D. float
Correct answer: B.
23. What does this code do?
df = df.rename(columns={"a": "x"})A. Renames index
B. Renames column a to x
C. Deletes column a
D. Copies DataFrame only
Correct answer: B.
24. What does this expression return?
df.loc[df["a"] > 1, :]
A. Boolean Series
B. Filtered DataFrame
C. Filtered Series
D. Error
Correct answer: B.
25. What is printed?
import pandas as pd
df = pd.DataFrame({"a": [1, 2, 3]})
print(df.empty)
A. True
B. False
C. None
D. Error
Correct answer: B.
https://news.1rj.ru/str/DataAnalyticsX
Please open Telegram to view this post
VIEW IN TELEGRAM
1. What is the output of this code?
A. Series with values [3, 1, NaN]
B. Series with values [3, 1]
C. KeyError
D. Series with values [1, 3, NaN]
Correct answer: A.
2. What does this code produce?
A. 3
B. 4
C. 5
D. 6
Correct answer: C.
3. What is the result?
A. [1, 2, 3]
B. [1, 0, 0]
C. [0, 0, 0]
D. [1, 2, 0]
Correct answer: B.
4. What does this output?
A. 10
B. 20
C. 30
D. IndexError
Correct answer: B.
5. What is returned?
A. 1
B. 2
C. 3
D. KeyError
Correct answer: B.
6. What does this code output?
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
7. What is the result?
A. 0
B. 1
C. 2
D. Raises error
Correct answer: B.
8. What does this produce?
A. (1, 2)
B. (2, 1)
C. (2, 2)
D. (1, 1)
Correct answer: B.
9. What is printed?
A. 1-2-3
B. ['1-2-3']
C. Series
D. Error
Correct answer: A.
10. What does this code return?
A. (3, 1)
B. (1, 3)
C. (1, 1)
D. Depends on random seed
Correct answer: C.
11. What is the result?
A. 4
B. 5
C. 6
D. NaN
Correct answer: B.
12. What does this output?
A. (3, 1)
B. (3, 2)
C. (1, 3)
D. Error
Correct answer: B.
13. What is returned?
A. 1
B. 2
C. 3
D. KeyError
Correct answer: B.
14. What does this code output?
A. (1, 3)
B. (3, 1)
C. (3, 3)
D. (1, 1)
Correct answer: B.
15. What is the result?
A. (2, 1)
B. (1, 2)
C. (2, 2)
D. (1, 1)
Correct answer: B.
16. What does this print?
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
17. What is the output?
A. True
B. False
C. None
D. Error
Correct answer: B.
18. What does this code return?
A. [3, 1, 2]
B. [1, 2, 3]
C. [3.0, 1.0, 2.0]
D. [3.0, 1.0, 2.0] sorted
Correct answer: C.
19. What is printed?
A. True
B. False
C. None
D. Error
Correct answer: A.
20. What does this produce?
A. (3, 0)
B. (0, 1)
C. (3, 1)
D. (1, 3)
Correct answer: C.
import pandas as pd
s = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
print(s.reindex(['c', 'a', 'd']))
A. Series with values [3, 1, NaN]
B. Series with values [3, 1]
C. KeyError
D. Series with values [1, 3, NaN]
Correct answer: A.
2. What does this code produce?
import pandas as pd
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
print(df.assign(c=lambda x: x['a'] + x['b'])['c'].iloc[1])
A. 3
B. 4
C. 5
D. 6
Correct answer: C.
3. What is the result?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
df.loc[df['a'] > 1, 'a'] = 0
print(df['a'].tolist())
A. [1, 2, 3]
B. [1, 0, 0]
C. [0, 0, 0]
D. [1, 2, 0]
Correct answer: B.
4. What does this output?
import pandas as pd
s = pd.Series([10, 20, 30], index=[2, 0, 1])
print(s.sort_index().iloc[0])
A. 10
B. 20
C. 30
D. IndexError
Correct answer: B.
5. What is returned?
import pandas as pd
df = pd.DataFrame({'a': [1, 1, 2]})
print(df['a'].value_counts().loc[1])
A. 1
B. 2
C. 3
D. KeyError
Correct answer: B.
6. What does this code output?
import pandas as pd
s = pd.Series([1, 2, 3])
print(s.map({1: 'a', 2: 'b'}).isna().sum())
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
7. What is the result?
import pandas as pd
df = pd.DataFrame({'a': [1, None, 3]})
print(df['a'].astype('Int64').isna().sum())
A. 0
B. 1
C. 2
D. Raises error
Correct answer: B.
8. What does this produce?
import pandas as pd
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
print(df.filter(regex='a').shape)
A. (1, 2)
B. (2, 1)
C. (2, 2)
D. (1, 1)
Correct answer: B.
9. What is printed?
import pandas as pd
s = pd.Series(['1', '2', '3'])
print(s.str.cat(sep='-'))
A. 1-2-3
B. ['1-2-3']
C. Series
D. Error
Correct answer: A.
10. What does this code return?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.sample(n=1).shape)
A. (3, 1)
B. (1, 3)
C. (1, 1)
D. Depends on random seed
Correct answer: C.
11. What is the result?
import pandas as pd
s = pd.Series([1, 2, 3, 4])
print(s.rolling(2).sum().iloc[-1])
A. 4
B. 5
C. 6
D. NaN
Correct answer: B.
12. What does this output?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.eval('b = a * 2').shape)
A. (3, 1)
B. (3, 2)
C. (1, 3)
D. Error
Correct answer: B.
13. What is returned?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.query('a % 2 == 0')['a'].iloc[0])
A. 1
B. 2
C. 3
D. KeyError
Correct answer: B.
14. What does this code output?
import pandas as pd
s = pd.Series([1, 2, 3])
print(s.to_frame().shape)
A. (1, 3)
B. (3, 1)
C. (3, 3)
D. (1, 1)
Correct answer: B.
15. What is the result?
import pandas as pd
df = pd.DataFrame({'a': [1, 2]})
print(df.T.shape)
A. (2, 1)
B. (1, 2)
C. (2, 2)
D. (1, 1)
Correct answer: B.
16. What does this print?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.shift(1)['a'].isna().sum())
A. 0
B. 1
C. 2
D. 3
Correct answer: B.
17. What is the output?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.duplicated().any())
A. True
B. False
C. None
D. Error
Correct answer: B.
18. What does this code return?
import pandas as pd
s = pd.Series([3, 1, 2])
print(s.rank().tolist())
A. [3, 1, 2]
B. [1, 2, 3]
C. [3.0, 1.0, 2.0]
D. [3.0, 1.0, 2.0] sorted
Correct answer: C.
19. What is printed?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.memory_usage(deep=True).iloc[1] > 0)
A. True
B. False
C. None
D. Error
Correct answer: A.
20. What does this produce?
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3]})
print(df.select_dtypes(include='int').shape)
A. (3, 0)
B. (0, 1)
C. (3, 1)
D. (1, 3)
Correct answer: C.
❤5
Data Analytics
1. What is the output of this code? import pandas as pd s = pd.Series([1, 2, 3], index=['a', 'b', 'c']) print(s.reindex(['c', 'a', 'd'])) A. Series with values [3, 1, NaN] B. Series with values [3, 1] C. KeyError D. Series with values [1, 3, NaN] Correct…
I have sent you some real and important questions based on my reading of the book "Pandas Cookbook 2025".
❤2👍1
Looking for the best deals of 2025? Imagine grabbing up to 60% OFF on top gift cards from Amazon, Airbnb, hotel chains, and more! Why pay full price when you can save big on your everyday purchases and travel?
Don’t miss out on these exclusive discounts waiting for you right now — discover your savings today and enjoy 24/7 support. Ready to upgrade your shopping game?
Join 2025 Deals and start saving smarter! Check it out here
#ad InsideAds
Don’t miss out on these exclusive discounts waiting for you right now — discover your savings today and enjoy 24/7 support. Ready to upgrade your shopping game?
Join 2025 Deals and start saving smarter! Check it out here
#ad InsideAds
❤3
Take Control of Selling in Amazon!
💫Too many tools, too little time? With dynamic pricing, real-time stock tracking, order monitoring and AI-powered BuyBox hunting, SellerFlash makes selling effortless in Amazon.
💫Say goodbye to manual chaos. With SellerFlash, you will manage listings, inventory, buyer messages and feedback campaigns all from one smart cloud platform designed for Amazon sellers.
👉🏽https://www.sellerflash.com/en/
Sponsored By WaybienAds
💫Too many tools, too little time? With dynamic pricing, real-time stock tracking, order monitoring and AI-powered BuyBox hunting, SellerFlash makes selling effortless in Amazon.
💫Say goodbye to manual chaos. With SellerFlash, you will manage listings, inventory, buyer messages and feedback campaigns all from one smart cloud platform designed for Amazon sellers.
👉🏽https://www.sellerflash.com/en/
Sponsored By WaybienAds