Which is true about Bag of Words?
Anonymous Quiz
69%
Treats a document as a set of words without order
23%
Preserves word order
8%
Only for noscripts
0%
Only used in databases
In Boolean model, which operation expands results?
Anonymous Quiz
8%
AND
15%
OR
0%
NOT
77%
All of the above
Query: AI AND ML returns:
Anonymous Quiz
18%
Documents containing either
64%
Documents containing both
9%
Documents without AI
9%
Documents without ML
Query: cloud NOT storage returns:
Anonymous Quiz
17%
All documents
50%
Documents containing cloud only, not storage
8%
Documents containing storage only
25%
Nothing
How does the Boolean model partition the set?
Anonymous Quiz
67%
Matching / non-matching documents
22%
Important / unimportant documents
0%
Recent / old documents
11%
Short / long documents
What does Exact-Match Retrieval mean?
Anonymous Quiz
67%
Retrieve documents that fully satisfy the query
11%
Retrieve similar documents
11%
Retrieve related documents
11%
No meaning
What is the purpose of Term-Document Incidence Matrix?
Anonymous Quiz
78%
Shows presence or absence of a term in a document
11%
Stores full text
0%
Counts word frequency
11%
Sorts documents
What is the problem of incidence matrix in large collections?
Anonymous Quiz
22%
Very dense
0%
Very small
78%
Sparse, full of zeros
0%
Cannot be created
If we have 1 million documents and 500,000 terms, total cells = ?
Anonymous Quiz
30%
500 million
0%
1 billion
70%
500 billion
0%
1 trillion
What is the biggest drawback of the Boolean model?
Anonymous Quiz
75%
Does not rank results by relevance
25%
Always slow
0%
Never used
0%
Relies on AI
Which systems still use the Boolean model?
Anonymous Quiz
13%
Modern search engines only
88%
Email systems, libraries, Spotlight
0%
Neural networks
0%
None
What happens if no document matches the query?
---
---
Anonymous Quiz
22%
Returns all documents
44%
Returns empty results
22%
Returns half of the documents
11%
Returns random results
What is the purpose of using an inverted index?
Anonymous Quiz
40%
Reduce storage
40%
Speed up search
20%
Translate documents
0%
Remove duplicates
What are the components of an inverted index?
Anonymous Quiz
70%
Dictionary + Postings Lists
20%
Spreadsheet
10%
SQL database
0%
Text files
What does the dictionary contain?
Anonymous Quiz
60%
All unique terms
30%
Full text of documents
10%
Usernames
0%
Titles only
What do postings lists contain?
Anonymous Quiz
89%
List of documents containing the term
0%
Synonyms
11%
Letter frequency
0%
Image locations
What is the first step in building an inverted index?
Anonymous Quiz
89%
Tokenization
0%
Normalization
11%
Stemming
0%
Sorting
Tokenization means:
Anonymous Quiz
73%
Splitting text into words/tokens
27%
Removing short words
0%
Compressing text
0%
Translating text