What happens if no document matches the query?
---
---
Anonymous Quiz
22%
Returns all documents
44%
Returns empty results
22%
Returns half of the documents
11%
Returns random results
What is the purpose of using an inverted index?
Anonymous Quiz
40%
Reduce storage
40%
Speed up search
20%
Translate documents
0%
Remove duplicates
What are the components of an inverted index?
Anonymous Quiz
70%
Dictionary + Postings Lists
20%
Spreadsheet
10%
SQL database
0%
Text files
What does the dictionary contain?
Anonymous Quiz
60%
All unique terms
30%
Full text of documents
10%
Usernames
0%
Titles only
What do postings lists contain?
Anonymous Quiz
89%
List of documents containing the term
0%
Synonyms
11%
Letter frequency
0%
Image locations
What is the first step in building an inverted index?
Anonymous Quiz
89%
Tokenization
0%
Normalization
11%
Stemming
0%
Sorting
Tokenization means:
Anonymous Quiz
73%
Splitting text into words/tokens
27%
Removing short words
0%
Compressing text
0%
Translating text
Challenge in Hewlett-Packard during tokenization:
Anonymous Quiz
44%
One word or two?
33%
Ambiguous meaning
22%
Long length
0%
Not used
Query: wink AND drink means:
Anonymous Quiz
33%
Union of lists
56%
Intersection of lists
11%
Subtract lists
0%
Nothing
What distinguishes positional index from normal index?
Anonymous Quiz
63%
Stores word positions
38%
Smaller size
0%
Always faster
0%
No difference
Purpose of positional index?
Anonymous Quiz
100%
Support phrase and proximity queries
0%
Reduce storage
0%
Compress text
0%
No use
Drawback of positional index?
Anonymous Quiz
30%
Large size
10%
Always slow
30%
Inaccurate
30%
Complicated
Query NEAR/3 means:
Anonymous Quiz
70%
Two words within 3 words of each other
10%
Two words within 3 documents
20%
Two words within 3 seconds
0%
Two identical words
Purpose of stop words removal?
Anonymous Quiz
100%
Reduce size and focus on important terms
0%
Increase frequency
0%
Improve translation
0%
Compress text
Which words are often stop words?
Anonymous Quiz
89%
the, a, is
0%
computer
0%
algorithm
11%
database
When should stop words not be removed?
Anonymous Quiz
89%
In phrases like to be or not to be
0%
Scientific texts
11%
Data compression
0%
Using images
What is normalization?
Anonymous Quiz
90%
Making different forms of a word similar
0%
Removing short words
0%
Alphabetical ordering
10%
Text compression
What is case folding?
Anonymous Quiz
100%
Convert all letters to lowercase
0%
Capitalize letters
0%
Remove letters
0%
Keep original
Drawback of case folding?
Anonymous Quiz
90%
Lose distinction between Black and black
0%
Increase size
10%
Hard to read
0%
System slowdown
What is stemming?
Anonymous Quiz
80%
Reducing a word to its root
20%
Deleting it
0%
Splitting text
0%
Translating
Most famous stemming algorithm for English:
Anonymous Quiz
73%
Porter Stemmer
27%
RSA
0%
Huffman
0%
Soundex