NEW BOT Телеграм, страница

علم البيانات | DS2 Quizes

The Bag-of-Words (BOW) model assumes that:

Anonymous Quiz

12%

All words have equal importance

12%

Word order is critically important for meaning

Documents should be treated as strings

76%

The topic of a document is determined by the words it contains,not their order

17 voters220 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 22:55

علم البيانات | DS2 Quizes

The Boolean Retrieval Model is called exact-match because it:

Anonymous Quiz

Is 100%accurate

11%

Always finds all relevant documents

Uses a precise ranking function

83%

Returns documents that exactly satisfy the Boolean logical condition

18 voters216 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 22:55

علم البيانات | DS2 Quizes

An example of a Boolean query is:

Anonymous Quiz

19 voters219 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 22:55

علم البيانات | DS2 Quizes

The main component of an IR system that matches queries to documents is the:

Anonymous Quiz

19 voters222 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:06

علم البيانات | DS2 Quizes

Which of these is NOT a typical application of IR technology?

Anonymous Quiz

Recommender system

Email search

86%

Relational database management system(RDBMS)

Web search engine

21 voters208 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:07

علم البيانات | DS2 Quizes

The vocabulary problem in IR refers to:

Anonymous Quiz

Not having a large enough vocabulary

24%

Translating queries into another language

Having too many words in the dictionary

71%

The mismatch between the terms users employ in queries and the terms authors use in documents

17 voters188 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

The Cranfield Paradigm for evaluation requires:

Anonymous Quiz

A live user study

Only a document collection

Only a set of queries

93%

A document collection,queries, and relevance judgments

15 voters174 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

A test collection in IR is reusable because:

Anonymous Quiz

It only uses synthetic data

It never changes

It is very small

86%

It allows different systems to be compared on the same ground truth

14 voters181 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

System-centered evaluation focuses on:

Anonymous Quiz

The aesthetics of the interface

How quickly users learn the system

User satisfaction surveys

92%

Measuring performance against a fixed test collection

13 voters198 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

User-centered evaluation focuses on:

Anonymous Quiz

The number of documents in the collection

The compression ratio of the index

The speed of the indexing algorithm

85%

Involving real users to complete tasks and measuring their success/satisfaction

13 voters201 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

The unit document problem in indexing refers to:

Anonymous Quiz

Removing stop words

Tokenizing the text

Choosing a character encoding

100%

Deciding what constitutes a single document to be indexed(e.g., a book, a chapter, a page)

13 voters185 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

IR systems are important because they:

Anonymous Quiz

Are easier to build than other systems

Can understand the semantic meaning of all text

Are faster than database systems

93%

Help users overcome information overload by finding needed information

14 voters166 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

The initial step in any IR process is:

Anonymous Quiz

Building an index

Displaying the results

18%

Ranking the results

76%

Understanding the users information need

17 voters169 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 23:22

علم البيانات | DS2 Quizes

An inverted index consists of:

Anonymous Quiz

A matrix of term frequencies

A list of documents

A graph of document relationships

92%

A dictionary of terms and their corresponding postings lists

12 voters163 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

A posting in an inverted index typically contains:

Anonymous Quiz

The user who authored the document

The entire text of a document

The relevance score for that term-document pair

91%

A document ID and often the term frequency or positions

11 voters160 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

The indexing process is typically performed:

Anonymous Quiz

Only when a document is updated

At query time(online)

Both online and offline

82%

Offline,before any queries are received

11 voters149 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

The first step in the text preprocessing pipeline is:

Anonymous Quiz

9 voters152 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

Tokenization is the process of:

Anonymous Quiz

Removing common words

Translating words into another language

55%

Splitting a text stream into individual tokens(words)

36%

Reducing words to their root form

11 voters156 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

A significant challenge in tokenizing text from social media (like Twitter) is:

Anonymous Quiz

The text is too long

The text is in multiple languages

There are no spaces between words

91%

Handling hashtags(#), mentions (@), and emoticons

11 voters173 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

Stop words are:

Anonymous Quiz

Words that are misspelled

10%

The most important words in a document

Words that signal the end of a sentence

90%

Very common words that often carry little semantic weight(e.g., the, a, is)

10 voters184 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

علم البيانات | DS2 Quizes

The main reason for removing stop words is to:

Anonymous Quiz

Make the text easier to read

Improve grammatical correctness

Increase the number of relevant documents

82%

Reduce the size of the index and improve efficiency

11 voters164 views𝙴𝚗𝚐. 𝙷𝚊𝚣𝚎𝚖 𝙵. 𝙰𝚕-𝙷𝚞𝚝𝚑𝚊𝚒𝚏𝚒, 00:47

About

Blog

Apps

Platform