In this task, you’ll be presented with (1) a query that likely asks a question, (2) a response that attempts to answer the question in the query, and (3) a response source (if present), which is the webpage that the response comes from. Your job will be to review the provided query, the response, and the response source, and then answer a few questions about them.

Note that if you’re unfamiliar with the query, or with any of the terms/concepts in the query, you should conduct research by clicking on one of the provided search engine links. Only after you’ve familiarized yourself with the query should you proceed to the questions in this task.

For the purposes of this task, we’ll define four types of questions that might pertain to the query: (1) Apple product help questions, (2) information-seeking questions, (3) personalized questions, and (4) miscellaneous questions.

  • An Apple product help question refers to a question explicitly seeking help with an Apple product. Examples include: [how can i screenshot on the iphone x], [how do i access my icloud drive], [how do i change the name of my ipad], [how do i hard reset an iphone], [how do you do a screenshot on a mac], [why is my ipad glitching], and [how to set up itunes].
  • An information-seeking question refers to a question that seeks knowledge about entities or concepts in the world, and is not asking about help with an Apple product. One defining attribute of information-seeking questions is that they could be answered by someone who simply researches the topic online using publicly available information. Examples of queries that contain information-seeking questions include: [are dinosaurs reptiles], [can a mule reproduce], [how many cards are in an uno deck], [san francisco weather], and [is falafel vegetarian]. Another example is the query [causes of insomnia], which seeks general information about possible causes of a medical condition.

Note that if the query asks a question whose answer is country-dependent, but no location is mentioned in the query, you can assume that the country is the USA. These types of queries are eligible to be marked as information-seeking. For example, the answer to the query [what countries did we fight in world war 2] depends on the country of the question-asker, but remember that we can assume the country is the USA. In this case, the question should be classified as information-seeking. However, if a question requires location information that is more precise than just the country, such as the state, county, city, or neighborhood, then the query should generally be marked as personalized (defined below).

Note also that if there’s an obvious misspelling in the query that makes it unclear what’s being asked, you should mark the query as miscellaneous (defined below).

  • A personalized question refers to a question that could be answered only if additional knowledge were known about the person asking the question. Additional knowledge includes (but is not limited to): who the person is, the person’s health information, what kind of electronic devices the person owns, and what city the person is in. For example, the query [weather today] contains a personalized question because answering it would require knowledge about what city the question asker is in. The query [how can i sync my phone to my car] contains a personalized question because it requires information about what phone and what car the question-asker has. The query [why is my internet connection so slow] contains a personalized question because it requires information about the question-asker’s internet connection. Lastly, the query [why do i have insomnia] contains a personalized question because answering it would require health information about the person asking the question. (Note that this is different from the more general query, [causes of insomnia], which is not specific to the person asking the question.)
  • A miscellaneous question refers to ANY question that does not fall into one of the three categories mentioned above. This includes (but is not limited to): questions that are incomplete, questions that are too broad, questions that are nonsense (possibly due to misspellings), questions that are impossible to answer, and questions that seek an opinion. Incomplete examples include [what is the salary of a] and [what temperature should i bake]. Too-broad examples include [who got married] and [who is elisa]. Nonsense examples include [what is how mika], [what are trampolines the flags], [what is the best anna histamine], and [population of sweedun]. Impossible-to-answer examples include [what is the meaning of life] and [why does god allow suffering]. Lastly, opinion-seeking examples include [is android or iphone better] and [what is the best cheese].
  • The query does not contain any questions: This is a simple statement  with no implication that information is being sought: [my dog is big] [it is hot].
  • The query contains more than one question: The question asks about more than one thing, or asks more than one thing about the same person or entity: [How old are Kevin, Joe, and Nick Jonas] [How old is Nick Jonas and when did he marry Priyanka Chopra].

Depending on your response to the first question, you may be asked a total of five questions. The final question (Question #5) asks about how satisfying the response would be to users who issued the query. The answer options for this question are defined below:

  • Not Satisfying — There are severe problems with the response. The response either does not actually answer the question, or it provides an answer that is inaccurate, misleading, or too confusing to actually be useful. In the case of timesensitive questions, the answer may be outdated and therefore invalid. (Note that misleading responses include those that might seem good at first glance, but upon examining the response source, turn out to have been taken out of context and are answering a different question entirely.) Alternatively, the response may actually provide an accurate answer to the question, but it contains inappropriate/ offensive language. Overall, very few to no users who issued the query would be satisfied with the response.
    • Slightly Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has noticeable problems. For example, the response may be only partially answering the question, or answering a narrow interpretation of the question. The response may be confusing because it’s missing important context—for example, it may make a reference to a subject that hasn’t been defined. The response may use informal language that might be opinionated or sound like an advertisement. The response may have noticeable spelling, grammatical, or formatting issues. Alternatively, the response provides a direct and complete answer to the question, but the response is unnecessarily long, with the answer being hidden behind a substantial amount of extraneous information. Overall, only some users who issued the query would be satisfied with the response.
    • Moderately Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has minor problems. The response answers the question completely, but the answer may be indirect or implied. The response may be longer than necessary, but the answer should be easily found toward the beginning of the response, potentially follo by extraneous information. The response may use informal language, but it should not be strongly opinionated, sound like an advertisement, or use inappropriate/offensive language. The response may have minor spelling, grammatical, or formatting issues. The language may be advanced (such as content from medical or academic literature), but it should still be understandable by most people. Overall, most users who issued the query would be satisfied with the response.
    • Highly Satisfying — The response is accurate (and in the case of time-sensitive questions, is timely and valid). The response answers the question completely, directly, and concisely, with no extraneous information. The response sounds professional and formal, with no inappropriate/offensive language. The language is easily understood, and has no spelling, grammatical, or formatting issues. Virtually all users who issued the query would be satisfied with the response.

Satisfaction Rating Examples

QueryResponseResponse SourceSatisfaction RatingRating Explanation
[how tall is the redwood tree]The tallest nonredwood tree is a 100.3 m (329 foot) tall Douglas fir.Link HereNot SatisfyingThe response doesn’t answer the question.
[is a viola smaller than a violin]The viola is generally strung with heavier strings than the violin.Link Here Not SatisfyingThe response doesn’t answer the question.
[is pectin vegan]Pectin is generally well tolerated when ingested.Link HereNot SatisfyingThe response doesn’t answer the question.
[how long is pregnancy]It is estimated that a human pregnancy should be about 18 months.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what is the world population]The earth has 50 billion people.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what happens when you sneeze with your eyes open]It can damage your hearing, lead to an ear infection, and rupture blood vessels in the eyes and brain. It is possible to keep your eyes open while sneezing.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[how old is evan from evan tube]Evan is now seven years old.Link HereNot SatisfyingThe response is outdated by several years.
[who won the par three contest]Ben Crane won the annual Par 3 contest, which took place on April 5, with a four- under 23. Arnold Palmer and Jack Nicklaus, made a curtain call at the event; Nicklaus was one-under and was in contention throughout the day.Link HereNot SatisfyingThe response is outdated, corresponding to the year 2006.
[who is the new mexican president]Mexico’s President Enrique Peña Nieto on Monday.Link HereNot SatisfyingThe response is outdated; as of 2018, Mexico’s president is Andrés Manuel López Obrador.
[where do they sell elf on the shelf]As a toy, the Elf on the Shelf is benign enough. It’s a skinny-ass doll, about a foot long, with a bigeyed pixie face, a plastic head, and a felt body, on sale at your local big box store for $29.95.Link HereNot SatisfyingThe response uses inappropriate language (“skinny-ass doll”), which automatically makes it Not Satisfying.
[what is polyurethane used for]Polyurethane foam is used primarily for bedding and furniture stuffing.Link HereSlightly SatisfyingThe response is too narrow; the question is about polyurethane in general, but the response is specifically focused on polyurethane foam. Because of this narrow interpretation, this answer is Slightly Satisfying.
[what noise does a parrot make]They will scream or make a sound like boiling water.Link HereSlightly SatisfyingThe response is too narrow; according to the response source, parrots will scream specifically when they feel threatened, which the question is not asking about.
[how much does a honda accord weigh]The first generation Honda Accord was launched on 7 May 1976 as a three-door hatchback with 68 hp (51 kW), a 93.7-inch (2,380.0 mm) wheelbase, and a weight of about 2,000 pounds.Link HereSlightly SatisfyingThe response is too narrow; the question is most likely asking about the weight of modern Honda Accords, rather than the weight of the original Honda Accord from 1976.
[when is eggplant in season]The largest producers of eggplants in the U.S. include Florida, New Jersey and California. They’re also grown in Mexico, China, Italy, Turkey, Egypt and Japan. Popular varieties include Black Beauty, Rosa Bianca, Classic, Orient Express, Black Italian, Japanese, Lavender and Cloud 9. Eggplants are in peak season from July through ber.Link HereSlightly SatisfyingThe response is accurate and it completely answers the question. However, the response is much longer than necessary, with lots of extraneous content, and the answer can only be found in the second half of the response. Because of this, the response is Slightly Satisfying.
[what are the steps of mitosis]Prophase – The first stage of mitosis is known as prophase, where the nuclear chromatin starts to become organized and condenses into thick strands that eventually become chromosomes observable in the optical microscope (Figure 1(b)). The nucleoli, primarily responsible for the production of ribosomal RNA, begin to disappear as the chromosomes condense. During prophase, major changes also occur in the cytoplasm, including disassembly of the cytoskeleton components based on tubulin (cytoplasmic microtubules).Link HereSlightly SatisfyingThe response provides only a partial answer to the question. A better answer would include all four phases: prophase, metaphase, anaphase, and telophase. As well, the answer is much longer than necessary, providing unnecessary details. The incompleteness and excessive length both contribute to this response being Slightly Satisfying.
[how much alcohol does modelo have]Enjoy Modelo Especial at your next barbecue or while watching the game with friends. This balanced, easydrinking beer contains 145 calories, 0 grams of fat, and 4.4% alcohol by volume per 12oz serving.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, because (1) it contains extraneous information with the answer only appearing at the very end, and (2) it sounds like an advertisement, it should be classified as Slightly Satisfying.
[how many licks does it take to get to center of a tootsie roll tootsie pop]Twenty of the group’s volunteers assumed the licking challengeunassisted by machinery-and averaged 252 licks each to the center.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, it misses critical context, referring to “twenty of the group’s volunteers” without explaining what that means. Missing this context causes the response to be confusing, which makes this Slightly Satisfying.
[why does pain hurts]This theory states that pain is a function of the balance between the information traveling into the spinal cord through large nerve fibers and information traveling into the spinal cord through small nerve fibers. Remember, large nerve fibers carry non-nociceptive information and small nerve fibers carry nociceptive information. If the relative amount of activity is greater in large nerve fibers, there should be little or no pain.Link HereSlightly SatisfyingThe response provides a very scientific answer to the question. However, for many users who might issue this query, the language used is likely to be too advanced and technical to actually be helpful. As well, the response begins with “This theory,” but the theory hasn’t actually been named or defined yet. These two issues cause this response to be Slightly Satisfying.
[who is anne coulter]Ann Coulter is a conservative political commentator, writer, and lawyer well-known for her polarizing opinions. Read her most shocking statements.Link HereSlightly SatisfyingThe first sentence of this response is great–it answers the question completely and directly. If the first answer were by itself, it would qualify as a Highly Satisfying response. However, the second sentence, “Read her most shocking statements,” sounds opinionated and like an advertisement. And because of that, this response is Slightly Satisfying.
[is plavix a blood thinner]Plavix (clopidogrel) is a blood thinner (anti-platelet) drug used for the prevention of strokes, heart attacks, and peripheral artery disease. Ibuprofen also thins the blood. If ibuprofen and Plavix are taken together it creates an interaction that thins the blood even more, which may cause gastrointestinal bleeding from ulcers in the digestive tract.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[is there gluten in rice]Rice does not contain gluten, so it can be eaten by people with celiac disease. However, it is not necessarily safe. There may be crosscontamination with glutencontaining products, and some rice has been found to contain a toxin, arsenic. Alternatives to rice include chia and maize. Always check the label first.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[are dinosaurs reptiles]Dinosaurs are a diverse group of reptiles of the clade Dinosauria. They first appeared during the Triassic period, between 243 and 233.23 million years ago, although the exact origin and timing of the evolution of dinosaurs is the subject of active research. They became the dominant terrestrial vertebrates after the Triassic– Jurassic extinction event 201 million years ago; their dominance continued through the Jurassic and Cretaceous periods.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half of the response, so this reponse is Moderately Satisfying.
[which states capital is salem]Salem Salem , capital of Oregon, U.S., and the seat (1849) of Marion county.Link HereModerately SatisfyingThe response provides an accurate answer to the question. However, two minor problems contribute to this being Moderately Satisfying rather than Highly Satisfying. First, this is not a complete sentence, and second, the word “Salem” appears twice in the beginning of the response.
[is falafel vegetarian]Falafel are vegan, nutritious and super delicious.Link HereModerately SatisfyingThe response accurately describes falafel as vegan. However, because it was written in an informal, opinionated way, it should be considered Moderately Satisfying rather than Highly Satisfying.
[what hemisphere is the united states in] For example, the United States is in both the Northern and Western Hemisphere.Link HereModerately SatisfyingThe response accurately answers the question. However, because it starts with “For example,” some useful context is missing, which makes this response Moderately Satisfying rather than Highly Satisfying.
[what are minerals]Minerals definition, any of a class of substances occurring in nature, usually comprising inorganic substances, as quartz or feldspar, of definite chemical composition and usually of definite crystal structure, but sometimes also including rocks formed by these substances as well as certain natural products of organic origin, as asphalt or coal.Link HereModerately SatisfyingThis response accurately answers the question. Because the language is somewhat advanced, and because of the grammatical issue caused by the first comma in the response, this response is Moderately Satisfying rather than Highly Satisfying.
[can a mule reproduce]Mules can be either male or female, but, because of the odd number of chromosomes, they can’t reproduce.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how many cards are in an uno deck]There are 108 cards in a Uno deck.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how much do human body is made of water]Up to 60% of the human adult body is water.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[what does gps stand for]GPS stands for Global Positioning Service.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when is the next haleys comet]The next predicted perihelion of Halley’s Comet is 28 July 2061.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when did hawaii become part of the united states]Hawaii was admitted as a U.S. state on August 21, 1959.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.

Question Answer Instructions

In this task, you’ll be presented with (1) a query that likely asks a question, (2) a response that attempts to answer the question in the query, and (3) a response source (if present), which is the webpage that the response comes from. Your job will be to review the provided query, the response, and the response source, and then answer a few questions about them.

Note that if you’re unfamiliar with the query, or with any of the terms/concepts in the query, you should conduct research by clicking on one of the provided search engine links. Only after you’ve familiarized yourself with the query should you proceed to the questions in this task.

For the purposes of this task, we’ll define four types of questions that might pertain to the query: (1) Apple product help questions, (2) information-seeking questions, (3) personalized questions, and (4) miscellaneous questions.

  • An Apple product help question refers to a question explicitly seeking help with an Apple product. Examples include: [how can i screenshot on the iphone x], [how do i access my icloud drive], [how do i change the name of my ipad], [how do i hard reset an iphone], [how do you do a screenshot on a mac], [why is my ipad glitching], and [how to set up itunes].
  • An information-seeking question refers to a question that seeks knowledge about entities or concepts in the world, and is not asking about help with an Apple product. One defining attribute of information-seeking questions is that they could be answered by someone who simply researches the topic online using publicly available information. Examples of queries that contain information-seeking questions include: [are dinosaurs reptiles], [can a mule reproduce], [how many cards are in an uno deck], [san francisco weather], and [is falafel vegetarian]. Another example is the query [causes of insomnia], which seeks general information about possible causes of a medical condition.

Note that if the query asks a question whose answer is country-dependent, but no location is mentioned in the query, you can assume that the country is the USA. These types of queries are eligible to be marked as information-seeking. For example, the answer to the query [what countries did we fight in world war 2] depends on the country of the question-asker, but remember that we can assume the country is the USA. In this case, the question should be classified as information-seeking. However, if a question requires location information that is more precise than just the country, such as the state, county, city, or neighborhood, then the query should generally be marked as personalized (defined below).

Note also that if there’s an obvious misspelling in the query that makes it unclear what’s being asked, you should mark the query as miscellaneous (defined below).

  • A personalized question refers to a question that could be answered only if additional knowledge were known about the person asking the question. Additional knowledge includes (but is not limited to): who the person is, the person’s health information, what kind of electronic devices the person owns, and what city the person is in. For example, the query [weather today] contains a personalized question because answering it would require knowledge about what city the question asker is in. The query [how can i sync my phone to my car] contains a personalized question because it requires information about what phone and what car the question-asker has. The query [why is my internet connection so slow] contains a personalized question because it requires information about the question-asker’s internet connection. Lastly, the query [why do i have insomnia] contains a personalized question because answering it would require health information about the person asking the question. (Note that this is different from the more general query, [causes of insomnia], which is not specific to the person asking the question.)
  • A miscellaneous question refers to ANY question that does not fall into one of the three categories mentioned above. This includes (but is not limited to): questions that are incomplete, questions that are too broad, questions that are nonsense (possibly due to misspellings), questions that are impossible to answer, and questions that seek an opinion. Incomplete examples include [what is the salary of a] and [what temperature should i bake]. Too-broad examples include [who got married] and [who is elisa]. Nonsense examples include [what is how mika], [what are trampolines the flags], [what is the best anna histamine], and [population of sweedun]. Impossible-to-answer examples include [what is the meaning of life] and [why does god allow suffering]. Lastly, opinion-seeking examples include [is android or iphone better] and [what is the best cheese].
  • The query does not contain any questions: This is a simple statement  with no implication that information is being sought: [my dog is big] [it is hot].
  • The query contains more than one question: The question asks about more than one thing, or asks more than one thing about the same person or entity: [How old are Kevin, Joe, and Nick Jonas] [How old is Nick Jonas and when did he marry Priyanka Chopra].

Depending on your response to the first question, you may be asked a total of five questions. The final question (Question #5) asks about how satisfying the response would be to users who issued the query. The answer options for this question are defined below:

  • Not Satisfying — There are severe problems with the response. The response either does not actually answer the question, or it provides an answer that is inaccurate, misleading, or too confusing to actually be useful. In the case of timesensitive questions, the answer may be outdated and therefore invalid. (Note that misleading responses include those that might seem good at first glance, but upon examining the response source, turn out to have been taken out of context and are answering a different question entirely.) Alternatively, the response may actually provide an accurate answer to the question, but it contains inappropriate/ offensive language. Overall, very few to no users who issued the query would be satisfied with the response.
    • Slightly Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has noticeable problems. For example, the response may be only partially answering the question, or answering a narrow interpretation of the question. The response may be confusing because it’s missing important context—for example, it may make a reference to a subject that hasn’t been defined. The response may use informal language that might be opinionated or sound like an advertisement. The response may have noticeable spelling, grammatical, or formatting issues. Alternatively, the response provides a direct and complete answer to the question, but the response is unnecessarily long, with the answer being hidden behind a substantial amount of extraneous information. Overall, only some users who issued the query would be satisfied with the response.
    • Moderately Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has minor problems. The response answers the question completely, but the answer may be indirect or implied. The response may be longer than necessary, but the answer should be easily found toward the beginning of the response, potentially follo by extraneous information. The response may use informal language, but it should not be strongly opinionated, sound like an advertisement, or use inappropriate/offensive language. The response may have minor spelling, grammatical, or formatting issues. The language may be advanced (such as content from medical or academic literature), but it should still be understandable by most people. Overall, most users who issued the query would be satisfied with the response.
    • Highly Satisfying — The response is accurate (and in the case of time-sensitive questions, is timely and valid). The response answers the question completely, directly, and concisely, with no extraneous information. The response sounds professional and formal, with no inappropriate/offensive language. The language is easily understood, and has no spelling, grammatical, or formatting issues. Virtually all users who issued the query would be satisfied with the response.

Satisfaction Rating Examples

QueryResponseResponse SourceSatisfaction RatingRating Explanation
[how tall is the redwood tree]The tallest nonredwood tree is a 100.3 m (329 foot) tall Douglas fir.Link HereNot SatisfyingThe response doesn’t answer the question.
[is a viola smaller than a violin]The viola is generally strung with heavier strings than the violin.Link Here Not SatisfyingThe response doesn’t answer the question.
[is pectin vegan]Pectin is generally well tolerated when ingested.Link HereNot SatisfyingThe response doesn’t answer the question.
[how long is pregnancy]It is estimated that a human pregnancy should be about 18 months.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what is the world population]The earth has 50 billion people.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what happens when you sneeze with your eyes open]It can damage your hearing, lead to an ear infection, and rupture blood vessels in the eyes and brain. It is possible to keep your eyes open while sneezing.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[how old is evan from evan tube]Evan is now seven years old.Link HereNot SatisfyingThe response is outdated by several years.
[who won the par three contest]Ben Crane won the annual Par 3 contest, which took place on April 5, with a four- under 23. Arnold Palmer and Jack Nicklaus, made a curtain call at the event; Nicklaus was one-under and was in contention throughout the day.Link HereNot SatisfyingThe response is outdated, corresponding to the year 2006.
[who is the new mexican president]Mexico’s President Enrique Peña Nieto on Monday.Link HereNot SatisfyingThe response is outdated; as of 2018, Mexico’s president is Andrés Manuel López Obrador.
[where do they sell elf on the shelf]As a toy, the Elf on the Shelf is benign enough. It’s a skinny-ass doll, about a foot long, with a bigeyed pixie face, a plastic head, and a felt body, on sale at your local big box store for $29.95.Link HereNot SatisfyingThe response uses inappropriate language (“skinny-ass doll”), which automatically makes it Not Satisfying.
[what is polyurethane used for]Polyurethane foam is used primarily for bedding and furniture stuffing.Link HereSlightly SatisfyingThe response is too narrow; the question is about polyurethane in general, but the response is specifically focused on polyurethane foam. Because of this narrow interpretation, this answer is Slightly Satisfying.
[what noise does a parrot make]They will scream or make a sound like boiling water.Link HereSlightly SatisfyingThe response is too narrow; according to the response source, parrots will scream specifically when they feel threatened, which the question is not asking about.
[how much does a honda accord weigh]The first generation Honda Accord was launched on 7 May 1976 as a three-door hatchback with 68 hp (51 kW), a 93.7-inch (2,380.0 mm) wheelbase, and a weight of about 2,000 pounds.Link HereSlightly SatisfyingThe response is too narrow; the question is most likely asking about the weight of modern Honda Accords, rather than the weight of the original Honda Accord from 1976.
[when is eggplant in season]The largest producers of eggplants in the U.S. include Florida, New Jersey and California. They’re also grown in Mexico, China, Italy, Turkey, Egypt and Japan. Popular varieties include Black Beauty, Rosa Bianca, Classic, Orient Express, Black Italian, Japanese, Lavender and Cloud 9. Eggplants are in peak season from July through ber.Link HereSlightly SatisfyingThe response is accurate and it completely answers the question. However, the response is much longer than necessary, with lots of extraneous content, and the answer can only be found in the second half of the response. Because of this, the response is Slightly Satisfying.
[what are the steps of mitosis]Prophase – The first stage of mitosis is known as prophase, where the nuclear chromatin starts to become organized and condenses into thick strands that eventually become chromosomes observable in the optical microscope (Figure 1(b)). The nucleoli, primarily responsible for the production of ribosomal RNA, begin to disappear as the chromosomes condense. During prophase, major changes also occur in the cytoplasm, including disassembly of the cytoskeleton components based on tubulin (cytoplasmic microtubules).Link HereSlightly SatisfyingThe response provides only a partial answer to the question. A better answer would include all four phases: prophase, metaphase, anaphase, and telophase. As well, the answer is much longer than necessary, providing unnecessary details. The incompleteness and excessive length both contribute to this response being Slightly Satisfying.
[how much alcohol does modelo have]Enjoy Modelo Especial at your next barbecue or while watching the game with friends. This balanced, easydrinking beer contains 145 calories, 0 grams of fat, and 4.4% alcohol by volume per 12oz serving.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, because (1) it contains extraneous information with the answer only appearing at the very end, and (2) it sounds like an advertisement, it should be classified as Slightly Satisfying.
[how many licks does it take to get to center of a tootsie roll tootsie pop]Twenty of the group’s volunteers assumed the licking challengeunassisted by machinery-and averaged 252 licks each to the center.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, it misses critical context, referring to “twenty of the group’s volunteers” without explaining what that means. Missing this context causes the response to be confusing, which makes this Slightly Satisfying.
[why does pain hurts]This theory states that pain is a function of the balance between the information traveling into the spinal cord through large nerve fibers and information traveling into the spinal cord through small nerve fibers. Remember, large nerve fibers carry non-nociceptive information and small nerve fibers carry nociceptive information. If the relative amount of activity is greater in large nerve fibers, there should be little or no pain.Link HereSlightly SatisfyingThe response provides a very scientific answer to the question. However, for many users who might issue this query, the language used is likely to be too advanced and technical to actually be helpful. As well, the response begins with “This theory,” but the theory hasn’t actually been named or defined yet. These two issues cause this response to be Slightly Satisfying.
[who is anne coulter]Ann Coulter is a conservative political commentator, writer, and lawyer well-known for her polarizing opinions. Read her most shocking statements.Link HereSlightly SatisfyingThe first sentence of this response is great–it answers the question completely and directly. If the first answer were by itself, it would qualify as a Highly Satisfying response. However, the second sentence, “Read her most shocking statements,” sounds opinionated and like an advertisement. And because of that, this response is Slightly Satisfying.
[is plavix a blood thinner]Plavix (clopidogrel) is a blood thinner (anti-platelet) drug used for the prevention of strokes, heart attacks, and peripheral artery disease. Ibuprofen also thins the blood. If ibuprofen and Plavix are taken together it creates an interaction that thins the blood even more, which may cause gastrointestinal bleeding from ulcers in the digestive tract.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[is there gluten in rice]Rice does not contain gluten, so it can be eaten by people with celiac disease. However, it is not necessarily safe. There may be crosscontamination with glutencontaining products, and some rice has been found to contain a toxin, arsenic. Alternatives to rice include chia and maize. Always check the label first.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[are dinosaurs reptiles]Dinosaurs are a diverse group of reptiles of the clade Dinosauria. They first appeared during the Triassic period, between 243 and 233.23 million years ago, although the exact origin and timing of the evolution of dinosaurs is the subject of active research. They became the dominant terrestrial vertebrates after the Triassic– Jurassic extinction event 201 million years ago; their dominance continued through the Jurassic and Cretaceous periods.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half of the response, so this reponse is Moderately Satisfying.
[which states capital is salem]Salem Salem , capital of Oregon, U.S., and the seat (1849) of Marion county.Link HereModerately SatisfyingThe response provides an accurate answer to the question. However, two minor problems contribute to this being Moderately Satisfying rather than Highly Satisfying. First, this is not a complete sentence, and second, the word “Salem” appears twice in the beginning of the response.
[is falafel vegetarian]Falafel are vegan, nutritious and super delicious.Link HereModerately SatisfyingThe response accurately describes falafel as vegan. However, because it was written in an informal, opinionated way, it should be considered Moderately Satisfying rather than Highly Satisfying.
[what hemisphere is the united states in] For example, the United States is in both the Northern and Western Hemisphere.Link HereModerately SatisfyingThe response accurately answers the question. However, because it starts with “For example,” some useful context is missing, which makes this response Moderately Satisfying rather than Highly Satisfying.
[what are minerals]Minerals definition, any of a class of substances occurring in nature, usually comprising inorganic substances, as quartz or feldspar, of definite chemical composition and usually of definite crystal structure, but sometimes also including rocks formed by these substances as well as certain natural products of organic origin, as asphalt or coal.Link HereModerately SatisfyingThis response accurately answers the question. Because the language is somewhat advanced, and because of the grammatical issue caused by the first comma in the response, this response is Moderately Satisfying rather than Highly Satisfying.
[can a mule reproduce]Mules can be either male or female, but, because of the odd number of chromosomes, they can’t reproduce.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how many cards are in an uno deck]There are 108 cards in a Uno deck.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how much do human body is made of water]Up to 60% of the human adult body is water.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[what does gps stand for]GPS stands for Global Positioning Service.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when is the next haleys comet]The next predicted perihelion of Halley’s Comet is 28 July 2061.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when did hawaii become part of the united states]Hawaii was admitted as a U.S. state on August 21, 1959.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.

Question Answer Instructions

In this task, you’ll be presented with (1) a query that likely asks a question, (2) a response that attempts to answer the question in the query, and (3) a response source (if present), which is the webpage that the response comes from. Your job will be to review the provided query, the response, and the response source, and then answer a few questions about them.

Note that if you’re unfamiliar with the query, or with any of the terms/concepts in the query, you should conduct research by clicking on one of the provided search engine links. Only after you’ve familiarized yourself with the query should you proceed to the questions in this task.

For the purposes of this task, we’ll define four types of questions that might pertain to the query: (1) Apple product help questions, (2) information-seeking questions, (3) personalized questions, and (4) miscellaneous questions.

  • An Apple product help question refers to a question explicitly seeking help with an Apple product. Examples include: [how can i screenshot on the iphone x], [how do i access my icloud drive], [how do i change the name of my ipad], [how do i hard reset an iphone], [how do you do a screenshot on a mac], [why is my ipad glitching], and [how to set up itunes].
  • An information-seeking question refers to a question that seeks knowledge about entities or concepts in the world, and is not asking about help with an Apple product. One defining attribute of information-seeking questions is that they could be answered by someone who simply researches the topic online using publicly available information. Examples of queries that contain information-seeking questions include: [are dinosaurs reptiles], [can a mule reproduce], [how many cards are in an uno deck], [san francisco weather], and [is falafel vegetarian]. Another example is the query [causes of insomnia], which seeks general information about possible causes of a medical condition.

Note that if the query asks a question whose answer is country-dependent, but no location is mentioned in the query, you can assume that the country is the USA. These types of queries are eligible to be marked as information-seeking. For example, the answer to the query [what countries did we fight in world war 2] depends on the country of the question-asker, but remember that we can assume the country is the USA. In this case, the question should be classified as information-seeking. However, if a question requires location information that is more precise than just the country, such as the state, county, city, or neighborhood, then the query should generally be marked as personalized (defined below).

Note also that if there’s an obvious misspelling in the query that makes it unclear what’s being asked, you should mark the query as miscellaneous (defined below).

  • A personalized question refers to a question that could be answered only if additional knowledge were known about the person asking the question. Additional knowledge includes (but is not limited to): who the person is, the person’s health information, what kind of electronic devices the person owns, and what city the person is in. For example, the query [weather today] contains a personalized question because answering it would require knowledge about what city the question asker is in. The query [how can i sync my phone to my car] contains a personalized question because it requires information about what phone and what car the question-asker has. The query [why is my internet connection so slow] contains a personalized question because it requires information about the question-asker’s internet connection. Lastly, the query [why do i have insomnia] contains a personalized question because answering it would require health information about the person asking the question. (Note that this is different from the more general query, [causes of insomnia], which is not specific to the person asking the question.)
  • A miscellaneous question refers to ANY question that does not fall into one of the three categories mentioned above. This includes (but is not limited to): questions that are incomplete, questions that are too broad, questions that are nonsense (possibly due to misspellings), questions that are impossible to answer, and questions that seek an opinion. Incomplete examples include [what is the salary of a] and [what temperature should i bake]. Too-broad examples include [who got married] and [who is elisa]. Nonsense examples include [what is how mika], [what are trampolines the flags], [what is the best anna histamine], and [population of sweedun]. Impossible-to-answer examples include [what is the meaning of life] and [why does god allow suffering]. Lastly, opinion-seeking examples include [is android or iphone better] and [what is the best cheese].
  • The query does not contain any questions: This is a simple statement  with no implication that information is being sought: [my dog is big] [it is hot].
  • The query contains more than one question: The question asks about more than one thing, or asks more than one thing about the same person or entity: [How old are Kevin, Joe, and Nick Jonas] [How old is Nick Jonas and when did he marry Priyanka Chopra].

Depending on your response to the first question, you may be asked a total of five questions. The final question (Question #5) asks about how satisfying the response would be to users who issued the query. The answer options for this question are defined below:

  • Not Satisfying — There are severe problems with the response. The response either does not actually answer the question, or it provides an answer that is inaccurate, misleading, or too confusing to actually be useful. In the case of timesensitive questions, the answer may be outdated and therefore invalid. (Note that misleading responses include those that might seem good at first glance, but upon examining the response source, turn out to have been taken out of context and are answering a different question entirely.) Alternatively, the response may actually provide an accurate answer to the question, but it contains inappropriate/ offensive language. Overall, very few to no users who issued the query would be satisfied with the response.
    • Slightly Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has noticeable problems. For example, the response may be only partially answering the question, or answering a narrow interpretation of the question. The response may be confusing because it’s missing important context—for example, it may make a reference to a subject that hasn’t been defined. The response may use informal language that might be opinionated or sound like an advertisement. The response may have noticeable spelling, grammatical, or formatting issues. Alternatively, the response provides a direct and complete answer to the question, but the response is unnecessarily long, with the answer being hidden behind a substantial amount of extraneous information. Overall, only some users who issued the query would be satisfied with the response.
    • Moderately Satisfying — The response is accurate (and in the case of timesensitive questions, is timely and valid), but it has minor problems. The response answers the question completely, but the answer may be indirect or implied. The response may be longer than necessary, but the answer should be easily found toward the beginning of the response, potentially follo by extraneous information. The response may use informal language, but it should not be strongly opinionated, sound like an advertisement, or use inappropriate/offensive language. The response may have minor spelling, grammatical, or formatting issues. The language may be advanced (such as content from medical or academic literature), but it should still be understandable by most people. Overall, most users who issued the query would be satisfied with the response.
    • Highly Satisfying — The response is accurate (and in the case of time-sensitive questions, is timely and valid). The response answers the question completely, directly, and concisely, with no extraneous information. The response sounds professional and formal, with no inappropriate/offensive language. The language is easily understood, and has no spelling, grammatical, or formatting issues. Virtually all users who issued the query would be satisfied with the response.

Satisfaction Rating Examples

QueryResponseResponse SourceSatisfaction RatingRating Explanation
[how tall is the redwood tree]The tallest nonredwood tree is a 100.3 m (329 foot) tall Douglas fir.Link HereNot SatisfyingThe response doesn’t answer the question.
[is a viola smaller than a violin]The viola is generally strung with heavier strings than the violin.Link Here Not SatisfyingThe response doesn’t answer the question.
[is pectin vegan]Pectin is generally well tolerated when ingested.Link HereNot SatisfyingThe response doesn’t answer the question.
[how long is pregnancy]It is estimated that a human pregnancy should be about 18 months.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what is the world population]The earth has 50 billion people.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[what happens when you sneeze with your eyes open]It can damage your hearing, lead to an ear infection, and rupture blood vessels in the eyes and brain. It is possible to keep your eyes open while sneezing.Link HereNot SatisfyingThe response is inaccurate/ misleading; it was taken out of context, as can be seen from the response source.
[how old is evan from evan tube]Evan is now seven years old.Link HereNot SatisfyingThe response is outdated by several years.
[who won the par three contest]Ben Crane won the annual Par 3 contest, which took place on April 5, with a four- under 23. Arnold Palmer and Jack Nicklaus, made a curtain call at the event; Nicklaus was one-under and was in contention throughout the day.Link HereNot SatisfyingThe response is outdated, corresponding to the year 2006.
[who is the new mexican president]Mexico’s President Enrique Peña Nieto on Monday.Link HereNot SatisfyingThe response is outdated; as of 2018, Mexico’s president is Andrés Manuel López Obrador.
[where do they sell elf on the shelf]As a toy, the Elf on the Shelf is benign enough. It’s a skinny-ass doll, about a foot long, with a bigeyed pixie face, a plastic head, and a felt body, on sale at your local big box store for $29.95.Link HereNot SatisfyingThe response uses inappropriate language (“skinny-ass doll”), which automatically makes it Not Satisfying.
[what is polyurethane used for]Polyurethane foam is used primarily for bedding and furniture stuffing.Link HereSlightly SatisfyingThe response is too narrow; the question is about polyurethane in general, but the response is specifically focused on polyurethane foam. Because of this narrow interpretation, this answer is Slightly Satisfying.
[what noise does a parrot make]They will scream or make a sound like boiling water.Link HereSlightly SatisfyingThe response is too narrow; according to the response source, parrots will scream specifically when they feel threatened, which the question is not asking about.
[how much does a honda accord weigh]The first generation Honda Accord was launched on 7 May 1976 as a three-door hatchback with 68 hp (51 kW), a 93.7-inch (2,380.0 mm) wheelbase, and a weight of about 2,000 pounds.Link HereSlightly SatisfyingThe response is too narrow; the question is most likely asking about the weight of modern Honda Accords, rather than the weight of the original Honda Accord from 1976.
[when is eggplant in season]The largest producers of eggplants in the U.S. include Florida, New Jersey and California. They’re also grown in Mexico, China, Italy, Turkey, Egypt and Japan. Popular varieties include Black Beauty, Rosa Bianca, Classic, Orient Express, Black Italian, Japanese, Lavender and Cloud 9. Eggplants are in peak season from July through ber.Link HereSlightly SatisfyingThe response is accurate and it completely answers the question. However, the response is much longer than necessary, with lots of extraneous content, and the answer can only be found in the second half of the response. Because of this, the response is Slightly Satisfying.
[what are the steps of mitosis]Prophase – The first stage of mitosis is known as prophase, where the nuclear chromatin starts to become organized and condenses into thick strands that eventually become chromosomes observable in the optical microscope (Figure 1(b)). The nucleoli, primarily responsible for the production of ribosomal RNA, begin to disappear as the chromosomes condense. During prophase, major changes also occur in the cytoplasm, including disassembly of the cytoskeleton components based on tubulin (cytoplasmic microtubules).Link HereSlightly SatisfyingThe response provides only a partial answer to the question. A better answer would include all four phases: prophase, metaphase, anaphase, and telophase. As well, the answer is much longer than necessary, providing unnecessary details. The incompleteness and excessive length both contribute to this response being Slightly Satisfying.
[how much alcohol does modelo have]Enjoy Modelo Especial at your next barbecue or while watching the game with friends. This balanced, easydrinking beer contains 145 calories, 0 grams of fat, and 4.4% alcohol by volume per 12oz serving.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, because (1) it contains extraneous information with the answer only appearing at the very end, and (2) it sounds like an advertisement, it should be classified as Slightly Satisfying.
[how many licks does it take to get to center of a tootsie roll tootsie pop]Twenty of the group’s volunteers assumed the licking challengeunassisted by machinery-and averaged 252 licks each to the center.Link HereSlightly SatisfyingThe response provides an accurate answer to the question. However, it misses critical context, referring to “twenty of the group’s volunteers” without explaining what that means. Missing this context causes the response to be confusing, which makes this Slightly Satisfying.
[why does pain hurts]This theory states that pain is a function of the balance between the information traveling into the spinal cord through large nerve fibers and information traveling into the spinal cord through small nerve fibers. Remember, large nerve fibers carry non-nociceptive information and small nerve fibers carry nociceptive information. If the relative amount of activity is greater in large nerve fibers, there should be little or no pain.Link HereSlightly SatisfyingThe response provides a very scientific answer to the question. However, for many users who might issue this query, the language used is likely to be too advanced and technical to actually be helpful. As well, the response begins with “This theory,” but the theory hasn’t actually been named or defined yet. These two issues cause this response to be Slightly Satisfying.
[who is anne coulter]Ann Coulter is a conservative political commentator, writer, and lawyer well-known for her polarizing opinions. Read her most shocking statements.Link HereSlightly SatisfyingThe first sentence of this response is great–it answers the question completely and directly. If the first answer were by itself, it would qualify as a Highly Satisfying response. However, the second sentence, “Read her most shocking statements,” sounds opinionated and like an advertisement. And because of that, this response is Slightly Satisfying.
[is plavix a blood thinner]Plavix (clopidogrel) is a blood thinner (anti-platelet) drug used for the prevention of strokes, heart attacks, and peripheral artery disease. Ibuprofen also thins the blood. If ibuprofen and Plavix are taken together it creates an interaction that thins the blood even more, which may cause gastrointestinal bleeding from ulcers in the digestive tract.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[is there gluten in rice]Rice does not contain gluten, so it can be eaten by people with celiac disease. However, it is not necessarily safe. There may be crosscontamination with glutencontaining products, and some rice has been found to contain a toxin, arsenic. Alternatives to rice include chia and maize. Always check the label first.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half, so this reponse is Moderately Satisfying.
[are dinosaurs reptiles]Dinosaurs are a diverse group of reptiles of the clade Dinosauria. They first appeared during the Triassic period, between 243 and 233.23 million years ago, although the exact origin and timing of the evolution of dinosaurs is the subject of active research. They became the dominant terrestrial vertebrates after the Triassic– Jurassic extinction event 201 million years ago; their dominance continued through the Jurassic and Cretaceous periods.Link HereModerately SatisfyingThe response completely answers the question, and is accurate, but it’s longer than necessary, with several extraneous sentences. However, the answer to the question appears in the first half of the response, so this reponse is Moderately Satisfying.
[which states capital is salem]Salem Salem , capital of Oregon, U.S., and the seat (1849) of Marion county.Link HereModerately SatisfyingThe response provides an accurate answer to the question. However, two minor problems contribute to this being Moderately Satisfying rather than Highly Satisfying. First, this is not a complete sentence, and second, the word “Salem” appears twice in the beginning of the response.
[is falafel vegetarian]Falafel are vegan, nutritious and super delicious.Link HereModerately SatisfyingThe response accurately describes falafel as vegan. However, because it was written in an informal, opinionated way, it should be considered Moderately Satisfying rather than Highly Satisfying.
[what hemisphere is the united states in] For example, the United States is in both the Northern and Western Hemisphere.Link HereModerately SatisfyingThe response accurately answers the question. However, because it starts with “For example,” some useful context is missing, which makes this response Moderately Satisfying rather than Highly Satisfying.
[what are minerals]Minerals definition, any of a class of substances occurring in nature, usually comprising inorganic substances, as quartz or feldspar, of definite chemical composition and usually of definite crystal structure, but sometimes also including rocks formed by these substances as well as certain natural products of organic origin, as asphalt or coal.Link HereModerately SatisfyingThis response accurately answers the question. Because the language is somewhat advanced, and because of the grammatical issue caused by the first comma in the response, this response is Moderately Satisfying rather than Highly Satisfying.
[can a mule reproduce]Mules can be either male or female, but, because of the odd number of chromosomes, they can’t reproduce.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how many cards are in an uno deck]There are 108 cards in a Uno deck.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[how much do human body is made of water]Up to 60% of the human adult body is water.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[what does gps stand for]GPS stands for Global Positioning Service.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when is the next haleys comet]The next predicted perihelion of Halley’s Comet is 28 July 2061.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.
[when did hawaii become part of the united states]Hawaii was admitted as a U.S. state on August 21, 1959.Link HereHighly SatisfyingThis response provides a direct, complete, accurate, and concise answer.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post