Modeling Exemplification in Long-form Question Answering via Retrieval

Exemplification is a process by which writers explain or clarify a concept by providing an example. While common in all forms of writing, exemplification is particularly useful in the task of long-form question answering (LFQA), where a complicated answer can be made more understandable through simple examples. In this paper, we provide the first computational study of exemplification in QA, performing a fine-grained annotation of different types of examples (e.g., hypotheticals, anecdotes) in three corpora. We show that not only do state-of-the-art LFQA models struggle to generate relevant examples, but also that standard evaluation metrics such as ROUGE are insufficient to judge exemplification quality. We propose to treat exemplification as a retrieval problem in which a partially-written answer is used to query a large set of human-written examples extracted from a corpus. Our approach allows a reliable ranking-type automatic metrics that correlates well with human evaluation. A human evaluation shows that our model’s retrieved examples are more relevant than examples generated from a state-of-the-art LFQA model.

where it is hard to automatically detect the exam-126 ple boundary. Hence, we take the other two most 127 frequent exemplification markers, namely "for ex-128 ample" and "e.g.", and extract the parentheses-129 1 The annotation included texts from physics, biology, mechanical & electric engineering, philosophy, sociology, applied linguistics, and marketing.  Exemplification is usually expressed through three 168 discourse units (Meyer, 1992;Triki, 2021): the 169 anchor (also known as "exemplified unit"), the 170 exemplification marker, and the example text it-171 self ("exemplifying unit"). We annotate the anchor 172 (marked as bold) and example (marked as italics).

176
in the study of Triki (2021), we find that these units 177 mainly come in two forms: (1) nominal groups 178 that refer to entities, or (2) clauses that represent 179 statements.

180
(1) However , some fish show a tremendous ability 181 to effectively osmoregulate across a broad range 182 of salinities ; fish with this ability are known as 183 euryhaline species , e.g., Flounder.

184
(2) Players earn you points, depending on their

329
The first quote block showing the masked answer 330 will be used as a query to retrieve the exemplifying 331 unit in the second quote block.
Prior to the 1800's, when people dug up fossils (and more frequently, subfossil bones from ice age animals, which are more common and easier to find) they tended to interpret them in light of their existing myths and legends. [MASK] For example, when a wooly rhino skull was dug up near Klagenfurt, it was thought to be the skull of a dragon.
First, we compute an embedding of the context (c) surrounding the example by passing it into a RoBERTa encoder, where the example text is replaced by a mask token.
Next, we compute candidate example embeddings (e i ) by feeding each of the 66K examples extracted from ELI5 into a separate RoBERTa encoder.
Finally, we use contrastive learning to push the context embedding c close to the correct example embedding and far from the incorrect examples.
For example, Zhi dao is how you would correctly say "to know".

Example encoder
Example encoder

Example encoder
For example, the lipopolysaccharide in your heart is not a bunch of living bacteria     (2) the top-ranked retrieval from EGRET, restricted 487 to only cases where this retrieval is not the ground- Task setup: In the ranking task, we ask work-492 ers to produce a ranking of the three choices (e.g.,

493
1>2>3). We allow equality (e.g., 1=2>3) since mul-494 tiple candidates can be equally valid for a given 495 context. In the rating task, we ask workers to eval- amine the closely-related instantiation discourse 537 6 We restrict workers to those in English speaking countries who have completed at least 1000 HITs with an acceptance rate of 97%.

Ground-truth EGRET-retrieved vs others Analysis
Evolution is not a force towards the optimum, it's a force towards the minimum necessary.
For example, if grass was poisonous, it would be better for its survival, as less animals would come eat it.
[EGRET-retrieved]: For example, we move incredible slowly when compared to the maximum speed allowed in the universe. ROUGE is not a viable evaluation for example quality. Our EGRET's retrieved example was rated a 5/5 by all three crowdworkers but achieves lower ROUGE than an irrelevant example.
... You're brain is asleep and not paying any attention to your body so it ignores all of these stimuli unless they become too hard to ignore.
For example if the touching turns to slapping, the talking turns to yelling, or the light in the eyes turns to really bright light in the eyes.
[EGRET-retrieved]: For example, if you're in a room with a clock ticking you don't notice the ticking after a while. (H:4.0) [c-REALM-RT Generated]: For example, its not just that your brain is dead. (H:2.0) The EGRET-retrieved example effectively illustrates the phenomenon in the context and receives a higher average rating from crowdworkers than the generated example and even the ground-truth (3.3).
... Multiple births mean less time per offspring. Each individual offspring therefore has a lower chance of survival, ... Seems like the larger mammals tend to have single births.
For example, polar bears and elephants usually have single births.
[EGRET-retrieved]: For example, in mammals, a typical litter will be one offspring per pair of nipples as this is as many individuals a female can reasonably sustain.
EGRET retrieves an example based on a key entity from the context (mammals) but fails to address the concept to be exemplified ("single births")

NQ Real
Group Areas Act was the title of three acts of the Parliament of South Africa enacted under the apartheid government of South Africa. The acts assigned racial groups to different residential and business sections in urban areas in a system of urban apartheid.An effect of the law was to exclude non-Whites from living in the most developed areas , which were restricted to Whites ( e.g. , Sea Point , Lansdowne , Cape Town , Claremont , Cape Town ).

NQ Hypothetical
Although the safest way to recognize a chord 's root is , after having reduced the chord to close spacing , to rearrange it as a stack of thirds , there are shortcuts to this : [...] With chord types, such as chords with added sixths or chords over pedal points, more than one possible chordal analysis may be possible.
For example, in a tonal piece of music , the notes C, E, G, A, sounded as a chord , could be analyzed as a C major sixth chord in root position ( a major triad -C, E, G -with an added sixth -Aabove the root ) or as a first inversion A minor seventh chord ( the A minor seventh chord contains the notes A, C, E and G, but in this example, the C note, the third of the A minor chord, is in the bass ).

ELI5 Real ✓
My uncle owns a pretty large recycling business. They export the majority of their newly created raw materials to the places that produce with the materials (China

Books3 Real
People in a second group were given a verbal description, with which they were to construct an image of walking along the two segments For example, people were told to imagine they would "Go forward 3 m, turn clockwise 90°, then go forward 3 m."

Books3 Hypothetical
When we cook together, I have to stay alert because she is always throwing a lemon at me-sometimes double down on acid and mix lemon juice with a little bit of vinegar to get the sunny sweet-sour note of the citrus along with earthy, apple, or wine notes of a vinegar for greater complexity For example, if you toss roasted beets (a notoriously earthy and sweet vegetable that some might say tastes like soil) with just lemon juice, olive oil, and salt, it would no doubt be good, but if you supplement the sunny lemon juice with a tiny splash of sherry vinegar for its woodsy earthiness, you get a roasted beet dish that is far more complex and delicious than if you had used only one or the other. For example if you have two dogs, and you give one of them a treat when you say Fido, and the other when you say Clifford, they learn that the respective words only apply to them.
[EGRET] For example people can identify their own dog EGRET retrieved relevant but semantically incorrect example (people identify their dog, instead of how dog identify themselves).
An economist would say healthcare has a positive externality. [...]There are some things you can buy that make everyone better off.
For example: going to the doctor every time you are sick will make you less likely to make other people sick.
[EGRET] For example, a butterfly house, a free cinema, games consoles etc.
EGRET retrieved examples related to the immediate preceding context ("some things you can buy...") but failed to retrieved examples based on earlier context (about healthcare).

billion -Economic and
Military aid for Pakistan, Egypt, and Jordan. The goal is to have a few people in the mideast who call us allies. Essentially, we buy their cooperation. That cooperation is sometimes useful.
For example, when we killed Osama Bin Laden, we sent troops into Pakistan. Normally, countries don't tolerate troops from other countries. The Pakistanis did complain a little, but they didn't do anything about it.
[c-REALM-RT] For example, For everyone here talking about how a lot of aid works: If we put money towards helping foreign countries rebuild, we are imposing restrictions on domestic activity. [...] c-REALM-RT generated an on-topic hypothetical example, which contradicts with the context.
You don't usually work on the same files because everything is split up between the departments. I haven't used USD yet but I have encountered the following workflow in different studios (using Maya).
For example: a character that has been rigged by one (or more, but not at the same time) rigger goes to the animators. Every animator works with the same character rig BUT each animator works on his/her own shot. c-REALM-RT generated a personal example that is irrelevant to the context.