Acquisition of Noncontiguous Class Attributes from Web Search Queries

Previous methods for extracting attributes (e.g., capital , population ) of classes ( Em-pires ) from Web documents or search queries assume that relevant attributes occur verbatim in the source text. The extracted attributes are short phrases that correspond to quantiﬁable properties of various instances ( ottoman empire , roman empire , mughal empire ) of the class. This paper explores the extraction of non-contiguous class attributes ( manner (it) claimed legitimacy of rule ), from fact-seeking and explanation-seeking queries. The attributes cover properties that are not always likely to be extracted as short phrases from inherently-noisy queries.


Introduction
Motivation: Resources such as Wikipedia (Remy, 2002) and Freebase (Bollacker et al., 2008) aim at organizing knowledge around classes (Food ingredients, Astronomical objects, Religions) and their instances (wheat flower, uranus, hinduism). Due to inherent limitations associated with maintaining and expanding human-curated resources, their content may be incomplete. For example, attributes representing the energy (or energy per 100g) or solubility in water are available in both Wikipedia and Freebase for many instances of Food ingredients (e.g., for olive oil, honey, fennel). But the attributes are missing for some instances (e.g., cornmeal). Moreover, structured information about how long (it) lasts unopened or manner (it) helps in weight loss is generally missing for Food ingredients, from both resources. Such information is also often absent from among the attributes acquired from either documents or queries by previous extraction methods Van Durme et al., 2008). Previously extracted attributes tend to be short, often nominal, phrases like nutritional value and taste. Even when extracted attributes are not nominal (Paşca, 2012), they remain relatively short phrases such as good for skin. As such, previous attributes have limited ability to capture the finer-grained properties being asked about in queries such as "how long does olive oil last unopened" and "how does honey help in weight loss". The presence of such queries suggests that such information is relevant to Web users. Identifying noncontiguous properties, or attributes of interest to Web users, helps filling some of the gaps in existing knowledge resources, which otherwise could not be filled by attributes extracted with previous methods.
Web users already inquire about the value of one of their arguments, the extracted relations are more likely to be relevant for the respective instances and classes, than relations extracted from arbitrary document sentences (Fader et al., 2011).

Noncontiguous Attributes
Intuitions: Users tend to formulate their Web search queries based on knowledge that they already possess at the time of the search (Paşca, 2007). Therefore, search queries play two roles simultaneously: in addition to requesting new information, they indirectly convey knowledge in the process. In particular, attributes correspond to quantifiable properties of instances and their classes. The extraction of attributes from queries starts from the intuition that, if an attribute A is relevant for a class C, then users are likely to ask for the value of the attribute A, for various instances I of the class C. If nutritional value and diameter are relevant attributes of the classes Food ingredients and Astronomical objects respectively, it is likely that users submit queries to inquire about the values of the attributes for instances of the two classes. Such queries could take the form "what is the (nutritional value) A of (olive oil) I " and "what is the (diameter) A of (jupiter) I "; or the more compact "(nutritional value) A of (olive oil) I " and "(diameter) A of (jupiter) I ". In this case, the attributes are relatively short phrases (nutritional value, diameter), and are expected to appear as contiguous phrases within queries. Previous methods on attribute extraction from queries specifically target this type of attributes. In fact, some methods apply dedicated extraction patterns (e.g., A of I) over either queries  or documents (Tokunaga et al., 2005). Other methods expand manually-provided seed sets of attributes, with other phrases that co-occur with instances within queries, in similar contexts as the seed attributes do (Paşca, 2007).
While simpler properties are often mentioned in queries as short, contiguous phrases, finer-grained properties often are not. Queries seeking the reason for solidification for some Food ingredients could, but rarely do, contain the attribute verbatim ("what is the reason for the solidification of honey"). Instead, queries are more likely to inquire about the expected value, while specifying the instance and the properties encoded by the attribute ("(why) A does (honey) I (solidify) A ").
Readable descriptions (names) of the attributes can be recovered from the queries, by assembling the type of the expected value and the properties together (reason (it) solidifies). Thus, fact and explanation-seeking queries are an intriguing source of noncontiguous attributes that are not restricted to short phrases, and are not required to occur as contiguous phrases in queries. Acquisition from Queries: The extraction method proposed in this paper takes as input a set of target classes, each of which is available as a set of instances that belong to the class; and a set of anonymized queries independent from one another. As illustrated in Figure 1, the method selects queries that contain an instance of a class together with what is deemed to be likely a noncontiguous attribute, and outputs ranked lists of attributes for each class. The extraction consists in several stages: • selection of a subset of queries that contain an instance in a form that suggests the queries ask for the value of a noncontiguous attribute of the instance; • extraction of noncontiguous attributes, from query fragments that describe the property of interest and the type of its expected value; • aggregation and ranking of attributes of individual instances of a class, into attributes of a class. Extraction Patterns: In order to determine whether a query contains an attribute of a class, the query is matched against the extraction patterns from Table 1. The use of patterns in attribute extraction has been previously suggested in Tokunaga et al., 2005), where the pattern what is the A of I extracts noun-phrase A attributes of instances I from queries and documents. In our case, the patterns are constructed such that they match fact-seeking and explanationseeking questions that likely inquire about the value of a relevant property of an instance I of the class C. For example, the first pattern from Table 1 matches queries such as "when did everquest become free to play" and "when was radon discovered as an element", which inquire about the date or time when certain events affected certain properties of the instances everquest and radon respectively. Instances I of the class C may be available as non-disambiguated items, that is, as strings (java) whose meaning is otherwise unknown; or as disambiguated items, that is, as strings associ-Query logs who discovered the element iron what family does zinc belong to in the periodic table when was radon discovered as an element how does oxygen return to the atmosphere why does chlorine react with water what elements does argon combine with how does javascript run who created haskell how does java execute who invented the programming language cobol how long does python take to learn how does java compile when was c# first released where does python install to how does c# differ from c++ how does javascript store dates when did minecraft come out for xbox 360 when did everquest become free to play who does the voice in black ops 2 when did league of legends become free to play when was fable 2 released how much does world of warcraft cost to play online Extracted class attributes Chemical elements: { who can you unlock in band hero how many copies did halo reach sell the first day date/time (it) was discovered as an element, manner (it) returns to the atmosphere, who discovered the element, manner (it) enters the soil, reason (it) reacts with water, elements (it) combines with, manner (it) reacts with other elements, family (it) belongs to in the periodic  Figure 1: Overview of extraction of noncontiguous attributes from Web search queries ated with pointers to knowledge base entries with a disambiguated meaning (Java (programming language)). In the first case, the matching of a query fragment, on one hand, to the portion of an extraction pattern corresponding to an instance I, on the other hand, consists in simple string matching. In the second case, the matching requires that the disambiguation of the query fragment, in the context of the query, matches the desired disambiguated meaning of I from the pattern. The subset of queries matching any of the extraction patterns, for any instances I of a class C, are the queries that contribute to extracting noncontiguous attributes of the class C.

Collecting Attributes of Individual Instances:
A small set of rules optionally converts whprefixes into coarse-grained types of the expected values (e.g., how long into length/duration; or when into date/time). In the case of what-prefixed queries, the adjacent noun phrase, if any, is considered to be the expected type ("what nutritional value .." into nutritional value). Similar rules have been employed for shallow analysis of opendomain questions (Dumais et al., 2002). The predicate verbs in the remainder of the query are updated, to match the tense specified by the auxiliary verb (e.g., "when did .."), if any, following the wh-prefix. Thus, the verb come is converted to the past tense came, in the case of the query "when did minecraft come out for xbox 360". An  attribute is constructed from the concatenation of the wh-prefix or expected type (date/time); the slot pronoun it, in lieu of the instance (date/time (it)); and the query remainder after tense conversion (date/time (it) came out for xbox 360). If the linking verb following the wh-prefix is a form of be (e.g., was), then the linking verb is also retained after the slot pronoun, to form a more coherent attribute (date/time (it) was first released).
Since constructed attributes are noun phrases, they are more consistent with, and can be more easily inserted among, existing attributes in structured data repositories (infobox entries of articles in Wikipedia, or property names or topics in Freebase).

Aggregation into Class Attributes:
Attributes of a class C are aggregated from attributes of individual instances I of the class. An attribute A is deemed more relevant for C if the attribute is extracted for more of the instances I of the class C, and for fewer instances I that do not belong to the class C. Concretely, the score of an attribute for a class is the lower bound of the Wilson score interval (Brown et al., 2001) where the number of positive observations is the number of queries for which the attribute A is extracted for some instance I in the class C, |{Query(I, A)} I∈C |; and the number of negative observations is the num-ber of queries for which the attribute A is extracted for some instances I outside of the class C, |{Query(I, A)} I / ∈C |. The scores are internally computed at 95% confidence. Attributes of each class are ranked in decreasing order of their scores. Reduction of Near-Duplicate Attributes: Due to lexical variations across queries from which attributes are extracted, some of the attributes are equivalent or nearly equivalent to one another. For example, gained independence, won its independence and gained its freedom of the class Countries are roughly equivalent, although they employ distinct tokens. The diversity and potential usefulness of a ranked list of attributes can be increased, if groups of near-duplicate attributes are identified in the list, and merged together.
A lower-ranked attribute is marked as a nearduplicate of a higher-ranked (i.e., earlier) attribute from the list, if all tokens from the lower-ranked attribute match either tokens from the higherranked attribute (gained independence vs. won its independence), or tokens from synonyms of phrases from the earlier attribute (gained independence vs. won its independence; or takes to show symptoms vs. takes to come out). Stop words, which include linking verbs, pronouns, determiners, conjunctions, wh-prefixes and prepositions, are not required to match. Synonyms may be either derived from existing lexical resources (e.g., WordNet (Fellbaum, 1998)), or mined from large document collections (Madnani and Dorr, 2010). Lower-ranked near-duplicate attributes are merged with the higher-ranked ones from the ranked list, thus improving the diversity of the list.

Experimental Setting
Textual Data Sources: The experiments rely on a random sample of around 1 billion fullyanonymized queries in English, submitted to a general-purpose Web search engine. Each query is available independently from other queries, and is accompanied by its frequency of occurrence in the query logs. Target Classes: Table 2 shows the set of 40 target classes for evaluating the attributes extracted from queries. In an effort to reuse experimental setup proposed in previous work, each of the 40 manually-compiled classes introduced in (Paşca, 2007) is mapped into the Wikipedia category that best matches it. For example, the evaluation classes Aircraft Model, Movie, Religion and Ter-Class (Examples of Instances) Actors (keanu reeves, milla jovovich, ben affleck), Aircraft (boeing 737, bombardier crj200, embraer 170), Animated characters (bugs bunny, pink panther (character), yosemite sam), Association football clubs (a.s. roma, fluminense football club, real madrid), Astronomical objects (alpha centauri, jupiter, delta corvi), Automobiles (nissan gt-r, tesla model s, toyota prius), Awards (grammy award, justin winsor prize (library), palme d'or), Battles and operations of world war ii (battle of midway, operation postmaster, battle of milne bay), Chemical elements (plutonium, radon, hydrogen), Cities (rio de janeiro, osaka, chiang mai), Companies (best buy, aveeno, pepsico), Countries (costa rica, rwanda, south korea), Currencies by country (japanese yen, swiss franc, korean won), Digital cameras (canon eos 400d, nikon d3000, pentax k10d), Diseases and disorders (anorexia nervosa, hyperlysinemia, repetitive strain injury), Drugs (fluticasone propionate, phentermine, tramadol), Empires (ottoman empire, roman empire, mughal empire), Films (the fifth element, mockingbird don't sing, ten thousand years older), Flowers (trachelospermum jasminoides, lavandula stoechas, evergreen rose), Food ingredients (carrot, olive oil, fennel), Holidays (good friday, easter, halloween), Hurricanes in North America (hurricane katrina, hurricane wilma, hurricane dennis), Internet search engines (google, baidu, lycos), Mobile phones (nokia n900, htc desire, samsung s5560), Mountains (mount rainier, cerro san luis obispo, steel peak), National Basketball Association teams (los angeles lakers, cleveland cavaliers, indiana pacers), National parks (yosemite national park, orang national park, tortuguero national park), Newspapers (the economist, corriere del trentino, seattle medium), Organizations designated as terrorist (taliban, shining path, eta), Painters (claude monet, domingo antonio velasco, tarcisio merati), Programming languages (javascript, prolog, obliq), Religious faiths traditions and movements (confucianism, fudoki, omnism), Rivers (danube, pingo river, viehmoorgraben), Skyscrapers (taipei 101, 15 penn plaza, eqt plaza), Sports events (tour de france, 1984 scottish cup final, rotlewi versus rubinstein), Stadiums (fenway park, chengdu longquanyi, stade geoffroy-guichard), Treaties (treaty of versailles, franco-indian alliance, treaty of cordoba), Universities and colleges (cornell university, nugaal university, gale college), Video games (minecraft, league of legends, everquest), Wine (madeira wine, yellow tail (wine), port wine)  (Paşca, 2007) are mapped into the Wikipedia categories Aircraft, Films, Religious faiths traditions and movements and Organizations designated as terrorist respectively. The name of the Wikipedia category only serves as a convenience label for its target class, and is not otherwise exploited in any way during the evaluation. Instead, a target class consists in a set of titles of Wikipedia articles, of which sample titles (e.g., the Wikipedia article titled nissan gt-r) are shown in lowercase for each class (e.g., Automobiles) in Table 2. The set of instances of a class is selected from all articles listed under the respective cate-   (Petrov et al., 2010). As a prerequisite, the portion I of the patterns from the table must match a disambiguated instance from a query.
A variation of the tagger introduced in (Cucerzan, 2007) maps query fragments to their disambiguated, corresponding Wikipedia instances (i.e., to Wikipedia articles). The tagger is simplified to select the longest instance mentions, and does not use gazetteers or queries for training. Depending on the sources of textual data available for training, any taggers (Cucerzan, 2007;Ratinov et al., 2011;Pantel et al., 2012) that disambiguate text fragments relative to Wikipedia entries can be employed.

Evaluation Results
Attribute Accuracy: The top 50 attributes, from the ranked lists extracted for each target class, are manually assigned correctness labels. As shown in Table 3, an attribute is marked as vital, if it must be present among representative attributes of the  Table 4: Accuracy of top 50 class attributes extracted from fact-seeking and explanation-seeking queries, over the evaluation set of 40 target classes class; okay, if it provides useful but non-essential information; and wrong, if it is incorrect (Paşca, 2007). For example, the attributes manner (it) generates its energy, manner (it) became a constellation and reason (it) has arms are annotated as vital, okay and wrong respectively for the class Astronomical objects. To compute the precision score over a set of attributes, the correctness labels are converted to numeric values: vital to 1.0, okay to 0.5, and wrong to 0.0. Precision is the sum of the correctness values of the attributes, divided by the number of attributes. Table 4 summarizes the precision scores over the evaluation set of target classes. The scores vary from one class to another, for example 0.71 for Food ingredients but 0.94 for Chemical elements. The average score is 0.76, indicating that attributes extracted from fact and explanationseeking queries have encouraging levels of accuracy. The results already take into account the detection of near-duplicate attributes. More precisely, the highest-ranked attribute in each group of near-duplicate attributes, examples of which are shown in Table 5, is retained and evaluated; the lower-ranked attributes from each group are not considered in the evaluation. Attributes like number of passengers (it) can hold, number of passengers it fits and number of passengers it seats are nearly equivalent, but are still not marked as near-duplicates for the class Aircraft, when they should. Conversely, the attribute location (it) lives is marked as a near-duplicate of location (it) lives in new york, when it should not. Nevertheless, a significant number of near-duplicates, which would otherwise crowd the ranked lists of attributes with redundant information, are identified and discarded.
Target Class: Group of Near-Duplicate Attributes Actors: movies (it) plays in, played in, acts in, acted in, played, played on Automobiles: date (it) was first manufactured, first produced, first made Battles and operations of World War II: reason (it) happened, took place, occurred Chemical elements: manner (it) returns to the atmosphere, gets back into the atmosphere, got into the atmosphere, gets into the atmosphere, enters the environment, enters the atmosphere Companies: location (it) makes its products, manufactures its products, produces its products, gets its products, makes its products, manufactures their products Companies: date/time (it) began outsourcing, started outsourcing, outsourced Countries: date (it) got its independence, gained independence, gained its independence, got independence, got their independence, won its independence, achieved independence, received its independence, gained its freedom Diseases and disorders: length/duration (it) takes to show symptoms, takes to show up, takes to show, takes to appear, takes to manifest, takes to come out Table 5: Groups of near-duplicate attributes identified for various classes. Attributes within a group are ranked according to their individual scores. Removing all but the first attribute of each group, from the ranked list of attributes of the respective class, improves the diversity of the list Discussion: The set of patterns shown in Table 1 is extensible. Moreover, the patterns are subject to errors. They may cause false matches, resulting in erroneous extractions. The extent to which this occurs is indirectly measured in the overall precision results. The modification of some of the patterns, or the addition of new ones, would likely affect the expected coverage and precision of the extracted attributes. If a pattern is particularly noisy, it is likely to cause systematic errors, and therefore produce attributes of lower quality.
Since attributes in Wikipedia and Freebase are initially entered manually by human editors, their correctness is virtually guaranteed. As for attributes extracted automatically, previous comparisons indicate that attributes tend to have higher quality when extracted from queries instead of documents (Paşca, 2007). Indeed, a set of extraction patterns applied to text produces attributes whose average precision at rank 50 is 0.44 when extracted from documents, vs. 0.63 from queries . More importantly, previously available or extracted attributes are virtually always simple, short noun phrases like nutritional value, taste or solubility in water. Even if not confined to noun phrases, they are still short, Run: [Ranked Attributes for a Sample of Classes] Class: Automobiles: D: [(it) goes on sale, (it) will go on sale, (it) is an engineering playground, (it) will be available in japan, (it) shows up in japan, (it) is a technical tour de force, (it) unveiled at tas 2008, (it) runs a 7:38, (it) is a unique car, (it) uses a premium midship package, (it) features an all-new 3.8-litre, (it) is one of the fastest cars, (it) made a quick drive-by, ..] Q: [price/quantity/degree (it) weights, year (it) was banned from bathurst, manner (it) launch control works, engine (it) has, kind of engine (it) has, price/quantity/degree (it) costs in japan, number of horsepower (it) has, price/quantity/degree horsepower (it) has, number of seats (it) has, speed (it) (Van Durme et al., 2008;Paşca, 2012). In comparison, attributes extracted in this paper accommodate properties that are sometimes awkward or even impossible to express through short phrases. Noncontiguous Attributes as Relations: Noncontiguous attributes extracted from fact-seeking queries are embodiments of relations linking the instances mentioned in the queries, on one hand, and the values being requested by the queries, on the other hand. Therefore, the method proposed in this paper can also be regarded as a method for the acquisition of relevant relations of various classes. The extracted relations specify the left argument (i.e., the instance) and the linking relation name (i.e., the attribute). They only specify the type of the, but not the actual, right argument (i.e., the value being requested).
An additional experiment compares the accu-racy of relations extracted as noncontiguous attributes from queries, vs. relations extracted by a previous open-domain method (Fader et al., 2011) from 500 million Web documents. The previous method, including its extraction patterns and its ranking scheme, is designed with instances rather than classes in mind. For fairness to the method in (Fader et al., 2011), the evaluation procedure is slightly adjusted. The set of instances associated with each target class, over which the two methods are evaluated, is reduced to a single representative instance selected a-priori. The instances are shown as the first instances in parentheses for each class in the earlier Table 2. Thus, the class attributes are extracted using only the instances keanu reeves, boeing 737 and bugs bunny in the case of the classes Actors, Aircraft and Animated characters respectively. Table 6 suggests that noncontiguous attributes extracted from queries tend to capture higherquality relations than arbitrary relations extracted from documents. Because fact-seeking queries inquire about the value of some relations (attributes) of an instance, the relations themselves tends to be more relevant than relations extracted from arbitrary document sentences. Nevertheless, relations derived from queries likely serve as a useful complement, rather than replacement, of relations from documents. The former only discover what relations may be relevant; the latter also identify their occurrences within text.

Related Work
Sources of text from which relations (Zhu et al., 2009;Carlson et al., 2010;Lao et al., 2011) and, more specifically, attributes can be extracted include Web documents and data in human-compiled encyclopedia. In Web documents, attributes are available within unstructured (Tokunaga et al., 2005;, structured (Raju et al., 2008) and semi-structured text (Yoshinaga and Torisawa, 2007), layout formatting tags (Wong et al., 2008), itemized lists or tables (Cafarella et al., 2008). In human-compiled encyclopedia (Wu and Weld, 2010), data relevant to attribute extraction includes infoboxes and category labels (Nastase and Strube, 2008;Hoffart et al., 2013) associated with Wikipedia articles. In order to acquire class attributes, a common strategy is to first acquire attributes of instances, then aggregate or propagate (Talukdar and Pereira, 2010) attributes, from instances to the classes to which the instances belong. The role of Web search queries, as an alternative textual data source to Web documents in open-domain information extraction, has been investigated in the tasks of attribute extraction (Paşca, 2007;Paşca, 2012), as well as in collecting sets of related instances (Jain and Pennacchiotti, 2010).
To increase diversity within a ranked list of attributes, the extraction method in this paper employs a synonym vocabulary to approximately identify groups of near-duplicate attributes. As reported for previous methods, the resulting lists may still contain lexically different but semantically equivalent attributes. Scenarios where detecting all equivalent attributes is important may benefit from other techniques for paraphrase acquisition (Madnani and Dorr, 2010).
Sophisticated techniques are sometimes employed to identify the type of the expected answers of open-domain questions (Pinchak et al., 2009). In comparison, the loose typing of the values of our noncontiguous attributes is mostly coarse-grained. It relies on wh-prefixes (when, how long, where, how) and possibly subsequent words (what nutritional value) from the queries, to determine whether the values are expected to be a date/time, length/duration, location, manner, nutritional value etc.
Relations extracted from document sentences (e.g., "Claude Monet was born in Paris") are tuples of an instance (claude monet), a text fragment acting as the lexicalized relation (was born in), and another instance (paris) (cf. (Fader et al., 2011;Mausam et al., 2012)). For convenience, the relation and second instance may be concatenated, as in was born in paris for claude monet. But document sentences mentioning an instance do not necessarily refer to properties of the instance that people other than the author of the document are likely to inquire about. Consequently, even topranked extracted relations occasionally include less informative ones, such as comes into view for mount rainier, is on the table for madeira wine, or allows for features for javascript (Fader et al., 2011). Comparatively, relations extracted via noncontiguous attributes from queries tend to refer to properties that have values that Web users inquire about in their search queries. Therefore, the relations extracted from queries are more likely to refer to salient properties, such as date/time (it) had its last eruption for mount rainier; length/duration (it) lasts for madeira wine; and manner (it) stores date information for javascript.

Conclusion
By requesting values for attributes of individual instances, fact-seeking and explanation-seeking queries implicitly assert the relevance of the properties encoded by the attributes, for the respective instances and their classes. The extracted attributes are not required to take the form of contiguous short phrases in the source queries, thus allowing for the acquisition of a broader range of attributes than those extracted by previous methods. Furthermore, since Web users are interested in their values, the relations to which the extracted attributes refer tend to be more relevant than relations extracted from arbitrary documents using previous methods. Current work explores the role of distributional similarities in expanding extracted attributes for narrow classes; and the extraction of noncontiguous attributes and relations from natural-language queries without a wh-prefix (e.g., cars driven by james bond).