Representing "Improvement": A Short Study on an RTE Example =========================================================== WORKING NOTE 27 Peter Clark (peter.e.clark@boeing.com) Jerry Hobbs (hobbs@isi.edu) December 2007 Introduction ------------ This note is a short study on a Recognizing Textual Entailment (RTE) example, number T9 from our test suite: T: Apple says the new Intel dual-core chips improve performance by up to 39 percent over the previous single-core variety. H: Dual-core chips make the performance better. The entailment problem is to show that H plausibly follows from T, but our goal here is a little deeper, to discuss what a meaningful representation of T and H might look like. For this note, we'll ignore the "Apple says" qualification (or assume that if "Apple says X" then X is true - this is a whole other dimension concerning contextual reasoning outside the scope of this note). Representing "improve" ---------------------- T: Apple says the new Intel dual-core chips improve performance by up to 39 percent over the previous single-core variety. The notion of "improve" is surprisingly difficult to capture, in particular in this context. Consider: "My car's gas milage improved from 34 mpg to 38 mpg." To capture this requires distinguishing parameters/attributes (e.g., gas milage) and parameter/attribute *values* (e.g., 34 mpg). It also requires notions of time, and a notion of which direction on a measuring scale is "better" (here, 38 is better than 34). "Better" might be subjective (e.g., price, depending on whether you're the buyer or seller). The parameter/value distinction is one that comes up in qualitative reasoning also. To confuse things, in language we often use the same word to refer to both a parameter and its value (in a given context): "My gas milage (parameter value) is 34 mpg" "Gas milage (parameter) improved" "Improve" implicitly requires a notion of time (before and after, such that some parameter has a "better" value after than before). Finally, what is the improvement between? What parameter values are being compared? If I say "My car's milage improved", I am comparing the values of an object's parameter at two different times: milage(mycar)@t0 vs. milage(mycar)@t1. But this RTE example is more complex: T: Apple says the new Intel dual-core chips improve performance by up to 39 percent over the previous single-core variety. First, there is no specific computer whose performance is being measured; rather the sentence contains generics, talking about some general level of performance. Second, the comparison is essentially between different objects (computers using the previous variety of chips, vs., those using the current variety) rather than of the *same* object at different time points. Third, the comparison somewhat hypothetical, between two computers identical in all ways except the chip (even though such a pair may not actually have been manufactured). Fourth, the time dimension suggested by "improve" is somewhat murky, as the comparison can be seen as between two computers at the *same* time point, but using different chips. (It's not that any specific computer has gotten faster; rather there's a new variety of computers faster than the old). Or, perhaps one could interpret this sentence as meaning "the typical performance of typical computers at time t0 (when single-core chips were used) < the typical performance of typical computers at time t1 (now dual-core ships are used)", though that requires considering statistics over, and criterea for membership in, a population. All in all there is a plethora of representational issues to consider if one really wants to formalize the deep meaning. Nevertheless, we can skirt some of these issues and at least say something along the lines of Hobbs' axioms: ;;; improve -> x's value increases on some scale of "goodness". improve(artifact, x, from y, to z, on scale s) -> at(x,y) @ t0, at(x,z) @ t1, before(t0,t1), less-than(y, z, on scale s), goodness-scale(s, x). Note that the text says nothing about *how* dual-core chips make the performance better, just that (somehow) they are responsible. Representing "better" --------------------- 9.H3 Dual-core chips make the performance better. Again this is tricky because of the ellipsis (i.e., better than what?), maybe again appealing to some hypothetical computer. If we assume some machinery for filling in the gaps: "make the performance [of computers using them] better [than computers using the previous variety]" then we might axiomatize better as: better(y, than z, for x, on scale s) -> greater-than(y, z, on scale s) ; (ie. "more") goodness-scale(s, x). The use of "make" (approximately "cause") here is also complex. If an agent makes/causes a state, then we would say the agent does something whose result is that state: make(x,state) -> do(x,e) & result(e,state) where result() can be seen as a second-order STRIPS-like "effects" predicate. However, in this case the subject of "make" is not an agent but an artifact (dual-core chips). If the state were an attribute (e.g., "dual-core chips make performance fast"), then the implication is that the world "with" the artifact results in the state, but an alternative world "without" the artifact (everything else being equal) would not: make(x,y,state) -> ( with(x,y) -> state without(x,y) -> not state) This is essentially the concept of causality. This example, though, is more complex because of the comparison operator ("better"), which is essentially comparing between these two alternative worlds. If we simplify and assume the comparison is over time, one might formalize "make" as: make(artifact, state) @ t0 -> state @ t1 e.g., ;;; x = performance of computers make(artifact, better(x)) @ t0 -> better(x) @ t1 The notion of better similarly can be thought of as "more good". Again here assuming the comparison is of a parameter value of an artifact at different time points, this would be similar to "improve": better(x) @ t1 -> at(x,y) @ t0, at(x,z) @ t1, before(t0,t1), less-than(y, z, on scale s), goodness-scale(s, x). This is similar to the formalization of "improve", and would enable the entailment T -> H to be inferred. Again note that the text says nothing about *how* the artifact (dual-core chips) make the performance better. Deep vs. Shallow Representations -------------------------------- The interesting thing is that while a shallow DIRT-like rule might hypothetically capture the shallow equivalence: improve(x,y) -> make(x,better(y)) this fails to capture the *meaning* of these similar words and thus fails to support many inferences which follow, e.g., The older chips resulted in worse performance. It also conflates the meaning of "make" and "better" rather than factoring them into separate axioms. WordNet also provides the shallow knowledge of the equivalence ("improve" and "better"(v) are synonyms). For the specific T/H pair above, the deep meaning really is somewhat overkill for the task: Really all that is required is to show the equivalence of "improve" and "better"(v). However, for additional tasks deeper understanding is needed. Wider Representational Issues ----------------------------- T: Apple says the new Intel dual-core chips improve performance by up to 39 percent over the previous single-core variety. In addition to the meanings of "improve" and "better", there are many other things going on here. In particular, the text is talking about performance of computers (not mentioned); the chips are (key functional) parts of those computers; the performance metric is speed; and Apple manufactures the computers. It would be interesting to add other H sentences which test realization of this knowledge, e.g., H: Apple's new computers are faster than their old ones. H: Apple's new computers use dual-core chips. Even more broadly, there is a kind of "business script" at play in this example: Companies make things, they are continuously trying to improve their product, they will advertise their improvements to increase sales, there is a constant cycle of new innovations / refinements as a result of this process. If one can recognize that this script is at play, then broader conclusions can be drawn, e.g., Apple manufactures a product Apple have improved their product Sales of Apple's product may increase in the near future Apple have a new version of their product out That is, a strong top-down set of expectations like this can significantly help understand what is going on in the example, as opposed to trying to derive an understanding from a largely bottom-up process. In the end, both bottom-up and top-down reasoning is necessary for a full understanding. Summary ------- Where does this all leave us? 1. On the one hand, it seems clear that "shallow" representations, where the representational structures mimic the surface syntactic structures, really are not adequate for capturing many aspects of meaning. While DIRT-like paraphrase rules can provide some surface implications, there seems clearly a limit. 2. The "deep" knowledge, e.g., the formalization of "improve", is time-consuming to write, requires making some ontological decisions, and also seems unlikely to be acquired automatically. A dictionary, for example, is unlikely to provide anything even close to the formalization above, e.g., "improve: to make better" (WordNet) "improve: to enhance in value or quality" (Mirriam-Webster) The hope for creating a broad coverage formalization is that such axioms are only needed for very general phenomonenon, i.e., the task is intricate but limited: maybe only 1000 concepts need to be formalized, which is likely a feasible task. 3. Orthogonally to the meaning of "improve", this example illustrates many other complex linguistic phenomena. The bottom line is there is huge variation, imprecision, approximation, and gaps which can occur in how a phenomenon is expressed in language. Without strong expectations to correct these, any syntax-derived representational structures will preserve all these undesirable phenomena, making subsequent reasoning very difficult. This again argues for an important role of top-down expectations on language understanding. -- end --