Interpretability for Morphological Inflection: from Character-level Predictions to Subword-level Rules

Neural models for morphological inflection have recently attained very high results. However, their interpretation remains challenging. Towards this goal, we propose a simple linguistically-motivated variant to the encoder-decoder model with attention. In our model, character-level cross-attention mechanism is complemented with a self-attention module over substrings of the input. We design a novel approach for pattern extraction from attention weights to interpret what the model learn. We apply our methodology to analyze the model’s decisions on three typologically-different languages and find that a) our pattern extraction method applied to cross-attention weights uncovers variation in form of inflection morphemes, b) pattern extraction from self-attention shows triggers for such variation, c) both types of patterns are closely aligned with grammar inflection classes and class assignment criteria, for all three languages. Additionally, we find that the proposed encoder attention component leads to consistent performance improvements over a strong baseline.


Introduction
With the rise of deep learning, neural networks have been nowadays used in the process of decision making in various domains, as different as trading, medicine and government. Ethical considerations of such decisions have led to an increasing need for interpreting neural models which is a vivid research topic in machine learning community (Lipton, 2018;Gilpin et al., 2018). Although interpretability research in NLP is partly driven by ethics (Jacovi and Goldberg, 2020), there is a growing body of work exploring what linguistic properties emerge in neural models (Belinkov and Glass, 2019;Manning et al., 2020). The latter line of work aims to aid and scale up linguistic research, which is also the topic of this paper. Lin-guistics research focuses on uncovering patterns and regularities in language. Retrieving and analyzing structures of languages learned by neural agents can systematize our knowledge and, ideally, help us to come up with new regularities. Recent advances (Schrimpf et al., 2020) in testing hypotheses about human language processing using the growing suite of modern interpretable NLP are, indeed, inspiring, but still relatively limited to few languages. To scale up linguistic research, we require truly language-independent models developed for languages other than English (Bender, 2011). In return, understanding the model's decisions can lead to new ideas, how to improve the performance of the model on hard cases, i.e. on a particular linguistic phenomenon or language.
In our work, we concentrate on interpretability methods suitable for examining what knowledge of inflection morphology is captured by neural networks. Specifically, we consider a neural model that learns a mapping from a lemma and an abstract morpho-syntactic definition (MSD) to its inflected form. MSD comprises a part-of-speech (POS) tag as well as language-specific inflection tags. For example, given the Italian lemma scolorire "discolor" and MSD V;IND;PRS;3;PL (verb, 3rd person plural present indefinite form), the output is the word form scoloriscono. Datasets for this task of morphological reinflection are available for many languages and provide an opportunity to study a broad set of inflection phenomena.
Character-level encoder-decoder neural models with attention achieved very high performance on this task across many languages (see Cotterell et al., 2017Cotterell et al., , 2018Vylomova et al., 2020 for the results of the recent shared tasks). Nevertheless, this class of models is typically not interpreted, and if it is so, the interpretation is limited to visualizing attention heatmaps on selected examples (see e.g. Aharoni and Goldberg 2017;Peters and Martins 2019). We argue that per-example heatmaps provide very limited insight into what neural agent learns about a specific inflection phenomenon and how neural learning process can be related, in a systematic way, to the linguistic theory. Indeed, consider the previous Italian example: how would humans reason to convert the lemma to its inflected form? 1 In this work, we assume that humans apply the rules of grammar (implicit for native speakers, and explicit for language learners) when they perform this task. Specifically to our example, human reasoning could look as follows: the verb scolorire ends with the suffix -ire which determines its inflection class 2 , according to which we construct the inflected form by copying the stem scolorand adding an inflection suffix -iscono. What does a typical character-level model of an encoderdecoder class do? We visualize learned attention weights for an example of such a model in Fig. 1. By analyzing the most prominent alignments, we conclude that the model's character-level decisions can be combined into a) copying a substring of characters corresponding to the stem and b) generating characters for a substring corresponding to the inflection suffix. However, the decision why the model chose a particular inflection class is not visible.
To reach interpretability of models for inflection, we require methods that satisfy three conditions. 1 This is something that speakers implicitly (maybe, we do not know this for a fact) perform every time they use a word. This is also often an explicit task that language learners perform in the process of acquiring a new language.
2 There are three inflection classes in Italian (Table 1). Verbs ending on -ire select between two classes: one (in the example) is more frequent in terms of types of verbs that belong to it, whereas the other one (-ere class) is selected by some very frequent verbs ending on -ire. First, the inflection model's decisions have to be aligned more closely to human reasoning by separating two kinds of operations: determining inflection class versus generating a string (given the assignment to a class). Second, a systematic analysis of model's decisions requires the extraction of inflection rules that are interpretable to humans. Finally, both of these factors require working with subword units rather than individual characters, the latter being the prevailing practice for inflection models.
In this paper, we propose a methodology -subword modification for a typical inflection model and interpretation method -which meet all three requirements. Our experiments on three typologically-different inflection phenomenon show that the linguistic rules elicited with our framework are highly consistent with linguistic knowledge (approximated by grammars). We evaluate the effectiveness of the proposed subword modification and find that, apart from its direct impact on interpretability, it leads to consistent performance improvements. To facilitate the use of our methods for linguistic research, we share our code 3 .

Methodology: Interpretability for Inflection
How to make a character-level neural model for inflection more interpretable? We take the stance that, to make current models more interpretable, we should analyze their decisions in terms of subwords, i.e. clusters of characters rather than individual characters. Interpreting character decisions is outside of human intuition because of the double articulation principle (Martinet, 1967) which postulates that single phones are uninterpretable to humans, whereas the clusters of them form a mental linguistic representation of meaning in the speaker's mind. In writing, this distinction maps to the one between single characters and morphemes.  We propose to extract human-interpretable rules from an encoder-decoder model with attention. Specifically, to make it more interpretable, we modify such model by complementing the crossattention mechanism with a novel component ( §3) for self-attention over the subwords of the lemma. The task of this component is to help identify the morphological class. To extract the rules, we design a pattern extraction method ( §4) that aggregates learned attention weights a) over a span of characters in a word and b) over a range of words in the same inflection category. Pattern extraction applied to character-level cross-attention weights retrieves linguistic rules satisfying requirements P1 and P2, whereas pattern extraction applied to subword-level self-attention weights, targets the requirements P3 and P4.

Case studies
To demonstrate the use of our approach for linguistic research, we will analyze how well patterns extracted with our proposed methodology align with human knowledge of typologically different phenomena. In morphological typology, crosslinguistic strategies to define the form and meaning of morphemes are described by typological parameters (Shopen, 1985;Dryer and Haspelmath, 2013;Bickel and Nichols, 2007) that separate different dimensions of the strategies. To select typologically different languages for our study, we focus on fusion and flexivity dimensions. Fusion classifies how easy it is to find a boundary between a morpheme and its phonological host and can take the following values: isolating (separate phonological word), concatenative (segmentable dependent morphemes) and nonlinear (not segmentable morphemes). Flexivity indicates whether variation in morpheme form can be explained by phonological processes (nonflexive) or not (flexive).
In our case studies, we consider verb conjugation rules and select three languages covering different degrees of fusion and flexivity: Finnish, Italian and Tagalog. In Finnish and Italian, morphemes are separable (concatenative fusion), whereas inflection in Tagalog is formed with affixes, including infixes, and reduplication (nonlinear fusion). Morphemes in Finnish can change their shape because of vowel harmony (nonflexive case). Forms of morphemes in Italian and Tagalog are selected by lexical context (flexive case) but distinctly. Italian verbs are conjugated with respect to three inflection classes defined by the lemma's ending (-are, -ire, and -ere, see Table 1), whereas assignment to an inflection class in Tagalog has no explicit rule.

Neural Model Cross-Att ch Self-Att sub
In this section, we introduce our novel component for self-attention over subwords of the lemma (Self-Att sub ). This module can be integrated into any variant of encoder-decoder system for inflection with character-level cross-attention (Cross-Att ch ). In this work, we show such integration to a sparse two-headed model of Peters and Martins (2019). 5 To explain the two-headed attention mechanism of the baseline model as well as our novel attention component, we first introduce the terminology of an abstract attention head module.

Attention Head Given an input sequence of vectors
and a query vector q ∈ R D2 , attention head module computes two components: attention weights a ∈ R J and an attention head vector c ∈ R D1 : where attention weights a are obtained by a mapping function from real values to probabilities, applied to alignment scores α. Following Peters and Martins (2019), we use sparsemax activations 6 as a mapping function. Hereafter we refer to the construction of an attention head vector c as scoring a sequence of vectors H with a query vector q.
Baseline Cross-Att ch (Peters and Martins, 2019) Input lemma and MSD sequences are represented by separate bi-LSTM encoder states: H u encodes characters in lemma, and H v encodes tags in MSD. The decoder is a unidirectional LSTM with input feeding (Luong et al., 2015). At each prediction time t, it computes a hidden state s t which is followed by the construction of two attention heads u t and v t : one for lemma and one for MSD. They are calculated by scoring the respective representations, H u and H v , with a query -decoder state s t . The two attention heads are used to compute separate candidate attentional decoder states: They are combined in a weighted sum to obtain an attentional decoder states t , where weights are calculated by a sparse gate vector p t = [p 0 , p 1 ] ∈ R 2 : The attentional decoder state is fed into a sparse prediction layer. For the input feeding, the input to decoder comprises the predicted symbol embedding and gated attention vector c t : The two-headed gate mechanism provides extra interpretability in the form of a three-way answer about what is relevant at a time step: the lemma, the inflections, or both.
Integrating Self-Att sub We depart from the existing character-level solution in that we assume that the input to the model -a (lemma, MSD) pairis complemented with the segmentation of lemma into subwords. 7 We obtain an extra subword representation of lemma H subw by averaging lemma representation vectors in H u spanning characters within each subword. Besides the attention heads u t and v t computed at each generation step, we construct an additional attention head vector m which is computed once before the decoding stage. It is constructed by scoring the sequence of lemma subword representations in H subw with a query vector q pos corresponding to the encoding of the lemma's POS tag. This encoding is obtained by selecting a vector in MSD representation H v corresponding to the position of the POS tag (e.g, POS tag V (verb) is in the first position in MSD V;IND;PRS;3;PL).
To integrate subword-level attention head m in the baseline system, we modify the gate layer in Eq. 4: In this way, the gate mechanism (and decoding) is expected to be informed with a signal for inflection class selection when such signal can be attributed to specific character spans (subwords) in the lemma. The attention over subwords is static and shared across target positions, aiming to separate the signal of the class assignment it conveys from local character transformations, given this assignment.

Pattern extraction
To extract linguistic rules from the trained model Cross-Att ch Self-Att sub , we represent its knowledge of inflection as a database. To populate it, we analyze predictions of the model on a dataset. The latter can be the original task data or a dataset collected to study a specific inflection phenomenon. Then, for each example in the dataset, we populate the knowledge database with the example itself and two patterns, which are extracted from the learned attention weights. The first one is a transformation pattern ( §4.1) obtained by applying our pattern extraction method to learned cross-attention weights (Cross-Att ch component). This method can be applied to any inflection model embedded into encoder-decoder paradigm with attention. The second pattern is over the lemma subwords ( §4.2) which is obtained from attention weights of the novel Self-Att sub component. Finally, we explain how populated in this way knowledge database can be queried to study inflection phenomena ( §4.3).

Cross-Att ch Transformation Patterns
This method maps each example (lemma, MSD) → inflected form to a transformation pattern of a A X and A F as in Figure 1 Step 1: Transform att. weights into 'salient' alignments A = [X1 · · · X7 copy , F4, F3, F2, F4, F4] Step 2: Inverse A and group pred. steps by gen. type Step 3: Replace char. in X and Y with indexed gen. type P tr (X) = c 1 · · · c 1 7 re P tr (Y ) = c 1 · · · c 1 7 f4 1 f3 1 f2 1 f4 2 f4 2 Step 4: Collapse adjacent symbols P tr (X)=c 1 re form P tr (lemma) → P tr (inflected form). Formally, the input to the algorithm is a lemma X = x 1 , . . . x n , MSD F = f 1 . . . f l , predicted target form Y = y 1 . . . y m and cross-attention weights over lemma characters A X = a X 1 . . . a X m , a X i ∈ R n and over MSD tags A F = a F 1 . . . a F m , a F i ∈ R l . 8 The output is a string of a form P tr (X) → P tr (Y ) where the constructed pattern representation P tr for the lemma and target are built through the following steps (shown in Table 2 for our example in Fig. 1): Step 1. Transform input attention weights A x and A f into "salient" alignments A: Each component a j of salient alignments A = a 1 . . . a m is a set of input positions (in lemma X and/or MSD F ) that provide the most significant contributions to predicting a character in Y at position j. We denote positions by capitalized symbols, i.e. F 1 for position 1 in F , to reflect the difference between the position's index and value. Salient alignments are built by applying a filtering function φ to attention weights at each predicted position: φ : [a X j ; a F j ] → a j . In the following, we illustrate how our algorithm works for the simplest choice of 8 We assume that the sum of weights in a combined vector [A X ;A F ] is 1. In Cross-Att ch Self-Att sub model this is achieved by scaling cross-attention weights for the lemma and MSD with corresponding gate values. Another way, typical for neural inflection models of encoder-decoder class, is to run cross-attention over a concatenation of the lemma's characters and MSD's tags. the filtering function, max-pooling, which simply selects one input position with the highest attention weight. 9 In our running example, this strategy results in only one element for each component a j : e.g. a 7 = X 7 .
Step 2. Inverse mapping A and group prediction steps by generation type: By inverting salient alignments, we construct a mapping from input positions to prediction steps grouped by a symbol corresponding to generation type. The latter is identified for each alignment a j by the type of input: we denote generation from the lemma's characters (a j = X i ) by symbol g, whereas that from a tag (a j = F k ) is denoted by indexed symbol f k. The special case of copying a character from the lemma, i.e. a j = X i and x i = y j , is denoted by symbol c. Thus, a position in F can be mapped to only one group of prediction steps (the type of generation is unique and defined by the tag's position), whereas that in X can be mapped to up to two groups, g and c. Some input positions might be absent in the constructed mapping, if not present in salient alignments, e.g. X 9 in Table 2.
Step 3. Replace characters in X and Y with indexed generation type symbols: We index (in the order of input positions) triples of salient alignments (input position, generation step, generation type) identified in the previous step. Then, we construct patterns of lemma and inflected form, by replacing characters at aligned positions with an indexed value of the generation type symbol, e.g.
In X, this can result in an aggregated symbol, e.g. replacing X i with c 1;2 ; g 1 means that position X i is aligned to three target positions, two of which are generated by copying x i . As illustrated in our running example, we use the same index value in two special cases: a) a whole target substring was copied, and b) a whole target substring was generated by the same tag. We keep the track of symbolic mappings from characters to indexed generation symbols that replace them.
Step 4. Collapse adjacent symbols: Scan representations P tr (X) and P tr (Y ), built at the previous step and iteratively collapse adjacent symbols of the same value. At the same time, we update the symbolic mapping: if two adjacent symbols are collapsed, we replace their string mappings with a single mapping from the strings concatenation to the generation symbol.
The idea behind the inverse mapping and indexing in steps 2 and 3 is to ensure a unique way of indexing generation symbols across all data pairs. The indices themselves are essential to keep a oneto-one mapping from substrings to the generation symbols they are replaced with. Both factors come into play when we query the knowledge database for an inflection phenomenon ( §4.3).

Self-Att sub Lemma Patterns
This algorithm takes as an input a data example (X, Y, F ), along with a segmented lemma representation S(X) = s 1 . . . s p and learned self-attention weights over lemma's subwords: a S(X) ∈ R p . The output is a pattern for salient subwords in lemma P l (X) which is built with a similar procedure as described above where indexing steps 2 and 3 are skipped.
First, we transform self-attention weights a S(X) into salient alignments a by applying a filtering function: φ : a S(X) → a, thereby identifying a set of subword positions with the most significant contribution to the overall generation process (any type of filtering function described in the previous subsection can be applied). Afterward, we replace all subwords in the input lemma at nonsalient positions, S j ∈ a, with a dedicated symbol, e.g. asterisk *. Finally, we iteratively merge adjacent asterisk symbols to obtain a more general pattern. To illustrate with our running example, given a segmented representation of lemma S(X) = s|col|or|i|re and salient alignments a = {S4, S5}, obtained by filtering input positions with nonzero self-attention weights, the resulting pattern for salient subwords in the lemma is P l (X) = * ire.

Querying Patterns
As a result of applying the previous two methods, each data example (X, Y, F ), along with segmented lemma representation S(X) and learned attention weights (A X , A F , a S(X) ), can be mapped into two items: Cross-Att ch transformation pattern P tr (X) → P tr (Y ) and Self-Att sub pattern for salient subwords in lemma P l (X). The data examples along with the extracted patterns are stored in a knowledge database. To systematically study how the neural model handles a specific linguistic phenomenon of interest, the database can be queried, for patterns and examples, with a phenomenon's formalization in a form of regular expressions applied to the lemma, inflected form or MSD. Selected with a query examples are then grouped by their patterns (either transformation or lemma ones) resulting in each group representing an induced linguistic rule for the phenomenon.
At this stage, to make the patterns more readable, we perform an unmasking operation within each group: if a particular symbol is used to substitute one substring that is the same for all examples within a group, we replace the symbol back with this substring. For instance, if the pattern from our example c 1 re → c 1 f4 1 f3 1 f2 1 f4 2 represents one such group, and symbol f4 2 is used to substitute only one string no, which is the same across all data points in the group, we can unmask the string, to obtain a pattern c 1 re → c 1 f4 1 f3 1 f2 1 no.

Experiments and Results
We perform three case studies, introduced in §2.1, to demonstrate how our framework allows querying patterns learned by an inflection neural model. The goal of our experiments is to assess how well the extracted patterns correspond to known inflection rules. To see whether our modifications to the inflection model affect its performance, we check the inflection accuracy on the analyzed languages and compare it with the original character-level model.
We use data from SIGMORPHON shared task: 2018 edition for Italian and Finnish (10K/1K/1K examples in train/development/test data), and 2020 edition for Tagalog (1,870/236/478). For each language, we train Cross-Att ch Self-Att sub model with batch size 4, beam size 1 and other hyperparameters as reported in Peters and Martins (2019). To produce segmented lemma input, we use the BPE method (Gage, 1994;Sennrich et al., 2016b) with 1K merges on a token list (100K examples) extracted from WikipediaDumps articles. 10 .
Using model's predictions on the concatenation of train, development and test set, we query Cross-Att ch and Self-Att sub patterns As a filtering function, we keep only nonzero weights for Self-Att sub patterns, whereas we choose max-pooling for Cross-Att ch ones, as on average sparse activations assign nonzero weight to one input feature. To systematically examine whether the classes of patterns extracted are correct and cover the data adequately, we report two metrics, namely a) number of examples selected with a query and how many of them are grouped by each pattern, and b) model accuracy (correct predictions) with respect to the number of examples per selection with a query and per group pattern.
Cross-Att ch : Transformation Patterns For each language, we define specific queries: 3rd person plural present tense for Italian (MSD=V;IND;PRS;3;PL), 3rd person plural present positive imperative for Finnish (MSD=V;ACT;PRS;POS;IMP;3;PL) and imperfective aspect with agent semantic role for Tagalog (MSD=V;IPFV;AGFOC). The choice of MSDs is rather arbitrary: for illustrative purposes we select grammatical categories that contain enough examples to represent form variation of the corresponding morpheme. Table 3 present the extracted Cross-Att ch patterns. For each query, we show patterns that group at least 5% of the examples selected with a query. The patterns are sorted by their number of examples in a decreasing order. For each presented pattern, we show an example mapped to this pattern and symbol mapping information. The latter lists, for each symbol in the pattern, all substrings mapped to this symbol along with their frequencies (within a group), if the number of distinct substrings is less than five elements. Otherwise, we show average length (≈) of substrings mapped to this symbol, or exact length (=), if it is the same for all of them. These symbol mappings also include bijection cases (↔) that were unmasked after grouping examples (as described in §4.
Self-Att sub : Lemma Patterns We analyze Self-Att sub lemma patterns to determine whether our model uses indeed subword segments when choosing the specific variant of a morpheme. Concretely, we use regular expressions on the target form to select examples corresponding to a specific form of morpheme, identified above with transformation patterns. Then, we map selected examples to their Self-Att sub patterns. Table 4 presents the queries and extracted patterns. 11 For each query, we list the most frequent patterns (sorted by frequency in a decreasing order) along with one segmented lemma example mapped to the pattern. 12 The segments of lemma examples, identified as salient (and presented in the patterns) are highlighted in bold.
We conclude that the subword regions identified by Self-Att sub patterns conform to a great extent to triggers of morpheme form variation listed in grammars. We note that although the regions for finding such clues (when they are phonological or lexical, and frequent) look plausible, their form is influenced by the results of BPE segmentation and may not be perfectly aligned with grammars. For example, Italian patterns show that the model's focus is on the endings of lemmas for all three classes. In case of reflexive verbs, where reflexive ending -si tends to be separated into a separate subword by BPE, the model correctly places focus on a more informative penultimate segment. The patterns extracted for Finnish, display the grammar rules too: the focus on the lemma endings -aa and -ua for the first group, and -ää/-ä for the second group, points directly to the harmony of back and front vowels, respectively. The model does not search for the clues in the vowel patterns of the stem but chooses a smart strategy to focus directly on the inflection endings for lemmas: they are frequent and already 11 For Tagalog, we note that we do not find any frequent patterns for e.g. a query "gold target=nag*", which is in line with no explicit criteria for inflection class assignment in this language. 12 We refer to the Appendix, Tables 6-7 for the full list of extracted patterns. agree with the vowels found elsewhere in the stem to the left.

Lemma Patterns
No. of/Acc  Self-Att sub : Performance Impact We evaluate the impact of the novel Self-Att sub component by comparing the performance of Cross-Att ch Self-Att sub with that of the baseline model, Cross-Att ch . For reference, we include the results of a) the hard monotonic attention (HMA) system of Wu and Cotterell (2019) which currently holds as the state-of-the-art on the reinflection task by rerunning their code; b) a variant of our system, Cross-Att ch Self-Att ch where the encoder attention module is run over the characters of the lemma, instead of subwords. The latter corresponds to a limiting case of lemma segmentation where each character is a segment. We report accuracy and edit distance on the test set in Table 5. Additionally, we provide information on the number of trained parameters for each model. The number of parameters for Cross-Att ch Self-Att sub model is the same as for its character variant Cross-Att ch Self-Att ch . The difference in the number of parameters across the languages is due to the variation of their character vocabulary sizes.  We observe, that the Cross-Att ch Self-Att sub model shows systematic improvements across all three languages over the baseline and reference models. Regarding the level of segmentation, the Cross-Att ch Self-Att ch system achieves higher results on Italian, where indeed, class variation can be associated with a certain character in a certain position. In terms of the number of trained parameters, the improvements due to the Self-Att sub component are achieved by only adding a relatively small number of extra parameters compared with the baseline model, Cross-Att ch . We also note that the performance of our systems is higher or on par with the state-of-the art model HMA, whereas the latter has an on-average sevenfold increase in the number of parameters in comparison with to that of Cross-Att ch Self-Att sub and Cross-Att ch Self-Att ch .

Discussion and Future Work
In the following, we discuss our proposed methodology in terms of two aspects, namely, interpretability for inflection (in terms of typological parameters) and ideas for performance improvement.
Interpretability for Inflection In terms of the typological parameter of fusion, the results of our experiments illustrate that our Cross-Att ch pattern approach can effectively extract rules for concatenative morpheme forms as well as reduplication processes. What is beyond, at the moment, are nonlinear processes that are not always visible in orthography, e.g. tonal changes and internal stem changes. The latter, for example, is demonstrated by root and pattern morphology in Arabic and Hebrew, for which standard orthographies do not indicate most vowels.
Regarding flexivity, our Self-Att sub pattern method can identify phonological (visible in orthography) as well as lexical triggers to the variation of inflection morpheme's form. However, the case of suppletive forms (English go→ went) would not be identifiable in patterns. Although suppletive cases are likely to be fairly rare in terms of word types, they seem to be only maintained in high-frequency words (Bybee, 1985). Therefore, although affecting only a small number of words, suppletion might be visible in patterns when studied together with word frequency (which is, at the moment, not possible because of the current practices for building inflection generation datasets).
The parameter of exponence encodes the extent to which single morphemes express multiple morphosyntactic features. For the class of neural models currently used for inflection generation, it is not possible to see a clear correspondence between the meaning assigned by humans and the model: as we see from Fig. 1 which illustrates polyexponence in Italian inflection, the model assigns separate characters of inflection morpheme -scono to different tags, whereas for humans, it is hard to break down this morpheme into smaller meaningful parts. 13 Performance Future work can evaluate the impact of Self-Att sub in combination with frequently used induction biases 14 , as well as transformers paradigm, which recently proved to be effective on the task. (Vylomova et al., 2020).

Conclusion
We propose a novel approach for interpreting neural inflection models by extracting patterns from attention weights. To enhance the interpretability of this class of models, we design a linguistically motivated attention component over subwords that leads to a systematic performance improvement. Our experiments with linguistic rules induction illustrate the great potential of our methodology for linguistic research scaled to diverse typology.

A Cross-Att ch Transformation Patterns Algorithm
In this Section, we formalize Steps 2 and 3 of the algorithm for pattern extraction from character-level cross-attention weights presented in §5.
Algorithm 1: ( Step 2) Inverse salient alignments mapping A and group prediction steps by generation type  and accuracy (Acc) are shown per selection with query and per group pattern. For each query, we list all extracted lemma patterns (sorted by frequency in a decreasing order) along with one segmented lemma example (in parentheses) mapped to the pattern.