Decouple knowledge from paramters for plug-and-play language modeling

Pre-trained language models (PLM) have made impressive results in various NLP tasks. It has been revealed that one of the key factors to their success is the parameters of these models implicitly learn all kinds of knowledge during pre-training. However, encoding knowledge implicitly in the model parameters has two fundamental drawbacks. First, the knowledge is neither editable nor scalable once the model is trained, which is especially problematic in that knowledge is consistently evolving. Second, it lacks interpretability and prevents humans from understanding which knowledge PLM requires for a certain problem. In this paper, we introduce PlugLM, a pre-training model with differentiable plug-in memory (DPM). The key intuition is to de-couple the knowledge storage from model parameters with an editable and scalable key-value memory and leverage knowledge in an explainable manner by knowledge retrieval in the DPM. To justify this design choice, we conduct evaluations in three settings including: (1) domain adaptation. PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training. (2) knowledge update. PlugLM could absorb new knowledge in a training-free way after pre-training is done. (3) in-task knowledge learning. PlugLM could be further improved by incorporating training samples into DPM with knowledge prompting 1


Introduction
Large pre-trained language models (PLM) (Peters et al., 2018;Devlin et al., 2019;Radford et al., 2018) have become a revolutionary breakthrough in NLP area. Optimized by carefully designed self-supervised objectives on unlabeled corpus and fine-tuned on downstream tasks, PLMs perform remarkably well in a wide range of NLP benchmarks. Recent studies (Warstadt et al., 2019;Petroni et al., 2019) have revealed that one of the key factors to the success of PLMs is that the parameters of these models implicitly learn various types of knowledge in the pre-training corpus. Owing to these learned syntactic, semantic, factual and commonsense knowledge, PLMs show great understanding, generalization and reasoning abilities in multiple downstream tasks (Rogers et al., 2020;Izacard et al., 2022). As Geva et al. (2021) pointed out, the feed-forward layers (FFN), constituting two-thirds of a transformer model's parameters, are essentially key-value memories and store all kinds of knowledge of PLM. The first linear layer of FFN acts like a set of sparsely activated keys detecting input patterns while the second is the corresponding value. To aggressively capture more knowledge, larger PLMs are continuously proposed, from 110M BERT (Devlin et al., 2019) to 530B MT-NLG (Smith et al., 2022), yet PLM has not reached upper bound (Ouyang et al., 2022).
However, a fundamental question still remains: For PLM, is it the optimal way to implicitly encode knowledge in its parameters? We argue that the implicit knowledge encoding approach has two fundamental drawbacks. First, the learned knowledge is neither editable nor scalable once the model is trained (e.g., BERT doesn't know what is a BERT). Nevertheless, world knowledge is actually infinite and evolving. We thus would never expect an ever-large model to capture all the knowledge in its parameters and to be continuously re-trained for the newly coming one. Second, the current PLMs lack interpretability at the knowledge level. Implicit knowledge encoding fails to provide provenance for model's prediction and makes PLM a black box preventing humans from understanding which knowledge PLM requires for a certain problem.
In this work, we propose a novel architecture of PLM, PlugLM, which decouples the knowledge storage from model parameters and explicitly leverages the knowledge in an explainable manner. As shown in Figure 1, we balance the functionality of FFN layer with a differentiable plug-in key-value memory (DPM), which is highly scalable as well as editable. Each slot of DPM encodes the knowledge to a pair of key and value, and thus we can explicitly retrieve the required knowledge in natural language from DPM rather than unnamed vectors in FFN.
To justify the design choice of decoupling the knowledge from parameters, we conduct extensive evaluations under different settings. In the domain adaptation setting, PlugLM could be easily adapted to different domains with pluggable indomain memory-obtaining 3.95 F1 improvements across four domains on average and up to 11.55 F1 improvement on ACL-ARC citation intent classification dataset, without any in-domain pre-training. In the knowledge update setting, PlugLM could absorb new knowledge after pre-training is done in a training-free way by knowledge updating operation in the DPM, with an improvement up to 4 F1 scores in LINNAEUS NER dataset. PlugLM could further be improved by incorporating training samples into DPM with knowledge prompting as a kind of in-task knowledge.

Related Work
Investigating FFN Feed-forward layers constitute two-thirds of a transformer model's parameters and are essential to unveil modern PLMs (Geva et al., 2021(Geva et al., , 2022. A surge of works have investigated the knowledge captured by FFN (Dai et al., 2022a;Meng et al., 2022;Geva et al., 2021Geva et al., , 2022Jiang et al., 2020;Yao et al., 2022;Wallat et al., 2021). Based on the view that FFN is essentially an unnormalized key-value memory network, Dai et al. (2022a) detects knowledge neurons in FFN and edit specific factual knowledge without finetuning. Meng et al. (2022) modifies FFN weights to update specific factual associations using Rank-One Model Editing. Yao et al. (2022) injects knowledge into the FFN via BM25. Dai et al. (2022b) and Lample et al. (2019) enhance the model by expanding the size of FFN with extra trainable keys and values.

Knowledge-Augmented
Language Model There are two lines of works to equip PLM with knowledge. The first is introduce additional Knowledge Graph (KG) and knowledge-based training signal (e.g., entity linking) into the language model pre-training, like ERNIE , KnowBERT (Peters et al., 2019) and KEPLER (Wang et al., 2021). Another line of works adopt retrieval mechanism to incorporate knowledge, either symbolic (Verga et al., 2020;Agarwal et al., 2021;Févry et al., 2020) or texual (Guu et al., 2020;Lewis et al., 2020c;Borgeaud et al., 2022;Lewis et al., 2020a;Verga et al., 2020;de Jong et al., 2022). They formulate the task as retrieve then predict process by using extra neural dense retriever or sparse retriever to find most relevant supporting knowledge and combine it with input using either concatenation (Guu et al., 2020;Lewis et al., 2020c), attention methods (de Jong et al., 2022; or interpolation (Khandelwal et al., 2020;Zhong et al., 2022) PlugLM differs from previous works in that we do not try to equip the model with additional knowledge to perform knowledge-intensive tasks. The key insight is to transform FFN architecture into deep retrieval in the interest of decoupling the knowledge which would otherwise be stored in the parameters and this is orthogonal to all retrievalaugmented PLMs.

Preliminary
Feed-forward Layers Transformer (Vaswani et al., 2017), the backbone for all PLMs, is made of stacked self-attention (Self-Attn) and feedforward (FFN) layers. The former captures the contextual interaction among inputs and the latter process each input independently. Let x ∈ R d 1 be a vector as input, the FFN could be formulated as: where W 1 , W 2 ∈ R d 2 ×d 1 and σ is the activation function. The bias term is omitted for brevity.

Key-Value Memory Network
The Key-Value Memory Network (Weston et al., 2014;Sukhbaatar et al., 2015) corresponds to d 2 key-value pairs and each key/value is a vector in R d 1 . They are the generalization of the way knowledge is stored (Eric et al., 2017;Miller et al., 2016). For an input Figure 1: Overview of our PlugLM. We replace FFN in PLM with a Differentiable Plug-in key-value Memory (DPM) by which PLM could store and leverage knowledge in an explainable manner.
x ∈ R d 1 , there are two stages for a key-value memory network. First, the lookup (addressing) stage would compute the matching degree between x and each key. In the second stage, x would be transformed by the weighted sum of values according to the distribution of the matching degree in the first stage. We can formally define it as: where K, V ∈ R d 2 ×d 1 . Comparing equation (1) and (2), we could find that the FFN is an unnormalized version of MemoryNetwork. The keys in FFN are pattern detectors and would be activated only when certain patterns occur in the input. This explains how FFN stores knowledge in a key-value manner (Geva et al., 2021;Sukhbaatar et al., 2019).

PlugLM
The overall architecture of PlugLM is illustrated in Figure 1. Because FFN is essentially a key-value memory network (Geva et al., 2021;Dai et al., 2022a;Meng et al., 2022) PlugLM is trained in both pre-training and finetuning stages.

Differential Plug-in Memory
In this paper, we view n-th knowledge d n = {t 1 n , t 2 n , ..., t |dn| n } as consecutive tokens from unlabeled corpora as in Guu et al. (2020). For each d n , we get its dense representation h n from a knowledge encoder KnowEncoder(·): where AttentivePooling function (Xu et al., 2021;Cheng et al., 2023a) corresponds to a trainable pattern detector aggregating information from a sequence of input. And E Token and E Pos denote token embedding and positional embedding. Then we use two independent mapping functions to project h n to the key space and value space: where W k , W v , b k and v k are trainable parameters. And DPM is a triplet of D, K, V :

Memory Fusion
For hidden states h ∈ R l×d from Self-Attn, FFN would transform h with unnormalized key-value memory as in Equation (1). Our key insight is that instead of interacting with unnamed vectors in FFN, we conduct Maximum Inner Product Search (MIPS) to retrieve knowledge in natural language from D, K, V where each triplet corresponds to one knowledge along with its key and value representation. For h, we first get its sentencelevel representation z by an attentive pooling function z = AttentivePooling(h), then we use z as the query vector to D, K, V . Since PLM is internally sparse , we only consider Top-N knowledge D z with corresponding keys K z and values V z : where Top-N also corresponds to the indexing operation. With K z and V z , we use knowledge attention to fuse retrieved knowledge into our model: where d is the head dimension. By knowledge retrieval and fusion, we explore an interpretable way to incorporate knowledge into the model where D z is the actual knowledge that PLM would leverage. And direct modification on D without changing model parameters empowers PlugLM with much flexibility and scalability in domain adaptation ( §5.1) and knowledge update ( §5.2) scenarios.

Training
The backbone of our model is a multi-layer bidirectional transformer encoder (Devlin et al., 2019). There are two phases in our framework: pretraining and fine-tuning. In the pre-training phase, to make the whole training process end-to-end trainable, we use asynchronous index refreshing to optimize our model as done in Guu et al. (2020)

Domain Adaptation
Learning robust and transferable representation has been the core of language model pre-training (Peters et al., 2019). For the general-purposed PLM to generalize well on domain-specific tasks, endowing the model with domain knowledge via indomain training remains the go-to approach (Gururangan et al., 2020;Whang et al., 2020;Zhang et al., 2020;Li et al., 2023). In this section, we show that without any in-domain pre-training, PlugLM could flexibly adapt to multiple domains with domainspecific DPM. For the existing PLM encoding knowledge in parameters, this is a challenging task in that it can not guarantee the generalization across multiple domains due to catastrophic forgetting (Kirkpatrick et al., 2016) and sometimes it is even computationally unaffordable to keep training the super large models (Smith et al., 2022;Brown et al., 2020). We consider two adaptation scenarios: domain adaptive post-training ( §5.1.1) and in-domain pretraining ( §5.1.2). The former is conducted after PLM was trained on the general domain and the latter trains a domain-specific PLM from scratch.  (Guu et al., 2020) and PlugLM are models that have an external knowledge base and can be simply adapted to other domains with a different base. We have two adaptation strategies: DAA, short for Domain Adaptive Addition, appends domain knowledge to the knowledge base, and DAR, Domain Adaptive Replacement, replaces general knowledge with domain-specific knowledge in the knowledge base.
We also include the results of ¬DAPT, ¬DAA and DACT. The former two use irrelevant domain corpora for post-training and knowledge base construction, which are used to test the robustness of the adaptation method and rule out the factor that improvements might be attributed simply to exposure to more data 3 . For DACT, Domain Adaptive Continual Training, we sequentially use DAPT for WikiBERT in multiple domains in the hope that it can capture and store knowledge from various domains in a lifelong learning way (Rostami, 2021).

Experimental Results
The results are shown in Table 1. The Avg.Cost is the cost for adaptation measured by hour. For WikiBERT, it's the time to post-train model in domain-specific corpus. For REALM and PlugLM, it is the time to encode domain knowledge into the knowledge base. We can observe: (1) In-domain training helps model better generalize to tasks requiring domain knowledge while irrelevant knowledge misleads the model and causes performance degradation. And by comparing ¬DAPT and ¬DAA, it shows that models with external knowledge base (PlugLM and REALM) are more robust when faced with noisy out-of-domain knowledge.
(2) For the model that implicitly encodes knowledge in the parameters, it fails to generalize across domains as the result of DACT indicates. For example, we keep training WikiBERT in NEWS domain after DAPT in CS domain and fine-tune it on the CS downstream tasks. It performs on par with model that is never exposed to CS domain (¬DAPT). PlugLM could alleviate this catastrophic forgetting problem by storing all kinds of knowledge in DPM and using it in a plug-and-play manner. (3) Direct modification on external memory helps PlugLM efficiently and effectively adapt to different domains without in-domain training. In 254× less time compared with DAPT and in 40× less time compared with REALM, PlugLM significantly outperforms DAPT and REALM-based methods. To further understand PlugLM, in Figure 2, we present a visualization for the distribution of actual retrieved knowledge for DAA, DAR and original PlugLM. A clear pattern here is that with more domain knowledge involved, the model performs better (63.77, 72.51 and 75.32) and remarkably, although pre-trained on the general domain, the PlugLM has managed to learn what to retrieve when there are both general knowledge and domainspecific knowledge in DPM shown in DAA visualization.
Experimental Setting In this section, we choose the biomedical domain and compare PlugLM with model in the architecture of BERT base , pre-trained on the general domain, Wikipedia (i.e., WikiB-ERT) and pre-trained on the biomedical domain, Pubmed (i.e., PubmedBERT). The statistics of datasets and pre-training details are listed in Appendix F. We test two kinds of abilities of these PLMs. First, we test how they perform in biomedrelevant downstream tasks. Specifically, we conduct experiments on eight representative biomedical NER datasets which aim at recognizing domainspecific proper nouns in the biomedical corpus. Then we test their general language understanding ability in GLUE  and SQUAD (Rajpurkar et al., 2016(Rajpurkar et al., , 2018. For SQUAD and GLUE, the DPM is constructed from Wikipedia, and for biomedical NER, DPM is from PubMed (Canese and Weis, 2013).

Experimental Results
The results are shown in Table 3. Both pre-trained on the Wikipedia, PlugLM outperforms WikiBERT in 8/8 NER tasks with average 1.75 F1 scores by simply switching the knowledge domain of DPM. PlugLM also gives comparable results with PubmedBERT in BC4CHEMD, JNLPBA and LINNAEUS datasets. Although PubmedBERT works well for biomedical tasks, it shows less general language understanding ability and underperforms WikiBERT and PlugLM in GLUE (Table 4) and SQUAD (Table 2), especially in low resource scenario (i.e., RTE, COLA and MRPC datasets). With DPM, PlugLM shows great flexibility and performs well in both general domain and biomedical domain. In Appendix D, we give concrete cases of PlugLM with respect to the retrieved knowledge.

Knowledge Update
Since the world is not fixed as a snapshot once the pre-training corpus is collected, the current PLM, no matter how large it is, fails to adapt to this changing world.

Experimental Results
The results are shown in Figure 3a and 3b. For the first setting, we test on QA (SQUAD) and Sentiment Classification tasks (SST-2). Both WikiBERT and PlugLM are pre-trained with only 1/4 Wikipedia corpus.
We have the following observations: (1) PlugLM trained with limited data already outperforms Wik-iBERT in both tasks (0.39 EM in QA and 0.59 Accuracy in classification) which verifies the effectiveness of PlugLM in low-resource setting; (2) A consistent pattern across two tasks verifies PlugLM could absorb new knowledge simply by adding more slots in D, K, V without heavy re-training. For the second setting, Figure 3c shows our model can absorb new cross-domain knowledge under adaptation setting. It achieves a higher F1 score on the LINNAEUS NER dataset with increasingly more biomed-specific knowledge injected.

In-task Knowledge
Inspired by in-context learning (Brown et al., 2020) and example-augmented generation (Cheng et al., 2022(Cheng et al., , 2023b, the training samples can also be viewed as a kind of in-task knowledge. In this section, we broaden the scope of DPM knowledge by including the training samples. Experimental Setting Since the knowledge from Wikipedia is a textual description from domain experts while the training sample from a Question-answering NLI dataset is in the form of [Q, A, Label], this surface form distribution shift may affect the knowledge retrieval. We consider the following injection methods. (1) Concate. We directly concatenate each training sample as a long string in the form of "Q [SEP] A [SEP] Label" and append this to DPM. (2) Tagged. To build the connection between model inputs and DPM, we tag each training sample by prepending a special token ([Tagged]), and use these tagged samples in both DPM and as model input.
(3) Knowledge Prompting. Inspired by prompting method Schick and Schütze, 2021), we transfer in-task knowledge to knowledge in the form of Wikipedia by a natural language prompting. For example, in QNLI dataset, we transform [Q, A, Label] with the following prompting: "The first sentence (doesn't) entail(s) with the second. The first sentence is [Q] and the second is [A]". We choose moderate-sized QNLI and QQP tasks because in-task knowledge injection doesn't apply to low-resource setting in our preliminary experiments.

Experimental Results
The result is shown in Table 5. We can observe that PlugLM has managed to learn from in-task knowledge and the surfaceform of knowledge affect the model performance.
Concatenation of training sample fails to inform PlugLM the actual in-task knowledge (zero retrieval in QNLI) and building connection between data and knowledge by a special tagged token only gives minor improvements. Instead, a welldesigned knowledge prompting can help PlugLM learn task-specific knowledge.

Tuning PlugLM
We investigate how each key design affects the performance of PlugLM.
(1) Number of Retrieved Knowledge. Figure 4 shows the effects of different N in STS-B dataset and the sparsely activated Top-5 knowledge proves to be optimal. (2)

Conclusion
For the first time, we challenge the current implicit knowledge encoding mechanism for PLMs with two fundamental drawbacks and insightfully propose to decouple knowledge storage from model parameters with an editable and scalable key-value memory. Inspired by the findings that FFN stores all kinds of knowledge and is essentially a keyvalue memory network, we transform FFN archi-tecture into deep retrieval with a differentiable plugin memory (DPM), which makes the knowledge encoding of PLMs more flexible and interpretable. Extensive experimental results in different scenarios including domain adaptation, knowledge update and in-task knowledge learning verify the design choice of PlugLM. We believe this architectural design would pave a new direction for future research on PLM, especially for super-large PLM.

Limitations
We discuss the limitations of PlugLM as follows: (1) Despite the strong performance achieved by our approach with DPM, it results in a reduced inference efficiency at the same time due to the MIPS search. For example, PlugLM is about two times slower than pure transformer-based models in GLUE. This would be more crucial when the external memory is much larger. Potential solutions to this issue include (1) constructing the memory using a coarser granularity (Borgeaud et al., 2022); (2) compressing DPM by semantic clustering as in Tay et al. (2022) or knowledge summarization as in .
(2) In this paper, we choose Wikipedia for DPM construction and PlugLM pre-training. While Wikipedia is the most commonly used data source for language model pre-training (Devlin et al., 2019; , there are also many other types of knowledge not covered in Wikipedia, and how to integrate different types of knowledge (e.g., factual, commonsense, syntactic and semantic knowledge) into our framework remains under-explored.
(3) Although this paper proposes a general architecture that is applicable to PLMs of all kinds and sizes including bidirectional (Devlin et al. , we only experiment with bidirectional models in moderate size. In particular, we believe this architectural design would be greatly beneficial for LLM (Smith et al., 2022;Chowdhery et al., 2022;Ouyang et al., 2022) for the following reasons: (1) the parameters of LLM could not be easily updated once the pre-training is done due to the unaffordable training cost. (2) the additional latency cost by MIPS retrieval is negligible compared with that of the whole LLM.

A PlugLM Pretraining Details
The details of PlugLM pre-training is shown in Table 6 Hyperparameter Assignment

B Data for Domain Adaptive Post-Training
The detailed statistics of domain corpora for posttraining is listed in the Table 7 and downstream tasks in Table 8.

C Latency
In Table 9, we show the detailed latency of WikiB-ERT and PlugLM.

D Case Study
We show three concrete examples from QNLI and ACL-ARC in Table 13,14,15.

E More Experiments for Tuning PlugLM
In

F Details for Wikipedia and Pubmed
The source and size of Wikipedia and Pubmed are shown in Table 11. And hyper-parameters for Wik-iBERT and PubmedBERT pre-training is shown in

Answer Prediction Label
How much of Jacksonville is made up of water?
According to the United States Census Bureau, the city has a total area of 874.3 square miles (2,264 km 2 ), making Jacksonville the largest city in land area in the contiguous United States; of this, 86.66% (757.7 sq mi or 1,962 km 2 ) is land and ; 13.34% (116.7 sq mi or 302 km 2 ) is water.

Knowledge
(1) this article lists the 3, 143 states of america. the 50 states of the united states are divided into 3, 007 " counties ", political and geographic subdivisions of a state ; 236 other local governments and geographic places are also first -order administrative divisions of their respective state / district / territory, but are called by different names. the latter are referred to collectively as " county equivalents " by the united states census bureau. the 236 county equivalents include 100 equivalents in the territories ( such as those in puerto rico ) outside the 50 states and the district of columbia. the large majority of counties and equivalents were organized by 1970. since that time, most creations, boundary changes and dissolutions have occurred in alaska and virginia. among the 50 states, 44 are partitioned entirely into counties, with no county equivalents. louisiana is instead divided into 64 equivalent parishes.
(2) the united states census bureau ( usc ##b ) , officially the bureau of the census , is a principal agency of the u . s . federal statistical system , responsible for producing data about the american people and economy . the census bureau is part of the u . s . department of commerce and its director is appointed by the president of the united states . the census bureau ' s primary mission is conducting the u . s . census every ten years , which all ##oca ##tes the seats of the u . s . house of representatives to the states based on their population . [ 1 ] the bureau ' s various census ##es and surveys help all ##oca ##te over $ 67 ##5 billion in federal funds every year and it assists states , local communities , and businesses make informed decisions . [ 2 ] [ 3 ] [ 4 ] the information provided by the census informs decisions on where to build and maintain schools , hospitals , transportation infrastructure , and police and fire departments (3) the crestview -fort walton beach -destin, florida, metropolitan statistical area, as defined by the united states census bureau, is a metropolitan area consisting of two counties in northwest florida, anchored by the cities of crestview, florida, and fort walton beach, florida. as of the 2010 census, the msa had a population of 235, 865, and a 2012 population estimate of 247, 665. the metropolitan area is a part of the " northwest corridor " which includes the pensacola metropolitan area and the panama city metropolitan area. demographics. as of the census of 2010, there were 235, 865 people, 95, 892 households, and 63, 964 families residing within the msa. the racial makeup of the msa was 81. 1 % white, 9. 3 % african american, 0. 3 % native american, 2. 9 % asian, 0. 1 % pacific islander, 0. 2 % from other races, and 3. 9 % from two or more races. hispanic or latino of any race were 6. 8 % of the population. according to the 2010 american community survey 1 -year (4) analog to digital conversions were achieved through steinberg, and in some cases mytek, converters. the album was recorded and mixed exclusively with steinberg cubase digital audio workstations on microsoft windows operating systems with waves ssl and abbey road tg12413 plugins. it was revealed that neither brahm nor marc know how to operate autotune, so it was not used. the songs were often performed to a click track, but there was no " snapping the drums to a grid ", which is a popular computerized technique to ensure that drums are in perfect time while simultaneously sucking the life out of an otherwise real performance. production. " tears of the enchanted mainframe " was produced and engineered by taylor and kaducak. backmasking is used on the track " superusurper " during an interlude that features a reversed reading of a passage from the george orwell novel nineteen eighty four. the album was mastered by geoff pesche and alex wharton at abbey road studios in london. title and artwork. " tears of the enchanted mainframe " (5) the zafarnama (, lit. " book of victory " ) is a biography of timur written by the historian nizam ad -din shami. it served as the basis for a later and better -known " zafarnama " by sharaf ad -din ali yazdi. one translation by felix tauer was published in prague in 1937.

Knowledge
(1) instrumentation and control engineering ( ice ) is a branch of engineering that studies the measurement and control of process variables, and the design and implementation of systems that incorporate them. process variables include pressure, temperature, humidity, flow, ph, force and speed. ice combines two branches of engineering. instrumentation engineering is the science of the measurement and control of process variables within a production or manufacturing area. meanwhile, control engineering, also called control systems engineering, is the engineering discipline that applies control theory to design systems with desired behaviors. control engineers are responsible for the research, design, and development of control devices and systems, typically in manufacturing facilities and process plants. control methods employ sensors to measure the output variable of the device and provide feedback to the controller so that it can make corrections toward desired performance. automatic control manages a device without the need of human inputs for correction, such as cruise control for regulating a car's speed. control systems engineering activities are multi -disciplinary in nature. they focus on the implementation of control systems, mainly derived by mathematical modeling. because instrumentation and control play a significant role in gathering information from a system and changing its parameters, they are a key part of control loops. as profession. high demand for engineering professionals is found in fields associated with process automation. specializations include industrial instrumentation, system dynamics, process control, and control systems. additionally, technological knowledge, particularly in computer systems, is essential to the job of (2) instrumentation is the art and science of measurement and control. instrumentation may also refer to: (3) the scientific and technological innovation ability of colleges and universities, and strengthening the evaluation research of the scientific and technological innovation ability and efficiency of colleges and universities, can we better promote the scientific and technological innovation ability of colleges and universities. universities the evaluation of scientific and technological innovation ability in colleges and universities is a complex system engineering, and the understanding of its connotation is the most important problem to be considered in the comprehensive evaluation. by consulting the data, it is found that the previous researches are mainly focused on the following three aspects : 1. from the perspective of innovative resource demand and innovative achievements, the scientific and technological innovation in colleges and universities is regarded as an organic whole composed of various elements. in the whole innovation system, colleges and universities undertake the functions and tasks of knowledge production and dissemination, technological innovation and transformation as well as personnel training. according to the relationship between innovation elements, the scientific and technological innovation ability of colleges and universities is divided into basic strength of scientific and technological innovation, scientific and technological innovation input ability, knowledge innovation ability, technological innovation ability, scientific and technological innovation output ability. science and technology innovation achievement transformation ability, talent innovation ability. 2. from the perspective of innovation process, the ability of scientific and technological innovation in colleges and universities is embodied in the process of knowledge creation, knowledge dissemination, transformation and diffusion of technological inventions. it also includes the technological, economic and managerial abilities that the university relies on (4) automation engineering has two different meanings : automation engineer. automation engineers are experts who have the knowledge and ability to design, create, develop and manage machines and systems, for example, factory automation, process automation and (5) this learning methodology is called blended learning. blended learning can also incorporate machine learning and other such technologies to implement adaptive learning.

Input Prediction Label
Although there are other discussions of the paragraph as a central element of discourse ( e.g. Chafe 1979, Halliday and Hasan 1976, Longacre 1979, Haberlandt et al. 1980 ) , all of them share a certain limitation in their formal techniques for analyzing paragraph structure .

Knowledge
(1) automation engineering has two different meanings : automation engineer. automation engineers are experts who have the knowledge and ability to design, create, develop and manage machines and systems, for example, factory automation, process automation and warehouse automation. scope. automation engineering is the integration of standard engineering fields. automatic control of various control system for operating various systems or machines to reduce human efforts & amp ; time to increase accuracy. automation engineers design and service electromechanical devices and systems to high -speed robotics and programmable logic controllers ( plcs ). work and career after graduation. graduates can work for both government and private sector entities such as industrial production, companies that create and use automation systems, for example paper industry, automotive industry, food and agricultural industry, water treatment, and oil & amp ; gas sector such as refineries, power plants. job description. automation engineers can design, program, simulate and test automated machinery and processes, and usually are employed in industries such as the energy sector in plants, car manufacturing facilities or food processing plants and robots. automation engineers are responsible for creating detailed design specifications and other documents, developing automation based on specific requirements for the process involved, and conforming to international standards like iec -61508, local standards, and other process specific guidelines and specifications, simulate, test and commission electronic equipment for automation.
(2) abstract. manipulator is a powerful tool which can help people to carry out the safe operation, production automation and improve the productivity of labor. based on the summary of the situation of research and development of manipulator, this article analyzes the functions of parts moving manipulator and carries out mechatronic design of parts moving manipulator according to the practical project items of parts moving manipulator of enterprises. on the basis of the analysis of the performance requirement and the operating characteristics of parts moving manipulator, this article analyses and designs the whole schemes for the mechanical structure, driving system, driving mode and the software and hardware control system of manipulator, and in which, the form of mechanical structure of cylindrical coordinate system is determined to be adopted in the design of manipulator, the driving scheme of pneumatic transmission is adopted, and the system control is carried out by plc. on this basis, this article analyses the kinematics and dynamics of parts moving manipulator and summarizes the relationship between displacement, speed, acceleration and joint angle. with the progress of science and technology and the development of social economy, the application area of manipulator has been becoming wider and wide. the manipulator can be found everywhere in human society. the application of manipulator has been extended to the civilian application fields such (3) in working environments with large manipulators, accidental collisions can cause severe personal injuries and can seriously damage manipulators, necessitating the development of an emergency stop algorithm to prevent such occurrences. in this paper, we propose an emergency stop system for the efficient and safe operation of a manipulator by applying an intelligent emergency stop algorithm. our proposed intelligent algorithm considers the direction of motion of the manipulator. in addition, using a new regression method, the algorithm includes a decision step that determines whether a detected object is a collision -causing obstacle or a part of the manipulator. we apply our emergency stop system to a two -link manipulator and assess the performance of our intelligent emergency stop algorithm as compared with other models. increasing the safety of robots, especially industrial manipulators, is just as important as improving their performance. a collision between a manipulator and a person, for example, may cause severe personal injury as well as damage to the machinery. thus, it is necessary to develop an algorithm that can detect collisions before they occur and make the manipulator stop before damage is done. various emergency stop or obstacle avoidance algorithms for robots, particularly those utilizing distance - [ 8 ] and those algorithms using each (4) the reliability of kinematic trajectory of manipulators describes the ability that manipulators keep kinematic accurate. it is an important parameter to evaluate the performance of manipulators. the kinematic accuracy of manipulators can be improved when piezoelectricity material are used as a transducer to suppress the vibration of flexible manipulators. first, a 3 degree -of -freedom parallel manipulator system and its dynamic equations are introduced. the theory and experiment of a vibration suppression system are then presented. the calculation method of both error and reliability of kinematic trajectory of manipulator is further implemented. finally, the reliability of kinematic accuracy are calculated and analyzed for the 3 degree -of -freedom parallel manipulator with or without vibration suppressing control. the results show that the reliability of kinematic accuracy is improved using vibration suppressing control. the reliability of kinematic accuracy of manipulators is an important indicator to evaluate the accuracy of manipulator motion [ 1 ]. in manipulators, light weight linkages are employed to achieve high speed and acceleration motions for better performance. however, the light weight linkage will result in inherent structural vibration, and the structural vibration leads to inaccurate kinematic trajectory of manipulators. different methods have been proposed to reduce the vibration of the flexible link (5) abstract -economic dispatch and frequency regulation are typically viewed as fundamentally different problems in power systems and, hence, are typically studied separately. in this paper, we frame and study a joint problem that cooptimizes both slow timescale economic dispatch resources and fast timescale frequency regulation resources. we show how the joint problem can be decomposed without loss of optimality into slow and fast timescale subproblems that have appealing interpretations as the economic dispatch and frequency regulation problems, respectively. we solve the fast timescale subproblem using a distributed frequency control algorithm that preserves network stability during transients. we solve the slow timescale subproblem using an efficient market mechanism that coordinates with the fast timescale subproblem. we investigate the performance of our approach on the ieee 24 -bus reliability test system. abstract -economic dispatch and frequency regulation are typically viewed as fundamentally different problems in power systems and, hence, are typically studied separately. in this paper, we frame and study a joint problem that co -optimizes both slow timescale economic dispatch resources and fast timescale frequency regulation resources. we show how the joint problem can be decomposed without loss of optimality into slow and fast timescale subproblems that have appealing interpretations as the economic dispatch and frequency regulation problems, respectively. we solve the fast timescale subproblem