Reasoning – Spress

Spelling, common sense, grammar, and reasoning errors can be corrected. Yuncong proposes a BART-based semantic error correction method

editor — Sun, 18 Apr 2021 06:08:07 +0000

Heart of the Machine released

Heart of the Machine Editorial Department Yuncong Technology Speech Group proposed a semantic error correction technical solution based on the BART pre-trained model. It can not only correct common spelling errors in ASR data, but also correct common sense errors, grammatical errors, and even some that require reasoning. Errors are corrected. In recent years, with the development of automatic speech recognition (ASR) technology, the recognition accuracy has been greatly improved. However, there are still some errors that are very obvious to humans in the ASR transcribing results. We don’t need to listen to the audio, we can find out only by observing the transcribed text. The correction of such errors often requires some common sense and grammatical knowledge, and even the ability of reasoning. Thanks to the recent development of unsupervised pre-training language model technology, error correction models based on plain text features can effectively solve such problems. The semantic error correction system proposed in this paper is divided into two modules: encoder and decoder. The encoder focuses on understanding the semantics of the output text of the ASR system, and the design of the decoder focuses on the use of standardized vocabulary to re-express. Link to the paper: https://arxiv.org/abs/2104.05507 introduction Text error correction is an important method to improve the accuracy of ASR recognition. Common text error correction includes grammatical error correction and spelling error correction. Because the error distribution of the ASR system transfer is quite different from the above error distribution, these models are often not suitable for direct use in the ASR system.Here, Yuncong Technology Speech Group proposed a pre-training model based on BART [1] The Semantic Error Correction (SC) technical solution can not only correct common spelling errors in ASR data, but also correct some common sense errors, grammatical errors, and even some errors that require reasoning. In our experiments with 10,000 hours of data, the error correction model can relatively reduce the error rate (CER) based on 3gram decoding results by 21.7%, achieving an effect similar to RNN rescoring. Using error correction on the basis of RNN re-scoring, a further 6.1% relative reduction in CER can be achieved. The error analysis results show that the actual error correction effect is better than the CER indicator shows. model 1) ASR semantic error correction system design The ASR semantic error correction process is shown in Figure 1. The semantic error correction module can be directly applied to the first pass of the decoding result as an alternative to the re-scoring module. In addition, it can also be connected to the re-scoring model to further improve the recognition accuracy. Figure1 ASR system with integrated semantic error correction model 2) Baseline ASR system The baseline acoustic model structure selected by the author is pyramidal FSMN[2], Training on 10,000 hours of Mandarin audio data. The WFST used in the first pass of decoding is composed of 3gram language model, pronunciation dictionary, dual phoneme structure and HMM structure. 4grams and RNNs are used in the rescoring, and the training data is the reference text corresponding to these audios.Acoustic model and language model use Kaldi tools [3] training. 3) Semantic error correction model structure The semantic error correction model proposed by the researcher is based on Transformer [4]Structure, it contains 6 layers of encoder layer and 6 layers of decoder layer, and the modeling unit is token. In the training process using the Teacher forcing method, the text output by the ASR is input to the input side of the model, and the corresponding reference text is input to the output side of the model. The input embedding matrix and the output embedding matrix are used for encoding, and the cross entropy is used as the loss function. In the semantic error correction model inference process, beam search is used for decoding, and the beam width is set to 3. Figure 2 Semantic error correction model based on Transformer experiment 1. Error correction training data preparation The training set of our baseline ASR model is 10,000 hours of Mandarin speech data, which contains about 800 transliterated texts. The side test set consists of 5 hours of mixed voice data, including Aishell, Thchs30 and other side test sets. In order to fully sample the error distribution identified by the ASR system, we adopted the following techniques when constructing the error correction model training data set: Use a weak acoustic model to generate error correction training data. Here, 10% of the speech data is used to train a small acoustic model separately to generate training data; Add disturbance to the MFCC feature, and randomly multiply the MFCC feature by a coefficient between 0.8 and 1.2; Input the noisy features into the weak acoustic model, take the first 20 results of beam search, and filter the samples according to the typos rate threshold. Finally, we pair the filtered decoding results with their corresponding reference texts as the error correction model training data. By decoding the full audio data and setting the threshold at 0.3, we obtained about 30 million error correction sample pairs. 2. Input and output presentation layer In the semantic error correction model, the input and output text use the same dictionary. But the typo in the input text contains more semantics than its standard usage, while the output text only uses the standard words to express. Therefore, the independent representation of the tokens on the input and output sides is more in line with the needs of error correction tasks. The results in Table 1 prove our inference. The experimental results show that when the input and output embedded matrix share the weight, the error correction model will bring negative effects. When the input and output tokens are represented independently, the CER of the system can be reduced by 5.01%. 3. BART vs BERT initialization Here, the researcher pre-trains the language model technology, and transfers the semantic knowledge learned from the large-scale corpus to the error correction scene, so that the error correction model obtains better robustness and universality on a relatively small training set.化性. We compare random initialization, BERT[5]Initialization and BART[1]Initialization method. During the initialization process, because the BART pre-training task and model structure are the same as Transformer, the parameters can be reused directly.In BERT initialization, both the encoder and decoder of Transformer are applicable to the first 6-layer network parameters of BERT[6]. The results in Table 2 show that BART initialization can reduce the typo rate of the baseline ASR by 21.7%, but the improvement of the BERT-initialized model relative to the random initialization model is very limited. We push this may be because the structure of the BERT and the semantic error correction model and the training target are too different, and the knowledge has not been effectively transferred. In addition, the error correction model corrects the output of the language model after re-scoring, and the recognition rate can be further improved. Compared with 4grams, RNN re-scoring results, CER can be relatively reduced by 21.1% and 6.1%, respectively. 4. Error correction model vs large language model Generally speaking, the ASR system uses a larger language model to obtain better recognition results, but it also consumes more memory resources and reduces decoding efficiency. Here, we add a large number of crawlers or open source plain text corpus on the basis of the speech data reference text, and newly train 3gram, 4gram and RNN language models, and call them the big language model. The one used in the baseline ASR system is called a small model. By comparison, it is found that the recognition accuracy of adding error correction on the basis of a small model surpasses the effect of using a large model alone. In addition, by using semantic error correction on the basis of a large model, the recognition rate can be further improved. Some examples of error correction are as follows: 5. Error analysis In error analysis of 300 examples of failed corrections, we found that the actual effect of semantic error correction is significantly better than the CER indicator evaluation. About 40% of errors hardly affect semantics, for example, some transliterated foreign names or place names There are many ways of expression, some personal pronouns lack context, which will cause the mixed use of “her, other, and other”, and some are substitutions of modal particles that do not affect semantics. In addition, 30% of errors are not suitable for correction based on pure text features due to insufficient contextual information. The remaining 30% of errors are caused by insufficient semantic understanding or expressive ability of the semantic error correction model. to sum up This paper proposes a BART-based semantic error correction model, which has good generalization and consistently improves the recognition results of multiple ASR systems. In addition, the researchers verified the importance of independent representation of input and output in the task of text error correction through experiments. In order to more fully sample the ASR system identification error distribution, this paper proposes a simple and effective error correction data generation strategy. Finally, although the semantic error correction method we proposed has achieved certain benefits, there is still room for optimization, such as: 1. The introduction of acoustic features helps the model identify whether there are errors in the text and reduces the false touch rate. 2. Introducing more contextual information can eliminate some semantic ambiguities or missing information in the text. 3. Adapt to vertical business scenarios to improve the recognition accuracy of some professional terms. references [1]M. Lewis et al., “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension,” 2019, [Online]. Available: http://arxiv.org/abs/1910.13461. [2]X. Yang, J. Li, and X. Zhou, “A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition,” vol. 1, Oct. 2018, [Online]. Available: http://arxiv.org/abs/1810.11352. [3]D. Povey et al., “The Kaldi Speech Recognition Toolkit,” IEEE Signal Process. Soc., vol. 35, no. 4, p. 140, 2011. [5]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” 2018, [Online]. Available: http://arxiv.org/abs/1810.04805. [6]O. Hrinchuk, M. Popova, and B. Ginsburg, “Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model,” ICASSP 2020-2020 IEEE Int. Conf. Acoust. Speech Signal Process., pp. 7074C7078, Oct . 2019, [Online]. Available: http://arxiv.org/abs/1910.10697. Build new, see wisdom – 2021 Amazon Cloud Technology AI Online conference April 22, 14:00-18:00 Why do so many machine learning loads choose Amazon Cloud Technology? How to achieve large-scale machine learning and enterprise digital transformation? “Building New · Seeing Wisdom-2021 Amazon Cloud Technology AI Online Conference” is led by Alex Smola, vice president of global artificial intelligence technology and outstanding scientist of Amazon Cloud Technology, and Gu Fan, general manager of Amazon Cloud Technology Greater China Product Department, and more than 40 heavyweights Guests will give you an in-depth analysis of the innovation culture of Amazon cloud technology in the keynote speech and 6 major conferences, and reveal how AI/ML can help companies accelerate innovation. Session 1: Amazon Machine Learning Practice Revealed Session 2: Artificial Intelligence Empowers Digital Transformation of Enterprises Session 3: The Way to Realize Large-scale Machine Learning Session 4: AI services help the Internet to innovate rapidly Session 5: Open Source and Frontier Trends Sub-venue 6: Intelligent ecology of win-win cooperation Which topic are you more interested in in the 6 major conference venues?

It is proposed to assign the Ministry of National Defense and the Ministry of Public Security the right to train high-level political theory

Xuân Trường (tổng hợp) — Thu, 15 Apr 2021 05:48:10 +0000

It is proposed that the Ministry of Defense and the Ministry of Public Security officially assign the authority to train and certify the Advanced level of political theory.

The Ministry of Public Security and the Ho Chi Minh National Academy of Politics had a meeting about the content of the training and confirmation of the Advanced level of political theory in the People’s Public Security Forces (CAND).

At the meeting, all comments acknowledged and appreciated the fact that the 3 Institutes in CAND were proactive and creative in the process of training for Advanced level of political theory in CAND … At the same time, the dominant opinions. most requested the Secretariat to officially assign the Ministry of Public Security the right to train and confirm the Senior level of political reasoning because this is very necessary, in accordance with the current context and situation.

General To Lam, Minister of Public Security spoke at the meeting.

Minister of Public Security To Lam emphasized that the effective implementation of the Secretariat’s Conclusion No. 25-KL / TW dated December 28, 2017 on the assignment of the right to train and certify Senior Political dissertation for the Ministry of Public Security and the Ministry of Defense is a major and important undertaking, contributing to promoting and enhancing the building of important resources for the CAND force, especially the source of commanding cadres and leaders, to be soon achieved. the goal of building a regular, elite and modern CAND force.

In 3 years of pilot implementation, the Ministry of Public Security basically completed the set objectives. This result has the great attention and support of the Central Organizing Committee, Ho Chi Minh National Academy of Politics, and the Central Propaganda Department for the CAND force.

Minister To Lam also suggested the Central Organizing Committee, Central Propaganda Department, Ho Chi Minh National Academy of Politics on the basis of the results of the task of training High-level political theory that the Ministry of Public Security had. implemented according to Conclusion 25, reporting to the Secretariat for official training and certification of long-term high-level political theory for the Ministry of Public Security.

Prof. Dr. Nguyen Xuan Thang, Director of the Ho Chi Minh National Academy of Politics suggested that, in the coming time, the academies in the CAND should pay more attention to the quality of textbooks and lesson plans, especially electricity lesson plans. Death, in which, requires high updating not only in the textbook but must be shown in each lecture, each subject.

Institutes in CAND need to focus on fostering faculty on political theory teaching methods, regularly training to standardize the team, with careful preparation for the future team; Integrating political theory into training programs to save time and money for staff.

Prof. Dr. Nguyen Xuan Thang, Director of Ho Chi Minh National Academy of Politics and General To Lam, Minister of Public Security chaired the meeting on the afternoon of April 14.

Previously, on April 12, Prof. Dr. Nguyen Xuan Thang also had a meeting with the Ministry of Defense on the training of high-level political theory.

At the meeting, Senior Lieutenant General Tran Quang Phuong, Deputy Director of the General Department of Politics of the Vietnam People’s Army affirmed that in recent years, the Central Military Commission and the Ministry of Defense have thoroughly grasped and implemented closely and seriously. No. 25-KL / TW of the Secretariat, guidance of the Central Organizing Committee.

The Central Military Commission and the Ministry of National Defense proactively formulate plans, guide the implementation of training, complete the program and confirm the high-level of political theory; Ensure good facilities, faculty, systems of documents, content, training programs, select students, organize the management, test, test, evaluate study results closely, serious.

Prof. Dr. Nguyen Xuan Thang, Director of Ho Chi Minh National Academy of Politics and Senior Lieutenant General Tran Quang Phuong, Deputy Director of the General Department of Politics of the Vietnam People’s Army and members of the delegation inspecting facilities training substances of the Army Academy.

Prof. Dr. Nguyen Xuan Thang highly appreciated the training results, completed and confirmed the Advanced level of political theory performed by the Ministry of Defense over the years, and affirmed the Ho Chi Minh National Academy of Politics. Minh will accept recommendations from academies and agencies under the Ministry of Defense, continue to coordinate with the Central Organizing Committee to request the Secretariat to assign the Ministry of Defense, which are directly 3 academies including the National Defense Academy, Political Academy, Army Academy officially trained High-level political theory for the subjects in the coming time.

Prof. Dr. Nguyen Xuan Thang requested the Ministry of Defense to urgently build and complete a new set of textbooks, update the contents of the documents of the XIII National Party Congress in the program content, standardize the teaching staff. Senior lecturer in political theory, integrating the training program High-level political reasoning into the training program for commanders, advisors at the campaign level and strategy to save time and money.