DEEPPAVLOV PRODUCTS
New article ⭐ Modular DREAM Socialbot in Alexa Prize
Close
 
Research at deeppavlov
At DeepPavlov, we conduct research that bridges the linguistic divide between people and machines to make communicating with computers as natural as speaking with
family and friends.
OUR Research Areas
Multiskill AI Assistants

Common NLU Components
Chit-Chat Systems
Goal-Oriented Systems
Other Skill Frameworks
Dialog Orchestration
Language Models
Neural Architecture Search

Knowledge Graphs
Models with Memory
CURRENT RESEARCH PROJECTS
Distilled BERT
Analysis of compression methods for Transformer-based neural networks. Large-scale pre-trained language models (such as BERT and Transformer) have become a common approach for solving many NLP tasks. Training and deploying such models requires a large amount of computational resources and complicates research experiments, as well as deployment on mobile devices and embedded systems. The goal of the project is to apply, analyze and improve SOTA methods for neural network compression to the Transformer-based models.
      Knowledge Graph-Based Dialogue Generation
      Responses can be more meaningful compared to chit-chat generated responses thanks to usage of facts coming from a knowledge base like Wikidata. The system consists of three components:
      1. extraction of triplets from Wikidata for entities from the user's utterance;
      2. triplets ranking (choosing the triplet which is the most appropriate to use for generation of the response utterance);
      3. generation of the response utterance.
        Data Augmentation and Filtration (DAF)
        The project of data augmentation and filtration was initiated during participation in DialoGLUE by Amazon. By applying DAF techniques, the DeepPavlov lab won 1st place in the Few-shot phase of the competition. The plan is to research and explore DAF techniques for NLP models where data collection is cumbersome and expensive (dialogues, QA, ...), and develop the DAF component for DeepPavlov.
            New Transformer Architectures
            We are working on memory augmented transformer-based language models, knowledge graph integration to language models, knowledge distillation and we are experimenting with transformer architecture modifications: multiple streams, bottlenecks, sentence-level representations.
                Zero-Shot Cross-Lingual Transfer
                The focus of the project is to explore the cross-lingual transfer of the multilingual Transformer-based models and develop multilingual components for DeepPavlov. We already have made progress in exploring cross-lingual transfer: multilingual ner, multilingual squad. We believe that developing universal multilingual models is one of the hottest topics in NLP.
                    Accelerating learning for Machine Translation Project
                    The research is looking for methods to accelerate machine translation learning using data augmentation and loss functions. The basic experiment has been carried out, the study is in the stage of collecting statistical data to obtain reliable conclusions.
                        Discourse-Driven Dialog Strategy Management Project
                        The goal is to use Discourse Management theory derived from Functional Linguistics (driven by M.A.K. Halliday) as a means to strategically control dialog in the open-domain dialogue systems.
                            Improving the performance of large models for subdomains
                            The research is looking for methods to improve the performance of large models for subdomains, under the condition that the quality of performance of models is not worse in open-domain.
                                The Speech Functions project
                                This project will be focused on the application of Speech Functions and Discourse Management.
                                    PAST RESEARCH PROJECTS
                                    Evolutionary Neural Architecture Search
                                    We propose an application of evolutionary algorithm to Neural Architecture Search for image and text classification tasks on PyTorch. One of the main issues of evolutionary search is high resource consumption, so one of our main priority is optimisation of the model evaluation.
                                    Entity Linking & Disambiguation for Russian & English Languages
                                    Disambiguation of candidate entities for extracted entities is performed using
                                    1. prior probabilities of mention and entity correspondence;
                                    2. context (the sentence in the text which contains the mentioned entity) and description of the entity in Wikidata;
                                    3. global disambiguation (joint disambiguation of all entities in the text using connections of the entities in Wikidata knowledge graph).
                                    Detection of Factual Errors In Historical Essays in Russian
                                    We detect errors in dates and causal relationship. We collected databases of dates and causal relationship of historic events. For date checking we extracted tokens in the text which refer to historic events using syntactic parser, then linked it to the event in database using TF-IDF and matched the date in text with the date in the database.
                                      Opportunities for Interns
                                      Opportunities for Full-Time Researchers
                                      OUR NEWS