Research at DeepPavlov

Be the first to try a brand new demo of DeepPavlov Library!

Research at deeppavlov

At DeepPavlov, we conduct research that bridges the linguistic divide between people and machines to make communicating with computers as natural as speaking with
family and friends.

OUR Research Areas

Multiskill AI Assistants

Common NLU Components

Chit-Chat Systems

Goal-Oriented Systems

Other Skill Frameworks

Dialog Orchestration

Language Models

Neural Architecture Search

Knowledge Graphs

Models with Memory

Multimodality

Neurosymbolic Scenario-Driven Dialog Skill Engine

Strategic & Goal-aware Dialog Management

Affective AI

Explainable Q&A Systems

User & Bot Persona

User Personalization

CURRENT RESEARCH PROJECTS

Distilled BERT
Analysis of compression methods for Transformer-based neural networks. Large-scale pre-trained language models (such as BERT and Transformer) have become a common approach for solving many NLP tasks. Training and deploying such models requires a large amount of computational resources and complicates research experiments, as well as deployment on mobile devices and embedded systems. The goal of the project is to apply, analyze and improve SOTA methods for neural network compression to the Transformer-based models.
Knowledge Graph-Based Dialogue Generation
Responses can be more meaningful compared to chit-chat generated responses thanks to usage of facts coming from a knowledge base like Wikidata. The system consists of three components:
1. extraction of triplets from Wikidata for entities from the user's utterance;
2. triplets ranking (choosing the triplet which is the most appropriate to use for generation of the response utterance);
3. generation of the response utterance.
Data Augmentation and Filtration (DAF)
The project of data augmentation and filtration was initiated during participation in DialoGLUE by Amazon. By applying DAF techniques, the DeepPavlov lab won 1st place in the Few-shot phase of the competition. The plan is to research and explore DAF techniques for NLP models where data collection is cumbersome and expensive (dialogues, QA, ...), and develop the DAF component for DeepPavlov.
New Transformer Architectures
We are working on memory augmented transformer-based language models, knowledge graph integration to language models, knowledge distillation and we are experimenting with transformer architecture modifications: multiple streams, bottlenecks, sentence-level representations.
Zero-Shot Cross-Lingual Transfer
The focus of the project is to explore the cross-lingual transfer of the multilingual Transformer-based models and develop multilingual components for DeepPavlov. We already have made progress in exploring cross-lingual transfer: multilingual ner, multilingual squad. We believe that developing universal multilingual models is one of the hottest topics in NLP.
Accelerating learning for Machine Translation Project
The research is looking for methods to accelerate machine translation learning using data augmentation and loss functions. The basic experiment has been carried out, the study is in the stage of collecting statistical data to obtain reliable conclusions.
Discourse-Driven Dialog Strategy Management Project
The goal is to use Discourse Management theory derived from Functional Linguistics (driven by M.A.K. Halliday) as a means to strategically control dialog in the open-domain dialogue systems.
Improving the performance of large models for subdomains
The research is looking for methods to improve the performance of large models for subdomains, under the condition that the quality of performance of models is not worse in open-domain.
The Speech Functions project
This project will be focused on the application of Speech Functions and Discourse Management.

PAST RESEARCH PROJECTS

Evolutionary Neural Architecture Search
We propose an application of evolutionary algorithm to Neural Architecture Search for image and text classification tasks on PyTorch. One of the main issues of evolutionary search is high resource consumption, so one of our main priority is optimisation of the model evaluation.
Entity Linking & Disambiguation for Russian & English Languages
Disambiguation of candidate entities for extracted entities is performed using
1. prior probabilities of mention and entity correspondence;
2. context (the sentence in the text which contains the mentioned entity) and description of the entity in Wikidata;
3. global disambiguation (joint disambiguation of all entities in the text using connections of the entities in Wikidata knowledge graph).
Detection of Factual Errors In Historical Essays in Russian
We detect errors in dates and causal relationship. We collected databases of dates and causal relationship of historic events. For date checking we extracted tokens in the text which refer to historic events using syntactic parser, then linked it to the event in database using TF-IDF and matched the date in text with the date in the database.

Opportunities for Interns

Current Internships

This program is for aspiring graduate and undergraduate students who are passionate about Conversational AI technology and offer diverse perspectives.

News

Get updates about what's going on the DeepPavlov Internships.

Opportunities for Full-Time Researchers

Check Out Open Positions

OUR NEWS

Find Out More News