Blog DeepPavlov

CV-Multimodality Engineer

We are looking for an experienced CV engineer to work with computer vision models (CV) primarily in multimodal text-audio-vision models. One of our ambitious goals is to introduce the vision modality into dialog agents, both virtual and embodied ones.

About us aims to help developers all over the world create chatbots, dialog systems, and Multiskill assistants in a faster and easier way. Our technology stack can be used by developers with different experience and levels of training, from beginners to experts in NLP.

In our research, we already build multimodal Open Source dialog systems capable of not only speaking and writing, but also of seeing and being aware of themselves in space. We've already open-sourced the whole stack for chatbots, and performed extensive testing during the Amazon Alexa Prize competition. Now we're moving towards full-fledged, embodied dialog systems in reality and VR.

Most of us are researchers, and now we need your expertise in the production-quality pipelines. We need you to help us bring the novel architectures to our stack quickly, and to master them.


  • Full-time, office/remote
  • 2+ years of experience with CV models;
  • Familiar with new methods and research papers in the field of CV;
  • Experience in applying new methods and implementing papers in CV;
  • Experience with docker, bash, git, pytorch;
  • The ability to write efficient, clean, and readable code, following a certain code style;
  • the ability to stick to deadlines.

Advantages could be:

  • experience with natural language processing models, in particular NLU and Transformers, Visual Transformers;
  • knowledge of frameworks for distributed computing in deep learning, in particular pytorch-lightning, ray, tensorflow X;
  • experience in explainable machine learning (XAI): the ability to interpret models' predictions and to analyze the errors.

Salary is determined based on the results of the interview

To apply for this Job send your CV to with title:  CV-Multimodality Engineer