License Apache 2.0 Python 3.6

DeepPavlov

We are in a really early Alpha release. You should be ready for hard adventures.

If you have updated to version 0.0.2 or greater - please re-download all pre-trained models

DeepPavlov is an open-source conversational AI library built on TensorFlow and Keras. It is designed for

Our goal is to enable AI-application developers and researchers with:

Demo

Demo of selected features is available at demo.ipavlov.ai

Features

Component Description
Slot filling and NER components Based on neural Named Entity Recognition network and fuzzy Levenshtein search to extract normalized slot values from text. The NER component reproduces architecture from the paper Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition which is inspired by Bi-LSTM+CRF architecture from https://arxiv.org/pdf/1603.01360.pdf.
Intent classification component Based on shallow-and-wide Convolutional Neural Network architecture from Kim Y. Convolutional neural networks for sentence classification – 2014. The model allows multilabel classification of sentences.
Automatic spelling correction component Based on An Improved Error Model for Noisy Channel Spelling Correction by Eric Brill and Robert C. Moore and uses statistics based error model, a static dictionary and an ARPA language model to correct spelling errors.
Ranking component Based on LSTM-based deep learning models for non-factoid answer selection. The model performs ranking of responses or contexts from some database by their relevance for the given context.
Question Answering component Based on R-NET: Machine Reading Comprehension with Self-matching Networks. The model solves the task of looking for an answer on a question in a given context (SQuAD task format).
Skills  
Goal-oriented bot Based on Hybrid Code Networks (HCNs) architecture from Jason D. Williams, Kavosh Asadi, Geoffrey Zweig, Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning – 2017. It allows to predict responses in goal-oriented dialog. The model is customizable: embeddings, slot filler and intent classifier can switched on and off on demand.
Seq2seq goal-oriented bot Dialogue agent predicts responses in a goal-oriented dialog and is able to handle multiple domains (pretrained bot allows calendar scheduling, weather information retrieval, and point-of-interest navigation). The model is end-to-end differentiable and does not need to explicitly model dialogue state or belief trackers.
Embeddings  
Pre-trained embeddings for the Russian language Word vectors for the Russian language trained on joint Russian Wikipedia and Lenta.ru corpora.

Basic examples

View video demo of deployment of a goal-oriented bot and a slot-filling model with Telegram UI

Alt text for your video

Principles

The library is designed according to the following principles:

Target Architecture

Target architecture of our library:

DeepPavlov is built on top of machine learning frameworks TensorFlow and Keras. Other external libraries can be used to build basic components.

Key Concepts

Contents

Installation

  1. Currently we support only Linux platform and Python 3.6 (Python 3.5 is not supported!)

  2. Create a virtual environment with Python 3.6
     virtualenv env
    
  3. Activate the environment.
     source ./env/bin/activate
    
  4. Clone the repo and cd to project root
    git clone https://github.com/deepmipt/DeepPavlov.git
    cd DeepPavlov
    
  5. Install the requirements:
     python setup.py develop
    
  6. Install spacy dependencies:
     python -m spacy download en
    

Quick start

To use our pre-trained models, you should first download them:

python -m deeppavlov.download [-all] 

Then you can interact with the models or train them with the following command:

python -m deeppavlov.deep <mode> <path_to_config>

For ‘interactbot’ mode you should specify Telegram bot token in -t parameter or in TELEGRAM_TOKEN environment variable.

For ‘riseapi’ mode you should specify api settings (host, port, etc.) in utils/server_utils/server_config.json configuration file. If provided, values from model_defaults section override values for the same parameters from common_defaults section. Model names in model_defaults section should be similar to the class names of the models main component.

Available model configs are:


Technical overview

Project modules

deeppavlov.core.commands basic training and inference functions
deeppavlov.core.common registration and classes initialization functionality, class method decorators
deeppavlov.core.data basic DatasetIterator, DatasetReader and Vocab classes
deeppavlov.core.layers collection of commonly used Layers for TF models
deeppavlov.core.models abstract model classes and interfaces
deeppavlov.dataset_readers concrete DatasetReader classes
deeppavlov.dataset_iterators concrete DatasetIterators classes
deeppavlov.metrics different Metric functions
deeppavlov.models concrete Model classes
deeppavlov.skills Skill classes. Skills are dialog models.
deeppavlov.vocabs concrete Vocab classes

Config

An NLP pipeline config is a JSON file that contains one required element chainer:

{
  "chainer": {
    "in": ["x"],
    "in_y": ["y"],
    "pipe": [
      ...
    ],
    "out": ["y_predicted"]
  }
}

Chainer is a core concept of DeepPavlov library: chainer builds a pipeline from heterogeneous components (rule-based/ml/dl) and allows to train or infer from pipeline as a whole. Each component in the pipeline specifies its inputs and outputs as arrays of names, for example: "in": ["tokens", "features"] and "out": ["token_embeddings", "features_embeddings"] and you can chain outputs of one components with inputs of other components:

{
  "name": "str_lower",
  "in": ["x"],
  "out": ["x_lower"]
},
{
  "name": "nltk_tokenizer",
  "in": ["x_lower"],
  "out": ["x_tokens"]
},

Each Component in the pipeline must implement method __call__ and has name parameter, which is its registered codename. It can also have any other parameters which repeat its __init__() method arguments. Default values of __init__() arguments will be overridden with the config values during the initialization of a class instance.

You can reuse components in the pipeline to process different parts of data with the help of id and ref parameters:

{
  "name": "nltk_tokenizer",
  "id": "tokenizer",
  "in": ["x_lower"],
  "out": ["x_tokens"]
},
{
  "ref": "tokenizer",
  "in": ["y"],
  "out": ["y_tokens"]
},

Training

There are two abstract classes for trainable components: Estimator and NNModel.
Estimators are fit once on any data with no batching or early stopping, so it can be safely done at the time of pipeline initialization. fit method has to be implemented for each Estimator. An example of Estimator is Vocab. NNModel requires more complex training. It can only be trained in a supervised mode (as opposed to Estimator which can be trained in both supervised and unsupervised settings). This process takes multiple epochs with periodic validation and logging. train_on_batch method has to be implemented for each NNModel.

Training is triggered by deeppavlov.core.commands.train.train_model_from_config() function.

Train config

Estimators that are trained should also have fit_on parameter which contains a list of input parameter names. An NNModel should have the in_y parameter which contains a list of ground truth answer names. For example:

[
  {
    "id": "classes_vocab",
    "name": "default_vocab",
    "fit_on": ["y"],
    "level": "token",
    "save_path": "vocabs/classes.dict",
    "load_path": "vocabs/classes.dict"
  },
  {
    "in": ["x"],
    "in_y": ["y"],
    "out": ["y_predicted"],
    "name": "intent_model",
    "save_path": "intents/intent_cnn",
    "load_path": "intents/intent_cnn",
    "classes_vocab": {
      "ref": "classes_vocab"
    }
  }
]

The config for training the pipeline should have three additional elements: dataset_reader, dataset_iterator and train:

{
  "dataset_reader": {
    "name": ...,
    ...
  }
  "dataset_iterator": {
    "name": ...,
    ...
  },
  "chainer": {
    ...
  }
  "train": {
    ...
  }
}

Simplified version of trainig pipeline contains two elemens: dataset and train. The dataset element currently can be used for train from classification data in csv and json formats. You can find complete examples of how to use simplified training pipeline in intents_sample_csv.json and intents_sample_json.json config files.

Train Parameters

DatasetReader

DatasetReader class reads data and returns it in a specified format. A concrete DatasetReader class should be inherited from the base deeppavlov.data.dataset_reader.DatasetReader class and registered with a codename:

from deeppavlov.core.common.registry import register
from deeppavlov.core.data.dataset_reader import DatasetReader

@register('dstc2_datasetreader')
class DSTC2DatasetReader(DatasetReader):

DatasetIterator

DatasetIterator forms the sets of data (‘train’, ‘valid’, ‘test’) needed for training/inference and divides it into batches. A concrete DatasetIterator class should be registered and can be inherited from deeppavlov.data.dataset_iterator.BasicDatasetIterator class. deeppavlov.data.dataset_iterator.BasicDatasetIterator is not an abstract class and can be used as a DatasetIterator as well.

Inference

All components inherited from deeppavlov.core.models.component.Component abstract class can be used for inference. The __call__() method should return standard output of a component. For example, a tokenizer should return tokens, a NER recognizer should return recognized entities, a bot should return an utterance. A particular format of returned data should be defined in __call__().

Inference is triggered by deeppavlov.core.commands.infer.interact_model() function. There is no need in a separate JSON for inference.

License

DeepPavlov is Apache 2.0 - licensed.

Support and collaboration

If you have any questions, bug reports or feature requests, please feel free to post on our Github Issues page. Please tag your issue with ‘bug’, ‘feature request’, or ‘question’. Also we’ll be glad to see your pull requests to add new datasets, models, embeddings, etc.

The Team

DeepPavlov is built and maintained by Neural Networks and Deep Learning Lab at MIPT within iPavlov project (part of National Technology Initiative) and in partnership with Sberbank.