Warning: This document is for an old version of Rasa Core. The latest version is 0.14.5.

Training and Policies


Rasa Core works by creating training data from your stories and training a model on that data.

You can run training from the command line like in the Quickstart:

python -m rasa_core.train -d domain.yml -s data/stories.md -o models/current/dialogue --epochs 200

Or by creating an agent and running the train method yourself:

from rasa_core.agent import Agent

agent = Agent()
data = agent.load_data("stories.md")

Data Augmentation

By default, Rasa Core will create longer stories by randomly glueing together the ones in your stories file. This is because if you have stories like:

# thanks
* thankyou
   - utter_youarewelcome

# bye
* goodbye
   - utter_goodbye

You actually want to teach your policy to ignore the dialogue history when it isn’t relevant and just respond with the same action no matter what happened before.

You can alter this behaviour with the --augmentation flag. --augmentation 0 disables this behavior.

In python, you can pass the augmentation_factor argument to the Agent.load_data method.

Max History

One important hyperparameter for Rasa Core policies is the max_history. This controls how much dialogue history the model looks at to decide which action to take next.

You can set the max_history using the training script’s --history flag or by passing it to your policy’s Featurizer.


Only the MaxHistoryTrackerFeaturizer uses a max history, whereas the FullDialogueTrackerFeaturizer always looks at the full conversation history.

As an example, let’s say you have an out_of_scope intent which describes off-topic user messages. If your bot sees this intent multiple times in a row, you might want to tell the user what you can help them with. So your story might look like this:

* out_of_scope
   - utter_default
* out_of_scope
   - utter_default
* out_of_scope
   - utter_help_message

For Rasa Core to learn this pattern, the max_history has to be at least 3.

If you increase your max_history, your model will become bigger and training will take longer. If you have some information that should affect the dialogue very far into the future, you should store it as a slot. Slot information is always available for every featurizer.

Training Script Options

/opt/python/3.5.6/lib/python3.5/runpy.py:125: RuntimeWarning: 'rasa_core.train' found in sys.modules after import of package 'rasa_core', but prior to execution of 'rasa_core.train'; this may result in unpredictable behaviour
usage: train.py [-h] {default,compare,interactive} ...

Train a dialogue model for Rasa Core. The training will use your conversations
in the story training data format and your domain definition to train a
dialogue model to predict a bots actions.

positional arguments:
                        Training mode of core.
    default             train a dialogue model
    compare             train multiple dialogue models to compare policies
    interactive         teach the bot with interactive learning

optional arguments:
  -h, --help            show this help message and exit


The rasa_core.policies.Policy class decides which action to take at every step in the conversation.

There are different policies to choose from, and you can include multiple policies in a single Agent. At every turn, the policy which predicts the next action with the highest confidence will be used. You can pass a list of policies when you create an agent:

from rasa_core.policies.memoization import MemoizationPolicy
from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.agent import Agent

agent = Agent("domain.yml",
               policies=[MemoizationPolicy(), KerasPolicy()])


By default, Rasa Core uses the KerasPolicy in combination with the MemoizationPolicy.

Memoization Policy

The MemoizationPolicy just memorizes the conversations in your training data. It predicts the next action with confidence 1.0 if this exact conversation exists in the training data, otherwise it predicts None with confidence 0.0.

Keras Policy

The KerasPolicy uses a neural network implemented in Keras to select the next action. The default architecture is based on an LSTM, but you can override the KerasPolicy.model_architecture method to implement your own architecture.

    def model_architecture(
            input_shape,  # type: Tuple[int, int]
            output_shape  # type: Tuple[int, Optional[int]]
        # type: (...) -> keras.models.Sequential
        """Build a keras model and return a compiled model."""

        from keras.models import Sequential
        from keras.layers import \
            Masking, LSTM, Dense, TimeDistributed, Activation

        # Build Model
        model = Sequential()

        # the shape of the y vector of the labels,
        # determines which output from rnn will be used
        # to calculate the loss
        if len(output_shape) == 1:
            # y is (num examples, num features) so
            # only the last output from the rnn is used to
            # calculate the loss
            model.add(Masking(mask_value=-1, input_shape=input_shape))
            model.add(LSTM(self.rnn_size, dropout=0.2))
            model.add(Dense(input_dim=self.rnn_size, units=output_shape[-1]))
        elif len(output_shape) == 2:
            # y is (num examples, max_dialogue_len, num features) so
            # all the outputs from the rnn are used to
            # calculate the loss, therefore a sequence is returned and
            # time distributed layer is used

            # the first value in input_shape is max dialogue_len,
            # it is set to None, to allow dynamic_rnn creation
            # during prediction
                              input_shape=(None, input_shape[1])))
            model.add(LSTM(self.rnn_size, return_sequences=True, dropout=0.2))
            raise ValueError("Cannot construct the model because"
                             "length of output_shape = {} "
                             "should be 1 or 2."




        return model