Server Configuration


Before you can use the server, you should train a model! See training_your_model


In older versions of Rasa NLU, the server and models were configured with a single file. Now, the server only takes command line arguments (see Server Parameters). The configuration file only refers to the model that you want to train, i.e. the pipeline and components.

Running the server

You can run a simple http server that handles requests using your projects with :

$ python -m rasa_nlu.server --path projects

The server will look for existing projects under the folder defined by the path parameter. By default a project will load the latest trained model.

Serving Multiple Apps

Depending on your choice of backend, Rasa NLU can use quite a lot of memory. So if you are serving multiple models in production, you want to serve these from the same process & avoid duplicating the memory load.


Although this saves the backend from loading the same set of word vectors twice, if you have projects in multiple languages your memory usage will still be high.

As stated previously, Rasa NLU naturally handles serving multiple apps. By default the server will load all projects found under the path directory passed at run time.

Rasa NLU naturally handles serving multiple apps, by default the server will load all projects found under the directory specified with --path option. unless you have provide --pre_load option to load a specific project.

$ # This will load all projects under projects/ directory
$ python -m rasa_nlu.server -c config.yaml --path projects/
$ # This will load only hotels project under projects/ directory
$ python -m rasa_nlu.server -c config.yaml --pre_load hotels --path projects/

The file structure under path directory is as follows:

- <path>
 - <project_A>
  - <model_XXXXXX>
  - <model_XXXXXX>
 - <project_B>
  - <model_XXXXXX>

You can specify which project to use in your /parse requests:

$ curl 'localhost:5000/parse?q=hello&project=my_restaurant_search_bot'


$ curl -XPOST localhost:5000/parse -d '{"q":"I am looking for Chinese food", "project":"my_restaurant_search_bot"}'

You can also specify the model you want to use for a given project, the default used being the latest trained:

$ curl -XPOST localhost:5000/parse -d '{"q":"I am looking for Chinese food", "project":"my_restaurant_search_bot", "model":<model_XXXXXX>}'

If no project is found by the server under the path directory, a "default" one will be used, using a simple fallback model.

Server Parameters

There are a number of parameters you can pass when running the server.

$ python -m rasa_nlu.server

Here is a quick overview:

usage: [-h] [-e {wit,luis,dialogflow}] [-P PORT]
                 [--pre_load PRE_LOAD [PRE_LOAD ...]] [-t TOKEN] [-w WRITE]
                 --path PATH [--cors [CORS [CORS ...]]]
                 [--max_training_processes MAX_TRAINING_PROCESSES]
                 [--num_threads NUM_THREADS] [--endpoints ENDPOINTS]
                 [--wait_time_between_pulls WAIT_TIME_BETWEEN_PULLS]
                 [--response_log RESPONSE_LOG] [--storage STORAGE] [-c CONFIG]
                 [--debug] [-v]

parse incoming text

optional arguments:
  -h, --help            show this help message and exit
  -e {wit,luis,dialogflow}, --emulate {wit,luis,dialogflow}
                        which service to emulate (default: None i.e. use
                        simple built in format)
  -P PORT, --port PORT  port on which to run server
  --pre_load PRE_LOAD [PRE_LOAD ...]
                        Preload models into memory before starting the server.
                        If given `all` as input all the models will be loaded.
                        Else you can specify a list of specific project names.
                        Eg: python -m rasa_nlu.server --pre_load project1
                        --path projects -c config.yaml
  -t TOKEN, --token TOKEN
                        auth token. If set, reject requests which don't
                        provide this token as a query parameter
  -w WRITE, --write WRITE
                        file where logs will be saved
  --path PATH           working directory of the server. Models areloaded from
                        this directory and trained models will be saved here.
  --cors [CORS [CORS ...]]
                        List of domain patterns from where CORS (cross-origin
                        resource sharing) calls are allowed. The default value
                        is `[]` which forbids all CORS requests.
  --max_training_processes MAX_TRAINING_PROCESSES
                        Number of processes used to handle training requests.
                        Increasing this value will have a great impact on
                        memory usage. It is recommended to keep the default
  --num_threads NUM_THREADS
                        Number of parallel threads to use for handling parse
  --endpoints ENDPOINTS
                        Configuration file for the model server as a yaml file
  --wait_time_between_pulls WAIT_TIME_BETWEEN_PULLS
                        Wait time in seconds between NLU model serverqueries.
  --response_log RESPONSE_LOG
                        Directory where logs will be saved (containing queries
                        and responses).If set to ``null`` logging will be
  --storage STORAGE     Set the remote location where models are stored. E.g.
                        on AWS. If nothing is configured, the server will only
                        serve the models that are on disk in the configured
  -c CONFIG, --config CONFIG
                        Default model configuration file used for training.
  --debug               Print lots of debugging statements. Sets logging level
                        to DEBUG
  -v, --verbose         Be verbose. Sets logging level to INFO


To protect your server, you can specify a token in your Rasa NLU configuration, by passing the --token argument when starting the server, or by setting the RASA_NLU_TOKEN environment variable. If set, this token must be passed as a query parameter in all requests, e.g. :

$ curl localhost:5000/status?token=12345


By default CORS (cross-origin resource sharing) calls are not allowed. If you want to call your Rasa NLU server from another domain (for example from a training web UI) then you can whitelist that domain by adding it to the config value cors_origin.

Have questions or feedback?

We have a very active support community on Rasa Community Forum that is happy to help you with your questions. If you have any feedback for us or a specific suggestion for improving the docs, feel free to share it by creating an issue on Rasa NLU GitHub repository.