Server Configuration¶
Note
Before you can use the server, you should train a model! See training_your_model
Note
In older versions of Rasa NLU, the server and models were configured with a single file. Now, the server only takes command line arguments (see Server Parameters). The configuration file only refers to the model that you want to train, i.e. the pipeline and components.
Running the server¶
You can run a simple http server that handles requests using your projects with :
$ python -m rasa_nlu.server --path projects
The server will look for existing projects under the folder defined by
the path
parameter. By default a project will load the latest
trained model.
Serving Multiple Apps¶
Depending on your choice of backend, Rasa NLU can use quite a lot of memory. So if you are serving multiple models in production, you want to serve these from the same process & avoid duplicating the memory load.
Note
Although this saves the backend from loading the same set of word vectors twice, if you have projects in multiple languages your memory usage will still be high.
As stated previously, Rasa NLU naturally handles serving multiple apps.
By default the server will load all projects found
under the path
directory passed at run time.
Rasa NLU naturally handles serving multiple apps, by default the server will load all projects found
under the directory specified with --path
option. unless you have provide --pre_load
option
to load a specific project.
$ # This will load all projects under projects/ directory
$ python -m rasa_nlu.server -c config.yaml --path projects/
$ # This will load only hotels project under projects/ directory
$ python -m rasa_nlu.server -c config.yaml --pre_load hotels --path projects/
The file structure under path directory
is as follows:
- <path>
- <project_A>
- <model_XXXXXX>
- <model_XXXXXX>
...
- <project_B>
- <model_XXXXXX>
...
...
You can specify which project to use in your /parse
requests:
$ curl 'localhost:5000/parse?q=hello&project=my_restaurant_search_bot'
or
$ curl -XPOST localhost:5000/parse -d '{"q":"I am looking for Chinese food", "project":"my_restaurant_search_bot"}'
You can also specify the model you want to use for a given project, the default used being the latest trained:
$ curl -XPOST localhost:5000/parse -d '{"q":"I am looking for Chinese food", "project":"my_restaurant_search_bot", "model":<model_XXXXXX>}'
If no project is found by the server under the path
directory, a "default"
one will be used, using a simple fallback model.
Server Parameters¶
There are a number of parameters you can pass when running the server.
$ python -m rasa_nlu.server
Here is a quick overview:
usage: server.py [-h] [-e {wit,luis,dialogflow}] [-P PORT]
[--pre_load PRE_LOAD [PRE_LOAD ...]] [-t TOKEN] [-w WRITE]
--path PATH [--cors [CORS [CORS ...]]]
[--max_training_processes MAX_TRAINING_PROCESSES]
[--num_threads NUM_THREADS] [--endpoints ENDPOINTS]
[--wait_time_between_pulls WAIT_TIME_BETWEEN_PULLS]
[--response_log RESPONSE_LOG] [--storage STORAGE] [-c CONFIG]
[--debug] [-v]
parse incoming text
optional arguments:
-h, --help show this help message and exit
-e {wit,luis,dialogflow}, --emulate {wit,luis,dialogflow}
which service to emulate (default: None i.e. use
simple built in format)
-P PORT, --port PORT port on which to run server
--pre_load PRE_LOAD [PRE_LOAD ...]
Preload models into memory before starting the server.
If given `all` as input all the models will be loaded.
Else you can specify a list of specific project names.
Eg: python -m rasa_nlu.server --pre_load project1
--path projects -c config.yaml
-t TOKEN, --token TOKEN
auth token. If set, reject requests which don't
provide this token as a query parameter
-w WRITE, --write WRITE
file where logs will be saved
--path PATH working directory of the server. Models areloaded from
this directory and trained models will be saved here.
--cors [CORS [CORS ...]]
List of domain patterns from where CORS (cross-origin
resource sharing) calls are allowed. The default value
is `[]` which forbids all CORS requests.
--max_training_processes MAX_TRAINING_PROCESSES
Number of processes used to handle training requests.
Increasing this value will have a great impact on
memory usage. It is recommended to keep the default
value.
--num_threads NUM_THREADS
Number of parallel threads to use for handling parse
requests.
--endpoints ENDPOINTS
Configuration file for the model server as a yaml file
--wait_time_between_pulls WAIT_TIME_BETWEEN_PULLS
Wait time in seconds between NLU model serverqueries.
--response_log RESPONSE_LOG
Directory where logs will be saved (containing queries
and responses).If set to ``null`` logging will be
disabled.
--storage STORAGE Set the remote location where models are stored. E.g.
on AWS. If nothing is configured, the server will only
serve the models that are on disk in the configured
`path`.
-c CONFIG, --config CONFIG
Default model configuration file used for training.
--debug Print lots of debugging statements. Sets logging level
to DEBUG
-v, --verbose Be verbose. Sets logging level to INFO
Authentication¶
To protect your server, you can specify a token in your Rasa NLU configuration,
by passing the --token
argument when starting the server,
or by setting the RASA_NLU_TOKEN
environment variable.
If set, this token must be passed as a query parameter in all requests, e.g. :
$ curl localhost:5000/status?token=12345
CORS¶
By default CORS (cross-origin resource sharing) calls are not allowed. If you want to call your Rasa NLU server from another domain (for example from a training web UI) then you can whitelist that domain by adding it to the config value cors_origin
.
Have questions or feedback?¶
We have a very active support community on Rasa Community Forum that is happy to help you with your questions. If you have any feedback for us or a specific suggestion for improving the docs, feel free to share it by creating an issue on Rasa NLU GitHub repository.