Change Log¶
All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning starting with version 0.7.0.
[Unreleased] - master¶
Note
This version is not yet released and is under active development.
Added¶
Changed¶
Removed¶
Fixed¶
[0.8.9] - 2017-05-26¶
Fixed
—–^
- properly handle response_log configuration variable being set to null
[0.8.8] - 2017-05-26¶
Fixed¶
- /status endpoint showing all available models instead of only those whose name starts with model
[0.8.0] - 2017-05-08¶
Added¶
- ngram character featurizer (allows better handling of out-of-vocab words)
- replaced pre-wired backends with more flexible pipeline definitions
- return top 10 intents with sklearn classifier #199
- python type annotations for nearly all public functions
- support for arbitrary spacy language model names
- duckling components to provide normalized output for structured entities
- Conditional random field entity extraction (Markov model for entity tagging, better named entity recognition with low and medium data and similarly well at big data level)
- allow naming of trained models instead of generated model names
- dynamic check of requirements for the different components & error messages on missing dependencies
- support for using multiple entity extractors and combining results downstream
Changed¶
unified tokenizers, classifiers and feature extractors to implement common component interface
src
directory renamed torasa_nlu
when loading data in a foreign format (api.ai, luis, wit) the data gets properly split into intent & entity examples
- Configuration:
- added
max_number_of_ngrams
- removed
backend
and addedpipeline
as a replacement - added
luis_data_tokenizer
- added
duckling_dimensions
- added
- parser output format changed
from
{"intent": "greeting", "confidence": 0.9, "entities": []}
to
{"intent": {"name": "greeting", "confidence": 0.9}, "entities": []}
- entities output format changed
from
{"start": 15, "end": 28, "value": "New York City", "entity": "GPE"}
to
{"extractor": "ner_mitie", "processors": ["ner_synonyms"], "start": 15, "end": 28, "value": "New York City", "entity": "GPE"}
where
extractor
denotes the entity extractor that originally found an entity, andprocessor
denotes components that alter entities, such as the synonym component.
camel cased MITIE classes (e.g.
MITIETokenizer
→MitieTokenizer
)model metadata changed, see migration guide
updated to spacy 1.7 and dropped training and loading capabilities for the spacy component (breaks existing spacy models!)
introduced compatibility with both Python 2 and 3
Removed¶
[0.7.4] - 2017-03-27¶
Fixed¶
- fixed failed loading of example data after renaming attributes, i.e. “KeyError: ‘entities’”
[0.7.3] - 2017-03-15¶
Fixed¶
- fixed regression in mitie entity extraction on special characters
- fixed spacy fine tuning and entity recognition on passed language instance
[0.7.1] - 2017-03-10¶
[0.7.0] - 2017-03-10¶
This is a major version update. Please also have a look at the Migration Guide.
Added¶
- Changelog ;)
- option to use multi-threading during classifier training
- entity synonym support
- proper temporary file creation during tests
- mitie_sklearn backend using mitie tokenization and sklearn classification
- option to fine-tune spacy NER models
- multithreading support of build in REST server (e.g. using gunicorn)
- multitenancy implementation to allow loading multiple models which share the same backend
Fixed¶
- error propagation on failed vector model loading (spacy)
- escaping of special characters during mitie tokenization