Pull to refresh

Comments 3

P.S. Ещё один способ определения языка от службы поддержки IBM:

The Machine Learning models supporting the Watson services are statistical in nature and for the sample data you provided («Do you have store near Dubai»), the model in Language Translator is not really confident which language it is (it returns nn with 0.327 confidence and en with 0.166 confidence). Providing more text would help in improving the confidence of the language identification.

Can you help me understand the use case better for language identification?
If all you're interested in is identifying the language of the text, I would recommend trying NLU (Natural Language Understanding) service.
www.ibm.com/watson/developercloud/natural-language-understanding/api/v1/#get-analyze

If you don't have an NLU service, you'd create one just like you create any of the other Watson services. This returns credentials (username and password) which you can then use for language identification as well as several other features. Note that although language identification is not a specific feature of NLU, it is executed and returned with every NLU call. So a simple keyword extraction call would work as follows:

curl -G -u «username»:«password» «gateway.watsonplatform.net/natural-language-understanding/api/v1/analyze?version=2017-02-27&features=keywords» --data-urlencode «text=Do you have store near Dubai»

Response would include the language as shown below:

{
  "usage": {
    "text_units": 1,
    "text_characters": 28,
    "features": 1
  },
  "language": "en",
  "keywords": [
    {
      "text": "Dubai",
      "relevance": 0.9652
    },
    {
      "text": "store",
      "relevance": 0.422647
    }
  ]
}
Sign up to leave a comment.

Articles