/ / TransportError 400 Elasticsearch com enorme lista de termos - python, django, elasticsearch, django-haystack

TransportError 400 Elasticsearch com uma lista enorme de termos - python, django, elasticsearch, django-haystack

Pesquisa com pequena lista de termos dá o que eu quero

In [29]: small_list
Out[29]: [8096, 8105, 8114, 8116, 8128, 8130]

In [30]: sqs.filter(id__in=small_list)
Out[30]: [<SearchResult: web.listing (pk=u"8128")>, <SearchResult: web.listing (pk=u"8130")>, <SearchResult: web.listing (pk=u"8116")>, <SearchResult: web.listing (pk=u"8105")>, <SearchResult: web.listing (pk=u"8114")>, <SearchResult: web.listing (pk=u"8096")>]

mas com milhares de termos dá erro abaixo:

In [32]: len(big_list)
Out[32]: 6305
In [33]: sqs.filter(id__in=big_list)
Traceback (most recent call last):
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/haystack/backends/elasticsearch_backend.py", line 516, in search
_source=True)
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 530, in search
doc_type, "_search"), params=params, body=body)
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 93, in perform_request
self._raise_error(response.status, raw_data)
File "/home/ravi/bit/wonder/env/local/lib/python2.7/site-packages/elasticsearch/connection/base.py", line 105, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
RequestError: TransportError(400, u"search_phase_execution_exception")

Django == 1.8

django-haystack == 2.4.1

elasticsearch == 2.1.0

Respostas:

1 para resposta № 1

Por padrão, elasticsearch é um limite nos termos da consulta limitados a 1024.

a consulta abaixo funciona para você

sqs.filter(id__in=big_list[:1024])

mais informações https://groups.google.com/forum/#!topic/elasticsearch/LqywKHKWbeI