PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Monday, August 1, 2022

[FIXED] how get page start records from 23000 to 23004 in elasticsearch

 August 01, 2022     elasticsearch, pagination, python     No comments   

Issue

I have an elasticsearch database contain about 100k rows. I want paginate about 30k rows.

The error that I get is about max-result-window.

In this case I cannot get records from 23000 to 23004 because is exceeds 10k records. Is there a workaround?


Solution

A possible workaround that I find is to use scroll api. In pratice I scroll by size 20 (1 page), until I achieve page 51711. It takes about 10 minutes because It scroll all data before achieve start record 1070100 to record 1070120.

url = "http://localhost:9200"
index = "civile"
pageLimit = 20

bodyPageAllDocBil = {"query": {"bool": {"must": [], "should": []}}, "_source": ["annoruolo", "annosentenza",  "cf_giudice","codiceoggetto", "controparte", "gradogiudizio", "nomegiudice", "parte","distretto"]}

bodyCountAllDoc = bodyPageAllDocBil
bodyCountAllDoc.pop('_source', None)
es = Elasticsearch(url)
res = es.count(index=index, body=bodyCountAllDoc) 
sizeCount = res["count"]

bodyPageAllDocs = bodyPageAllDocBil
bodyPageAllDocs["size"] = pageLimit
es = Elasticsearch(url)

docs = es.search(index=index, body=bodyPageAllDocs,scroll = '10m')

currentSize = pageLimit
scrollId = docs["_scroll_id"]
page = 51711
paginationStart = (page - 1) * pageLimit

while currentSize <= paginationStart + pageLimit:
    es = Elasticsearch(url)
    docs = es.scroll(scroll_id = scrollId,scroll = '10m')
    countRec = len(docs["hits"]["hits"])
    
    if currentSize == paginationStart:
        print(docs["hits"]["hits"][0])
        print(docs["hits"]["hits"][1])
        #...

    currentSize = currentSize + countRec
    scrollId = docs['_scroll_id']  


Answered By - user18480960
Answer Checked By - Timothy Miller (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home
View mobile version

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing