java - Elasticsearch improve query performance -

- August 15, 2015

i'm trying improve query performance. takes average of 3 seconds simple queries don't touch nested document, , it's longer.

curl "http://searchbox:9200/global/user/_search?n=0&sort=influence:asc&q=user.name:bill%20smith"

even without sort takes seconds. here details of cluster:

1.4tb index size. 210m documents aren't nested (about 10kb each) 500m documents in total. (nested documents small: 2-5 fields). 128 segments per node. 3 nodes, m2.4xlarge (-xmx set 40g, machine memory 60g) 3 shards. index on amazon ebs volumes. replication 0 (have tried replication 2 little improvement)

i don't see noticeable spikes in cpu/memory etc. ideas how improved?

garry's points heap space true, it's not heap space that's issue here.

with current configuration, you'll have less 60gb of page cache available, 1.5 tb index. less 4.2% of index in page cache, there's high probability you'll needing hit disk of searches.

you want add more memory cluster, , you'll want think number of shards well. sticking default can cause skewed distribution. if had 5 shards in case, you'd have 2 machines 40% of data each, , third 20%. in either case, you'll waiting slowest machine or disk when doing distributed searches. article on elasticsearch in production goes bit more in depth on determining right amount of memory.

for exact search example, can use filters, though. you're sorting, ignoring score calculated query. filter, it'll cached after first run, , subsequent searches quick.

Search This Blog

Cap

java - Elasticsearch improve query performance -

Comments

Post a Comment

Popular posts from this blog

Need to Replace properties of single sql file using bat file -

postgresql - Lazarus + Postgres: incomplete startup packet -

c# - How to get the current UAC mode -