Isaac.

Elasticsearch Fundamentals

Index and search data efficiently with Elasticsearch.

By EMEPublished: February 20, 2025
elasticsearchsearchindexingfull-text search

A Simple Analogy

Elasticsearch is like an advanced library index. Instead of reading every book to find a topic, it instantly locates all books with that topic through indexing.


Why Elasticsearch?

  • Full-text search: Find across documents
  • Real-time indexing: Fresh results immediately
  • Scalability: Distribute across nodes
  • Relevance: Score matching documents
  • Aggregations: Group and analyze data

Setup with Docker

# Start Elasticsearch
docker run -d \
  -p 9200:9200 \
  -e ELASTIC_PASSWORD=password \
  -e xpack.security.enabled=true \
  docker.elastic.co/elasticsearch/elasticsearch:8.0.0

# Test connection
curl -u elastic:password http://localhost:9200

Indexing Documents

using Elasticsearch.Net;
using Nest;

var settings = new ConnectionSettings(new Uri("http://localhost:9200"))
    .DefaultIndex("products");

var client = new ElasticClient(settings);

// Index a product
var product = new Product
{
    Id = "1",
    Name = "Laptop",
    Description = "High-performance laptop",
    Price = 999.99m,
    Category = "Electronics"
};

var response = await client.IndexAsync(product, i => i.Id(product.Id));

// Bulk index
var bulkResponse = await client.BulkAsync(b => b
    .Index<Product>()
    .IndexMany(products));

Searching

// Simple search
var response = await client.SearchAsync<Product>(s => s
    .Query(q => q
        .Match(m => m
            .Field(f => f.Name)
            .Query("laptop")
        )
    )
);

// Multiple criteria
var response = await client.SearchAsync<Product>(s => s
    .Query(q => q
        .Bool(b => b
            .Must(must => must
                .Match(m => m
                    .Field(f => f.Description)
                    .Query("laptop")
                )
            )
            .Filter(f => f
                .Range(r => r
                    .Field(f => f.Price)
                    .GreaterThanOrEquals(500)
                    .LessThanOrEquals(1500)
                )
            )
        )
    )
);

foreach (var product in response.Documents)
{
    Console.WriteLine($"{product.Name}: ${product.Price}");
}

Aggregations

// Group by category
var response = await client.SearchAsync<Product>(s => s
    .Aggregations(a => a
        .Terms("categories", t => t
            .Field(f => f.Category)
            .Size(10)
        )
    )
);

var categoryBuckets = response.Aggregations.Terms("categories");
foreach (var bucket in categoryBuckets.Buckets)
{
    Console.WriteLine($"{bucket.Key}: {bucket.DocCount} products");
}

// Average price by category
var response = await client.SearchAsync<Product>(s => s
    .Aggregations(a => a
        .Terms("categories", t => t
            .Field(f => f.Category)
            .Aggregations(aa => aa
                .Average("avgPrice", av => av
                    .Field(f => f.Price)
                )
            )
        )
    )
);

Analyzers

// Create custom analyzer
var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var client = new ElasticClient(settings);

var createResponse = await client.Indices.CreateAsync("articles", c => c
    .Settings(s => s
        .Analysis(a => a
            .Analyzers(aa => aa
                .Custom("email_analyzer", ca => ca
                    .Tokenizer("keyword")
                    .Filters("lowercase")
                )
                .Standard("standard", sa => sa
                    .StopWords("_english_")
                )
            )
        )
    )
    .Map<Article>(m => m
        .Properties(p => p
            .Text(t => t
                .Name(n => n.Title)
                .Analyzer("standard")
            )
            .Text(t => t
                .Name(n => n.Email)
                .Analyzer("email_analyzer")
            )
        )
    )
);

Best Practices

  1. Index size: Keep shards balanced
  2. Mapping: Define field types
  3. Refresh rate: Balance indexing vs search
  4. Backups: Snapshots for disaster recovery
  5. Monitoring: Track cluster health

Related Concepts

  • Lucene scoring
  • Inverted indexes
  • Sharding and replication
  • Kibana visualization

Summary

Elasticsearch enables powerful full-text search across large datasets. Use proper indexing, analyzers, and aggregations to extract insights from text data.