Indexing and Updating Documents with the Elasticsearch REST API

Master the core Create, Read, Update, Delete (CRUD) operations in Elasticsearch using the REST API. This guide details the precise HTTP requests, endpoints, and JSON payloads required for indexing new documents (with or without specified IDs) and performing granular, partial updates on existing records. Learn practical `curl` examples for atomic updates, scripted modifications, and efficient bulk data ingestion.

Indexing and Updating Documents with the Elasticsearch REST API

Indexing and updating documents with the Elasticsearch REST API looks easy at first: send JSON to an endpoint and get JSON back. The parts that matter in production are more specific. Are you creating a new document or replacing an old one? Do you want Elasticsearch to generate the ID, or do you need an ID from your source system? Are you doing a partial update, or are you accidentally overwriting fields you meant to keep?

The examples below use localhost:9200, an index named products, and plain curl. In a real cluster you may also need HTTPS, authentication, and a certificate option.

curl -u elastic:password --cacert http_ca.crt https://es.example.com:9200/

Create a document with an automatic ID

Use POST /{index}/_doc when Elasticsearch can assign the document ID.

curl -X POST "localhost:9200/products/_doc" \
  -H "Content-Type: application/json" \
  -d '{
    "sku": "mouse-x1",
    "name": "Wireless Mouse X1",
    "price": 25.99,
    "in_stock": true,
    "updated_at": "2026-05-24T09:30:00Z"
  }'

A successful response includes an _id and a result of created.

{
  "_index": "products",
  "_id": "wI4aQZkB8xExample",
  "_version": 1,
  "result": "created"
}

Automatic IDs are fine for event data, logs, click records, and append-only content. They are not ideal when the same real-world item may arrive again later. If the same product update is indexed twice with automatic IDs, you get two documents.

Index with your own ID

Use PUT /{index}/_doc/{id} when your source system already has a stable identifier.

curl -X PUT "localhost:9200/products/_doc/sku-mouse-x1" \
  -H "Content-Type: application/json" \
  -d '{
    "sku": "mouse-x1",
    "name": "Wireless Mouse X1",
    "price": 25.99,
    "in_stock": true,
    "updated_at": "2026-05-24T09:30:00Z"
  }'

If the ID does not exist, Elasticsearch creates it. If it already exists, Elasticsearch replaces the whole document. That replacement behavior is useful for simple synchronization jobs, but it can also delete fields you forgot to send.

For example, if the existing document has category, brand, and description, and your next PUT only sends price, the stored _source becomes only the fields in that new request. Use PUT when you are sending the complete desired document.

Create only if missing

If duplicate creation would be a bug, use the create operation:

curl -X PUT "localhost:9200/products/_create/sku-mouse-x1" \
  -H "Content-Type: application/json" \
  -d '{
    "sku": "mouse-x1",
    "name": "Wireless Mouse X1",
    "price": 25.99
  }'

If the ID already exists, Elasticsearch returns a version conflict instead of overwriting the document. This is useful for ingestion jobs where retrying a message should not change an existing record.

Partially update fields

Use POST /{index}/_update/{id} with a doc object when you only want to change some fields.

curl -X POST "localhost:9200/products/_update/sku-mouse-x1" \
  -H "Content-Type: application/json" \
  -d '{
    "doc": {
      "price": 22.49,
      "in_stock": false,
      "updated_at": "2026-05-24T10:15:00Z"
    }
  }'

Elasticsearch fetches the existing document internally, merges the provided fields, and indexes the changed source again. It is a partial update from the API user's point of view, not an in-place mutation on disk.

If the document might not exist yet, use doc_as_upsert:

curl -X POST "localhost:9200/products/_update/sku-keyboard-k90" \
  -H "Content-Type: application/json" \
  -d '{
    "doc": {
      "sku": "keyboard-k90",
      "name": "Mechanical Keyboard K90",
      "price": 129.99,
      "in_stock": true
    },
    "doc_as_upsert": true
  }'

This creates the document with the doc content if it is missing. Use it when your upstream feed sends "current state" records and you do not care whether this is the first time Elasticsearch has seen the ID.

Scripted updates

Scripts are useful when the new value depends on the old value. A common example is incrementing a counter:

curl -X POST "localhost:9200/products/_update/sku-mouse-x1" \
  -H "Content-Type: application/json" \
  -d '{
    "script": {
      "source": "ctx._source.views = (ctx._source.views ?: 0) + params.count",
      "params": {
        "count": 1
      }
    }
  }'

Keep scripts small and predictable. They are powerful, but they are also easier to misuse than a plain doc update. For high-volume counters, think carefully about write rate, refresh needs, and whether Elasticsearch should be the system of record.

Bulk indexing and updates

For more than a handful of documents, use _bulk. The request body is newline-delimited JSON, and the final newline matters.

cat bulk-products.ndjson
{"index":{"_index":"products","_id":"sku-mouse-x1"}}
{"sku":"mouse-x1","name":"Wireless Mouse X1","price":25.99,"in_stock":true}
{"update":{"_index":"products","_id":"sku-keyboard-k90"}}
{"doc":{"price":119.99,"in_stock":true},"doc_as_upsert":true}
{"delete":{"_index":"products","_id":"sku-old-cable"}}

Send it like this:

curl -X POST "localhost:9200/_bulk" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @bulk-products.ndjson

Use --data-binary, not plain -d, so curl preserves newlines. Then inspect the response. A bulk request can return HTTP 200 while individual items failed.

curl -s -X POST "localhost:9200/_bulk" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @bulk-products.ndjson | jq '.errors, .items[] | select(.index.error or .update.error or .delete.error)'

Refresh and visibility

An indexed document is not always searchable immediately. Elasticsearch refreshes indices periodically by default. If you need to read the document by ID, GET /products/_doc/sku-mouse-x1 can find it right away. If you need it to appear in search before continuing a test, use refresh=wait_for:

curl -X PUT "localhost:9200/products/_doc/sku-mouse-x1?refresh=wait_for" \
  -H "Content-Type: application/json" \
  -d '{"sku":"mouse-x1","name":"Wireless Mouse X1","price":25.99}'

Do not add forced refreshes to every production write without measuring the cost. They can reduce indexing throughput.

A practical rule of thumb

Use POST /_doc for append-only data where duplicate events are acceptable or handled elsewhere. Use PUT /_doc/{id} when the request contains the complete current document. Use _update when you only want to change certain fields. Use _create when overwriting would hide a data problem. Use _bulk when the work is more than a few documents.

Most Elasticsearch indexing bugs come from choosing the wrong write shape. The endpoint should match your source data, your retry behavior, and whether Elasticsearch is storing a complete record or a searchable copy of another system's record.

Check mappings before blaming the write request

Sometimes the write succeeds and the search result still looks wrong. That is often a mapping issue, not an indexing endpoint issue.

Check the mapping:

curl -s "localhost:9200/products/_mapping?pretty"

If price was first indexed as text because an early test document sent "25.99" as a string, range queries may behave badly. If sku needs exact matching and aggregations, it should usually be a keyword field or have a keyword subfield. If timestamps arrive in different formats, some documents may reject during indexing while others succeed.

For predictable systems, create the index mapping before the first write:

curl -X PUT "localhost:9200/products" \
  -H "Content-Type: application/json" \
  -d '{
    "mappings": {
      "properties": {
        "sku": { "type": "keyword" },
        "name": { "type": "text" },
        "price": { "type": "double" },
        "in_stock": { "type": "boolean" },
        "updated_at": { "type": "date" }
      }
    }
  }'

Dynamic mapping is convenient for exploration. For production ingestion, explicit mappings prevent a bad first document from shaping the index.

Handle conflicts and retries deliberately

Concurrent updates can conflict. If two workers update the same document at nearly the same time, one may be based on an older version. For simple update retries, Elasticsearch supports retry_on_conflict:

curl -X POST "localhost:9200/products/_update/sku-mouse-x1?retry_on_conflict=3" \
  -H "Content-Type: application/json" \
  -d '{
    "doc": {
      "price": 21.99
    }
  }'

That is useful for low-risk updates, but it is not a replacement for clear ownership. If one system owns product names and another owns inventory, consider updating different fields through well-defined pipelines. If event order matters, include source timestamps and reject older updates in your ingestion layer or with a careful script.

Read after write during debugging

When testing an indexing flow, fetch the document by ID immediately:

curl -s "localhost:9200/products/_doc/sku-mouse-x1?pretty"

Then search for it:

curl -s "localhost:9200/products/_search?pretty" \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "term": {
        "sku": "mouse-x1"
      }
    }
  }'

If GET by ID works but search does not, think refresh, mapping, analyzer, or query type. If both fail, think indexing error, wrong index, wrong ID, authentication, or routing.

For applications, log the target index, document ID, response status, and item-level bulk errors. You do not need to log every full document forever, but during rollout those fields save hours.