You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

245 lines
5.9 KiB

== Indexing Documents
When you add documents to Elasticsearch, you index JSON documents. This maps naturally to PHP associative arrays, since
they can easily be encoded in JSON. Therefore, in Elasticsearch-PHP you create and pass associative arrays to the client
for indexing. There are several methods of ingesting data into Elasticsearch, which we will cover here.
=== Single document indexing
When indexing a document, you can either provide an ID or let elasticsearch generate one for you.
{zwsp} +
.Providing an ID value
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [ 'testField' => 'abc']
];
// Document will be indexed to my_index/my_type/my_id
$response = $client->index($params);
----
{zwsp} +
.Omitting an ID value
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'body' => [ 'testField' => 'abc']
];
// Document will be indexed to my_index/my_type/<autogenerated ID>
$response = $client->index($params);
----
{zwsp} +
If you need to set other parameters, such as a `routing` value, you specify those in the array alongside the `index`,
`type`, etc. For example, let's set the routing and timestamp of this new document:
.Additional parameters
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'routing' => 'company_xyz',
'timestamp' => strtotime("-1d"),
'body' => [ 'testField' => 'abc']
];
$response = $client->index($params);
----
{zwsp} +
=== Bulk Indexing
Elasticsearch also supports bulk indexing of documents. The bulk API expects JSON action/metadata pairs, separated by
newlines. When constructing your documents in PHP, the process is similar. You first create an action array object
(e.g. `index` object), then you create a document body object. This process repeats for all your documents.
A simple example might look like this:
.Bulk indexing with PHP arrays
[source,php]
----
for($i = 0; $i < 100; $i++) {
$params['body'][] = [
'index' => [
'_index' => 'my_index',
'_type' => 'my_type',
]
];
$params['body'][] = [
'my_field' => 'my_value',
'second_field' => 'some more values'
];
}
$responses = $client->bulk($params);
----
In practice, you'll likely have more documents than you want to send in a single bulk request. In that case, you need
to batch up the requests and periodically send them:
.Bulk indexing with batches
[source,php]
----
$params = ['body' => []];
for ($i = 1; $i <= 1234567; $i++) {
$params['body'][] = [
'index' => [
'_index' => 'my_index',
'_type' => 'my_type',
'_id' => $i
]
];
$params['body'][] = [
'my_field' => 'my_value',
'second_field' => 'some more values'
];
// Every 1000 documents stop and send the bulk request
if ($i % 1000 == 0) {
$responses = $client->bulk($params);
// erase the old bulk request
$params = ['body' => []];
// unset the bulk response when you are done to save memory
unset($responses);
}
}
// Send the last batch if it exists
if (!empty($params['body'])) {
$responses = $client->bulk($params);
}
----
== Getting Documents
Elasticsearch provides realtime GETs of documents. This means that as soon as the document has been indexed and your
client receives an acknowledgement, you can immediately retrieve the document from any shard. Get operations are
performed by requesting a document by its full `index/type/id` path:
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id'
];
// Get doc at /my_index/my_type/my_id
$response = $client->get($params);
----
{zwsp} +
== Updating Documents
Updating a document allows you to either completely replace the contents of the existing document, or perform a partial
update to just some fields (either changing an existing field, or adding new fields).
=== Partial document update
If you want to partially update a document (e.g. change an existing field, or add a new one) you can do so by specifying
the `doc` in the `body` parameter. This will merge the fields in `doc` with the existing document:
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'doc' => [
'new_field' => 'abc'
]
]
];
// Update doc at /my_index/my_type/my_id
$response = $client->update($params);
----
{zwsp} +
=== Scripted document update
Sometimes you need to perform a scripted update, such as incrementing a counter or appending a new value to an array.
To perform a scripted update, you need to provide a script and (usually) a set of parameters:
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'script' => 'ctx._source.counter += count',
'params' => [
'count' => 4
]
]
];
$response = $client->update($params);
----
{zwsp} +
=== Upserts
Upserts are "Update or Insert" operations. This means an upsert will attempt to run your update script, but if the document
does not exist (or the field you are trying to update doesn't exist), default values will be inserted instead.
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'script' => 'ctx._source.counter += count',
'params' => [
'count' => 4
],
'upsert' => [
'counter' => 1
]
]
];
$response = $client->update($params);
----
{zwsp} +
== Deleting documents
Finally, you can delete documents by specifying their full `/index/type/id` path:
[source,php]
----
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id'
];
// Delete doc at /my_index/my_type/my_id
$response = $client->delete($params);
----
{zwsp} +