elasticsearch學習筆記(五)——快速入門案例實戰電商網站商品管理:叢集健康檢查,文件的CRUD
elasticsearch和kibana都已經安裝和啟動了,下就開始進行實戰了
1、document資料格式
首先來講一下ES為什麼面向文件以及面向文件的好處。
(1)一般應用系統的資料結構都是面向物件的,結構複雜,操作起來特別不方便。如果將物件資料儲存到資料庫中,只能拆解開來,變為扁平的多張表,每次查詢的時候還得還原回物件格式,相當的麻煩。
(2)ES是面向文件的,文件中儲存的資料結構,與面向物件的資料結構是一樣的,基於這種文件的資料結構,es可以提供複雜的索引,全文檢索,分析聚合等的功能。
(3)es的document底層是用json資料格式來表達的,json的優勢就用說了,附上一篇文章來說明https://blog.csdn.net/it_drea...
物件的資料結構:
public class Employee { private String email; private String firstName; private String lastName; private EmployeeInfo info; private Date joinDate; } public class EmployeeInfo { private String bio; private Integer age; private String[] interests; } EmployeeInfo info = new EmployeeInfo(); info.setBio("curious and modest"); info.setAge(30); info.setInterests(new String[]{"bike", "climb"}); Employee employee = new Employee(); employee.setEmail("[email protected]"); employee.setFirstName("san"); employee.setLastName("zhang"); employee.setInfo(info); employee.setJoinDate(new Date());
兩張表:employee表,employee_info表,將employee物件的資料重新拆開來,變成Employee資料和EmployeeInfo資料
employee表:email,first_name,last_name,join_date,4個欄位
employee_info表:bio,age,interests,3個欄位
從外還有一個外來鍵欄位,比如employee_id關聯著employee表
ES面向文件的json資料結構:
{ "email":"[email protected]", "first_name":"san", "last_name":"zhang", "info": { "bio":"curious and modest", "age":30, "interests":["bike", "climb"] }, "join_date":"2017/01/01" }
這裡我們就可以明白ES的document資料格式和資料庫的關係型資料庫的區別
2、電商網站商品管理案例背景介紹
有一個電商網站,需要為其基於ES構建一個後臺系統,提供以下功能:
(1)對商品資訊進行CRUD(增刪改查)操作
(2)執行簡單的結構化查詢
(3)可以執行簡單的全文檢索,以及複雜的phrase(短語)檢索
(4)對於全文檢索的結果,可以進行高亮顯示
(5)對資料進行簡單的聚合分析
3、簡單的叢集管理
(1)快速檢查叢集的健康狀況
es提供了一套api,叫做cat api,可以檢視ES的各種各樣的配置以及狀態資料
GET /_cat/health?v epochtimestamp clusterstatus node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent 1555412142 10:55:42elasticsearch green11220000-100.0%
快速瞭解叢集的健康狀況,檢視status引數值即可
- green: 每個索引的primary shard和replica shard都是active狀態
- yellow: 每個索引的primary shard都是active狀態,但是部分的replica shard不是active狀態,處於不可用的狀態
- red: 不是所有的索引的primary shard都是active狀態,部分索引有資料的丟失
(2)快速檢視叢集中有哪些索引
GET /_cat/indices?v health status indexuuidpri rep docs.count docs.deleted store.size pri.store.size greenopen.kibana_task_manager q25yU7fCQlKw5PnMwe-IPA102045.5kb45.5kb greenopen.kibana_1u3ZsZEtUQCiIFpng4Z-Mww103014.2kb14.2kb
(3)簡單的索引操作
建立索引
PUT /test_index?pretty { "acknowledged" : true, "shards_acknowledged" : true, "index" : "test_index" }
刪除索引
DELETE /test_index?pretty { "acknowledged" : true }
(4)商品的CRUD操作
1、新增商品:新增文件,建立索引
格式
PUT /{index}/{type}/{id} { "json資料" }
PUT /product/_doc/1 { "name":"gaolujie yagao", "desc": "gaoxiao meibai", "price":30, "producer":"gaolujie producer", "tags":["meibai", "fangzhu"] } { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 } PUT /product/_doc/2 { "name" : "jiajieshi yagao", "desc" :"youxiao fangzhu", "price" :25, "producer" :"jiajieshi producer", "tags": [ "fangzhu" ] } PUT /product/_doc/3 { "name":"zhonghua yagao", "desc": "caoben zhiwu", "price":40, "producer":"zhonghua producer", "tags":["qingxin"] }
這裡不用事先建立好索引index和型別type,ES會預設對document每個field都建立倒排索引,讓其可以被搜尋
2、查詢商品:檢索文件
格式:
GET /{index}/{type}/{id}
GET /product/_doc/1 { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 1, "_seq_no" : 0, "_primary_term" : 1, "found" : true, "_source" : { "name" : "gaolujie yagao", "desc" : "gaoxiao meibai", "price" : 30, "producer" : "gaolujie producer", "tags" : [ "meibai", "fangzhu" ] } }
3、修改商品:替換文件
格式:
PUT /{index}/{type}/{id} { "json資料" }
PUT /product/_doc/1 { "name" : "jiaqiangban gaolujie yagao", "desc" :"gaoxiao meibai", "price" :30, "producer" :"gaolujie producer", "tags": [ "meibai", "fangzhu" ] } { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 1 }
替換方式有一個不好,替換時必須帶上所有的fields,才能達到我們想要的修改效果
舉個例子,如果執行
PUT /product/_doc/1 { "name" : "jiaqiangban gaolujie yagao" } GET /product/_doc/1 { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 3, "_seq_no" : 4, "_primary_term" : 1, "found" : true, "_source" : { "name" : "jiaqiangban gaolujie yagao" } }
就不是我們想要的了
4、修改商品:更新文件
格式
POST /{index}/_update/{id}
雖然本質還是一樣的,但是進行替換處理的操作全部放在了ES內部,我們傳輸的資料只需要傳需要修改的欄位即可,大大降低了在批量處理時的網路頻寬,提高了效能。
下面是展示的例子:
GET /product/_doc/1 { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 4, "_seq_no" : 5, "_primary_term" : 1, "found" : true, "_source" : { "name" : "jiaqiangban gaolujie yagao", "desc" : "gaoxiao meibai", "price" : 30, "producer" : "gaolujie producer", "tags" : [ "meibai", "fangzhu" ] } } POST /product/_update/1 { "doc":{ "name": "jiajieshi yagao" } } GET /product/_doc/1 { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 5, "_seq_no" : 6, "_primary_term" : 1, "found" : true, "_source" : { "name" : "jiajieshi yagao", "desc" : "gaoxiao meibai", "price" : 30, "producer" : "gaolujie producer", "tags" : [ "meibai", "fangzhu" ] } }
從這個例子就可以看出update操作成功了
5、刪除商品:刪除文件
格式:
DELETE /{index}/{type}/{id}
DELETE /product/_doc/1 { "_index" : "product", "_type" : "_doc", "_id" : "1", "_version" : 6, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 7, "_primary_term" : 1 }