ELK-数据分析工具学习
ELK数据分析工具学习
学习参考资料
ElasticSearch参考手册,学习
http://elasticsearch.cn/book/elasticsearch_definitive_guide_2.x/index.html
DSL查询语法,包括查找(query)、过滤(filter)和聚合(aggs)等。
Logstash参考手册,学习数据导入,包括输入(input)、过滤(filter)和输出(
output)等,主要是filter中如何对复杂文本 进行拆分和类型 转化。
http://udn.yyuap.com/doc/logstash-best-practice-cn/index.html
Kibana参考手册,使用Kibana提供的前端界面对数据进行快速展示,主要是对Visulize 模块的使
https://kibana.logstash.es/content/
安装
下载
git clone https://github.com/elastic/stack-docker
安装 docker-compose
pip install docker-compose
修改配置
修改docker-compose.yml 中ip 127.0.0.1 为 0.0.0.0
启动
docker-compose up
访问
http://IP:5601
登陆
user=elastic password=changeme
参看文献
https://github.com/elastic/stack-docker
http://hao.jobbole.com/kibana/
基本操作
1{
2 "count": 155300,
3 "_shards": {
4 "total": 79,
5 "successful": 44,
6 "skipped": 0,
7 "failed": 0
8 }
9}
使用put, 指定id
使用post, 自动生成id
1PUT /website/blog/123
2{
3 "title": "My first blog entry",
4 "text": "Just trying this out...",
5 "date": "2014/01/01"
6}
1{
2 "_index": "website",
3 "_type": "blog",
4 "_id": "R0DpVWEB9_9q-Fol0opS",
5 "_version": 1,
6 "result": "created",
7 "_shards": {
8 "total": 2,
9 "successful": 1,
10 "failed": 0
11 },
12 "_seq_no": 0,
13 "_primary_term": 1
14}
自动生成的 ID 是 URL-safe、 基于 Base64 编码且长度为20个字符的 GUID 字符串。 这些 GUID 字符串由可修改的 FlakeID 模式生成,这种模式允许多个节点并行生成唯一 ID ,且互相之间的冲突概率几乎为零。
取回文档 根据_index _type _id
1{
2 "_index": "website",
3 "_type": "blog",
4 "_id": "123",
5 "_version": 1,
6 "found": true,
7 "_source": {
8 "title": "My first blog entry",
9 "text": "Just trying this out...",
10 "date": "2014/01/01"
11 }
12}
如果没有 会返回
1{
2 "_index": "website",
3 "_type": "blog",
4 "_id": "123",
5 "_version": 1,
6 "found": true,
7 "_source": {
8 "title": "My first blog entry",
9 "text": "Just trying this out...",
10 "date": "2014/01/01"
11 }
12}
检查文档是否存在
1HEAD /website/blog/123?_source
200 - ok
创建新文档
1POST /website/blog/
2{ ... }# 自动生成id
3PUT /website/blog/123?op_type=create
4{ ... }
5PUT /website/blog/123/_create
6{ ... }
1{
2 #如果可以创建
3 201 Created
4 #如果id重复
5 "error": {
6 "root_cause": [
7 {
8 "type": "document_already_exists_exception",
9 "reason": "[blog][123]: document already exists",
10 "shard": "0",
11 "index": "website"
12 }
13 ],
14 "type": "document_already_exists_exception",
15 "reason": "[blog][123]: document already exists",
16 "shard": "0",
17 "index": "website"
18 },
19 "status": 409
20}
更新文档
1PUT /website/blog/123
2{
3 "title": "My first blog entry",
4 "text": "I am starting to get the hang of this...",
5 "date": "2014/01/02"
6}
1{
2 "_index": "website",
3 "_type": "blog",
4 "_id": "123",
5 "_version": 2,#版本会变成version2
6 "result": "updated",#更改为Update
7 "_shards": {
8 "total": 2,
9 "successful": 1,
10 "failed": 0
11 },
12 "_seq_no": 1,
13 "_primary_term": 1
14}
删除文档
1DELETE /website/blog/123
#存在
{
{
"_index": "website",
"_type": "blog",
"_id": "123",
"_version": 3,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1
}
}
#不存在
{
"found" : false,
"_index" : "website",
"_type" : "blog",
"_id" : "123",
"_version" : 4
}
并发中遇到的问题
悲观并发控制
这种方法被关系型数据库广泛使用,它假定有变更冲突可能发生,因此阻塞访问资源以防止冲突。 一个典型的例子是读取一行数据之前先将其锁住,确保只有放置锁的线程能够对这行数据进行修改。
乐观并发控制
Elasticsearch 中使用的这种方法假定冲突是不可能发生的,并且不会阻塞正在尝试的操作。 然而,如果源数据在读写当中被修改,更新将会失败。应用程序接下来将决定该如何解决冲突。 例如,可以重试更新、使用新的数据、或者将相关情况报告给用户。
乐观并发控制
利用 _version 号来确保 应用中相互冲突的变更不会导致数据丢失。我们通过指定想要修改文档的 version 号来达到这个目的。
实例流程
1.创建一个文档
1PUT /website/blog/1/_create
2{
3 "title": "My first blog entry",
4 "text": "Just trying this out..."
5}
version 版本号是 1
2.重建索引保存修改 指定version
1PUT /website/blog/1?version=1
2{
3 "title": "My first blog entry",
4 "text": "Starting to get the hang of this..."
5}
version=2
如果version 已经是2了,再对version1进行修改,
1{
2 "error": {
3 "root_cause": [
4 {
5 "type": "version_conflict_engine_exception",
6 "reason": "[blog][1]: version conflict, current version [2] is different than the one provided [1]",
7 "index_uuid": "_1drhEbXTB-rHBomeP7cJw",
8 "shard": "3",
9 "index": "website"
10 }
11 ],
12 "type": "version_conflict_engine_exception",
13 "reason": "[blog][1]: version conflict, current version [2] is different than the one provided [1]",
14 "index_uuid": "_1drhEbXTB-rHBomeP7cJw",
15 "shard": "3",
16 "index": "website"
17 },
18 "status": 409
19}
使用外部版本号 更新版本号
只有当前版本号小于更新版本号的时候,才会更新成功
1PUT /website/blog/2?version=10&version_type=external
2{
3 "title": "My first external blog entry",
4 "text": "This is a piece of cake..."
5}
1{
2 "_index": "website",
3 "_type": "blog",
4 "_id": "2",
5 "_version": 10,
6 "result": "created",
7 "_shards": {
8 "total": 2,
9 "successful": 1,
10 "failed": 0
11 },
12 "_seq_no": 0,
13 "_primary_term": 1
14}
15# 如果不小于
16{
17 "error": {
18 "root_cause": [
19 {
20 "type": "version_conflict_engine_exception",
21 "reason": "[blog][2]: version conflict, current version [10] is higher or equal to the one provided [10]",
22 "index_uuid": "_1drhEbXTB-rHBomeP7cJw",
23 "shard": "2",
24 "index": "website"
25 }
26 ],
27 "type": "version_conflict_engine_exception",
28 "reason": "[blog][2]: version conflict, current version [10] is higher or equal to the one provided [10]",
29 "index_uuid": "_1drhEbXTB-rHBomeP7cJw",
30 "shard": "2",
31 "index": "website"
32 },
33 "status": 409
34}