Mapping 映射
ElasticSearch    2018-09-01 00:39:56    2    0    0
cqc   ElasticSearch
每个文档都有个type属性,每个type都有其自己的映射(mapping)或模式定义(schema definition)。映射定义了每个字段的类型及该字段将如何解析

GET /gb/_mapping/tweet 查看mapping

主要的一些字段类型
  1. 文本:string
  2. 整型:byte、short、integer、long
  3. 浮点型:float、double
  4. 布尔型:boolean
  5. 日期:date
string类型的字段有两个重要的属性:index和analyzer,其中index有以下可选选项:analyzed(默认,需要解析该字段)、not_analyzed(不解析)、no(不加入倒排索引中,即不可搜索)
long, double, date类型的字段也支持index属性,但只有not_analyzed(不解析)、no(不加入倒排索引中,即不可搜索)

复杂的字段类型
多值类型:{ "tag": [ "search", "nosql" ]}.可包含0、1、n个元素,需要是相同的数据类型
多级对象:
{
    "tweet":            "Elasticsearch is very flexible",
    "user": {
        "id":           "@johnsmith",
        "gender":       "male",
        "age":          26,
        "name": {
            "full":     "John Smith",
            "first":    "John",
            "last":     "Smith"
        }
    }}
映射多级别对象
{
  "gb": {
    "tweet": { 
      "properties": {
        "tweet":            { "type": "string" },
        "user": { 
          "type":             "object",
          "properties": {
            "id":           { "type": "string" },
            "gender":       { "type": "string" },
            "age":          { "type": "long"   },
            "name":   { 
              "type":         "object",
              "properties": {
                "full":     { "type": "string" },
                "first":    { "type": "string" },
                "last":     { "type": "string" }
              }
            }
          }
        }
      }
    }
  }}

新建索引时,包含映射

curl -XPUT '10.206.19.199:9200/gb' -d '{
   "mappings": {
     "tweet" : {
       "properties" : {
         "tweet" : {
           "type" :    "string",
           "analyzer": "english"
         },
         "date" : {
           "type" :   "date"
         },
         "name" : {
           "type" :   "string"
         },
         "user_id" : {
           "type" :   "long"
         }
       }
     }
   }
 }'

  
新增一个叫tag的字段
PUT /gb/_mapping/tweet
{
  "properties" : {
    "tag" : {
      "type" :    "string",
      "index":    "not_analyzed"
    }
  }
}

设置是否动态映射新字段
PUT /my_index
{
    "mappings": {
        "my_type": {
            "dynamic":      "strict", 
            "properties": {
                "title":  { "type": "string"},
                "stash":  {
                    "type":     "object",
                    "dynamic":  true 
                }
            }
        }
    }}

The my_type object will throw an exception if an unknown field is encountered.

The stash object will create new fields dynamically.

 dynamic 可选项:
true Add new fields dynamically—the default
false Ignore new fields
strict Throw an exception if an unknown field is encountered

动态映射模板
PUT /my_index
{
    "mappings": {
        "my_type": {
            "dynamic_templates": [
                { "es": {
                      "match":              "*_es", 
                      "match_mapping_type": "string",
                      "mapping": {
                          "type":           "string",
                          "analyzer":       "spanish"
                      }
                }},
                { "en": {
                      "match":              "*", 
                      "match_mapping_type": "string",
                      "mapping": {
                          "type":           "string",
                          "analyzer":       "english"
                      }
                }}
            ]}}}

Match string fields whose name ends in _es.

Match all other string fields.

Templates are checked in order; the first template that matches is applied. For instance, we could specify two templates for string fields:

  • es: Field names ending in _es should use the spanish analyzer.
  • en: All others should use the english analyzer.
定义默认配置
PUT /my_index
{
    "mappings": {
        "_default_": {
            "_all": { "enabled":  false }
        },
        "blog": {
            "_all": { "enabled":  true  }
        }
    }}
The _default_ mapping can also be a good place to specify index-wide



测试映射
GET /gb/_analyze
{
  "field": "tweet"
  "text": "Black-cats" }

GET /gb/_analyze
{
  "field": "tag",
  "text": "Black-cats" }

 
每个文档都有个type属性,每个type都有其自己的映射(mapping)或模式定义(schema definition)。映射定义了每个字段的类型及该字段将如何解析

GET /gb/_mapping/tweet 查看mapping

主要的一些字段类型
  1. 文本:string
  2. 整型:byte、short、integer、long
  3. 浮点型:float、double
  4. 布尔型:boolean
  5. 日期:date
string类型的字段有两个重要的属性:index和analyzer,其中index有以下可选选项:analyzed(默认,需要解析该字段)、not_analyzed(不解析)、no(不加入倒排索引中,即不可搜索)
long, double, date类型的字段也支持index属性,但只有not_analyzed(不解析)、no(不加入倒排索引中,即不可搜索)

复杂的字段类型
多值类型:{ "tag": [ "search", "nosql" ]}.可包含0、1、n个元素,需要是相同的数据类型
多级对象:
{
"tweet": "Elasticsearch is very flexible",
"user": {
"id": "@johnsmith",
"gender": "male",
"age": 26,
"name": {
"full": "John Smith",
"first": "John",
"last": "Smith"
}
}}
映射多级别对象
{
"gb": {
"tweet": {
"properties": {
"tweet": { "type": "string" },
"user": {
"type": "object",
"properties": {
"id": { "type": "string" },
"gender": { "type": "string" },
"age": { "type": "long" },
"name": {
"type": "object",
"properties": {
"full": { "type": "string" },
"first": { "type": "string" },
"last": { "type": "string" }
}
}
}
}
}
}
}}

新建索引时,包含映射
curl -XPUT '10.206.19.199:9200/gb' -d '{
  "mappings": {
    "tweet" : {
      "properties" : {
        "tweet" : {
          "type" :    "string",
          "analyzer": "english"
        },
        "date" : {
          "type" :   "date"
        },
        "name" : {
          "type" :   "string"
        },
        "user_id" : {
          "type" :   "long"
        }
      }
    }
  }
}'
新增一个叫tag的字段
PUT /gb/_mapping/tweet
{
"properties" : {
"tag" : {
"type" : "string",
"index": "not_analyzed"
}
}
}
设置是否动态映射新字段
PUT /my_index
{
"mappings": {
"my_type": {
"dynamic": "strict",
"properties": {
"title": { "type": "string"},
"stash": {
"type": "object",
"dynamic": true
}
}
}
}}
The my_type object will throw an exception if an unknown field is encountered.
The stash object will create new fields dynamically.
 dynamic 可选项:
true Add new fields dynamically—the default
false Ignore new fields
strict Throw an exception if an unknown field is encountered

动态映射模板
PUT /my_index
{
"mappings": {
"my_type": {
"dynamic_templates": [
{ "es": {
"match": "*_es",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"analyzer": "spanish"
}
}},
{ "en": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"analyzer": "english"
}
}}
]}}}
Match string fields whose name ends in _es.
Match all other string fields.

Templates are checked in order; the first template that matches is applied. For instance, we could specify two templates for string fields:
  • es: Field names ending in _es should use the spanish analyzer.
  • en: All others should use the english analyzer.
定义默认配置
PUT /my_index
{
"mappings": {
"_default_": {
"_all": { "enabled": false }
},
"blog": {
"_all": { "enabled": true }
}
}}
The _default_ mapping can also be a good place to specify index-wide



测试映射

GET /gb/_analyze
{
"field": "tweet"
"text": "Black-cats" }

GET /gb/_analyze
{
"field": "tag",
"text": "Black-cats" }

 


上一篇: Document operations 文档操作

下一篇: Analyzer 解析器

文档导航