Mapping 映射

每个文档都有个type属性，每个type都有其自己的映射（mapping）或模式定义（schema definition）。映射定义了每个字段的类型及该字段将如何解析

GET /gb/_mapping/tweet 查看mapping

主要的一些字段类型

文本：string
整型：byte、short、integer、long
浮点型：float、double
布尔型：boolean
日期：date

string类型的字段有两个重要的属性：index和analyzer，其中index有以下可选选项：analyzed(默认，需要解析该字段)、not_analyzed(不解析)、no(不加入倒排索引中，即不可搜索)

long, double, date类型的字段也支持index属性，但只有not_analyzed(不解析)、no(不加入倒排索引中，即不可搜索)

复杂的字段类型

多值类型：{ "tag": [ "search", "nosql" ]}.可包含0、1、n个元素，需要是相同的数据类型

多级对象：

{
    "tweet":            "Elasticsearch is very flexible",
    "user": {
        "id":           "@johnsmith",
        "gender":       "male",
        "age":          26,
        "name": {
            "full":     "John Smith",
            "first":    "John",
            "last":     "Smith"
        }
    }}

映射多级别对象

{
  "gb": {
    "tweet": { 
      "properties": {
        "tweet":            { "type": "string" },
        "user": { 
          "type":             "object",
          "properties": {
            "id":           { "type": "string" },
            "gender":       { "type": "string" },
            "age":          { "type": "long"   },
            "name":   { 
              "type":         "object",
              "properties": {
                "full":     { "type": "string" },
                "first":    { "type": "string" },
                "last":     { "type": "string" }
              }
            }
          }
        }
      }
    }
  }}

新建索引时，包含映射

curl -XPUT '10.206.19.199:9200/gb' -d '{
   "mappings": {
     "tweet" : {
       "properties" : {
         "tweet" : {
           "type" :    "string",
           "analyzer": "english"
         },
         "date" : {
           "type" :   "date"
         },
         "name" : {
           "type" :   "string"
         },
         "user_id" : {
           "type" :   "long"
         }
       }
     }
   }
 }'

新增一个叫tag的字段

PUT /gb/_mapping/tweet
{
  "properties" : {
    "tag" : {
      "type" :    "string",
      "index":    "not_analyzed"
    }
  }
}

设置是否动态映射新字段

PUT /my_index
{
    "mappings": {
        "my_type": {
            "dynamic":      "strict", 
            "properties": {
                "title":  { "type": "string"},
                "stash":  {
                    "type":     "object",
                    "dynamic":  true 
                }
            }
        }
    }}

	The `my_type` object will throw an exception if an unknown field is encountered.
	The `stash` object will create new fields dynamically.

dynamic 可选项：

true Add new fields dynamically—the default

false Ignore new fields

strict Throw an exception if an unknown field is encountered

动态映射模板

PUT /my_index
{
    "mappings": {
        "my_type": {
            "dynamic_templates": [
                { "es": {
                      "match":              "*_es", 
                      "match_mapping_type": "string",
                      "mapping": {
                          "type":           "string",
                          "analyzer":       "spanish"
                      }
                }},
                { "en": {
                      "match":              "*", 
                      "match_mapping_type": "string",
                      "mapping": {
                          "type":           "string",
                          "analyzer":       "english"
                      }
                }}
            ]}}}

	Match string fields whose name ends in `_es`.
	Match all other string fields.

Templates are checked in order; the first template that matches is applied. For instance, we could specify two templates for string fields:

es: Field names ending in _es should use the spanish analyzer.
en: All others should use the english analyzer.

定义默认配置

PUT /my_index
{
    "mappings": {
        "_default_": {
            "_all": { "enabled":  false }
        },
        "blog": {
            "_all": { "enabled":  true  }
        }
    }}

The _default_ mapping can also be a good place to specify index-wide

测试映射

GET /gb/_analyze
{
  "field": "tweet"
  "text": "Black-cats" }

GET /gb/_analyze
{
  "field": "tag",
  "text": "Black-cats" }

每个文档都有个type属性，每个type都有其自己的映射（mapping）或模式定义（schema definition）。映射定义了每个字段的类型及该字段将如何解析

GET /gb/_mapping/tweet 查看mapping

主要的一些字段类型

文本：string
整型：byte、short、integer、long
浮点型：float、double
布尔型：boolean
日期：date

long, double, date类型的字段也支持index属性，但只有not_analyzed(不解析)、no(不加入倒排索引中，即不可搜索)

复杂的字段类型

多值类型：{ "tag": [ "search", "nosql" ]}.可包含0、1、n个元素，需要是相同的数据类型

多级对象：

{

"tweet": "Elasticsearch is very flexible",

"user": {

"id": "@johnsmith",

"gender": "male",

"age": 26,

"name": {

"full": "John Smith",

"first": "John",

"last": "Smith"

}

}}

映射多级别对象

{

"gb": {

"tweet": {

"properties": {

"tweet": { "type": "string" },

"user": {

"type": "object",

"properties": {

"id": { "type": "string" },

"gender": { "type": "string" },

"age": { "type": "long" },

"name": {

"type": "object",

"properties": {

"full": { "type": "string" },

"first": { "type": "string" },

"last": { "type": "string" }

}

}}

新建索引时，包含映射

curl -XPUT '10.206.19.199:9200/gb' -d '{

"mappings": {

"tweet" : {

"properties" : {

"tweet" : {

"type" : "string",

"analyzer": "english"

"date" : {

"type" : "date"

"name" : {

"type" : "string"

"user_id" : {

"type" : "long"

}

新增一个叫tag的字段

PUT /gb/_mapping/tweet
{
"properties" : {
"tag" : {
"type" : "string",
"index": "not_analyzed"
}
}
}

设置是否动态映射新字段

PUT /my_index

{

"mappings": {

"my_type": {

"dynamic": "strict",

"properties": {

"title": { "type": "string"},

"stash": {

"type": "object",

"dynamic": true

}

}}

	The my_type object will throw an exception if an unknown field is encountered.
	The stash object will create new fields dynamically.

dynamic 可选项：

true Add new fields dynamically—the default

false Ignore new fields

strict Throw an exception if an unknown field is encountered

动态映射模板

PUT /my_index

{

"mappings": {

"my_type": {

"dynamic_templates": [

{ "es": {

"match": "*_es",

"match_mapping_type": "string",

"mapping": {

"type": "string",

"analyzer": "spanish"

}

}},

{ "en": {

"match": "*",

"match_mapping_type": "string",

"mapping": {

"type": "string",

"analyzer": "english"

}

}}

]}}}

	Match string fields whose name ends in _es.
	Match all other string fields.

Templates are checked in order; the first template that matches is applied. For instance, we could specify two templates for string fields:

es: Field names ending in _es should use the spanish analyzer.
en: All others should use the english analyzer.

定义默认配置

PUT /my_index

{

"mappings": {

"_default_": {

"_all": { "enabled": false }

"blog": {

"_all": { "enabled": true }

}

}}

The _default_ mapping can also be a good place to specify index-wide

测试映射

GET /gb/_analyze

{

"field": "tweet"

"text": "Black-cats"

}

GET /gb/_analyze

{

"field": "tag",

"text": "Black-cats"

}

阿川CH