Golangのjson.Unmarshalとjson.Decoder.Decodeの違い

Golang標準のjsonパッケージでは、jsonをstructに展開する方法として、 json.Unmarshal という関数と、 json.NewDecoder でデコーダを生成し、 Decode 関数を呼ぶ方法がある。

インターフェイスは、

func Unmarshal(data []byte, v any) error
func NewDecoder(r io.Reader) *Decoder と func (dec *Decoder) Decode(v any) error

バイト列を渡すのと、io.Readerから渡すという点が異なるが、これによる挙動の違いが最初わかっていなかった。

違いの結論

Unmarshalに渡すバイト列はひとつのjsonとして正しい形式である必要がある
Decoderにio.Readerを渡してDecodeする方法は、Decodeはストリームから次のjsonを取り出して処理するため、jsonが複数個含まれているファイルも処理できる

{ "example": "1" } の処理

以下のような { "example": "1" } をただ処理したい場合には同じ挙動をしてくれる。

json.Unmarshalを使う方法

package main

import (
    "encoding/json"
    "fmt"
)

type Example struct {
    Example string `json:"example"`
}

func main() {
    jsonSrc := []byte(`{ "example": "1" }`)

    var myJson Example
    if err := json.Unmarshal(jsonSrc, &myJson); err != nil {
        panic(err)
    }
    fmt.Printf("%+v\n", myJson)
}

$ go run main.go 
{Example:1}

json.Decoder.Decodeを使う場合

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
)

type Example struct {
    Example string `json:"example"`
}

func main() {
    jsonSrc := []byte(`{ "example": "1" }`)

    var myJson Example
    decoder := json.NewDecoder(bytes.NewReader(jsonSrc))
    decoder.Decode(&myJson)
    fmt.Printf("%+v\n", myJson)
}

> go run main.go
{Example:1}

複数行のjson処理

次に、2行のjsonを処理してみる。

{ "example": "1" }
{ "example": "2" }

この場合、json.Unmarshalはパースに失敗し、json.Decoder.Decodeは処理することができる。

json.Unmarshalを使う場合 - 複数行を処理

package main

import (
    "encoding/json"
    "fmt"
)

type Example struct {
    Example string `json:"example"`
}

func main() {
    jsonSrc := []byte(`{ "example": "1" }
  { "example": "2" }`)

    var myJson Example
    if err := json.Unmarshal(jsonSrc, &myJson); err != nil {
        panic(err)
    }
    fmt.Printf("%+v\n", myJson)
}

$ go run main.go
panic: invalid character '{' after top-level value

json.Decoder.Decodeを使う場合 - 複数行を処理

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
)

type Example struct {
    Example string `json:"example"`
}

func main() {
    jsonSrc := []byte(`{ "example": "1" }
  { "example": "2" }`)

    var myJson Example
    decoder := json.NewDecoder(bytes.NewReader(jsonSrc))
    decoder.Decode(&myJson)
    fmt.Printf("%+v\n", myJson)
}

> go run main.go
{Example:1}

関数の説明を読むとわかるが、入力のリーダーから次のJSONを取り出して、処理してくれるということがわかる。

Decode reads the next JSON-encoded value from its input and stores it in the value pointed to by v.

従って繰り返し呼ぶことで、2行目も取り出すことができる。

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
)

type Example struct {
    Example string `json:"example"`
}

func main() {
    jsonSrc := []byte(`{ "example": "1" }
  { "example": "2" }`)

    var myJson Example
    decoder := json.NewDecoder(bytes.NewReader(jsonSrc))
    for {
        if err := decoder.Decode(&myJson); err != nil {
            fmt.Println("error: ", err)
            break
        }
        fmt.Printf("%+v\n", myJson)
    }
}

> go run main.go
{Example:1}
{Example:2}
error:  EOF

Decode関数を読みすすめたjsonのポインタを保持しており、次にDecode関数が呼ばれた場合には、そこからjsonを取り出している処理を確認できる。

github.com

感想

Decoderにはストリームを渡しているのだから、わかってしまえば、それはそうという挙動だと思う。ただ、連続したjsonを扱いたいというわけではなく os.File を便利に扱えるというモチベーションで使用していたので、２個のjsonが含まれるファイルでデコードのエラーにならなかった時には、ちょっとびっくりした。

試運転ブログ

技術的なあれこれ