Working with JSON in Python relies on Python’s builtin json
module. After you import json
, here are the json module methods:
- json.loads – deserialize a json string into the appropriate Python type(s)
- json.dumps – serialize a Python “object” into a json string
- json.load – deserialize json from a file
- json.dump – serialize json into a file
Notice that the methods ending in “s” deal directly with strings, whereas the others deal with files. To disambiguate them, I call them “load string” and “dump string” (in my own head, at least).
json.loads (load string)
Use this when you need to deserialize a json string, like when handling a json API response:
import json
person_str = '{"name": "justin", "age": 100}'
person_dict = json.loads(person_str)
print(person_dict)
{"name": "justin", "age": 100}
Default value parsing
Python json.load and json.loads also provide some nice keyword argument hooks for loading json data: parse_float
, parse_int
, and parse_constant
. Each of these is called on its specified data type during loading, and the result of the hook function is what comes out at the end. Let’s do a quick example.
Say you know that all the float
values in a json payload need to be converted to Decimal
type. We can do that easily with parse_float
:
import json, decimal
person_str = '{"name": "justin", "dollars": 50.25}'
json.loads(person_str, parse_float=decimal.Decimal)
# {"name": "justin", "dollars": Decimal("50.25")}
The other parse_
methods work just like this. There’s even an object_hook
you can call on the full decoded payload, to be used like a reducer. For more, see the official docs.
json.dumps (dump string)
Using the same person from above:
import json
person = {"name": "justin", "age": 100}
json.dumps(person)
'{"name": "justin", "age": 100}'
With pretty printing
For nicer formatting, you might want to pretty-print your json:
import json
person = {"name": "justin", "age": 100}
# use the indent kwarg for nicer formatting
print(json.dumps(person, indent=4))
{
"name": "justin",
"age": 100
}
You can also sort the keys of your json, which makes life easier when inspecting large json objects:
import json
person = {"name": "justin", "age": 100}
# use the sort_keys kwarg to sort object keys
print(json.dumps(person, indent=4, sort_keys=True))
{
"age": 100,
"name": "justin"
}
These pretty-printing and formatting kwargs work exactly the same in json.dump
also.
json.dump (to file)
This works just like json.dumps
, but instead of writing to a string it writes to a file:
import json
person = {"name": "justin", "age": 100}
with open("person_file.json", "w") as outfile:
person = json.dump(person, outfile)
json.load (from file)
This works just like json.loads
above, but instead of acting on a string it reads from a file:
import json
# read the file we just created above
with open("person_file.json", "r") as infile:
person = json.load(infile)
print(person)
# {"name": "justin", "age": 100}
Type conversions
Converting Python into json is not 100% apples to apples. Tuples become arrays1, None
isn’t valid json, True
and False
aren’t capitalized, etc. Here’s the conversion table lifted directly from the Python json docs:
JSON | Python |
---|---|
object | dict |
array | list |
string | str |
number (int) | int |
number (real) | float |
true | True |
false | False |
null | None |
Errors when reading or writing json
If you try and load something that isn’t json, or isn’t properly encoded, you’ll see either a TypeError
or a JSONDecodeError
:
import json
# This is a dict, not JSON
json.loads({"name": "justin"})
# TypeError: the JSON object must be str, bytes or bytearray, not dict
# This string has an extra " in it, it's not properly encoded
json.loads('{"name": "justin""}')
# JSONDecodeError: Extra data: line 1 column 19 (char 18)
Gotcha: Decimal type
There is a small note in the official docs about “exotic” numerical types, like decimal.Decimalโthey are not JSON serializable:
import json
import decimal
not_serializable = {"number": decimal.Decimal(100)}
json.dumps(not_serializable)
# TypeError: Object of type Decimal is not JSON serializable
You can use json.dumps default
keyword argument to get around this issue, it provides a default encoding function to be used for data that could not be serialized. In the case of the Decimal
above, we could pass float
:
import json, decimal
not_serializable = {"number": decimal.Decimal(100)}
json.dumps(not_serializable, default=float)
'{"number": 100.0}'
Careful with default
though, it will apply to all non-serializable Python types. If you tried to use float
and there was a datetime
object in your data, you’d get a new error2.
- json doesn’t have a
tuple
type, so Python tuples become json arrays โฉ๏ธ - This blog post has a clever
default
encoding solution using python f-strings โฉ๏ธ