As much as i like this (yaml goes way too far, but trailing commas and comments would make json much nicer. I actually think this spec goes too far with single quotes) i hate that it is named json5. I think its unethical to imply you are the next version of something if you don't have the blessing of the original author.
I’m a fan of JSON5. A common criticism is “we’ve already got YAML for human readable config with comments,” but IMO YAML’s readability sucks, it’s too hard to tell what’s an object and what’s an array at a glance (at least, with the way it’s often written).
When dealing with large YAML files, I find myself frequently popping them into online “YAML to JSON” tools to actually figure out WTF is going on. JSON5 is much easier to read, at least for me.
It's too bad EDN [1] hasn't seen much adoption outside of the biblical paradise that is the Clojure ecosystem.
[1]: https://en.m.wikipedia.org/wiki/Clojure#Extensible_Data_Nota...
In fact, there doesn't seem to be a spec or standard for it, outside of the de facto standard used by Clojure and the programs in its orbit. I guess nobody's bothered to write a standard, because the people who are already using EDN are doing fine without one, and the people who aren't either don't know what it is or don't see its value.
The whole reason JSON rules the world is because it's brutally simple.
We already have 5+ replacements that are far more robust(XML, YML) and IMO they are not great replacements for JSON.
Why? Because you can't trust most people with anything more complicated than JSON.
I shutter at some of the SOAP / XML I have seen and whenever you enable something more complicated inevitably someone comes up with a "clever" idea that ruins your day.
Hijacking for a random concern:
I love JSON, but one of the technical problems we've ran into with JSON is that the spec forgot about all special characters.
I actually noticed it when reading Douglas Crockford's 2018 book, "How JavaScript Works". The mistake is on page 22.9 where it states that there are 32 control characters. There are not 32 control characters. There are 33 7-bit ASCII control characters and 65 Unicode control characters. When thinking in terms of ASCII, everyone always remembers the first 32 and forgets the 33rd, `del`. I then went back and noticed that it was also wrong in the RFC and subsequent revisions. (JSON is defined to be UTF-8 and is thus Unicode.)
Below is a RFC errata report just to point out the error for others.
Errata ID: 7673 Date Reported: 2023-10-11
Section 7 says:
The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
It should say:
The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F, U+007F, and U+0080 through U+009F).
Notes:
There are 33 7-bit control characters, but the JSON RFC only listed 32 by omitting the inclusion of the last control character in the 7-bit ASCII range, 'del.' However, JSON is not limited to 7-bit ASCII; it is Unicode. Unicode encompasses 65 control characters from U+0080 to U+009F, totaling an additional 32 characters. The section that currently reads "U+0000 through U+001F" should include these additional control characters reading as "U+0000 through U+001F, U+007F, and U+0080 through U+009F"
---
I've chosen `del` to be my favorite control character since so many engineers forget it. Someone needs to remember that poor little guy.
May I suggest using TOML, which in my experience has been the perfect blend of human readability while having good tooling.
This should have been named NOTjson-somethingv5. Or similar. Now it is far from obvious for the uninitiated that this might not be the 'latest' version of JSON. And then they end up using this incompatible format by accident, when in all likelihood standard JSON would serve equally well or better in 95% of the use cases.
When I manage a project and have the freedom to choose my configuration structure, then I always use typescript. I never understood the desire to have configuration be in ini/json/jsonnet/yaml. A strongly typed configuration with code completion seems so much more robust. Except of course your usecase is to load or change the config via an API.
I like what apple is doing with https://pkl-lang.org/ though.
I feel like the comments are the only important part. I’d rather not have single quoted strings or unquoted identifiers to be honest. Trailing commas are nice to have though.
All I miss in JSON are comments and a native datetime type. Everything else, I’m fine with.
I think main problem people trying to solve is treating JSON as computer-human interface. It was not designed for it and I don’t think we need to expand its use-case. You can perfectly use subset of YAML with much better readability for human interactions. I wrote custom parsers for subset I need with like 100 lines of Python code. JSON should stay as a loggable system-to-system format that you can render in a more readable way.
A common thing in JSON/YAML alternatives is to support more types through syntax. I don't think this is a good idea. YAML already did this badly with the Norway problem, but JSON also has issues with eg "is it float or int", what about nulls, what about precision... and so on.
There are many, many more types to support and all this does is complicate syntax; the types can be relegated to a schema. For example, where are dates, with or without timezones, what about durations, what about SI units for mass, current, what about currency, what about the positive integers only, numbers as hex, as octal, as base64...
One format that _nearly_ gets it is NestedText https://nestedtext.org/en/latest/basic_syntax.html ... which means everything gets ingested as strings, dicts or lists, which vastly simplifies things; my quibbles with it would be it still went for multiple syntaxes (for dictionaries, multiline strings, inline vs multiline dicts&lists. And yet, it still didn't make comments part of the data model (which is so useful when processing or refactoring files). While it's not perfect, it does separate the validation of scalars, not stuffing someone's priority list of validations into incomprehensible syntax.
YAML's been a decades long mistake and making JSON more like YAML is not the way to fix that.
Would it be correct to say that this is basically any valid JS code that describes an object, excluding the use of references and function definitions?
If not, what is the difference, and why was it made to be different?
Good API design dictates that you should be flexible as to what you accept and strict about what you serve. Being flexible doesn't really break anything.
Elasticsearch and Opensearch both actually have partial support for JSON5 (comments), which is a nice feature if you want to document e.g. a complex query or mapping choice. It won't return any comments in the response. So it won't break other parsers. Implementing JSON 5 support like this is a valid thing to do for any server. More broad support for this in parsers would be nice.
I'd probably enable this on my own servers if this was possible. I'd need that to be supported in kotlinx.serialization. See discussion on this here: https://github.com/Kotlin/kotlinx.serialization/issues/797
I find json5 much better than json, but it has still many of the same annoyances.
- instead of trailing commas, how about making them completely optional? It's not like they are needed in the first place.
- curly braces for top-level objects could be optional too.
- For a data exchange format, there should really be a standard size for numbers, like i32 for integers and f64 for floats.
- parsing an object with duplicate keys is still undefined behavior.
I don’t know how to feel about this. Personally I want to write configs like code, and I want to avoid using yet another specific DSL for it. So currently working on a tool that allows you to write configs in Typescript - https://github.com/typeconf/typeconf
I've always been a big fan of KDL in principle, haven't used it in anger. After that HCL, then YAML, with JSON and others being my least favourite to use.
Of course the hard part is gaining enough critical mass to make a significant switch. JSON had AJAX. YAML had Rails. What could make JSON5 or KDL break out?
I'm looking at the JSON5 spec and it appears it does not introduce a capital \U escape sequence for Unicode characters outside the Basic Multilingual Plane (BMP). It's not brought up often, but in JSON you do need UTF-16 surrogates to write an escape sequence for Unicode characters outside the BMP. Consider the Hamburger Emoji (U+1F354). Instead of escaping it as "\U0001F354", you need to escape it with UTF-16 surrogates "\uD83C\uDF54". This is both cumbersome for humans and not in accordance with the Unicode Standard [1]. It's ironic, but many (most?) of the "JSON for Humans" flavors of JSON tend to overlook this.
[1] See Chapter 3.8 "Surrogates" of the Unicode Standard.
If browser/node etc.. starts to support json5, i am sure it won't take that much time to get adopted.
If you're looking for a human-friendly json superset (comments, non-quoted keys) that can also abstract away repetitive configuration with variables and list comprehensions, check out https://rcl-lang.org/.
I find that these efforts to make something that is almost but not quite JSON to be counterproductive.
It means that something you can't tell if it's JSON or another format. You'll have some tools that can work with it, while other tools will choke because they expect valid JSON. Oh, someone just switched the quoting style so now your jq based automation is all broken.
And now you have to figure out which of these not-quite-JSON formats this is. Is it HuJSON/JWCC? Is it JSON5? Does my editor have a mode that supports this particular variant, or am I always going to be fighting with it?
And finally, having used HuJSON for Tailscale config: the issue isn't just things like comments and trailing commas, or quoting styles. JSON is just a kind of heavyweight and cumbersome syntax for writing config files. I find that I prefer writing a script to auto-generate my Tailscale config, because editing it by hand is cumbersome.
There are a number of other possible config file formats, with varying levels of JSON data model compatibility. YAML has its issues, but we've all learned to live with them by now. TOML isn't bad, though good luck remembering the array of tables syntax. KDL is pretty nice; it has a slightly different data model than JSON, but it's actually one that is somewhat better suited for config files.
I'd rather use any of these for config files than something that is almost, but not quite, JSON.
I always thought that “JSON5” is a deceptive name. It is not the fifth version of JSON; it is an alternative/extension of JSON, of which there are many alternatives, and this one is no more official than any other.
I like JSON5 and have used it some. When GPT was younger and I was parsing its JSON output directly, JSON5 was forgiving in useful ways.
The one thing I really wish it had was some form of multi-line string, mostly so I could use it with line diffs. Also sometimes for really simple property editors it's nice to put the serialization in a textarea, and that works fine for everything but multiline strings.
(I suppose you can escape newline characters to put a string on multiple lines, but I find that rather inhumane and fragile)
This is very close to what the ruby REPL will accept.
I tried to paste in the kitchen sink - it didn't like dangling decimals and the comment format, everything else worked as expected.
Hjson looks friendlier for direct manipulation, no string quotes
What would be the advantages/disadvantages?
If it’s designed for hand authoring it should support an ISO8601 date format; mere mortals cannot author numeric timestamps without tools.
There are a few more tolerant versions of JSON. In OjG I called the format SEN https://github.com/ohler55/ojg/blob/develop/sen.md
The only thing I worry about is how do you parse this, then modify some fields and write back the file with all the comments still in place?
Why not just not use JSON for config? In a sane world YAML wouldn't even exist and everyone would use something like TOML.
What is the benefit of this over something like Pkl[0]? Pkl compiles down to JSON, YAML, etc., but the language itself is user-friendly. This way you get best of both worlds: readable and editable for humans, and parsable for computers.
[0]: https://pkl-lang.org
This may be heretical but surely the problem isn't lack of comments et al in JSON, rather that people try to use JSON for everything, when it was designed to be a text representation of javascript objects?
Shameless plug for my JSON/5 parser written in zig: https://github.com/berdon/zig-json
There is a std json library as well but the aesthetics weren’t great imo.
The specs are quite pleasant to implement.
I find HOCON[0] to be great for this need in JVM-based languages.
It feels like AI has made this redundant.
I honestly cannot imagine hand typing out some JSON now, or most code for that matter.
I just write in natural language what I want and the AI will perfectly output valid JSON.
> JSON for Humans
The emoji in the first paragraph seems to convey the understanding that humans like expressiveness, but the format itself doesn't allow Unicode values in keys, which seriously limits said expressiveness...
The "official" JSON should be enhanced to cover a few of the pain points.
I wish json had a simple version/convention like elixir sigils so I could pass datetimes around as first class entities instead of always having to [de]stringify them.
Looks very nice, but I feel it's one of those "yeah, it's better, but only useful for personal projects, or until it gets critical mass which won't be until after I'm dead, if at all" projects.
Yaml is for people, json is for machines
I think the killer feature of JSON is that there’s one version and that won’t ever change. You don’t have to think about compatibility.
All JSON is valid YAML. So you clearly can make yet another one of these and make it support JSON. But JSON doesn’t support the stuff you’re adding, so calling it JSON5 just makes things confusing as if it’s a version and not a whole new thing altogether.
The ugliest thing the authors could accomplish is making this sufficiently popular that there’s a ton of .json files out there that aren’t actually valid JSON. I hope they’re being careful about strongly discouraging ever writing these outputs to files with a .json filetype.
Just use jsonnet if you want this IMO. No need to change json into yaml.
Comments are nice. I wonder if they can also be inserted programmatically.
I'm a huge fan. We use it for all our configs.
Unfortunately this is basically that XKCD cartoon about proliferating standards. I think I’d avoid this additional standard and just use JSON or a JavaScript object if I really need this level of flexibility.
Eh, if you drink, then drink...
1. Add `;` as a separator of elements, so you may have:
{ a: "foo"; b:"bar; }
2. Add array tags and space separated value lists so you may have { a: 12 13 14; }
to be treated as [12, 13, 14] with the tag " ". Normal arrays are parsed with the tag ","
3. Add "functors" as, again, tagged arrays rgb(128,128,14);
will be parsed to an array with the tag "rgb". Also you may have calc(128 + 14);
4. Add tagged numbers so 90deg
will be parsed as a number with the tag "deg"
And you will get pretty much CSS that is proven to define quite complex constructs with minimal syntax.No I don’t need this thing.
` leadingDecimalPoint: .8675309, andTrailing: 8675309.,`
Sorry but what is the benefit of this? Lazy shorthand? This is too much. Is this a string in other languages? PHP the `.` is a string concat.
I think it allows for too much. I was glad that JSON only supports double-quoted strings. It is a feature that removes discussions about which quotes to use. Or even whether to use quotes at all (we still need them for keys with colons or minus in it, so what gives?).
The only thing that JSON is really missing are comments and trailing commas. I use JSONC for that. It's what VSC uses for the config format and it works.