A comprehensive guide to storing boolean…

Jun 8, 2021

Booleans are the simplest structural type. Let's investigate how to model boolean data.

6 Comments

Jun 16, 2021

Nitpick: your JSON optimization should be avoided like the plague. The optimization you propose is NOT equivalent. The if condition will be evaluated true and executed for ANY non-empty value on the property hash_gym. This, of course, includes "false" (as string), "f", "F", "NO", whitespace or any invalid data.

Keep in mind, you're not transforming the attribute hash_gym to an optional attribute - you are basically adding both implicit and explicit behaviour on the property (if it doesn't exist, treat is as optional and assume false, if exists, you could care less on the contents and assume true). This makes API testing a nightmare, specially if you're exposing it to people outside your team.

Expand full comment

Reply (1)

Alexey Makhotkin

Jun 17, 2021

I'm not sure I follow. The server can define API as "having true boolean value if true, having false boolean value or missing key if false".

How can you begin getting "NO" from the server under this contract?

Expand full comment

Reply (1)

Jun 18, 2021

The first wrong assumption is that JSON somehow is a contract. It's not. It's a notation.

You may or may not validate a specific response against a contract, but that is either 3rd party and half-assed (json-schema) or code-specific (your code). Some languages (specially dynamic) will eat any kind of garbage on the fields, some language will do boundary constraints and try its best to match the types. Do not hold your breath, though as mileage may vary.

Problem 1:

Back to your example, you have an attribute that you decided to omit from the response when its false. Back to the notion of "contract", you just changed a parameter you presented as required to optional; existing test code wherever the response is generate need to check that the following scenarios are true:

a) Mock value true will generate a boolean attribute with true as value;

b) Mock value false will generate a boolean attribute with false as value;

c) Mock value false will omit the attribute;

While in some practical implementations this will probably make no difference, you're both changing the definition of your notion of contract by changing parameter scope and adding an implicit behavior. This is shoddy practice and will bite you in the rear sooner or later.

Problem 2:

Are bools really bools? Can other invalid types be interpreted as boolean in some languages? Oh yes they can! One quick example (python):

------------

for datum in ['',0, None, False, "false", [], {}, 'a', 1, True, "true", [0]]:

# evaluated this form to keep the implicit bool evaluation spirit

if datum:

print("type {} with value '{}' evaluates to True".format(type(datum), datum))

else:

print("type {} with value '{}' evaluates to False".format(type(datum), datum))

-------------

Outputs:

type <class 'str'> with value '' evaluates to False

type <class 'int'> with value '0' evaluates to False

type <class 'NoneType'> with value 'None' evaluates to False

*type <class 'bool'> with value 'False' evaluates to False

*type <class 'str'> with value 'false' evaluates to True

type <class 'list'> with value '[]' evaluates to False

type <class 'dict'> with value '{}' evaluates to False

type <class 'str'> with value 'a' evaluates to True

type <class 'int'> with value '1' evaluates to True

type <class 'bool'> with value 'True' evaluates to True

type <class 'str'> with value 'true' evaluates to True

type <class 'list'> with value '[0]' evaluates to True

Check the lines marked with *. See the problem? String values are evaluated as bool differently depending on if they match the actual language notion of bool. If you try this in PHP, you'll actually get even weirder results, as string contents may be implicitly evaluated as bool on a given context.

You may say "ok, but I was referring to true json bools, not misstypes". Fair enough. Problem 1 still stands, and a reality check will show you that problem 2 may cause quite some concern in many scenarios, and create subtle bugs that are not easy to detect. Can you quickly spot the difference between false and "false" on a 10kb response? Probably not.

Worse, some languages will happily eat whatever garbage you feed them, evaluate as string and evaluate the string! Your response may be generating completely wrong data (such as placing the name on that field because someone mis-typed a field, or your data source has an incorrect mapping), and some of your systems will happily digest it as "evaluate to true". Except if the name is false. Or probably "False" if using python. Or "0name" if touches PHP.

Rules of thumb for good programming:

- Be explicit on the behavior;

- Never trust input (either user or computer generated);

- When using dynamic languages, always validate your types;

Expand full comment

Reply (2)

Alexey Makhotkin

Jun 20, 2021

Thank you. Maybe it's my Perl background, but I'm not particularly frightened by the examples you provide. I'd say that for me one of the things that continue to bite me is the fact that in Ruby numeric zero is true.

Also, some of your arguments I think apply even if we decide to explicitly include each boolean attribute key, like "Can you quickly spot the difference between false and "false" on a 10kb response? Probably not." I'm not sure why is this part of the argument.

Also, I'm not sure why you call JSON-schema half-assed! Of course you very often need an extra validator that could only be implemented in the native code, but JSON-schema handles the proverbial 80%.

thanks,

Expand full comment

Reply (1)

Jun 20, 2021

Its probably your Perl background :) Keep in mind, this is not some random code review where specific issues don't apply because of some other existing layers of a specific application; You're giving advice - and even stating it as "comprehensive" advice - on how to do this specific thing. Yet you present shortcut-based assumptions that don't work well on several languages, and may introduce subtle bugs in the code. Its your blog, of course, I'm just stating why these things aren't usually handled in a lighthearted way. Because they are not simple.

Regarding false vs "false", if you're just testing for field existence on a dynamic language (using the example code (if (var) {}), one will assume false, the other may assume false or true, depending on the language. If you don't think this is a problem, I don't know what else to say :)

Have you ever used anything besides JSON? I'd suggest you have a look at eg. amazon's Ion, and why it exists (or a ton of other formats that try to solve the same problem). Or have a look at older stuff like xml schemas and SOAP and see how you can build software interfaces that last decades, without having to "hack" extra validators for stuff like proper-formatted RFC3389 dates, 64 bit integers, money values that are actually numeric and not float, single-document-multiple-version validations and whatnot. I'm not saying these formats don't have issues, but lets face it - they're (still are somewhat) the bread and butter for interconnecting heterogeneous systems in the enterprise world for some reason.

Expand full comment

Jun 18, 2021

Edit:

when you read:

"String values are evaluated as bool differently depending on if they match the actual language notion of bool."

you should read:

"String values are evaluated as bool differently depending if they have content or match the actual language notion of bool (the latter is not applicable to Python)"

Expand full comment

Minimal Modeling

A comprehensive guide to storing boolean…