Why TRUE + TRUE = 2: Data Types

Den Den 04 May 2020
Why TRUE + TRUE = 2: Data Types

In the early days of computing, programmers needed to be very sure about the data they were operating on. If an operation was fed a number when it was expecting a letter, a character: at best you might get a garbled response, and at worst you could break the system. Maybe even physically. At the low level of coding, yeah, that’s still true. But these days we have programming languages where we don’t always need to be so rigorous in defining the data, and we can let the computer figure it out. But for something that seems so technical, this can be controversial.

When you write a computer program, you use variables, which are basically just labelled buckets of memory. Inside that bucket is some data, and you can change it, you can vary it. Hence, variable. I know it’s a massive simplification, but computer memory is a bit like an enormous rack of switches storing ones and zeros that represent other things like letters and numbers. But if we look into a region of memory, there is nothing in there to indicate what those ones and zeros are actually representing. So in the code, we declare that the variable is a particular type. What's contained in that variable, in that bucket? It's an integer. What's in that one? It's a string of characters. That tells the computer how to interpret those ones and zeros in memory.

The types that you get to use can differ a bit between languages. But in general, you’ll at least have: Integer or INT. That’s a whole number that can’t have anything after the decimal point. And those are extremely useful for storing things like the number of times you’ve looped through some code, or how many points your player’s clocked up, or how many pennies there are in someone’s account.

Then you've got character or CHAR. These are letters, numbers, punctuation, and whitespaces, like the space between words, and instructions to start a new line.

And in most high-level languages, you’ll probably be using a STRING instead, which is just a string of characters.

Then you've got Boolean, or BOOL, named after George Boole, an English mathematician. That's very simple: it's true or false. A boolean only contains either a zero or a one. A yes or a no. A no or a yes. Nothing more.

Then there’s floating-point numbers, or FLOATs. Floats are complicated and messy, but in short, they let you store numbers with decimals, although you might lose a very small bit of precision as you do it.

There are others, other types, in a lot of languages, I know it’s more complicated than this: but this is just the basics. So. Most languages use “explicit type declaration”. So when you declare a when you set up that bucket, you have to also declare its type. So, x is an integer, it can only hold integers, and right now, that integer is 2. But in some languages, including some popular ones that people tend to get started with, and that I like, you don’t need to actually declare that. It just gets figured out from your code. That’s called “implicit declaration”. So in JavaScript, you can just type x = 1.5 and it’ll know, that’s a number. Put the 1.5 in quotes, and it’ll go, ah, it’s a string. So, okay, it’s storing 1.5 and "1.5" as different ones and zeros. Why does that matter?

Well, in JavaScript, the plus sign means two different things. It’s the addition operator, for adding two numbers together. But it’s also the concatenation operator, for combining two strings together. So if x is “1.5”, you ask for x + x it returns 3. But if either of those xs is "1.5", a string, it'll return "1.51.5". And that's called “type casting”; converting from one data type to another.

Some languages require the programmer to explicitly request the conversion in code. Other languages, like JavaScript there, do it automatically. JavaScript is referred to as having “weak typing” as opposed to “strong typing”. And it’s weak because, even if that 1.5 is a string, and you ask for it multiplied by 2… it’ll return 3. Unlike the plus sign, that asterisk can only mean ‘multiply’, so it can only handle an integer or a floating-point number. Give it a string, though, and it won’t throw an error like a strongly-typed language would. It’ll just convert it for you on the fly. Really convenient. Really easy to program with. Really easy to accidentally screw things up and create a bug that’ll take you hours to track down. Or worse, create a bug that you don’t even notice until much, much later.

In a lot of languages, you can also cast to and from boolean values. Which is called "truthiness", and experienced programmers who are watching this may already be grimacing. Truthiness is a great shorthand. If you convert an empty string to a boolean, it generally comes out as false. Anything else, true. So you can just test for an empty string with if(x). But that also means that in JavaScript, you can ask for true + true and it’ll tell you that the answer to that is 2, because when you cast ‘true’ to a number you get 1.

In PHP, a language notable for many questionable design decisions, even a string with just a single zero in it will get converted to a boolean false, there’s a special case just for that string. Which can cause a lot of unexpected bugs.

There is a workaround for that in loosely-typed languages. Normally, if you want to compare two variables, you use two equals signs, 1.5 == "1.5". You can’t use a single one, because that’s used for assigning variables. I’ve been coding for about thirty years and I still absent-mindedly screw that up sometimes. If you ask if 1.5 is equal to “1.5” with two equals signs in JavaScript or PHP, you’ll get true. But if you add a third equals sign, then you’re asking for strict equality 1.5 === "1.5". If the data types don’t match, any comparison will automatically fail.

So why is all this controversial? Well, languages like Javascript and PHP can get a bad reputation because they use weak typing. If you see yourself as a Real Programmer -- and I’m using that sarcastically, but if you see yourself as the kind of programmer where you are in control of everything, then… yeah, you can see that weak typing is like training wheels, something that introduces sloppy coding practices and bugs and shorthand. And that’s not unfair. But weak typing also makes programming easier to learn and easier to do, it can reduce frustration and just make programmers’ lives easier. It is a trade-off: even if it is a controversial one.

Comments (0)

    No comments yet

You must be logged in to comment.