Web

Why The Web Is Such A Mess

Den W. Den W. 23 November 2020
Why The Web Is Such A Mess

When Tim Berners-Lee proposed the idea  for the World Wide Web back in 1989,   he wrote about a “universal linked  information system”. His exact words:  

“a place to be found for any  information or reference which   one felt was important, and a  way of finding it afterwards”

Three decades later, and you’ve possibly  noticed that the web is a bit of a mess. What happened to that  universal information system?   What even is a cookie, and  why do we need to accept them?

There are two reasons that  the web is terrible right now: first, advertisers. And second, people.

Let’s talk  about the advertisers. The first versions of the web  were simple. Text and images   and links to other text and images.  You could use forms to send information,   but the whole thing was ‘stateless’. This means neither the web browser nor the web server kept track of you from one page to the next: you just sent a request for a document, you got the document back, and that was it. If you submitted a form, that one response could be  tailored, but after that, nope, the server’s forgotten about you because it can’t tell you apart from anyone else who’s using the same network.

This wasn't, long-term, a great plan. In theory, you could hack together  something more complicated,   send forms back and forth, and people  did build things like that, but they’d break very, very easily. It was impossible to have something  both reliable and personalized.

Then in 1994, a programmer called  Lou Montulli, who was building a   web browser at a company called  Netscape, came up with “cookies”,   adapting an old computer science  technique called “magic cookies”.   They were so useful that every other  browser started to use them as well,   and they got built  into the web standards.

A cookie is “client-side storage”:  your web browser sends the request to   the server, as usual, but the server sends back  two things: the page you asked for, and a little text file called a cookie  that your browser then saves. After it’s saved, every time that browser  makes a request to that same server,   it sends the cookie along with it.  So that cookie could contain your   preferences: if you choose “dark mode”,  then the browser would now send a “dark mode” preference, in that cookie, every single request. If the server asked for it, then that cookie would last even if you closed the browser, and restarted your computer. For years and years and years.

Or a cookie could be used for  authentication: so after you’ve   logged into a site, that site gives your  browser a cookie that’s just a long, random string of characters. But because no-one except the server and that cookie have that long string of characters, it can act like a password: the browser sends it back along with every page request,   and the server looks at it and  goes “oh, yeah, I remember you”.

The designers of cookies looked at that, and they knew that they could be used   to track people, so there was a  very, very important rule for privacy:   from the very start, only the site that set the  cookie can read the cookie back. Except...

Because of the way the web  is built, sites can include material   from other servers. A web page can  pull in an image from somewhere else,   or some scripting code, or even an  entire other website, included in a frame. And those other servers could  all set and read their own cookies.   “Third-party cookies”, they were called.

So when web pages included advertising,  well, those adverts came from the advertising   company’s servers, so those advertisers  could set a cookie that tracked people   across the entire web. If the same  ad company partnered with loads   of different websites, they could start  building up some very detailed profiles. No, they might not know who you  are at first. But they knew you visited   that website about your hometown, then you visited that website about your college,   and then you visited that website about a very specific  medical condition. Oh, and then you bought something from an online shop, so they can work behind-the-scenes combine the details from that shop and their cookies, and add your name and address to the profile.

All those fancy “share with Facebook” buttons? Well, yeah, loads of them   could report back to Facebook where you were going. "Hey, this person's looking at all these sites! So now you can tailor adverts for them!" Which is why you could be on a site that seems to be completely disconnected from Facebook, and still see adverts that were uncannily linked to  things you’ve been thinking about.

It took a long while before  regulators decided that,   maybe, advertising companies having  enormous databases on basically   everything that people were thinking  might be a bit of a privacy problem.

The European Union acted  first. The “Cookie Law”,   which was the euphemistic name for, um, this mess... meant that all the EU member  countries had to put a law in place   requiring consent to place cookies.

That wasn’t particularly well-thought-out, because “consent” requires clicking OK, and if there’s a confusing box in between the average person and the article they want to read, look, nearly everyone is just going to just click OK. And by this point, browser makers  were starting to give users options   to block third-party cookies anyway, so advertisers were adjusting,   they were tracking people not through cookies, but  through “fingerprinting”: looking at all the apparently-innocent individual quirks of how that exact computer was set up and how it rendered a web page. The fonts you had installed, the size of your monitor,  the websites you've visited in the past, all things that individually appear  innocent and safe but which together uniquely identify each person.

Browser makers and advertisers ended up in an odd war with each other about privacy: occasionally a civil war, given that one of the  most popular browsers and also one of the major advertising companies were under the same corporate umbrella.

So later, the General Data  Protection Regulation, GDPR,   put even stronger requirements in  place, at least for EU citizens. Not just “no cookies without consent”: no  tracking without consent. At all times,   the user must have full knowledge  what they are signing up for, and must consent to it. Even if the company’s not in the EU. Even if they’re storing data on EU citizens "outside", it still applies to them. The potential maximum fine is 4% of the company’s global annual revenue. Not profit, revenue. Or €20 million, whichever is higher. And yeah, sure, maybe the EU couldn’t easily fine an all-American company, but they can definitely stop them doing business in Europe if they don't pay.

So all the advertisers had two options. Number one: stop tracking people! Or number two, add a big box on each page saying “hey, is it OK if we track you?” And legally, you should  be able to easily say no and still access the site. Opting out should  be a simple, one-click process.   And anyone who’s been on the web  lately knows that’s often not true.

So, yes, at least some of those popups and frustrations are because of advertisers. Certainly, they normalized  the idea of intrusive popup boxes. But why can they put those  boxes there in the first place?

The web is built around freedom  of design. It’s one of the most   fundamental principles: within the  box that the web browser gives you,   a designer can put almost anything. 

For decade after decade after decade, programmers have pushed the limits of what’s possible: someone has even managed to emulate an entire Windows 95 computer in the web browser. I don’t mean just making a web page look like Windows 95, I mean an entire virtual machine that thinks it’s running on regular computer hardware but is actually just simulated in code, in a web browser.

And that's incredible! Give people power to create and  some will make amazing things. But some will look at the ability  to run almost any code and put almost anything on screen, at least within that box, and go “maybe I can get some more newsletter sign-ups if I interrupt the reader”. Or "maybe I can make people’s browsers mine cryptocurrency in the background "while they’re on my site!" So the war starts again.

Most web browsers now have a separate view that you can switch into, that shows just  the text you want. It might be called “article view” or “reader mode”. The browser tries to figure out what the main content is, what the user actually wants to see, and puts just that text on screen. Stripping out the adverts,  removing all the popup boxes, and, yes, losing all the creativity  and design that the web allows.   But that ‘reader view’ is a lot closer  to Tim Berners-Lee’s original design:   a place for information, and  a way to find it afterwards.

Of course, there are already techniques  to detect and defeat reader view,   and make sure that the user has to see the  adverts, or the newsletter popups, or   whatever else. Every bit of freedom that a  designer has for creativity and good   is also freedom for abuse.  That’s true not just on the web,   but in every single aspect  of the online world.

Comments (0)

    No comments yet

You must be logged in to comment.

Sign In / Sign Up