Mar 24, 2010

The approximate HTML

At the time of writing this, Google had 40, Yahoo 154, Facebook 41, Twitter 88. All are validation errors on their homepages. No matter whether they define their document type as being HTML 4.01, HTML5 or XHTML 1.0, they all fail to validate and break basic rules. It's worth to mention that Google is represented in the HTML working group at W3C and they also handle the HTML5 draft editing. That's one example.

But why would they make such flagrant breakings of the web standards? They, in fact, produce incorrect code. Client code. And then make it inevitably public and available for others to see how they've done it. These are some of the most visited and busiest sites out there, and yet they see the rules of the web as something optional. Web browsers (some more than others) are often criticized for not providing a correct rendering of pages and for making it difficult for developers to build consistent applications. Very true. But do developers respect what they claim to be respected? Seems not. And this is just a (representative) sample, because there are only a few sites having valid markup. And even though an end user may not notice a simple HTML error, or may not care about it, the web standards have a clear goal of making things overall better for end users, while being transparent to them. Markup is the support framework of any content. And in the long run, a healthy markup will sustain better content. Better = correct, accessible and meaningful.

Of course, it's not all that easy to produce valid code, especially for dynamic web pages. It's even harder when outputting a lot of markup from the server side. But this alone is a bad practice as well. Fewer bytes per page and heavy use of Javascript can make for reasons of not being valid, but they cannot stand any solid argument. I think the best way to have your HTML clean and correct is to be willing to do it. After all, it's not enough to have a doctype at the top of your page; you'll also need to write the rest of the page with it in mind.

Mar 15, 2010

Hard to learn from the Web

Some time back Adam Bosworth wrote an interesting article about what the Web can and does teach us, and about why it's important to extend its capabilities in other areas. However, the Web is itself a poor learner. It is diverse and heterogeneous content. Content serving as information. Information coming from data, by means of... HTML.

The problem is not with the content, but with the structure. HTML has been an excellent markup at that. What some may not know is that its latest (current) stable version, HTML 4.01, is 11 (eleven, as in a soccer team) years old. That beats even the age of the C++ standard, or does it? In an environment that changes so often, that has had a tremendous growth over the past decade, one that can be taken as an example of evolution – what we get to work with inside it is an aged and almost deprecated language acting as a main tool. After so “many” revisions and after such a “continuous” development, HTML has clearly failed to keep the pace with today's Web. It is difficult to express modern and original ideas with a technology that was unable to stay in sync with the very medium it acts upon.

That, in fact, is the main reason that there are so many flavors of HTML, so many flagrant quirk modes, and so many browser-specific markup extensions. HTML5 is coming out a “bit” late. But better later than never. One of the best parts of HTML 4.01 was its simplicity; I believe XHTML has failed (did it?) just because of breaking this rule. HTML5 seems to have observed this, and moreover seems to have learned some key lessons from the semantic Web.

Mar 4, 2010

What you C++ is what you get

This year, more than ever, C++ is coming closer and closer to hitting a major upgrade as a “standard programming language”. It's the next big take on this language – the upcoming and so-called C++0x, a new ISO standard. The first C++ standard was released in 1998, as far as I know, and another tiny revision in 2003, that is after 5 years. But that one was so small it can barely be called a “revision”. C++0x, on the other hand, will make for a real revision and will add up many polished features to the language (threads, regexes, hash tables, tuples, the auto type, a new for-loop, lambdas, delegation, variadic templates – just to name a few). Great things and great work on this side of the page.

But – because there is a big but – C++ deserves more. More is less, someone once told me. And in this case, it may need less. C++ is a big and complex language, it is a “cathedral” prisoner struggling to escape into the world and experience with everything the “bazaar” has to offer. A first-class prisoner at that. Way too many features are being discussed and debated and standardized, and yet the removal of obsolete or simply bad features is kind of left aside. IMO these are the bugs that C++ as a solid language has, and that won't be fixed too soon, if ever:

  • The whole idea of C as a strict subset of C++, along with the famous idea of maintaining backward compatibility with millions and millions of lines of C code, are both counter-evolutionary and will cause static software at any time. Existing code must change if it needs or wants to evolve. Languages should encourage software (existing and future) to change for the better. The post-increment operator in its name should mean more than “superset” and “object-oriented”.
  • C++ must have a general cleanup of its features. It could generalize many useful ideas from the STL in other parts of the language. It can have the STL simplified a lot.
  • Do we still need char * ? Moreover, do we still need C pointers at all? Do we need all the iterators in order to pass through a mere collection? Do we need all the maps and multimaps and sets and multisets and hashes and so on, when we could have one good hash? Do we need templates, when some auto type could do it? Do we need all the stacks, queues, deques, when a proper vector would suffice?
  • Maybe the best way to advance is not making simple things complicated; and just maybe C++ can learn some new and good things from interpreted languages as PHP, Python and Ruby.
  • Maybe the ISO standardization and voting process are not that suited for such a language. C++ can learn a lot from open source and from the evolution of languages similar to those mentioned above. Release early, release often, community support, simplicity, continuous development, are facts that can be borrowed successfully from software development to actual language construction.

Who hasn't heard of C or C++? It has inspired and influenced many other languages, and has helped in building a lot of them. At this moment, C++ can too learn and pull good things and best practices from others. In breaking the excessive backward compatibility with C and with itself, it can have a glimpse at the way PHP handled this when passing to version 5 from 4. Also, Python stepping up to a revamped version 3 is a living example.

The Web has shown us more than anything else that software is ever-changing. C++ should and is able to keep the pace in the most intelligent manner.