Warning: gzinflate() [function.gzinflate]: data error in /home/boozkerc/public_html/kneedeepincode/wp-includes/http.php on line 1787

better_html

UPDATE: I’ve corrected some <html:media> self closing errors (damn XHTML habits) and I have added a statement about backwards compatibility in the questions section.

Below is a long post about the possibilities of HTML with only 7 reserved elements which would use XML namespace syntax, a Form API, and the reserved attributes for those two APIs, which I believe, would make HTML much stronger. Let’s face it, HTML5 is already becoming so, so bloated with too much effort on backwards compatibility. If you just want to skim, here are the table of contents:

Example

What is the point of having nearly all HTML elements? What if we could simplify HTML to 7 reserved elements? Everything else is up to you plus CSS. Look at this example:

<!DOCTYPE html>
<html:html>
    <html:head>
        <html:meta type="title" value="Page Title">
        <html:meta type="description" value="This is an example of HTML with namespaces">
        <html:link src="css/main.css" title="Main Styles" type="text/css">
        <html:link src="js/main.js" title="Main Script" type="text/javascript">
    </html:head>
    <html:body>
        <header>
            <logo>
                <html:media type="image" src="images/logo.png">
            </logo>
            <nav>
               <html:a href="/cats">Cats</a>
               <html:a href="/dogs">Dogs</a>
               <html:a href="/rain">Rain</a>
            </nav>
        </header>
        <content>
            <article>
                <h1>This is my main article head</h1>
                <h2>This is my sub head</h2>
                <p>[...]</p>
                <p>[...]</p>
            </article>
            <article>
                <h1>A cool video!</h1>
                <h2>Pay attetion to the media elements</h2>
                <p>[...]</p>
                <html:media type="video" src="vids/funny-cat.mp4" autostart controls>
                <p>Man, that was a stupid cat.</p>
            </article>
        </content>
        <footer>
            <copyright>This site is &copy; to Oscar Godson 2009</copyright>
        </footer>
    </html:body>
</html:html>

HTML Tags

The entire HTML definition could be summed up in a few reserved tags:
<html:html>: Begins the HTML document
<html:head>: Contains information about the HTML document
<html:meta>: Contains meta data. There can be ANY type, but there are a few reserved types. See Below
<html:link>: Contains text based information that needs to run the page. Things such as javascript, css, actionscript, etc Type can be anything, but needs to be supported by the browser.
<html:body>: Defined the same as the current HTML definition.
<html:media>: Anything that is not text based. Things like audio, video, images, swf, applets etc. If it isn’t a video, audio, or image the browser will look for a plugin to run the embedded media.
<html:a>: Contains a link just like current HTML

HTML Attributes

Each element has it’s on own attributes, but you can also specify any that are not taken already with the API. E.g. selectable is NOT used by the HTML API therefore allowing you to use it and in anyway you want.

All tags have the name attribute reserved for built in browser tool tips as well as the id and class attributes for CSS, JS, etc. Otherwise anything goes for your own tags whilst HTML API tags have their own reserved attributes (all but the type are the same as the current HTML spec):

<html:html>: lang
<html:meta>: type, value
<html:link>: src, title, type, rel
<html:media>: src, width, height, rel, autostart, controls, type (must be either: video, audio, image, or embed)
<html:a>: href, rel

Forms

“Ah!” you say, “What about forms?” The forms would be part of the Form API. This allows us to scale it at a different pace than HTML. The HTML API uses the Form API. Here are elements (most are just like current HTML forms):

<form:form>: Begins the form
<form:input>: Creates a new input field. For possible input types see below.
<form:select>: Creates a select input
<form:optgroup>: Groups options together. Child of <form:select>
<form:option>: Creates an option for the select field Child or either/or <form:select>/<form:optgroup>
<form:label>: A normal form label
<form:fieldset>: Groups parts of the form together

Form Attributes

Like the HTML API, id and class are reserved for all Form API elements. Unlike the HTML API, the name attribute will not create a browser tool tip. The name attribute should act just like HTML Forms currently work.

<form:form>: method, action
<form:input>: type (can be: text, radio, checkbox, textarea, calendar, email, password, match, submit, button), name, disabled, value
<form:select>: name, multiple, disabled
<form:optgroup>: disabled, selected, label
<form:option>: value, selected, disabled
<form:label>: for
<form:fieldset>: legend

Some questions I’ve received when talking about this concept:

1. “This isn’t semantic!”
Right now we have numerous elements that should not be there at all. Such as <img>, <b>, <i> etc. Even with the “new” HTML5 definitions, they stand for image, bold, italics and so forth. This isn’t semantic at all. This completely presentational. This method makes HTML pure semantics and pure information.

2. “With people naming things random names, how are search engines going to work?!”
What do we do now? Most of us use divs with ids and classes. How do search engines index this with random names? Well, they look at the ids and classes for common names. Most of us use things such as wrapper, nav, navigation, main-content, main_content, footer, header, etc. If they can read ids and classes, I’m sure they can index tags. Plus, this cleans up the markup making it less information for them to cover.

We have to remember that if we are naming them things that search engines would have problems parsing, then there is most likely no current HTML tag that suits it now. For example, let’s take this example:

<calendar>
    <month>
        <week>
            <day />
            <day />
            <day />
            <day />
            <day />
            <day />
            <day />
        </week>
    </month>
</calendar>

To me, if I were to make a calendar with HTML, that makes more sense than:

<ul id="calendar">
    <li class="month">
        <ul class="weeks">
            <li>
                <ul class="days">
                    <li></li>
                    <li></li>
                    <li></li>
                    <li></li>
                    <li></li>
                    <li></li>
                    <li></li>
                </ul>
            </li>
        </ul>
    </li>
</ul>

And personally, I bet more people would like to mark and style it the first way. Plus, are you going to tell me that the UL method would be easier for a spider? If you decide to name your nav <blahteeteeoop>, then sure, the search engine won’t catch on that this is a navigational element. But most of us are going to use <nav> (my choice), <navigation>, or <menu>

3. “What if we end up adding a HTML API that is named the same as one of my tags in the future?”
That’s why we use the HTML name space. When you do <html:x> (replace x with anything) you are calling the HTML > X. The browser then would run the API for that tag. E.g. <html:a> would tell the browser “Hey! Make me a clickable link!” while <nav> doesn’t say anything to the browser except that there is an element called <nav> in this location in the DOM.

4. “WTF, this is just XML with XML Namespaces, but this isn’t even valid XML! You don’t even properly close your self-closed elements! “
Correct! This would be invalid along with 90% of the “XHTML” web sites out there. They simply call themselves “XHTML” when they are not in anyway but the look of the code. If you tell the browser it’s HTML then, oh my god, guess what? It parses as HTML! Quite genius isn’t it?

I’ve decided to keep the HTML syntax, call it HTML and keep it as HTML with XML namespaces because XML namespaces work perfectly to avoid conflict among elements. We need this so we can scale with this new method and avoid people naming elements the same as something that might later become included in the HTML API.

I don’t self close tags because this isn’t XML. However, you can use whatever elements you would like just like XML with a few minor exceptions to attributes.

5. “Over half of all websites are outdated by html 4.01 standards and switching to something new would completely break things on those old sites”
Correct, moving to a completely new system would break every form of HTML/XHTML currently out. This is why browsers would just need to retain the current standards as they are built in as I write this. Once HTML5 is completely implemented we could simply stop and switch to the new system. Allowing browsers to keep what they have and keep old sites working while, using the same technologies, implement this form of HTML.

For example to make an image display, the browser could say, if HTML5 or less, <img> == an image OR <html:media type="image"> == an image, but both ways would do the same thing. If you are using the new system, you could take advantages of the technologies in the new form of HTML.

This change would not be hard at all because browsers already accept XML, most of my HTML API method is built on current HTML standards, and you could very simply run the browser in a standard HTML mode without changing any current code.

Conclusion

Personally, I think this is a much more scalable and easier to maintain because we can add modules and not have to wait years upon years for an entire spec to come out for the entire language it’s self. We can write a spec on, for example, the media element, and spend maybe an entire year working on it. Then the browsers can pick up that single module and add support for it. People could even write plugins to run those specific tags if we truly built this as a HTML API.

This also clears all presentational elements. All elements included have a reason they are there and the elements wouldn’t need any default styling because they are there for functionality.

This is not XML, but a form of more liberal HTML, and I personally believe, a more powerful HTML at that.

[Post to Twitter]   [Post to Delicious]   [Post to Digg]   [Post to Reddit]   [Post to StumbleUpon]  

Chris O’Rourke 08.08.09

While I’m in total agreement I think the idea of basically switching to a whole new html would require that browser engines be capable of backwards compatibility. Over half of all websites are outdated by html 4.01 standards and switching to something new would completely break things on those old sites (at which point removing possibly useful information from the total pool of knowledge).

I do think that as long as the browsers hold the backwards compatibility then HTML has no need for it.

Oscar Godson 09.08.09

I’m not saying to remove HTML <=4.01 from browsers. I'm saying we need to stop adding onto the prior standards. Most of the functions used in my proposed HTML API are currently built into HTML now. They would only need to port those to new elements. After the port, adding new features would be simpler and faster. Browsers would simply run either system A or B:

A: This site does NOT use the HTML API. Run as standard HTML<=5.
B: The site DOES use the HTML API. Run as so.

Backwards compatibility will always be required for information's sake. I believe that if we keep adding and making new definitions for elements it's going to really start bloating browsers. Think about this, the menu element has come, gone, and come again. Then think if in HTML6 it gets removed again making it even more confusing for everyone and is it even necessary?