Showing posts with label html. Show all posts
Showing posts with label html. Show all posts

Sunday, July 19, 2009

My HTML in Scala Wishlist

One of my favorite features in Scala is its native support for XML(link warning, shameless plug.) I love the "projections DSL" (for lack of a better term) that is the scala.xml package. It is great for slicing up XML, and is almost like having native XPath and XQuery built into the language. Using XML in Scala for HTML generation is not as universally appealing to people. Some people think it encourages mixing presentation and business logic and is not "clean." I have even heard folks say that this is a critical flaw in Lift. I completely disagree. Of course I also think MVC is an anti-pattern, but anyways... HTML support in Scala is great in my opinion, but could be even better. Here is my wishlist.
  • Better understanding HTML semantics. For example, it is great that the compiler won't let me do something like this <div>Hello world</di>. However, I would not want it to even let me do <di>Hello world</di>. Similarly I should not be able to stick a thead outside of a table. If I have an anchor tag with an href attribute, then I'd like the compiler to do some validation on the value. Am I asking for too much? Well, that's I'm supposed to do, right? Is some/all of this possible with the help of XML schema support? Maybe so.
  • CSS. HTML without CSS ceased to exist a long time ago. With all of the DSL possibilities in Scala, is there some way to allow CSS literals and create some type safety for me. For example, I if I fat finger the name of a style on a class attribute, the compiler should catch this. Also I can say from experience that introducing an object-oriented model is really useful in CSS. Finally, the XPath/XQuery-like support in Scala XML makes me think that something similar could be done for CSS selectors.
  • JavaScript. Honestly I don't know what I would want here. I like JavaScript, but I know the idea of being able to write your web application in a single language is very appealing to a lot of people. If nothing else, it would once again be nice if the compiler would prevent me from doing something like <button onclick="foo()" ...> if there as no function called "foo" in scope.
  • Tooling. Most modern IDEs support things like syntax highlighting, code completion, error checking, code formatting for HTML and XML+schema. I expect at least this much for XML/HTML literals in Scala. I'd also like to be able to step through and debug literals. Refactoring on subtrees of XML (extract method for example) would be sweet. Oh, and large XML literals should not overwhelm my IDE :-)

Wednesday, January 28, 2009

How Google Makes The Net Suck

Some people like to compare developers to artists. When it comes to web development, some people say there's always a man behind the curtain. Whether you agree or not, there are definitely certain freedoms that web developers enjoy. As a web developer, what are the greatest limitations and obstacles in your way? Once it may have been browser quirks. Now maybe it's all those annoying users who still use IE6. However, I think the greatest obstacle to progress is Google.

Now Google would have you believe just the opposite. I do not think they are disingenuous. In a large organization, it's all too easy for different groups to have different motivations. But ask yourself this, how much money does Google from Chrome? What does Google make money from? That's easy: advertising on search. And that is what is hold us all back.

If you have endured my purple prose to this point, I will finally cut to the chase. One of the most important aspects of any web page is how its PageRank. If your web page is all about deep sea diving, where does it surface when somebody searches Google for deep sea diving? The black art of making your page get a higher PageRank has given birth to an entire cottage industry known as Search Engine Optimization (SEO.)

As a developer I have never given much thought to SEO. I always thought that SEO was about the content of the page, and web developers are not responsible for the content. We are responsible for retrieving/generating that content from all kinds of sources, as well as creating applications that are easy and intuitive for the user to interact with a meaningful way. But, if we go back to the deep sea diving example, we're not responsible for providing information about deep sea diving. Heck you are lucky if most developers even known how to swim, but I digress.

But I was wrong. SEO is not just about content. It is about structure. If you want a good PageRank, then quality content about deep sea diving will lead to other people linking to your page and that will increase your PageRank. But there are much more instantly gratifying things you can do. For example, your page should a title and it better contain the term deep sea diving. No big deal, right? The title is really just part of the template outside of the main contents of the page. Its value has little effect on anything, besides PageRank that is. However, it gets worse.

To maximize your PageRank, then immediately after your page's body tag you should have an H1 tag whose contents should contain the term deep sea diving. Oh maybe you put the phrase on the page, but you put it in a div that styled quite nicely? Not good enough. It needs to be in an H1 tag. Maybe you used some JavaScript to create the H1 tag? That is no good at all. Why? Because The All-Mighty Googlebot does not understand how the page looks to a user. It only understand basic HTML constructs. That's right, it's time to party like it's 1999.

Oh, maybe your organization hired an artist who created a killer deep sea diving logo and you load it on to the page as an image? Not good enough. If you put deep sea diving as the alt text, that will win you some bonus points from the Googlebot, but it is still dwarfed by the rewards you could receive by busting out the H1. Nothing compares to the mighty H1 tag. And don't just put that H1 tag anywhere on the page. Heck you might even get penalized for having more than one! Nope only one, and it better appear (in the HTML source code) as close to the body tag as possible.

Ok, so maybe you give in and put the catch phrase in an H1 tag. That wasn't too bad, right? Now back to your regularly scheduled hacking? Not so fast. Do you have some hierarchical information on the page? Sections, headings, menus, etc? How are you going to do those? Again you better not even think about using things like JavaScript to create them dynamically. Nope, they have got to be static on the page. Back to divs, spans (maybe a table or two), along with some oh-so-clever CSS? Forget about it. Let me introduce you to H1's other friends: H2, H3, H4, H5, H6. That's right, if you want that damn Googlebot to "understand" the hierarchy of concepts on your page, then you better put away your divs and spans.

Maybe you think that's going overboard, but it's not. Do you have a section on your deep sea diving page called "Gear" ? Then if you want to show up on a search for deep sea diving gear, you better have the term Gear wrapped in an H2 or an H3 tag.

What about RIA technologies? Again, if you dynamically create things with JavaScript, it will get picked up, but it is non-optimal. You have to do things the way that the Googlebot wants it done to get best results. What about Flash or Silverlight or JavaFX? Flash will get you screwed on about the same level as JavaScript. Silverlight or Java might as well be black holes. Whatever is in there, is never getting out.

There are tricks you can employ like progressive enhancement. There you do things the way that the Googlebot wants them, then dynamically obliterate that garbage and replace it with rich content that your users actually want. This can backfire. If the Googlebot figures out that you are tricking it, then it will banish you to purgatory.

What if you just make a great web application that users will love and don't bother to worry about the Googlebot? That's fine of course, it just means that people will not find your application by searching for it. Is your business model and marketing efforts robust enough to not need SEO? Yeah, I didn't think so.

So now do you understand? The Googlebot ties your hands, or at the very least makes you jump through all kinds of hoops. There are all these great technologies you could use to make your site as interactive as any desktop application, but The Googlebot does not like this. You've got to play his game whether you like it or not.