Blog
Html Layout
Date: 9/9/2005
I now have a far more intimate understanding of the absolute horror of HTML 4 layout. It's gruesome and evil and it haunts you, always promising more broken markup just around the corner to break your layout engine.

Well that aside, I did get some breakthough in the layout of 'p' elements in the HTML control. By keeping track of how many previous scanlines are 'margin' pixels in the flow construct I can get paragraphs to layout the same as firefox, under all the conditions I know about. Like nested, and self closed and so on. And gee it's only taken me like 2 years to get it thus far.

The tables are also looking much neater now. Although not totally perfect. At least all the work I put into validating the table structure has paid off, there is a bunch of code that re-arranges all the elements under a table tag to be 'correct' html. i.e. tables can only have 'tr's which can only have 'td's. I suppose you would believe how many people get that wrong. I have some classic example HTML emails from PayPal and Ebay that are ridiculous. Missing or wrong tags everywhere, badly formed CSS, nested table upon nest table. They really are a torture test. My personal fravorite was the PayPal email where nested 3 tables in they miss the end table tag. Oh man that just fries everything.

Anyway all these chucks of hideous HTML are stored away in source control with a HTML test suite program that runs through them all and renders it using the current HTML control. This way if I make any changes to the control I can scan quickly throught a bunch of hard files to render and check I havn't broken anything. So far it's passing the most it ever has, there are a few insane cases that it doesn't do yet, but for reading Outlook HTML it's enough. Speaking of which, the paragraphs generated by Outlook now appear correctly. Hurrah.
Comments:
SnappyCrunch
11/09/2005 6:06pm
I don't understand why you've chosen to continue making your own rendering engine. I know you initially wanted something lightweight that could render basic HTML, but it looks like you've turned it into your own full featured rendering engine. What I'm saying is - building a good rendering engine could be a full time job in it's own right. Why not use someone elses rendering engine (e.g. Gecko) to do the dirty work for you?
fret
11/09/2005 10:48pm
Gecko in it's smallest form is 4.5mb. Scribe is 800kb. The HTML control adds maybe 40kb to the download. IE is insecure and non-standard compliant.

I seriously considered Gecko but it's so big. As far as I know if you have Mozilla or Firefox installed you can't use that version of Gecko embeded in another application or either windows or linux. You have to have a separate "embedding" version of gecko installed as well.

There is nothing out there on the market in terms of my own HTML control, the combination of size, function, license is unbeatable.
 
Reply
From:
Email (optional): (Will be HTML encoded to evade harvesting)
Message:
 
Remember username and/or email in a cookie.
Notify me of new posts in this thread via email.
BBcode:
[q]text[/q]
[url=link]description[/url]
[img]url_to_image[/img]
[pre]some_code[/pre]
[b]bold_text[/b]