Blog
Page: 0 ... 5 ... 10 ... 15 ... 20 ... 25 ... 30 ... 35 ... 40
Video Encoding
Date: 3/5/2005
We seem to have run out of VHS tapes at home and the machine is dying anyway so last night when I needed to record some tv (not for me) I installed the TV card drivers again and played around with the encoding settings. I have a Leadtek Tv2000 which is a basic PCI TV tuner card and it works Ok I guess. Despite it saying that it supports MPEG4 in hardware don't believe it.

The bundled PVR/record/view software 'Winfast PVR' is a mix of good and bad, firstly it gives you a functional UI to setup channels, setup encoding options and capture video. It does 'skin' everything but it looks hideous. However 15 minutes in to my hour long capture it crashed. So stability is not good enough for everyday use. Oh and you do get a hardware remote and IR receiver with it but it only worked for the first few days I had it and then died. No amount of fiddling with drivers and new batteries would resurrect it.

After recording the show I had a go at optimizing the encoding settings, because the default use of MPEG2 was a) eating all the CPU and dropping frames/audio and b) creating 2gb files for less than an hour of video. So I started by dropping the MPEG2 bitrate down to 4000, and it looked terrible. So I started exploring the other codecs. The most obvious one to try was DivX 5. So I made some test encodings, and at full rez (PAL: 720 x 576) it would drop frames badly, as the CPU (1.4ghz Athlon) would by pegged at 100%. So I tried half res (which is still watchable for TV) and it would sit around the 90% CPU mark. However when the scene pans around it has blocky artifacts which are very noticable even with the bitrate ramped up to 4000. So I kept looking and the next thing to try was Xvid. With half res video and Xvid's quality rating set to about 3.5 the output video was very watchable with only very slight video artifacts and the CPU would average about 85% load. Very nice! I checked the output avi with GSpot and it told me the file was 90% audio?!?! What the? Ok so the audio compression was set to "ogg" (my fravorite format) but according to the filesize it seems like it was actually PCM (uncompressed). So I switched to mp3 encoding for the audio and wow... the output capture files are now tiny! I'm encoding at about 250-300mb an hour, with very watchable picture quality and decent audio. Most people expect to get about 1gig/hr. So I'm doing way better than that.

When playing back these Xvid/mp3 avi files back it uses about 20% of the CPU, nice in of itself, but it does mean that I can't encode and playback reliably on my current setup (85+20 = 105%). So no Xvid PVR for me. At least until I upgrade.

Now what I need to find is a reliable capture and PVR application that I can hook into a tv guide to auto record shows of interest. That search is still ongoing. The WinFast PVR software is a) too unreliable and b) can't be scripted to record shows, their schedule file is a binary .DAT thing (grrr!). I'm hoping that some freeware app can do good video capture off a TV card... any suggestions?
(4) Comments | Add Comment

Tiger
Date: 2/5/2005
I've been reading up on Apple's new version of OS X 'Tiger' and it's pretty much heading directly where I wanted linux to go. With a full OpenGL implementation of the GUI for maximum hardware acceleration. Powerful OS level API's for graphics, sound and data. Visually beautiful UI - Aqua. Support for lots of multimedia add-ons via firewire and usb2. Top level authoring capabilities.

And linux has X windows. Wow I'm so overwhelmed. ;)

To me it doesn't seem like it's a matter of 'if' I'll jump on the Mac bandwagon but rather 'when'. I doubt that a Mac Mini will really give Tiger room to stretch it's legs. According to some it wants a ATI Radeon 9600 or NVIDIA GeForce FX to run Quartz 2D Extreme, a feature which I think I'd appreciate. So from my limited research into which Mac would suit me (a free one hehehe) it seems the iMac G5 17" @ 1.8 Ghz + 512mb RAM would be a minimum hardware platform for decent performance in Tiger (right?). A mere A$ 2,520 at my local Apple retailer. Which is quite a bit more than a dual Opteron upgrade for the existing machine. *sigh* But I think the Mac has a really rosy future with a disproportionately high level of mindshare in the geek community.
(9) Comments | Add Comment

Rise Of The Machines
Date: 22/4/2005
After yesterday's spike of website hits I thought I'd look into where that was comming from, and no referer spiked up to explain it. But the number of "bots" hitting the site had gone up to some 30% of the page loads on www.memecode.com, and I don't know about you, but thats er kinda high isn't it?

I've starting tracking the bots by counting their hits per useragent string. And obviously Googlebot and Msnbot are leading the race early on but I suspect a rouge bot 'telnet0.1 noone@example.org' is responsible for yesterday's spike. I have in the past banned IP's due to the shear number of incomming hits for no apparent reason. Then of course Googlebot itself decided to hit my site over 61000 times in a 24 hr period some time ago now.

Is it just me, or does 30% of your site traffic being eaten away by bots just annoy you?

On a somewhat related note, I've also instituted a kill file for porn / scam sites that spam my referer log to help boost their Google ranking. I might add that I've also set my robots.txt file to stop scanning of the stats anyway, so even if you evade the kill file you won't receive any benifit from getting listed as a referer. I suspect that these sites are inserting themselves as the referer by infecting machines with spyware that frigs with IE's outgoing referer field thus littering the web's stats pages with their URL, which in turn makes their Google ranking grow. But I have no conclusive evidence that happens, but it's my current theory.
(2) Comments | Add Comment

Site Update
Date: 16/4/2005
I've added a new feature to the forums that allows you to receive email notification of replies to a thread. A simple thing but it should save people some time checking back on a thread.

Also I've fixed old posts not appearing in the forums. When I redid the threading code I killed the old posts, and that would show as error messages in the tables.

I'm wondering if I set up an RSS feed for software releases would people be interested? Currently I have email notifications setup but some people might prefer an RSS feed.
(2) Comments | Add Comment

Built In Obsolescence In Consumer Goods
Date: 11/4/2005
Recently in our household we have had a rash of devices have their interfaces, mostly buttons, fail on us. Firstly one of our Doro cordless landline phones stopped responding to some of the numbers making it useless to call anyone. When we bought it 2 odd years ago I was expecting good things from it, partly because it wasn't the cheapest off the rack and partly because it seemed european and maybe better quality. So far we're pretty disappointed with it, firstly the answering machine doesn't have a mode whereby you can screen calls and secondly you can't force it to not ring. A useful feature if your trying to sleep and you want the answering machine to take the call without it ringing. And now the other handset is showing signs of dying. Don't get a Doro phone if your in the market, they suck. Their only saving grace is that being digital the sound quality is very good.

Then there is the pair of Nokia 3105 phones we got from Orange when we moved to a cheaper plan. And after just bit less than a year one of the handsets is not responding to some numbers and hanging every now and then. It hasn't been abused at all, but it's had fairly regular use. Pretty disappointing that it didn't even last a year. A black mark for Nokia.

Now the remote for the VHS recorder is failing as well, the play button is non-functional and the device spits tapes out that it doesn't like, which is highly annoying. The player is now at least 5 years old, so it has lasted a little longer than the others I've mentioned.

It seems I'm in the market for a new cordless phone and I'm a bit hesitant to buy any old device off the shelf. I want something that will last, not some flimsy throwaway appliance. But how do I know something will be still working in 5, or even 10 years? Is it unfair to expect a phone to still work after that long? I would have kept our cordless phones in operation for many years yet if they hadn't up and died on us. So I really only got half or less value out of the A$300 we spent on them. It's no surprise that most companies warrent their product for only a year. It seems consumer goods manufactures are taking us for a ride.

Anyone had some good experiences with a cordless phone?
(5) Comments | Add Comment

Optimizing Memory Usage
Date: 7/4/2005
It occured to me the other day that Scribe has definately lost some of that "lightweight" character that it used to have. In fact it was down right scary when I looked at the memory usage the other day after Scribe had been running for a few hours and it was 110mb. What the? Huh?

After I calmed down I decided to get to the bottom of it. For starters I did a leak test, and fixed every damn leak. But still the memory usage would rocket up to around 100mb. But it'd do that after the first receive. Alright what gets loaded during a mail receive? The bayesian spam word tables... bugger. Well they are only a few MB on disk, why are they adding 60mb to my memory image?

Good question.

So firstly I looked at the hash table sizes, and lo and behold they were much larger than needed because some of the word counts were way out which put off the preallocation of hash table space. Fixed that. But still quite a lot of memory was unaccounted for.

One thing that bothered me about the hash table implementation I'm using is that it does an allocation for each value stored to hold the key name (a string). On a small table it's no big deal but on a hash table of half a million entries it really hurts, both in allocation / free time overhead, and the extra memory being used to track all those blocks. As a side effect of the extra time spent freeing 500k blocks of memory you risk slowing the program to a halt for minutes on end if that memory has been swapped out to disk. I'm assumed the memory manager wants to "touch" each peice of memory it free's, which means swapping all those 500k bits of memory into physical ram just to free it. Nasty nasty nasty.

So I've given the hash table the option of using a string pool. Which works by doing one big allocation and putting lots of strings end to end inside it. This has 2 very important features, firstly it's very fast to allocate and free, secondly it doesn't require swapping vast amounts of memory into physical ram to free.

The downside of course is that if you delete a key in the hash table it leaves a hole in the string pool's memory, which is wasted space. But for a large static hash table it's perfect.

Now, in doing all this work I decide to keep track of the numbers involved to find the optimal values for hash table size, and the effect it has on the overall memory usage. Check it out:

Table Size Allocs (MB) Load Time (s) VM Size (KB)
Before:
x2 16.08 35 79672
x2.5 18.60 10 82252
x3 21.13 13 84844
x4 26.19 4 90024
After adding string pooling:
x2 16.08 20 38124
x2.5 18.60 5 40720
x3 21.13 7 43304
x4 26.19 1.3 48484


The hash table is preallocated to a multiple of the number of words it has to store, thats the first column. After the string pooling optimization the memory image is drastically smaller due to the overhead of maintain 500k blocks being gone.

I've settled on a multiplier of 2.5 because it seems to have the best memory/speed trade off. For reference Scribe is using about 20000kb without the word lists loaded, in debug mode. So it's getting close to just the data for the strings and hash table's, with little or no overhead.

I think there are better solutions yet, but they take a lot of coding and testing. So I'll leave them for another day. I like the idea of using a tree structure for the word lists to avoid duplicate storage of letters... e.g. if there are 1000 words starting with 'a' then it should be possible to store all of them in a container marked 'a' and store just the rest of the string minus the 'a'. Saving more memory. But thats an idea for later.
(3) Comments | Add Comment