Tuesday, April 15, 2003
Back in the saddle
Since my onslaught of work has at least temporarily abated, I've gotten some more time to work on Evil Toaster lately, which is a good thing.
I had been meaning to update some test code to handle non-IMAP servers with JavaMail (specifically offline Stores to store mail locally). I finally wrote a java app that will copy mail from one Store to another, letting me copy everything from an IMAP store to a local store, for instance - and I found that the ICEMail [icemh] provider had a lot of problems, at least on MacOS X with JDK 1.4.1. Yesterday I addressed the most immediate of those shortcomings and got it going with my code, which pleasantly enough didn't need major changes to work with non-IMAP stores (I had to change maybe two lines of code that were really part of a bug).
Anyway, doing work with JavaMail is a lot faster now that I'm not connecting to mail.mac.com via SSL, and instead accessing local files in pretty much the same way. I learned my lesson about not using SSL at DefCon when I was sh33p'd on my mac.com account over the open wireless AP. Duoh! I still haven't gotten the photo of my sheepage developed yet, oh well. I have like 10 disposable cameras from defcon and Edwards waiting to be developed to PhotoCD.
With that done, I got the [Lucene] mail indexing stuff updated to handle links in emails pretty much as I want to. They're being extracted using a regular expression from the email body, which is fast and with some tweaking pretty reliable. I had planned on having java grab the HTML a link pointed to from an email and indexing that as well- how many times have you searched though mails to find a link someone sent you, right?- but yesterday realized that wouldn't be good- it would probably end yup indexing spam links, which is bad, bad, bad. The HTML viewing component for EvilToaster (using Mozilla's Gecko rendering engine as the [CHBrowserView]) has a set of controls on it that allow you to only view HTML from certain people, not view HTML emails at all, or view them but the CHBrowserView will *not* get any images, etc. over the network.
Why? Here's Apple's [simple explanation] of it. Basically, viewing spam HTML email messages can send information to the spammer without your knowledge, usually a confirmation that there is a live person at your email address. That results in more spam. I don't know about HotMail, but Yahoo mail does have a setting that will prevent HTML emails from loading graphics, etc. over the network- and it's very effective at stopping spam.
So I have to rething the link indexing. While it's nice to have just links like "http://www.something.com" in the index, from the user's perspective it's a lot more useful to have it index the actual content at that address, not the URL. Even if it's just the actual HTML title of that page, that's a lot better than a cryptic URL. I'm probably going to have to come up with some kind of ruleset to determine what links to follow, which oh ya will be lots of fun.
Lucene 1.3RC1 is out, which among many [improvements] I've been waiting for has an API for "find similar" kind of functionality - which kicks ass.
On the Cocoa side, I've made a lot of progress implmenting some things, like a view very much like iChat's [contact list]. Apple has this annoying thing where they implement UI elements and don't let developers access them, like everything in iChat, the search field in Mail.app, etc. It does annoy the hell out of me, but screw em. My app is going to be a lot easier to use than their stuff, even though I'm writing it with a target audience of one.
My brother has been pointing me to UI ideas from the Newton yet again, some would work well on MacOS X, some wouldn't. We'll see what I can pull off there :P
[ 4/15/2003 05:50:00 PM ] [