Bill Lovett's Blog
2012-05-17T14:25:18-04:00
Zend_Feed_Writer
http://ilovett.com/blog
Bill Lovett
bill@ilovett.com
2010-11-17T00:00:00-05:00
2010-11-17T00:00:00-05:00
http://ilovett.com/blog/appengine/testing-inbound-mail
Bill Lovett
bill@ilovett.com
The App Engine Development Console lets you send messages to your
application while you're in development. But the message composition
is relatively simplistic, and won't expose some of the edge cases and
general weirdness you'll see in production once your application
starts receiving messages from the riffraff of the Internet at
large.
Here are two of my favorite mail-related errors that have shown up
in my application's logs, but never cropped up during development:
InvalidEmailError: Empty email address for cc.
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 59: ordinal not in range(128)
The tedious way of debugging errors like these is to send a test
email to a secondary deployed instance of your application, monitor
the log for errors, make a code change, deploy a fix, and repeat. That
quickly ceases to be fun. An easier approach is to test against your
development server via curl. Here's what that workflow looks like:
Find an email that triggers an error and save it to a file
(headers and all).
Edit the address in the To: header to something your application
will accept.
Use curl's --data-binary option to post that file to
your application.
Here's an example:
curl -i -v \
--data-binary @bad-email.txt \
http://localhost:8000/_ah/mail/foo@bar.appspotmail.com
The -i and -v options aren't strictly
necessary, but provide more insight into what's being sent. The
--data-binary option side-steps urlencoding issues that
would otherwise crop up if you just used --data. Although
the messages emitted from the Development Console include some other
headers like Content-Type: message/rfc822; charset=UTF-8,
leaving them out doesn't appear to cause any problems.
The only problem I've run into with this approach is that I
sometimes need to restart the dev server after a request returns an
error. Not quite sure what that's all about, but it beats going
through a full deploy cycle every time.
2010-11-03T00:00:00-04:00
2010-11-03T00:00:00-04:00
http://ilovett.com/blog/programming/mod-pagespeed
Bill Lovett
bill@ilovett.com
Google released mod_pagespeed
for Apache today, making it much easier to implement a lot of
improvements to website performance with next to no effort. The
biggest benefits will probably come from the worst offenders--sites
that haven't already implemented best practices. For sites that have,
some of what mod_pagespeed offers will be redundant. But the trickier
parts are what make it worthwhile.
Installation was easy because Google
provides .debs and .rpms. The creation of
/etc/apt/sources.list.d/mod-pagespeed.list was a nice
touch as well. On a Debian system, you'll start out with a default
configuration in
/etc/apache2/mods-available/pagespeed.conf that is
semi-daunting but really not so bad. You can set up a much smaller
config in your site's virtual host using the default minus the
comments, and end up with something that's really only a dozen lines
or so.
The filters that pertain to CSS and JS minification are the the
sort of baseline improvements that you may have already made if you've
read High
Performance Web Sites or Even
Faster Web Sites. Similarly, the filters
that pull inline scripts and styles into external resources
(outline_css, outline_javascript) or vice versa (inline_css,
inline_javascript) might only be of interest if you're optimizing
something you otherwise don't want to or cannot modify.
To borrow a book title cliché, the "good parts" of mod_pagespeed
start with collapse_whitespace, which will minify your
HTML source, and remove_quotes, which culls omittable
quotation marks from attributes. That level of optimiziation is easy
to dismiss as not-worth-the-trouble, but if mod_pagespeed can do all
the dirtywork why not? Every little bit helps.
It's the same story with rewrite_images, which can
replace small images with data URIs, and cache_extend,
which can hash filenames for static assets. It's not so bad making
optimizations on a case-by-case basis, but it takes more effort to
work up a solution that covers an entire site reliably while being
cross-browser.
Lastly, there's mod_pagespeed_beacon. Capturing client
load times in the access log is much simpler than setting up a
dedicated endpoint. You can collect the data passively and process it
whenever you feel like it, or not at all. So again, why not?
2010-10-28T00:00:00-04:00
2010-10-28T00:00:00-04:00
http://ilovett.com/blog/programming/rebuilding-lucene-indexes-via-zend-tool
Bill Lovett
bill@ilovett.com
I recently worked out a better process for updating the index that
powers the search feature on this site. The index is built with Zend_Search_Lucene,
but now I interact with it from a Zend_Tool
provider.
The original search implementation relied on a hidden controller
action which would kick of an index rebuild when accessed. It was
convenient but awkward because I had to remember to hit the URL after
I published a new entry (my blog entries are stored as text files,
like Blosxom). I never bothered
with details like securing access to the hidden URL or dealing with
browser timeouts, so it was easy to miss when things went wrong, or to
forget the necessary steps entirely.
The new implementation still begins with a Search model that
oversees the interaction with Zend_Search_Lucene. But now I kick of
the rebuild from the command line, which makes it easy to incorporate
into a Phing deployment target.
I had hoped that creating a Zend_Tool provider would give me the
equivalent of a management command in Django or a Rake task in Rails,
but Zend_Tool as of 1.10 isn't quite the same. As pointed out in Creating
Zend_Tool Providers by Matthew Weier O'Phinney, your provider
doesn't give you all the project-specific luxuries you might expect
when you're developing the application from the MVC side. So no fancy
autoloading, and no automatic configuration from
application.ini. Which makes sense--providers that generate skeleton
files from templates don't need to be aquainted with your
application.
For occasion where you do want that level of connection, Zend_Tool
can still be viable with not much effort. Here's the relevant parts of
my application's directory structure:
$ tree .
.zf.ini
bin
|-- zf.bat
|-- zf.php
`-- zf.sh
library/Lovett/
`-- Tool
`-- LuceneProvider.php
The bin folder is coped out of the framework download to avoid
environment dependancy--the same rationale behind putting
library/Zend under your application. The
.zf.ini is outlined in Matthew's blog post. Since
Zend_Tool doesn't currently support project-specific providers, a
workaround is to specify a configuration file prior to calling zf.sh
which can point them out to Zend_Tool. Mine looks like this:
php.includepath = "/var/www/ilovett/website/library:.:/usr/share/php"
basicloader.classes.1 = "Lovett_Tool_LuceneProvider"
This corresponds to a zf.sh invocation of:
export ZF_CONFIG_FILE=./.zf.ini; ./bin/zf.sh reindex lucene
My first attempt at creating a Zend_Tool provider was fraught with
disappointment because I was expecting the component to do all the
legwork for me, as it does elsewhere in the application. So when I got
errors about my models not being found, I fell back to
require_once. When I realized that I didn't have a way of
getting at some values in application.ini, I made them
into parameters to be specified at run time.
Although it worked, it was lame. Lame in terms of repeating
configuration values, lame in terms of not taking advantage of
autoloading, and lame in terms of the end result being more of a
standalone script than an extension of my application.
Things got a lot easier when I started to think of my provider like
an alternate version of index.php. That file is where the constant
APPLICATION_PATH gets created and where the
Zend_Application application instance is created. By doing the same in
my provider's constructor, I regained access to the application and
could interact with my search model just like I would from a
controller.
I didn't see any need for a mainifest, so I skipped that piece
entirely.
2010-10-27T00:00:00-04:00
2010-10-27T00:00:00-04:00
http://ilovett.com/blog/projects/using-org-mode-spreadsheets-for-personal-finance
Bill Lovett
bill@ilovett.com
I recently started using Org-mode
to manage my personal finances. It's pretty much the greatest thing
ever.
Previously I used a Google Spreadsheet with separate worksheets per
account. Before that I used a web-based application of my own
creation, and before that I used Microsoft Money. Org-mode wins out
because it gives me just the basics I need, and wraps them in the
simplicity and convenience of a text file.
The main thing I learned from using Microsoft Money is that my
personal finance needs aren't that complicated. All I really need to
do is make sure the amount of money I think I have syncs up
with the amount of money the bank thinks I have.
Google Spreadsheets was nice at first because I could access it
from any machine and any location. But the page load time and slight
slugishness of the interface started to annoy me. Since I already use
Org-mode for other things, it was really just a question of getting
past the learning curve of the table editor's spreadsheet
capabilities.
Getting started was straightforward. I exported a CSV
from Google Docs and converted it to an Org-mode table via
C-c |. Which yields a fairly ugly table, because Org-mode
expands each column to accommodate the length of its contents. If you
have a note field, that's no good. The solution is to specify
column widths, allowing Org-mode to truncate long cells.
My system is based on entries in reverse chronological order with
the current balance kept as a running total. Column
formulas reduce that balance calculation to a single
keystroke. Org-mode's table editor pretty much takes care of
everything else.
One slight downside I didn't expect is that currency formatting had
to go. Dollar signs, commas, and the like interfered with Org-mode's
calculations. In practice, I don't miss them too much.
Here's the template I'm using:
#+STARTUP: align
| Date | Clear | Source | Amount | Balance | Note |
|------------+-------+----------------------+------------+-----------+---------------------------|
| <10> | <5> | <20> | <10> | <9> | <25> |
| 2010-10-27 | | Unexpected dividend | 50.00 | 55.00 | Another silly example |
| 2010-10-15 | X | ATM Withdrawl | -5.00 | 5.00 | A ridiculous example |
| 2010-10-01 | X | Opening Balance | 0 | 10.00 | |
#+TBLFM: $5=@+1 + $4;%.2f
The first line ensures everything lines up by default when you
first visit the file. The last line captures the formula for the
Balance column. Unlike Excel or Google Spreadsheets, the value of
each cell in the Balance column is not recalculated on the
fly. Pressing = while in the cell invokes the formula,
but it's more like a one-time shortcut as opposed to a dynamic value. The
downside is that you need to recalculate the balance if you reorder
the rows. For my purposes, not a big deal.
Adding a new transaction is just a matter of adding a new line above the previous transaction, filling out each field and tabbing to the next, then calculating the balance. It's very quick.
2010-09-01T00:00:00-04:00
2010-09-01T00:00:00-04:00
http://ilovett.com/blog/frontend-util/command-line-javascript-validation
Bill Lovett
bill@ilovett.com
Let's make JSLint easier to run
by calling it from the command line via wrapper script. Let's also
incorporate Google's Closure
Linter while we're at it for double the fun.
Here's
the finished product. Read on for further details.
Why?
Linters make your code look better to machines as well as humans.
They point out stupid typos and petty errors. They enforce best
practices and consistent style, and prevent you from lapsing just
because you were in a hurry. Best of all, they let you know whether
you're really as good as you think you are.
Scenarios for Running JSLint
There are lots of ways to incorporate JSLint into your workflow, but
none were as low-effort as I wanted. Flymake for JavaScript
seemed promising at first, but I decided that on-the-fly checking
would be annoying.
Maybe something service-esque? There's Lintnode, but its components
no longer work together. How about running JSLint's web interface
locally, scripting a POST request and scraping the result? That's
all wrong. JSLint needs a JavaScript runtime--the checking happens
client-side. There are no obvious browser automation possibilities
here, and copy-paste is not desirable.
Maybe Spidermonkey? There's a workaround for its
inability to read files, but I'm not keen on using a modified
version of JSLint.
That leaves Rhino. It's convered by one of the "official"
editions of JSLint, but it's not necessarily a speed demon. I can
live with that.
The Setup
If you use JSLint straight from the command line, the results won't be pretty.
rhino jslint.js functions.js
...
Lint at line 8 character 13: 'jQuery' is not defined.
References to third-party libraries will throw spurious errors because
JSLint hasn't been told about them. We need to define some global
variables to prevent that confusion. We also need a flexible way to
specify JSLint's configuration options.
These things could go directly into the files you will lint, but
that's extraneous clutter. The approach I've taken instead is:
Define a standard set of options in the wrapper script.
Pass in any additional globals as an argument.
Prepend both to a copy of the file being linted.
Stop after the first error
I find that going through errors one at a time is easier than seeing everything at once.
The Settings
I started with JSLint's "Good Parts" options. I disabled "Strict white
space" (white) because of a difference in opinion with Closure Linter
I'll describe in a bit. As mentioned above, "Stop on first error"
(passfail) is enabled. So is "Assume a browser" (browser).
Adding Closure Linter
Closure Linter is already suited to running from the command line, so
not much effort is involved here. I'm running it after JSLint, but the
order could certainly be reversed.
JSLint and Closure Linter disagree about whitespace after "function"
and the subsequent left parentheses. JSLint wants it, Closure Linter
doesn't. I decided to go with the latter (hence the "Strict white
space" config change mentioned above). I'm sure there are epic
arguments both pro and con on both sides, but this is minutiae.
How It Looks
Example 1:
$ validate-js functions.js
# JSLint
############################################################
No problems found.
# Closure Linter
############################################################
Line 5, E:0220: No docs found for member 'BB.video'
Line 25, E:0220: No docs found for member 'BB.rsvp'
Line 79, E:0110: Line too long (103 characters).
Line 85, E:0220: No docs found for member 'BB.slideshow'
Example 2:
$ validate-js functions.js
# JSLint
############################################################
Line 1 character 1: Missing "use strict" statement.
var GK = function() {
Stopping (0% scanned).
# Closure Linter
############################################################
Line 7, E:0131: Single-quoted string preferred over double-quoted string.
Line 7, E:0131: Single-quoted string preferred over double-quoted string.
Line 8, E:0131: Single-quoted string preferred over double-quoted string.
Line 8, E:0131: Single-quoted string preferred over double-quoted string.
Line 9, E:0131: Single-quoted string preferred over double-quoted string.
Line 9, E:0131: Single-quoted string preferred over double-quoted string.
Line 9, E:0110: Line too long (82 characters).
Line 36, E:0131: Single-quoted string preferred over double-quoted string.
Line 36, E:0131: Single-quoted string preferred over double-quoted string.
Line 37, E:0131: Single-quoted string preferred over double-quoted string.
Line 38, E:0131: Single-quoted string preferred over double-quoted string.
Line 39, E:0131: Single-quoted string preferred over double-quoted string.
Line 42, E:0002: Missing space before "="
Line 42, E:0002: Missing space after "="