Bill Lovett's Blog 2012-05-17T14:25:18-04:00 Zend_Feed_Writer http://ilovett.com/blog Bill Lovett bill@ilovett.com <![CDATA[Testing App Engine Inbound Mail From Dev ]]> 2010-11-17T00:00:00-05:00 2010-11-17T00:00:00-05:00 http://ilovett.com/blog/appengine/testing-inbound-mail Bill Lovett bill@ilovett.com The App Engine Development Console lets you send messages to your application while you're in development. But the message composition is relatively simplistic, and won't expose some of the edge cases and general weirdness you'll see in production once your application starts receiving messages from the riffraff of the Internet at large. Here are two of my favorite mail-related errors that have shown up in my application's logs, but never cropped up during development: InvalidEmailError: Empty email address for cc. UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 59: ordinal not in range(128) The tedious way of debugging errors like these is to send a test email to a secondary deployed instance of your application, monitor the log for errors, make a code change, deploy a fix, and repeat. That quickly ceases to be fun. An easier approach is to test against your development server via curl. Here's what that workflow looks like: Find an email that triggers an error and save it to a file (headers and all). Edit the address in the To: header to something your application will accept. Use curl's --data-binary option to post that file to your application. Here's an example: curl -i -v \ --data-binary @bad-email.txt \ http://localhost:8000/_ah/mail/foo@bar.appspotmail.com The -i and -v options aren't strictly necessary, but provide more insight into what's being sent. The --data-binary option side-steps urlencoding issues that would otherwise crop up if you just used --data. Although the messages emitted from the Development Console include some other headers like Content-Type: message/rfc822; charset=UTF-8, leaving them out doesn't appear to cause any problems. The only problem I've run into with this approach is that I sometimes need to restart the dev server after a request returns an error. Not quite sure what that's all about, but it beats going through a full deploy cycle every time. <![CDATA[Getting Started with mod_pagespeed ]]> 2010-11-03T00:00:00-04:00 2010-11-03T00:00:00-04:00 http://ilovett.com/blog/programming/mod-pagespeed Bill Lovett bill@ilovett.com Google released mod_pagespeed for Apache today, making it much easier to implement a lot of improvements to website performance with next to no effort. The biggest benefits will probably come from the worst offenders--sites that haven't already implemented best practices. For sites that have, some of what mod_pagespeed offers will be redundant. But the trickier parts are what make it worthwhile. Installation was easy because Google provides .debs and .rpms. The creation of /etc/apt/sources.list.d/mod-pagespeed.list was a nice touch as well. On a Debian system, you'll start out with a default configuration in /etc/apache2/mods-available/pagespeed.conf that is semi-daunting but really not so bad. You can set up a much smaller config in your site's virtual host using the default minus the comments, and end up with something that's really only a dozen lines or so. The filters that pertain to CSS and JS minification are the the sort of baseline improvements that you may have already made if you've read High Performance Web Sites or Even Faster Web Sites. Similarly, the filters that pull inline scripts and styles into external resources (outline_css, outline_javascript) or vice versa (inline_css, inline_javascript) might only be of interest if you're optimizing something you otherwise don't want to or cannot modify. To borrow a book title cliché, the "good parts" of mod_pagespeed start with collapse_whitespace, which will minify your HTML source, and remove_quotes, which culls omittable quotation marks from attributes. That level of optimiziation is easy to dismiss as not-worth-the-trouble, but if mod_pagespeed can do all the dirtywork why not? Every little bit helps. It's the same story with rewrite_images, which can replace small images with data URIs, and cache_extend, which can hash filenames for static assets. It's not so bad making optimizations on a case-by-case basis, but it takes more effort to work up a solution that covers an entire site reliably while being cross-browser. Lastly, there's mod_pagespeed_beacon. Capturing client load times in the access log is much simpler than setting up a dedicated endpoint. You can collect the data passively and process it whenever you feel like it, or not at all. So again, why not? <![CDATA[Rebuilding Lucene Indexes Via Zend_Tool ]]> 2010-10-28T00:00:00-04:00 2010-10-28T00:00:00-04:00 http://ilovett.com/blog/programming/rebuilding-lucene-indexes-via-zend-tool Bill Lovett bill@ilovett.com I recently worked out a better process for updating the index that powers the search feature on this site. The index is built with Zend_Search_Lucene, but now I interact with it from a Zend_Tool provider. The original search implementation relied on a hidden controller action which would kick of an index rebuild when accessed. It was convenient but awkward because I had to remember to hit the URL after I published a new entry (my blog entries are stored as text files, like Blosxom). I never bothered with details like securing access to the hidden URL or dealing with browser timeouts, so it was easy to miss when things went wrong, or to forget the necessary steps entirely. The new implementation still begins with a Search model that oversees the interaction with Zend_Search_Lucene. But now I kick of the rebuild from the command line, which makes it easy to incorporate into a Phing deployment target. I had hoped that creating a Zend_Tool provider would give me the equivalent of a management command in Django or a Rake task in Rails, but Zend_Tool as of 1.10 isn't quite the same. As pointed out in Creating Zend_Tool Providers by Matthew Weier O'Phinney, your provider doesn't give you all the project-specific luxuries you might expect when you're developing the application from the MVC side. So no fancy autoloading, and no automatic configuration from application.ini. Which makes sense--providers that generate skeleton files from templates don't need to be aquainted with your application. For occasion where you do want that level of connection, Zend_Tool can still be viable with not much effort. Here's the relevant parts of my application's directory structure: $ tree . .zf.ini bin |-- zf.bat |-- zf.php `-- zf.sh library/Lovett/ `-- Tool `-- LuceneProvider.php The bin folder is coped out of the framework download to avoid environment dependancy--the same rationale behind putting library/Zend under your application. The .zf.ini is outlined in Matthew's blog post. Since Zend_Tool doesn't currently support project-specific providers, a workaround is to specify a configuration file prior to calling zf.sh which can point them out to Zend_Tool. Mine looks like this: php.includepath = "/var/www/ilovett/website/library:.:/usr/share/php" basicloader.classes.1 = "Lovett_Tool_LuceneProvider" This corresponds to a zf.sh invocation of: export ZF_CONFIG_FILE=./.zf.ini; ./bin/zf.sh reindex lucene My first attempt at creating a Zend_Tool provider was fraught with disappointment because I was expecting the component to do all the legwork for me, as it does elsewhere in the application. So when I got errors about my models not being found, I fell back to require_once. When I realized that I didn't have a way of getting at some values in application.ini, I made them into parameters to be specified at run time. Although it worked, it was lame. Lame in terms of repeating configuration values, lame in terms of not taking advantage of autoloading, and lame in terms of the end result being more of a standalone script than an extension of my application. Things got a lot easier when I started to think of my provider like an alternate version of index.php. That file is where the constant APPLICATION_PATH gets created and where the Zend_Application application instance is created. By doing the same in my provider's constructor, I regained access to the application and could interact with my search model just like I would from a controller. I didn't see any need for a mainifest, so I skipped that piece entirely. <![CDATA[Using Org-mode Spreadsheets For Personal Finance ]]> 2010-10-27T00:00:00-04:00 2010-10-27T00:00:00-04:00 http://ilovett.com/blog/projects/using-org-mode-spreadsheets-for-personal-finance Bill Lovett bill@ilovett.com I recently started using Org-mode to manage my personal finances. It's pretty much the greatest thing ever. Previously I used a Google Spreadsheet with separate worksheets per account. Before that I used a web-based application of my own creation, and before that I used Microsoft Money. Org-mode wins out because it gives me just the basics I need, and wraps them in the simplicity and convenience of a text file. The main thing I learned from using Microsoft Money is that my personal finance needs aren't that complicated. All I really need to do is make sure the amount of money I think I have syncs up with the amount of money the bank thinks I have. Google Spreadsheets was nice at first because I could access it from any machine and any location. But the page load time and slight slugishness of the interface started to annoy me. Since I already use Org-mode for other things, it was really just a question of getting past the learning curve of the table editor's spreadsheet capabilities. Getting started was straightforward. I exported a CSV from Google Docs and converted it to an Org-mode table via C-c |. Which yields a fairly ugly table, because Org-mode expands each column to accommodate the length of its contents. If you have a note field, that's no good. The solution is to specify column widths, allowing Org-mode to truncate long cells. My system is based on entries in reverse chronological order with the current balance kept as a running total. Column formulas reduce that balance calculation to a single keystroke. Org-mode's table editor pretty much takes care of everything else. One slight downside I didn't expect is that currency formatting had to go. Dollar signs, commas, and the like interfered with Org-mode's calculations. In practice, I don't miss them too much. Here's the template I'm using: #+STARTUP: align | Date | Clear | Source | Amount | Balance | Note | |------------+-------+----------------------+------------+-----------+---------------------------| | <10> | <5> | <20> | <10> | <9> | <25> | | 2010-10-27 | | Unexpected dividend | 50.00 | 55.00 | Another silly example | | 2010-10-15 | X | ATM Withdrawl | -5.00 | 5.00 | A ridiculous example | | 2010-10-01 | X | Opening Balance | 0 | 10.00 | | #+TBLFM: $5=@+1 + $4;%.2f The first line ensures everything lines up by default when you first visit the file. The last line captures the formula for the Balance column. Unlike Excel or Google Spreadsheets, the value of each cell in the Balance column is not recalculated on the fly. Pressing = while in the cell invokes the formula, but it's more like a one-time shortcut as opposed to a dynamic value. The downside is that you need to recalculate the balance if you reorder the rows. For my purposes, not a big deal. Adding a new transaction is just a matter of adding a new line above the previous transaction, filling out each field and tabbing to the next, then calculating the balance. It's very quick. <![CDATA[Command Line JavaScript Validation ]]> 2010-09-01T00:00:00-04:00 2010-09-01T00:00:00-04:00 http://ilovett.com/blog/frontend-util/command-line-javascript-validation Bill Lovett bill@ilovett.com Let's make JSLint easier to run by calling it from the command line via wrapper script. Let's also incorporate Google's Closure Linter while we're at it for double the fun. Here's the finished product. Read on for further details. Why? Linters make your code look better to machines as well as humans. They point out stupid typos and petty errors. They enforce best practices and consistent style, and prevent you from lapsing just because you were in a hurry. Best of all, they let you know whether you're really as good as you think you are. Scenarios for Running JSLint There are lots of ways to incorporate JSLint into your workflow, but none were as low-effort as I wanted. Flymake for JavaScript seemed promising at first, but I decided that on-the-fly checking would be annoying. Maybe something service-esque? There's Lintnode, but its components no longer work together. How about running JSLint's web interface locally, scripting a POST request and scraping the result? That's all wrong. JSLint needs a JavaScript runtime--the checking happens client-side. There are no obvious browser automation possibilities here, and copy-paste is not desirable. Maybe Spidermonkey? There's a workaround for its inability to read files, but I'm not keen on using a modified version of JSLint. That leaves Rhino. It's convered by one of the "official" editions of JSLint, but it's not necessarily a speed demon. I can live with that. The Setup If you use JSLint straight from the command line, the results won't be pretty. rhino jslint.js functions.js ... Lint at line 8 character 13: 'jQuery' is not defined. References to third-party libraries will throw spurious errors because JSLint hasn't been told about them. We need to define some global variables to prevent that confusion. We also need a flexible way to specify JSLint's configuration options. These things could go directly into the files you will lint, but that's extraneous clutter. The approach I've taken instead is: Define a standard set of options in the wrapper script. Pass in any additional globals as an argument. Prepend both to a copy of the file being linted. Stop after the first error I find that going through errors one at a time is easier than seeing everything at once. The Settings I started with JSLint's "Good Parts" options. I disabled "Strict white space" (white) because of a difference in opinion with Closure Linter I'll describe in a bit. As mentioned above, "Stop on first error" (passfail) is enabled. So is "Assume a browser" (browser). Adding Closure Linter Closure Linter is already suited to running from the command line, so not much effort is involved here. I'm running it after JSLint, but the order could certainly be reversed. JSLint and Closure Linter disagree about whitespace after "function" and the subsequent left parentheses. JSLint wants it, Closure Linter doesn't. I decided to go with the latter (hence the "Strict white space" config change mentioned above). I'm sure there are epic arguments both pro and con on both sides, but this is minutiae. How It Looks Example 1: $ validate-js functions.js # JSLint ############################################################ No problems found. # Closure Linter ############################################################ Line 5, E:0220: No docs found for member 'BB.video' Line 25, E:0220: No docs found for member 'BB.rsvp' Line 79, E:0110: Line too long (103 characters). Line 85, E:0220: No docs found for member 'BB.slideshow' Example 2: $ validate-js functions.js # JSLint ############################################################ Line 1 character 1: Missing "use strict" statement. var GK = function() { Stopping (0% scanned). # Closure Linter ############################################################ Line 7, E:0131: Single-quoted string preferred over double-quoted string. Line 7, E:0131: Single-quoted string preferred over double-quoted string. Line 8, E:0131: Single-quoted string preferred over double-quoted string. Line 8, E:0131: Single-quoted string preferred over double-quoted string. Line 9, E:0131: Single-quoted string preferred over double-quoted string. Line 9, E:0131: Single-quoted string preferred over double-quoted string. Line 9, E:0110: Line too long (82 characters). Line 36, E:0131: Single-quoted string preferred over double-quoted string. Line 36, E:0131: Single-quoted string preferred over double-quoted string. Line 37, E:0131: Single-quoted string preferred over double-quoted string. Line 38, E:0131: Single-quoted string preferred over double-quoted string. Line 39, E:0131: Single-quoted string preferred over double-quoted string. Line 42, E:0002: Missing space before "=" Line 42, E:0002: Missing space after "="