Posts Tagged ‘web applications’

Testing Email in Django The Easy Way

Wednesday, January 20th, 2010

Today a coworker showed me a very easy way to test django code that sends emails.  It’s straight from the documentation:

Another approach is to use a “dumb” SMTP server that receives the e-mails locally and displays them to the terminal, but does not actually send anything. Python has a built-in way to accomplish this with a single command:

python -m smtpd -n -c DebuggingServer localhost:1025

This command will start a simple SMTP server listening on port 1025 of localhost. This server simply prints to standard output all e-mail headers and the e-mail body. You then only need to set the EMAIL_HOST and EMAIL_PORT accordingly, and you are set.

Character Sets are Important™

Tuesday, March 3rd, 2009

(Note: since this article is about a character that shouldn’t have been able to appear on my screen, I’ve used that character several times to demonstrate.  If you can’t see it, it’s the trademark character, an elevated TM.)

A few days ago I implemented an “email this product to your friend” feature for my new employer Reusable Bags. It all went smoothly until I tested it with products like “ACME Bags™ Workhorse Style 1500″. The ™ in that name caused me endless problems, all related to one of the least known aspects of computing (at least for English speakers), character encoding.

I’ve read Joel Spolsky’s article on character encoding, so I know just enough to identify that my problem has to do with that, but not enough to know how to fix it. I find out that on our website, where the ™ displays fine, the charset is “ISO-8854-1″ a.k.a. Latin1. This is used without problems all over the place. The curiosity here is that ™ is not in that charset. Somehow Firefox translated a sequence of bits from the web page into a character that shouldn’t even exist. I couldn’t wrap my head around that, so I kind of assumed that it was expressing it some other way I didn’t know about and kept going. In the emails I was sending, the character was displaying as a sequence of 3 unusual characters, meaning it was being interpreted wrong. The charset in the email was Latin1 so that was what I would expect from the browser. Since it was 3 chars, that reinforced my idea that it was being encoded in some other unusual way (with multiple bytes) and I kept looking.

I tried everything I could figure to try and make some headway on this bug. I used every English charset I could find everywhere to see if I was inputting the character in one set and interpreting it with another, but nothing worked. I would recount everything I tried, but there was so much I don’t remember it all. I spent probably half a day just switching charsets and retrying things.

Eventually we gave up on representing the character properly and just wanted to strip it out, so I threw in a “str_replace(”™”, “”, $string)”. This didn’t work either! I could replace anything else in the string, but not that blasted ™! This problem was preposterous. There’s no way PHP isn’t recognizing this character. I wrote a testing script to verify the problem in absence of the rest of the page, and there it was recognized and replaced just fine. So what was the difference between the two scripts?

The difference was the source of the text being searched. In my testing script, I typed both the needle and the haystack. In the real page, the haystack came out of the database. I don’t think the database pays much attention to the character encoding, it just stores whatever sequence of bytes you enter. So the encoding used on that string depends on who entered it. Who did enter it? A Windows user. Therefore, the encoding was undoubtedly Windows-1252, which is one of the only encodings I found that includes the ™ character. If I had been smart about it earlier I would have realized that must be the case, because someone obviously entered the character and Windows-1252 is the only encoding that contains it in a way that’s easy to enter.

So how do I type that character in our code files that aren’t Windows-1252? Well I know that in that encoding, ™ is represented by the number 157. That means I can get php to give it to me with the call “chr(157)”. I put that into my str_replace call from earlier and it worked perfectly; detected the ™ and stripped it out no problem. Originally I was going to berate the PHP developers for assuming the Windows-1252 charset in the chr() function but I subsequently realized that it doesn’t matter what little picture is associated with character #157 in any encoding, the binary is still the same.

So the lesson here is to not assume something quasi-magical is happening when two facts seem to conflict, like when I assumed the ™ was encoded in some multi-byte extension to Latin1. It can’t be, that’s not possible. The only common encoding in the English world that includes it is Windows-1252, so that had to be what I was seeing, despite Firefox reporting otherwise. If I had realized and accepted that earlier I would have saved myself a lot of shotgun debugging. Why Firefox did that is a separate question that I don’t really care enough to answer, but IE does some auto-detecting of character encodings and displays whatever it thinks will work the best. Maybe Firefox did the same thing, ignoring the encoding specified in the document, and forgot to update the page info? That’s all I can figure.

PHP best practices

Wednesday, July 9th, 2008

I’m currently working for a company that uses the LAMP application stack. They have only had one full time programmer since they started, and he’s a cowboy. They don’t use much of a database abstraction layer, they mix their display code with their business logic, they don’t do any testing, and even worse than not using source control at all, they sometimes use source control.

I’m starting a new project that will be fairly large and independent from the rest of the site, so I’d like to introduce some better development practices. I don’t know much about php frameworks and stuff, so if you have any suggestions for what I should use please post them in the comments. The only major requirement is that it can be used along side the existing code. So what do you suggest?

JS equivalence operators: "Good enough for government work"

Monday, June 30th, 2008

I was having some strange behavior with a javascript app I wrote. It’s an image thumbnailing interface that allowed the user to zoom and drag an image around. When it loads, the image is scaled to be either as tall or as wide as the thumbnail size, and the other dimension is larger. The user can zoom in and out, but they can’t zoom it smaller than it starts so no whitespace can appear. When a user zoomed in and then all the way out, the image would pop out of the frame a little bit and whitespace would appear at the bottom (this was an image that was as tall as the thumbnail size, I imagine the whitespace would be on the right if the image were as wide as the thumbnail size and taller). After tracing through the javascript for a while I realized the problem; javascript considers (” == 0) to be True.

I have a function that repositions the image so when you zoom in/out it stays centered on the same point. I wanted to be able to call it to reposition for a move that only had a horizontal vector, so I made it check to make sure there was a value for each of the x and y coordinates before it tried moving the image on that vector. I passed in an empty string when I didn’t want to make a move on that vector. The problem came in to play when I zoomed out to the max and the image’s position on the short dimension became 0. I want to move the image to 0 on that vector, but my test for no value was catching the 0 and calling it “nothing”, just like ”.

Once I tracked this down, the solution was simple. Just use the “really equal, I mean it for reals” operator; a.k.a. “===”.

if (left != ” || left === 0) { do stuff; }

A more appropriate way to do this might be to have a real value like “nochange” mark when I don’t want to do anything with that vector, but I did this because I didn’t want to find all the places where I used ” and change them.

"bug" with onclick handlers in IE

Wednesday, June 25th, 2008

I had an issue today with Internet Explorer. An object with an onClick handler worked fine in Firefox and Safari, but in IE the handler only fired every other click. In the course of debugging I discovered that if I clicked slowly, it worked on every click. I realized that this was because IE must be registering an onDblClick event instead of two onClick events. A little testing confirmed this. I searched to see if someone else had the same problem, and found this page. User jamescover had the same issue and found a solution: use the onMouseUp event to handle clicks instead of onClick. He also directed the focus in the onMouseDown event, but I found that part to be unnecessary in my application. A demo of his solution can be found here. I’ll reproduce the code in this post in case that page ever gets taken down:

<script type=”text/javascript”>
<!–

var x = 0;
function addX(){
document['oFrm']['num'].value = x;
x++;
}

var y = 0;
function addY(){
document['oFrm2']['num2'].value = y;
y++;
}

//–>
</script>
This one invokes the function <b>onclick</b>
<form name=”oFrm”>
<input type=”text” name=”num” size=”5″ />
<input type=”button” value=”add” onclick=”addX();” />
</form>
This one focuses the text field <b>onmousedown</b>, then invokes the function <b>onmouseup</b>
<form name=”oFrm2″>
<input type=”text” name=”num2″ size=”5″ />
<input type=”button” value=”add” onmousedown=”this.focus();” onmouseup=”addY();” />
</form>

A Better CAPTCHA

Wednesday, May 14th, 2008

I don’t yet have any need to implement CAPTCHA myself, but if I did, it wouldn’t be your standard distorted and scribbled on text. It would be one of these:

Microsoft Asirra

With Asirra, to identify yourself as a human you have to identify a series of pictures as cats (excluding the dogs). It seems like a sound approach, but on the face it looks so nonsensical that I feel compelled to use it. Picture this internet argument: “Well you plainly have no idea what you’re talking about on this issue, so I won’t keep wasting my valuable time trying to fight with your stupidity! As soon as I click on these cats, you’ll never hear from me again!”

reCAPTCHA

This one is more serious, and has a purpose too. Instead of displaying random obscured characters, it displays a real scanned image of two words from an old book. One of these words has been identified and the other has not. The user types both words, the computer verifies the user’s humanity with the known word, and records what the human said the unknown word was. Through this process the un-digitized book becomes completely digitized. They’re turning CAPTCHA tests, a “lesser evil” annoyance, into something that’s actually good.

Rot13 Utility

Friday, January 11th, 2008

Rot13 is a common method for obfuscating text, often used to randomize passwords or to hide “spoilers” from online discussions. The tool I most commonly use to translate rot13′d text is http://www.rot13.com/, and that works well for translating long sections of ciphertext back in to plaintext. However, often there is just one or a few words to be translated from plaintext to ciphertext, and I find the site to be too much overhead for the task.

That’s why I made a simple php script on my website to do my rot13 translations from now on. The key difference between mine and rot13.com is that the form on mine uses the GET method rather than POST. This allows me to make a firefox bookmark to translate text directly from the url bar. To do this, bookmark this url: http://timsaylor.com/tools/rot13.php?plaintext=%s. Then in the bookmark’s properties add a value to the keyword field. My keyword is “rot”, so now whenever I type “rot [text]” into my url bar, it sends that to my script and opens a page with the ciphertext.

It’s just a simple utility, and writing this blog post about it took longer than actually making the script itself. I just had to rot13 something today, though, and I remembered wishing that I could do it more simply. A quick search turned up this rot13 php function, which meant all the hard work was done. I just wrapped that up in an html form and put it online. The source is here.

Google Gadgets vs Facebook Applications

Friday, August 17th, 2007

Just the other day at work I was tasked with exploring various widget api’s to see if there’s potential in integrating our telephony product with these various web 2.0 sites. I’ve spent the last few days looking over the basics of the iGoogle and Facebook api’s, and there was a stark difference.

I started with iGoogle. I googled for “iGoogle api” and the first hit was this link to the Google Gadgets api documentation. Right there on the front page it says “Write a gadget”. I click Developer Guide, then “Hello, World”, and there’s some gadget code right in front of me. One search and two clicks is all it took to get a basic idea of how Google Gadgets work. Further reading of that page, and the next step in the documentation, “Writing Your Own Gadgets”, gave me all the information I needed to know. Total time: about an hour.

Next was Facebook. Googling leads me to the Facebook developers portal. “Getting started” looks like a good place to start. All it tells me is to install the Facebook developer application. I did that, and tried to start a new application, but it asked me for things that I don’t know anything about. So, back to the documentation. Hmm, nothing else on this page. I’ll click “Documentation” up top. I read around here for a while, and there was a lot of specific information but not general type stuff like I’m looking for. Down at the bottom I eventually see a link called Anatomy of a Facebook Application. This has got to be what I’m looking for. Nope! It describes all the ways a user interacts with your application, but no anatomy at all. I’m getting frustrated at this point; I’ve read all kinds of details about the Facebook api and what the application looks like to the user, but still nothing about how I go about writing one. (Notice, I don’t think they’ve even told me it’s written in php yet!) Finally, I see Guide to Creating an Application. After hours of looking, I finally find what I want. The guide is pretty good, but the example application doesn’t work. I can’t look at the finished product while I read what’s going on under the hood. Total time: A day or so, and still counting.

It’s pretty clear to me that iGoogle has people behind it who know what they’re doing and Facebook needs to get their act together. Their api has been out for several months, they need to get some good documentation written and better organized so developers can more easily figure out how Facebook applications work. There’s no excuse for it really, these applications that confused people aren’t writing could be adding considerable value to Facebook. For now though, I’m ignoring the silly walled garden and sticking to iGoogle.

Proxying web.py through apache

Sunday, April 29th, 2007

I’ve been wanting to try web.py for a while. It seems like it’s easier to learn than Django for making python web applications. I made the hello world app quickly and simply, and decided I’d use it for a project I’m working on.

That’s when it turned south. I have to use php for another project I’m doing, and I use cgi-irc (in perl through CGI) on that server, so I didn’t want to re-setup all that stuff on lighttpd (the recommended method for running web.py). I tried running it through apache with cgi. After a few hours of that not working I switched to trying fastcgi on apache, also to no avail. I got fastcgi to the point where I was having an error that was common enough to be in an FAQ: I was getting 500 responses from apache because fastcgi/web.py wasn’t starting fast enough for the flup WSGI library to realize it, so it would start them over and over until it eventually gave up. I decided I just wanted to see the thing work at this point, so I installed lighttpd and went through the setup for that. When that method didn’t work I was fed up with their install instructions. I decided to try a method that John Quigley recommended. He said I should just run it with the internal http server that I’ve already used successfully and proxy the requests through apache to web.py. This might not achieve the goal of making the server large scale production ready, but I don’t really have to worry about that so much at this point. It does achieve my more important goal of making my php, cgi, and web.py applications all available through a single url and port, so I gave it a shot.

I looked up apache proxying and found it was done through mod_proxy, and each protocol you want to proxy has it’s own module as well. So I installed mod_proxy and mod_proxy_http and added the following line to my apache2.conf file:


ProxyPass /lifelog http://192.168.1.20:8888

192.168.1.20 being the IP address of the server in question. It would be better to get it resolving localhost so this line can be used on other servers, but I didn’t want to bother with it yet. I also had to change the proxy permission rules in proxy.conf to this:


ProxyRequests Off

<Proxy *>
AddDefaultCharset off
Order allow,deny
Allow from all
</Proxy>

This allows any user online to use my ProxyPass rule. Note that I left “ProxyRequests” set to “Off”. If I turned that on, I would be an open proxy that spammers and hackers could use to hide their identity during their nefarious behavior.

Now all that’s left is to start up my web.py server on port 8888 as specified in apache2.conf and whenever I go to /lifelog/* through apache, it’ll send the request to web.py.