Automating ING account balance retrieval
I like to keep track of my money and I have an ING bank account. The problem is that I have a lot of online accounts that concern my money: one for online banking, one for my regular bank, one for investments, one for an IRA, one for my credit card, and one for loans. If I want to see what’s going on in my financial world on any given day, I have some login and authentication work ahead of me.
The dream of automating logins retrieving balances is not new. Mechanize is a Perl library (also available for other nameuages) that, as I understand it, can allow you to authenticate to a site and scrape some info. But I had a hell of a time.
The websites I am concerned with have undergone several changes, even in the past few months, in the way that users login and authenticate to their sites. Handling cookies and JavaScript problems became too complex for me. After a brief trial with firewatir, I turned to Selenium.
Selenium remote control is a program that can drive a Firefox web browser as if a real person were using it. All cookies and JavaScript intricacies are handled just as your regular web browser would handle them. For now I have settled on using this tool for automating logins.
INGDirect’s login process is one of the more complicated that I use. A user enters their account number, they click the login button, then they push a uniquely generated pad of numbers coupled with letters to convey their pin number, they may answer a security challenge question. I will tackle this login with Selenium remote control for Ruby, Firefox, and a tool called Selenium IDE to help generate code.
The first thing we need to do is to start the selenium server that will talk to Firefox:
@server_thread = Thread.new do
system "java -jar ./selenium-remote-control-0.9.2/selenium-server-0.9.2/selenium-server.jar -multiWindow"
end
sleep until @server_thread.status == "sleep" Obviously in this case my selenium-remote-control folder is in the same directory as my script. Here we’re starting the server in a new thread so it can chug away while the rest of the program runs. This code starts the server in multiWindow mode because ING will not let you view their site from within the frame layout that selenium usually uses.
The next step is to start up a Firefox window.
@selenium = Selenium::SeleneseInterpreter.new("localhost", 4444, "*chrome", "https://secure.ingdirect.com/", 1000)Here we tell Selenium to use the chrome protocol to drive Firefox. This allows some browser security restrictions (imposed to help prevent JavaScript XSS attacks) to be bypassed.
Now we begin the dance. Here is a great opportunity to use Selenium IDE. The learning curve is pretty easy. Basically it will allow you to record the actions you make on a website. The extension can then spit out some code for you. I like Ruby so I’m sticking with that.
@selenium.open "/myaccount/InitialINGDirect.html?command=displayLogin&device=web&locale=en_US&userType=Client"
@selenium.type "ACNID", "1234567"
@selenium.click "//input[@name='YES, I WANT TO CONTINUE.']"
@selenium.wait_for_page_to_load "30000"
@selenium.mouse_up "//img[@alt='one']"
@selenium.mouse_up "//img[@alt='two']"
@selenium.mouse_up "//img[@alt='three']"
@selenium.mouse_up "//img[@alt='four']"
@selenium.click "SUBMITID"
@selenium.wait_for_page_to_load "30000"Unfortunately Selenium IDE will not be able to record the button pushes (selenium.mouse_up "//img[alt=‘four’]") as of this post. These buttons used to be activated by a click, but recently have changed to be activated by the mouse_up command. The preceding code is always going to look ugly and always be subject to change unfortunately. If anyone has a way to clean up this kind of monotonous code I would like to hear it in a comment.
Most of the challenge in producing this code is trying to find best way to identify the pieces of the web page that you want to click or enter information into. Google “xpath” and learn to use the “DOM Inspector” for Firefox if you’re having trouble.
This next little bit is necessary because even after the page loads, not all of the html is available for Selenium to see. There must be a better way around this, but this works.
sleep(1)Finally we are going to grab the balance off of the page that we’ve finally loaded. In this case it’s the first number with a format that looks like this: $12,345.67.
bal = @selenium.get_html_source[/\$\d+([,\.]\d+)+/]That crazy bit of jibberish at the end ([/\$\d+([,\.]\d+)+/]) is a regular expression, which I’m still new to. These are fun, but not a topic for this post.
We finally have the balance. We should clean some things up…
@selenium.close
@server_thread.killWhen trying this on your own banking site it’s bound to take some trial and error, during which time I hope no one sends the FBI to your house. Of course, it should be noted that keeping all of your bank info in an unencrypted file (as in this example script) is very bad. The script could be modified to grab data from an encrypted source, but in my case I just encrypted the whole script. Also I never addressed the challenge question that can appear on INGDirect and many other sites — a good post for another day.
This one script is of little use on its own, but with a little imagination and work you could have some pretty graphs and automated checking of all of your accounts. I’ve combined many of these smaller scripts with a larger one to plot my financial info over time (with some additional info on debt and equity) so I can see how poor I’ve gotten while wasting time writing scripts.