TvE 2100 » At 2100 feet above Santa Barbara
Bosque del Apache National Wildlife Refuge, Feb 2004, ©2004 Thorsten von Eicken 

Ruby-prof does not play nice with threads

Posted: January 16th, by tve
Tags: #

… or so it seems! I was trying to profile a worker running in BackgrounDRb and I kept getting errors almost as soon as I’d call RubyProf.start looking like this:


./script/../vendor/rails/activerecord/lib/../../activesupport/lib/active_support/inflector.rb:108: warning: ruby-prof: An error occured when leaving the method Inflector#singularize.
   Perhaps an exception occured in the code being profiled?

Each time the error would occur in a slightly different place. I finally disabled all but one worker (thread) and that seems to have fixed the problem.

Update: it seems that it is possible to use ruby-prof with threads, but it looks like one has to start the profiling before forking off the threads.

Verifying SSL certs when using Net:HTTP

Posted: December 25th, by tve
Tags: #

What good is an HTTPS web service if you’re not verifying the certificate presented by the web service? That’s what I was wondering when I started to use the Amazon Elastic Compute Cloud web service and used a sample that used simple HTTP. So I looked into the docs and quickly found that the trick is to set http.verify_mode = OpenSSL::SSL::VERIFY_PEER when creating the https connection object. Unfortunately that turned out not to be so simple, because the only effect I got is an error on every connection attempt telling me that the peer’s certificate cannot be validated. Very useful!

Black-necked Stilts
Black-necked Stilts, Devereux Slough, Santa Barbara, CA ©2005 Thorsten von Eicken

Of course my next step was a Google search, but after a long time all I found is that everyone turns the verification off! I then proceeded to look at the source code to figure out what is going on, and I finally gave up after a couple of hours. Finally, the Pragmatic Studio alumni mailing list came to the rescue: Devin Mullins gave me the critical tip that made it work: one has to give the certificate file a special name, duuuh. Here is in detail what I did to get it to work:

I have a file ‘cacert.pem’ that has all the root certs I care about. I ran the following command:

# openssl x509 -hash < cacert.pem
f73e89fd
-----BEGIN CERTIFICATE-----
MIICNDCCAaECEAKtZn5ORf5eV288mBle3cAwDQYJKoZIhvcNAQECBQAwXzELMAkG
A1UEBhMCVVMxIDAeBgNVBAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMS4wLAYD
VQQLEyVTZWN1cmUgU2VydmVyIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTk0
MTEwOTAwMDAwMFoXDTEwMDEwNzIzNTk1OVowXzELMAkGA1UEBhMCVVMxIDAeBgNV
BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMS4wLAYDVQQLEyVTZWN1cmUgU2Vy
dmVyIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MIGbMA0GCSqGSIb3DQEBAQUAA4GJ
ADCBhQJ+AJLOesGugz5aqomDV6wlAXYMra6OLDfO6zV4ZFQD5YRAUcm/jwjiioII
0haGN1XpsSECrXZogZoFokvJSyVmIlZsiAeP94FZbYQHZXATcXY+m3dM41CJVphI
uR2nKRoTLkoRWZweFdVJVCxzOmmCsZc5nG1wZ0jl3S3WyB57AgMBAAEwDQYJKoZI
hvcNAQECBQADfgBl3X7hsuyw4jrg7HFGmhkRuNPHoLQDQCYCPgmc4RKz0Vr2N6W3
YQO2WxZpO8ZECAyIUwxrl0nHPjXcbLm7qt9cuzovk2C2qUtN8iD3zV9/ZHuO3ABc
1/p3yjkWWW8O6tO1g39NTUJWdrTJXwT4OPjr0l91X817/OWOgHz8UA==
-----END CERTIFICATE-----

The trick is that the file basename must be the hash printed out by the above command plus an appended .0! So my next move was:

# mv cacert.pem f73e89fd.0

I then adjusted my code as follows:

  http = Net::HTTP.new(link.host, link.port)
  if link.scheme == 'https'
    http.use_ssl = true
    http.verify_mode = OpenSSL::SSL::VERIFY_PEER
    http.ca_file = "#{RAILS_ROOT}/lib/ec2/f73e89fd.0" 
  end
  http.start
  response = http.get(link.request_uri)

This now works like a charm! Phew!

Update:

Upon further inspection, it looks like the above only works for the first certificate in my original cacert.pem file and I was lucky that that’s the one I need for EC2. What I need to do to use all the certs in the file is break it up, one cert per file, use openssl to figure out the filenames, and then use http.ca_path instead of http.ca_file to point to the directory with all the cert files.

Course on Scalable Internet Services (in Ruby on Rails)

Posted: December 20th, by tve
Tags: #

Phew, I just finished teaching a graduate course on building scalable internet services at UCSB (Univ. of California At Santa Barbara). This was a very hands-on, learn-by-doing course with a significant project in Ruby on Rails! The goal was for the students to learn about all the technologies that go into a scalable internet service, specifically into dynamic web sites. The lectures provided background information to support the project and they explained technologies beyond the scope of the project.

Hummingbirds
Anna’s Hummingbirds, Joshua Tree, CA, Mar 2005, ©2005 Thorsten von Eicken

The project consisted of building a transactional dynamic web site in Ruby on Rails and running on Amazon’s Elastic Compute Cloud (EC2). Each site had to hold >100’000 database records that could be searched and explored, have user accounts, and include some form of transaction, such as a shopping cart check-out.

Each project then had to be deployed on multiple servers on EC2 and the groups had to use httperf to demonstrate that they could scale the performance of their site by running a front-end load balancer server, a database server, a memcached server, and up to 10 application servers. All this had to fit into a 10-week quarter, with none of the students knowing either Ruby or Rails at the outset!

Note that the emphasis of the course was on the scalability aspect of the sites and not on the web design or feature-set. Thus it was more important to understand the performance characteristics and optimize the core of the site than to have the most eye-candy. (Although eye-candy is always appreciated, of course…)

Please check out the course wiki for more information, including all course materials!

Accessing inner Java classes via Rjb

Posted: December 3rd, by tve
Tags: #

The entry below is copied straight from another blog. I ran into the problem described therein a few minutes ago and found this solution via Google’s cache. The origin server doesn’t respond, so who knows, this may be lost as soon as it pops out of the cache, so I am duplicating it here. I also can’t tell who wrote it, just that it came from http://doodle.barelylegible.com/blog/?cat=5. So here we go:

== RubyJavaBridge is nice == Sunday, February 12th, 2006

I’ve been playing with the RubyJavaBridge that was talked up at the last Ruby get-together and so far it is indeed very swank.

I couldn’t resist investigating the issue Les had with accessing static inners and I think I found the syntax. Accessing inner classes (static or not) can look a little wonky but it is doable. Statics are loaded like any other class, but their pathname is ‘OuterClass$StaticInnerClass’. The nonstatic inner classes are a tiny bit trickier. Import like the static, with ‘OuterClass$Inner’; now you have the inner class, but the trick is in instantiating an instance: you must provide an OuterClass instance as the first argument to the constructor (thus revealing a little behind the curtain of java the implicit access an inner has to its outer’s methods and data):

Outer = Rjb::import(‘Outer’)
Inner = Rjb::import(‘Outer$Inner’)
StaticInner = Rjb::import(‘Outer$StaticInner’)

outer = Outer.new
inner = Inner.new(outer)
staticInner = StaticInner.new

I have full sources and the example output below.

I have so many ways I want to use this bridge. This is the key I need to access the AS400 from Ruby. Tasty tasty.

[Sorry, the “full sources and examples” are not in the Google cache…]

Optimizing a Rails application, part 1

Posted: December 2nd, by tve
Tags: #

I’m working on a really cool Rails application called AWS-Console which manages servers on Amazon’s Elastic Compute Cloud. With the click of a button, we can instantiate new servers which fire up within a couple of minutes. We can even take a svn repository of a Rails app and launch it fully automatically, takes about 10 minutes to come up. Our test uses mephisto: want 10 instances of mephisto running? Clicke here…

So I was measuring the performance of the AWS-Console site and it turned out to be abysmal! Like 1 request per second max, running apache+mongrel_cluster on a 2Ghz/2GB box. That was rather disappointing. So I rolled up my sleeves to try and figure out what is going on, and this is the tale of my first Rails performance adventure!

Sandhill Cranes
Sandhill Cranes, Bosque del Apache, NM, Feb 2004, ©2004 Thorsten von Eicken

The goal

Up to now we’ve been exclusively focused on functionality and have not cared a bit about performance. But at some point one does have to make sure that performance is in the ballpark. I’ve read Stefan Kaes’ excellent blog with his many suggestions on how to improve performance. So I have about 20 things in mind that I could “fix” and hope that they improve performance. But I don’t like working blindly, so I decided to follow the conventional procedure:

  1. benchmark the application to get baseline measurement(s)
  2. profile the application to figure out where the time is going
  3. optimize the bottlenecks using Stefan’s tips or things I figure out on my own
  4. re-benchmark the application to see whether I actually improved things
  5. go back to step 1 until satisfied
  6. learn from what I did so I don’t have to do it all over for the next app/feature

I’m planning to write up the whole story here, this is just the first part. I hope this will be useful to others and I hope it will jog my memory the next time I’m going through this :-).

Part 1: benchmarking

For the benchmarking I am using httperf which is an excellent program to apply realistic load to a web site. The reason I like httperf is that it decouples request generation from server responses, which means that it can continue opening new concurrent connections to the server no matter how fast the server responds. This is as it happens in real life: users come to the site no matter how slow the server is and only as they browse around they get slowed down by the server’s response time. The bottom line is that this allows httperf to overload the server and really drive it against the wall. Httperf can also take little scripts of URLs which describe a flow through the site and it will open a connection and then “walk through” the URLs one at a time, and if the site has images, it can request the images for a page in a burst. So all-in-all it’s a good and realistic benchmarking tool.

The downside of httperf is that the URL scripts are very primitive, and in particular if the site requires authentication, then it’s tedious to have httperf log in as a different user every time it opens a connection.

Creating a workload

To get started I downloaded httperf from HP’s site and installed it. I don’t remember the details, but it seemed pretty straightforward.

The key to using httperf as described above where it walks through a sequence of URLs is the –wsesslog workload generator. If you want to play with it, you will need to check out the man page for all the details, but here is what I did. The first thing was to create a file with the workload. My first test of the AWS-console looked as follows:

/
/sessions/new
  /javascripts/prototype.js?1160194386
  /javascripts/effects.js?1160194386
  /javascripts/dragdrop.js?1160194386
  /javascripts/controls.js?1160194386
  /javascripts/application.js?1154539549
  /javascripts/niftycube.js?1160194386
  /stylesheets/yui/reset.css?1157476643
  /stylesheets/yui/fonts.css?1157693397
  /stylesheets/yui/grids.css?1160194386
  /stylesheets/syslitics.css?1162714447
  /stylesheets/niftyCorners.css?1160194386
  /images/loading.gif?1154732591
/favicon.ico
/sessions method=POST contents='email=test@test.com&password=test'
/
/ec2_instances
/ec2_images
/ec2_instances
/ec2_images
/ec2_instances
/ec2_images
/ec2_instances
/ec2_images

This file fetches “/”, which redirects to “/sessions/new” and which is followed by fetching all the javascripts and stylesheets referenced by the response. Then comes the favicon and that is followed by a POST of the login information. Then it fetches “/” because the login redirects there, and then a bunch of fetches of the instances and images pages.

The way I came up with this list is to start a fresh browser and navigate to the site, and then look at the server log in /var/log/httpd/access_log (or similar) and grab all the URIs listed there. I had to make-up the contents of the post myself, but it’s simply a URL-encoded string of the various form fields.

Note that httperf does handle cookies in a simple but effective manner. At the first request, the site will return a session cookie to httperf which the latter dutifully presents with all subsequent requests. So a site where the user has to login before proceeding actually does work correctly. Nice!

Then I tested this using httperf with the following command line:

httperf --hog --server test.aws-console.com --wsesslog=1,0,aws-test1 \
--session-cookie -ssl --print-reply

The –hog option has to do with sockets and is important but not interesting, –session-cookie enables cookie handling as described above, –server is the address of the server, –print-reply prints out all the responses from the server so I can check that the proper pages get returned and not some errors, –ssl uses https (aws-console is an SSL site), and –wsesslog references the workload file. The –wsesslog options signify that httperf should run through the workload file once, and delay 0 seconds between URLs (to simulate user think time, which I’m not interested in), and aws-test1 is the filename of my workload. Printing out the replies means lots of stuff to scroll through, but it allowed me to check that the login and the other page fetches worked properly.

Applying some load

Then a quick test running 30 sessions and starting a new session every two seconds:

httperf --hog --server www.aws-console.com --wsesslog=30,0,aws-test1 --session-cookie --rate 0.5

To produce a graph, I would vary –rate from about 0.1 to 5 (ramp starting a new session from one every 10 seconds up to 5 per second). Good numbers for your app will depend on the performance that you see.

Well, the result I got was a whopping 1.2 replies per second on average!!! Given that almost half the requests are for static pages (the style sheets and java scripts) which are served blindingly fast by apache, I can only say “abysmal!”

With this poor performance there really isn’t much point in generating a graph that shows reply rate as load is increased or load vs. response time.

Something I did is to add more sessions to my description file so that I can exercise different use cases (flows through the site). I though of automatically generating variants of the workload file where I change the login id and perhaps some other post parameters, but I haven’t implemented that yet.

Note that httperf has a “-v” option which is helpful in seeing progress. One can specify a very large number of sessions in –wsesslog and then watch httperf print out its requests/sec measurements every 5 seconds and after one to two dozen measurements, hit ctrl-C to get the overall stats.

Figuring out where the time is going

So, why doesn’t it do more requests per second than it does? That’s an excellent question for part2, to appear soon!

Installing a new version of the EC2 AMI Tools

Posted: November 13th, by tve
Tags: #

The Amazon Web Services EC2 AMi Tools are the set of tools needed to create a new Xen image for Amazon’s Elastic Compute Cloud.

To install a new version, the following commands are recommended:

rpm -q ec2-ami-tools
wget http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.noarch.rpm
rpm -Uvh ec2-ami-tools.noarch.rpm
rpm -q ec2-ami-tools

That’s it!

Why running mysql on Amazon EC2 and S3 is not that simple

Posted: September 22nd, by tve
Tags: #

Jeff posted a blog entry that implies that it’s not that difficult to run mysql on Amazon EC2. Ha! Possible : yes. Easy: noooo! The devil is in the details, many of them!

For example, MyISAM tables do not replicate correctly because the replication (binary) log does not obey transactions. You can have a transaction that rolls back on the master but it’s in the log and happily executed on the slave. Ouch!

Even if you use InnoDB tables, you are not safe. For example, you can write non-deterministic SQL statements that may produce different results on the slave than on the master. An example is creating a table with an auto-increment key using a select from another table. The keys assigned depend on the order in which the select produces results. This may be different on the slave than the master and you end up with different keys, which is not going to match subsequent operations! (A friend ran into this one, one slave matched the master while two others became inconsistent quickly, he spent hours and hours figuring out what went wrong!)

You also assume multiple EC2 instances in your description, but there is no control over their location. They may well be on the same power circuit that goes out. What about the S3 node? I forget the semantic details: when S3 ACKs the store request, does it guarantee that the data is replicated already or does it only guarantee that it is on persistent storage? In other words, could it be on a machine in the same datacenter on the same UPS that goes down with the EC2 mysql box and doesn’t come back up until the UPS is repaired a few hours later? In that case, yes, the data would be safe on S3, but unavailable for a few hours. That wouldn’t help me in quicly restoring from S3 onto a new mysql instance, would it?

Someone else also mentioned restore time on the EC2 forum. How long would it take to restore a 70GB database? (70GB is probably the max you can put on an EC2 instance if you ever want to be able to make a backup copy without jumping through hoops.) My guess is at least 10-15 minutes, and that doesn’t count the time to apply the logs.

Oh, talking about logs, when do you start your logs? You need a clean full backup and then you can start the (incremental) replication logs. After a while, you really should start afresh so you don’t collect endless logs that would take forever to apply. If your app is not 24/7 it’s easy, but if it’s 24×7 you need to do what’s called a hot-backup. Ahhh, no such support in mysql/innodb unless you pay for it.

And performance? Did you know that a mysql slave can easily be slower than the master? The reason is that the slave reads and applies the replication log using a single thread. So while your app server bangs on your master with high concurrency, your slave performs one operation at a time. Usually this is not an issue because the master has the additional load of reads, but the problem does exist.

The bottom line is that in theory everything is available to set-up a nice mysql installation on EC2/S3, but in practice it’s far from easy to actually pull it off in a reliable manner.

Creating a Debian Xen virtual machine

Posted: September 17th, by tve
Tags: #

This is brief note on how to create a Debian Sarge Xen VM on a Gentoo box. I’m using file-based disk images because it’s so flexible. I don’t believe the partition based ones are actually significantly faster. Inspiration for some of this came from this wiki page.

First I created a 30G LVM volume to hold disk images, created an XFS file system, and mounted it. Then I created a 4GB root image for Debian using XFS as well. I like XFS because it allows me to shrink and expand the filesystem and it also allows me to take snapshots.

hydra xen # lvcreate -L 30G vg1 -n images
 Logical volume "images" created
hydra xen # mkdir /etc/xen/images
hydra xen # cat >>/dev/fstab
/dev/vg1/images  /etc/xen/images xfs  noauto,noatime  0 2
<ctrl-D>
hydra xen # mkfs.xfs /dev/vg1/images 
meta-data=/dev/vg1/images        isize=256    agcount=16, agsize=491520 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=7864320, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=3840, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
hydra xen # mount /dev/vg1/images
hydra xen # df -h /dev/vg1/images
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg1-images
                       30G   13G   18G  41% /etc/xen/images
hydra ~ # dd if=/dev/zero of=/etc/xen/images/debian.root bs=1024k count=4096
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB) copied, 129.508 seconds, 33.2 MB/s
hydra xen # mkdir /mnt/xen/debian.root
hydra xen # mkfs.xfs /xen/xen/images/debian.root 
meta-data=/big/xen/debian.root   isize=256    agcount=8, agsize=131072 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=1048576, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=2560, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0
hydra xen # mount -o loop /etc/xen/images/debian.root /mnt/xen/debian.root
hydra xen #

Now we’re ready to install Debian Sarge:

hydra xen # emerge debootstrap
Calculating dependencies ...done!
>>> emerge (1 of 2) app-arch/dpkg-1.10.28 to /
...
>>> emerge (2 of 2) dev-util/debootstrap-0.2.45-r1 to /
...
hydra xen # debootstrap --arch i386 sarge /mnt/xen/debian.root http://ftp.us.debian.org/debian
I: Retrieving debootstrap.invalid_dists_sarge_Release
I: Validating debootstrap.invalid_dists_sarge_Release
I: Retrieving debootstrap.invalid_dists_sarge_main_binary-i386_Packages
I: Validating debootstrap.invalid_dists_sarge_main_binary-i386_Packages
I: Checking adduser...
...
I: Base system installed successfully.
umount: /mnt/xen/debian.root/dev/pts: not mounted
umount: /mnt/xen/debian.root/dev/shm: not mounted
umount: /mnt/xen/debian.root/proc/bus/usb: not mounted
hydra xen # 

Now we configure the new system:

hydra xen # chroot /mnt/xen/debian.root/
hydra:/# cd /etc
hydra:/etc# cat >fstab
/dev/sda1     /     ext3     defaults     0     1
/dev/sda2     none  swap     sw                    0     0
proc          /proc proc     defaults              0     0
hydra:/etc# mv /lib/tls /lib/tls.disabled
hydra:/etc# exit

Here’s my Xen config file, note that I have a DHCP server with static leases based on the mac address, that’s why I specify it carefully.:

hydra xen # cat debian 
kernel = "/boot/vmlinuz-2.6.16-r1-xenU" 
memory = 128
name = "debian" 
vif = [ 'mac=00:16:3E:00:00:24' ]
disk = ['file:/etc/xen/images/debian.root,sda1,w', 'file:/etc/xen/images/debian.swap,sda2,w']
root = "/dev/sda1 ro" 
hydra xen # 

Let’s start the virtual machine!

hydra xen # umount /mnt/xen/debian.root/

Top missing features for Amazon EC2

Posted: August 25th, by tve
Tags: #

We need Static IPs. Dynamic IPs don’t work for a reliable web server. Hopefully a solution will come combined with a simple load balancing option.

SQL database: I think that what’s really missing is a SQL database service. Yes, I can run mysql or similar on EC2, but it’s really not a good fit. The lack of persistence is not a show-stopper but it is a pain right now. But more importantly, the machine specs for CPU speed, memory size, number of spindles, and disk space just don’t cut it. There isn’t enough oomph there to run a real database-backed web service. Amazon needs to step up and offer a D2B service: Distributed DataBase. I’ll pay for storage at 2x the cost of S3 (since DB storage tends to be more expensive) and $.20 per MegaTransactions (equivalent to $0.20 for 1 hour at 277 tps). Well, 2006 isn’t over yet, is it :-)

The lack of persistence is not an real showstopper issue. The only set-up I could imagine that would really improve things is if they had SAN type of storage attached to their machines. This way the machine can be fried and your data is still intact on the SAN and can be mounted on a fresh machine. Use RAID on the SAN to safeguard against failures there or against datacenter-wide outages. At that point, failures should be down sufficiently in probability that a “disaster recovery” type of backup is sufficient. For example incremental backups to S3 every 10-30 minutes. Alternatively all enterprise class SANs have mirroring options. You could pay Amazon a few more bucks to be allocated on a wide-area mirrored SAN partition. But all this gets into big bucks quickly, even on an Amazon scale, so I don’t think we’ll see that anytime soon.

I believe the persistence issue will be solved though log replication. Replicate database logs to another instance or to S3 in a real-time manner and have EC2 give you some control over the placement of instances so that a cluster pair has some degree of failure isolation. For files use a log-based filesystem and also replicate the logs. Ok, all that will take some time for people to sort out, but when they/we do, everyone will benefit, not just EC2 users because all this is not just an EC2 problem. Your machine at joe-hosting.com is just as affected. Just that usually when it crashes your disk doesn’t get wiped, or if the motherboard dies joe moves the HD to the new box for you. Just wait until something more interesting happens, or the datacenter goes out for a few hours and the data is as good as gone because you have to bring up your disaster recovery datacenter after 30 minutes of outage and once traffic hits it you can’t merge any data recovered from the old box back in anymore.


RadRails no-go

Posted: August 25th, by tve
Tags: #

Up to now I’ve been using JEdit as my editor when programming RoR apps. Nice, but not quite there. Also, would be nice to have an IDE, me thinks. So I finally gave RadRails a spin. This is on a windows box, so no TextMate… I’ll switch to a Mac when Vista forces me to upgrade one way or the other. Tangent. Back to what I found out after two days of trying RadRails.

The first thing that drove me nuts is that RadRails doesn’t support drag&drop cut&paste of text. I’m so used to selecting text and then dragging the selection in order to move the text that I just can’t do without. I can’t believe that it’s not supported, it just can’t be true, yet I haven’t found a way to “enable” such a feature. Please let me know if there is a solution!

Then I found out that I don’t seem to be able to split the editor window, e.g. to see two files at the same time. It’s easy to switch between the most recent few files using the tabs at the top, but I often really want to see two or three files tiled vertically on top of one another. Ouch!

I love being able to stop & restart the webbrick server at the bottom, and the tailing of the log file is also often handy. But I don’t see any instructions on how to set things up so I can run rails in the debugger and set breakpoints. Maybe that’s not possible, dunno. So I haven’t seen any way to turn all these nifty debugging buttons from being teasers to becoming actually useful.

I’m coming to the conclusion that at this point RadRails is a no-go and am switching back to JEdit. Maybe in a few months RadRails will have improved sufficiently to give it another spin…

Upgrading typo 2.6 to typo 4.0.3

Posted: August 25th, by tve
Tags: #

Well typo-4 is out. I had installed the typo-2.6-with-rails archive which had given me everthing in one big easy to install bundle. But can you believe it, the typo folks did not post any upgrade instructions! Yeah, there’s a great new installer, but that’s only great if you want sqlite and stuff like that.

Anyway, I’m running Fedora Core 5 (FC5), mysql, and lighttpd for typo. So here’s the upgrade story:

First upgrade your stock ruby & rails installs. I used the typo gem to get the dependencies:
gem install typo

And when it asked to install sqlite I said “no”. But by then I got everything else I needed. A hack, but it worked for me.

Download the new typo tgz archive and extract it to a new directory:

wget http://rubyforge.org/frs/download.php/12504/typo-4.0.3.tgz
tar zxf typo-4.0.3.tgz
cd typo-4.0.3

Now copy over your database.yml and edit /etc/lighttpd/lighttpd.conf to point to the new typo dir. For me that ended up being these two lines:

'server.document-root        = "/home/typo-4.0.3/public/" 

"bin-path" => "/home/typo-4.0.3/public/dispatch.fcgi",

Now back up your database:

mysqldump -u typo_user_name -p --opt -A >dump-2006-08-25
And migrate the DB content:
rake environment RAILS_ENV=production migrate

Now copy any themes over from the old themes dir and finally restart lighttpd:

/etc/init.d/lighttpd restart

Done! (Well, if you’re lucky!)

More thoughts on EC2 pricing

Posted: August 25th, by tve
Tags: #

[This is from a comment I left on Jonathan’s blog.]

I have to admit I am disappointed at the pricing of EC2. At this rate it costs $72/mo for a not so hot machine, without persistent disk storage, and without bandwidth. Once you add bandwidth and storage you easily exceed the costs of a dedicated server at a decent hosting company

Compare this to S3 where you have to pay at least 2x for anything remotely similar. With EC2 there is no such 2x factor.

So the only advantage over another hosting company is that if the thing fails you have a higher chance of being able to get a new box up and running within minutes, and if your usage skyrockets it’s quick to get more machines online. But we really don’t know about the reliability and scalability of EC2 since Amazon doesn’t reveal any details whatsoever. At this point it could well be that they have all of EC2 in a couple of racks at a single location…

Amazon Elastic Compute Cloud just launched

Posted: August 24th, by tve
Tags: #

See the EC2 announcement on the Amazon Web Services site. Pretty cool to have “infinite” compute resources available (credit card limit permitting)! Basically you get a Xen virtual linux image that you store on Amazon S3 and can “intantiate” on a server. The server is actually dedicated, so the Xen virtual image is just used in order to conveniently load the image and control the set-up.

The specs of a server are “the equivalent of a system with a 1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth.” The virtual server image that is on S3 and that gets instantiated is limited to 10GB. Once running, there is a swap partition (2GB, I believe) and a 160GB “ephemeral” partition that is mounted as well. The ephmeral partition is lost if the server crashes and the machine instance is restarted on a different server. This means that for persistent storage one better use S3 or multiple instances.

Overall this sounds a bit more expensive than I had hoped. For $120/mo you get a dedicated box with 2TB transfer at various hosting facilities (without trying to get rock-bottom). At Amazon it costs you $72/mo if you run an instance 24/7. If you then did 2TB of transfer that would slap on another $200, ouch! Looks like the a good idea would be to run the base servers at some traditional hosting facility and to leverage Amazon for peaks only.

SOne of the things that is missing from EC2 is static IPs. Every instance gets a dynamic IP allocated. That makes it difficult to run a reliable external-facing web server. One can use dynamic DNS services, but with the amount of caching going on in the DNS system that’s really not a satisfactory solution. Hopefully AWS will add some form of load balancing solution with static IPs soon!

Rails list / show / edit variations

Posted: August 17th, by tve
Tags: #

Having tried the standard Rails scaffold and now the new Streamlined scaffold, I feel like I’m back to square one in terms of designing the interaction model for my app. It seems it all boils down to how one handles handles list / show / edit, in particular where AJAX edit in place is possible and where it isn’t. There are only a few combinations possible, but unfortunately both the standard Rail scaffold as well as Streamlined only implement one of them, so it makes it difficult to move stuff around or to build an app that uses more than one model of interaction.

The standard Rails scaffold doesn’t use AJAX at all. It’s model is as follows:

  • List - Table with 1 row per object, buttons for show / edit / delete per row, new (create) button at end of table
  • Show - Table with 1 row per attribute, buttons for back and edit at the bottom
  • Edit - Form table similar to show, but with editable forms, buttons for save and cancel at the bottom
Streamlined takes a different approach and basically operates entirely from within the list view. Show and edit are handled within pop-up layers:
  • List - Table with 1 row per object, buttons for show / edit / delete per row, new button at end of table
  • Show - Pop-up table with 1 row per attribute, edit button at end, and “window” close/minimize/maximize buttons in the pop-up window title bar
  • Edit - Uses same pop-up window as show and transition between show and edit is smooth (but does require server request), buttons for save and show at the bottom.
A nice aspect of the streamlined show/edit pop-up windows is that it is possible to show multiple elements at the same time.

The ajax scaffold seems to be similar to Streamlined in that it operates entirely out of the list view. I haven’t tried it, so I’m not 100% sure. From what I can gather it operates as follows:
  • List - Table with 1 row per object, buttons to edit and delete, button for new (create) at the top.
  • Show - no such view, it’s all shown in the list table
  • Edit - Creates an in-place form by increasing the (vertical size of the row and placing a form into the row with buttons for save and cancel.
The mode I’d like to have at the moment is different from the above. The primary reason is that I have a lot of information for each object, including images, thus maing it impractical to show everything in a list view. What I’d like is:
  • List - Table with 1 row per object, showing only selected attributes, click on row to show, button at top for new
  • Show - Page with 1 row per attribute, click on any attribute to switch to edit, button at top for delete
  • Edit - Seamless in-place transition between show and edit, buttons for save and cancel at the bottom
The second reason I’d like to have a full page for show, instead of the pop-up that Streamlined creates, is that I have a bunch of many-to-many relationships for which I’m still trying to find a good editing model. I am not using the Rail HABTM at all and am instead creating explicit models for the relationships because each of these relationships has attributes on its own. What I’d like to happen with these relationships in the views of the entities that are being related is the following:
  • List - There is no list view for a relationship on its own, but there is a list partial that can be included in a cell in the list view of either of the entities being related
  • Show - There is no show view for a relationship on its own, but there is a show partial that is included in the show view of either of the entities being related. The show partial is really more like a list view in that it shows a table with one row per relationship.
  • Edit - Again, there is no edit view for a relationship on its own, but there is an edit partial very similar to the show partial that is intended to be included in the edit view of either of the entities being related. The edit partial allows inline editing of the relationship attributes, and it has a delete button for each row. At the bottom it has a drop-down select box to add new relationships.
Ok, I really need to add some pictures to explain this. But actually the real point I want to make is not that the way I’d like to structure the list / show / edit interaction is better, but that I need the flexibility. So what I’d like is the scaffold to offer more flexibility and generate more possibilities for me: don’t try to lock me into one interation model. Either give me a bunch of options to pass to the generator ot generate a whole bunch of partials that I can combine in different ways.

Installing Ruby on Rails on a Rimuhosting Fedora Core machine

Posted: August 7th, by tve
Tags: #

Here I go for my second Ruby on Rails install. This time it’s on a “virtual private server” at Rimuhosting

Installing a whole bunch of ruby stuff:

yum install ruby ruby-libs ruby-mode ruby-rdoc ruby-irb ruby-devel ruby-docs

Installing gems (check for latest version on RubyForge)
wget http://rubyforge.org/frs/download.php/11289/rubygems-0.9.0.tgz
tar zxvf rubygems-0.9.0.tgz
cd rubygems-0.9.0
ruby setup.rb

I installed rails:

gem install rails --include-dependencies

Now mySQL, which was installed but needed a bit of config:

chkconfig --add mysqld
chkconfig --level 3 mysqld on
/etc/init.d/mysqld start
/usr/bin/mysqladmin -u root password 'somethingsmart'
/usr/bin/mysqladmin -u root -h graphs.voneicken.com password 'somethingsmart'

Now we’re ready for the ruby mysql interface. First install the mysql-devel package (Rimu uses apt on FC5, not yum):

apt-get install mysql-client

and then the ruby interface itself:

gem install mysql -- --with-mysql-dir=/usr/lib/mysql --with-mysql-config=/usr/bin/mysql_config

Now we’re ready for mongrel!

gem install mongrel

did it’s job without complaining. What a relief!