Friday, November 20, 2009

R Editor

Apology - This turned into a rather long post but I am happy to say that it is NOT a rant. In contrast, it is actually rather constructive. While I have written this for K/Ubuntu, it is relevant to other Linux distros.

A couple of days ago, there was an interesting discussion on the REvolution Computing blog, where David discusses installing and using ESS as a front-end for R. His discussion of ESS got me to thinking about R's default user interface on Linux (and other operating systems) and how accessible they made R to new users. While the default R interface on Windows and OS X are light years from offering the kind of accessibility and ease-of-use offered by SPSS or other graphical tools, they do make it easier for new users to know where to start on their journey into the land of single-letter Googling.

On Windows and OS X, the default interface for R is a Graphical User Interface (GUI). The screen-shots I have seen of the R interface on OS X look very nice. (Please note that this screen shot is very dated, v 2.6.2, and may not represent the current tool very well)

I have used the Windows interface and while it is not as sophisticated as the interface provided by OS X, it is adequate. (This screen shot is of more current vintage, which really just highlights the gap in the two tools.)

In contrast to both, the default Linux interface is straight-up command-line heaven (or hell, depending on your point of view). I rather like the spartan, CLI interface provided by R, but many others do not. The most important reason I think the current default option is a problem is that many new users have a hard time locating R in the menu, because it's not there. Imagine a new users' surprise. After downloading and installing R, which is a fairly hefty download, the user is unable to find an icon or link to start the program.

This isn't exactly newbie-friendly. In fact, I will admit that I spend several minutes looking for a menu entry before figuring out that there wasn't one. And I figured this out by looking at the contents of the core package and realizing that there was no R.desktop file installed but there was a program called R in /usr/bin.

David's post on the REvolution Computing blog discusses ESS as a front-end for R. I have used ESS in the past and it is a very good way to interact with R. Unfortunately, I had some problems with ESS and EMACS in general which eventually led me to abandoning it, but for many experienced users this is an excellent choice. In fact, it is obvious that many of the users on the R-users mailing list use ESS exclusively or nearly exclusively.

Lately, I have been using the new Vim-R-plugin2 for . . . Vim. It is similar to ESS in many ways but uses screen to connect Vim to R. I have been using screen at work for other reasons and the integration between Vim, R, and screen makes my life much easier. As a nice added bonus for Ubuntu users, the plugin works just as well with byobu.

While editor extensions such as Vim-R-plugin2 and ESS are great options for experienced users, these tools do not provide an adequate interface for new users. There are many reasons for this.
  • Learning curve -- Don't pretend there isn't a learning curve.
  • Not installed by default when you install R.
  • Neither tool provides a menu entry to start R.
My point is not to critique either tool. They both rock just the way they are. But, they are not good tools for new users and any attempt to shoe-horn them into meeting the needs of new users will fail, miserably. Asking a new user to learn EMACS and R at the same time is enough to send many intrepid analysts running for the safe hills of SPSS syntax.

Inevitably, someone will point out that while Python is installed by default on Ubuntu, IDLE is not and therefore Python does not have, by default, a menu entry in the menu. Hopefully everyone will read this far before writing a response in the comments. :-) To me the difference between Python and R is simple. The people looking for Python are going to have at least some technical skills, otherwise they will never hear about python and don't need to interact with it. In contrast, non-programmers do learn about R and want to try it out. In the past year, I have seen articles about R in the NY Times and other mainstream publications. While R is unlikely to ever achieve an installation base similar to FireFox, it is an increasingly popular tool. More importantly, many of the people expressing interest in it are not programmers or Linux users with lots of Command-Line-Fu.

What is needed is a sane front-end for new users that will help ease them into R. I am not saying that R needs an interface like SPSS or PSPP. I would be more than satisfied by something similar to what is provided for Windows or OS X users (perferably more like the OS X interface). At the very least a new user should be able to install R and find it/run it from their menu! Fortunately, there are several front-ends to R.
Many of these options present problems as the "default" GUI for R. For starters, Cantor hasn't actually been released yet and like Rkward, is designed for the KDE desktop. While both apps will work wonderfully under Gnome, XFCE, or any other desktop environment, both tools have a large dependencies list. Installing either of these tools, plus R on a non-KDE based environment will require a large download that many new users may not like. Besides, R is not part of the KDE project and requiring a new user to install KDE when installing R is absolutely silly.

Tools such as JGR or Rattle are not currently in the Ubuntu repositories, which makes using them as the default interface difficult to say the least. They can be installed via CRAN, but that's not the first thing that a new user is going to want to do.

Kate and Gedit include syntax highlighting for R and can embed a terminal in the editor but, new users will not find this easily and it does not address the need for a menu entry.

We have narrowed the list to tkStartGUI and RCMDR. tkStartGUI is actually installed by default, although it's rarely used. If you are already running R, it's easy to get it running, if you want to. Simply enter these two commands into an existing R session:

library(tcltk)
tkStartGUI()

This should start the simplest graphical interface you have ever seen. While it's not pretty (tcl/tk), it is functional AND odds are good that if you are reading this long-ass post, you already have it installed! You will quickly notice that this interface is actually less useful than the basic interface provided on Windows.

While Rcmdr is not installed by default, it is easy to install:

sudo apt-get install r-cran-rcmdr

On my machine, installation of Rcmdr required me to install 22 megabytes of additional R extensions. Most users can spare 22 megabytes, or more and this is a much smaller download/installation than RKward on a Ubuntu system. But, much like tkStartGUI, Rcmdr does not come with a .desktop file and can not be located from within the K/Ubuntu menu. Starting Rcmdr, from within R is easy:

library(Rcmdr)
Rcmdr()

This should present you with a nice easily used GUI for R. While it's not as nice as the OS X interface, it is an ugly (tcl/tk) but capable interface for R.

<<>>

It is possible to set up a default Ubuntu installation to include a menu entry for either tkStartGUI or Rcmdr. Given that tkStartGUI is already installed by default this seems like a good place to start. And, logically, Rcmdr should have a .desktop file when installed, although I don't think it should be installed by default, although setting it to recommended might be a good idea (it may be already, I have to look).

These two proposals would help new users find R and get started and bring a certain degree of parity to the different platforms served by R. As an added bonus, setting things up this way would have practically no affect on users who are already satisfied with the command-line approach. Most would never even notice the additional entry in their menus and it wouldn't hurt them if they did.

Comments?

Wednesday, November 11, 2009

REvolution Computing, Ubuntu 9.10, - An example of the strength of the community.

Earlier today, I wrote an post titled "REvolution Computing, Ubuntu 9.10 - A mishandled opportunity. This evening I received an email from David Smith from REvolution Computing. For some reason, Uncle Google wouldn't let him log in to make a comment, so he emailed me instead. He pointed out that my earlier statement included some outdated information. After reading his email and confirming everything in the email, I wrote this because I felt that David should be given the opportunity to respond to what I said, in his own words.
----------------------------------------------------------------------------------------------
from: David Smith
subject: Ubuntu 9.10 and REvolution R

I think you're looking at an old version of r-base-core. In a fresh 9.10 install, installing r-base-core and then running R shows only two lines added by REvolution:

david@ubuntu:~$ R
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

REvolution R enhancements not installed. For improved
performance and other extensions: apt-get install revolution-r

>

We changed the text during the beta in response to comments (from you and others). That's why we do betas, after all. There are only two added lines, which were cleared by the R Core Group:

REvolution R enhancements not installed. For improved
performance and other extensions: apt-get install revolution-r

We announced the changes (and the reasons why it was done the way it was) before 9.10 was released on our blog:


The new Application Center option is an interesting one, thanks for bringing it up. It just wasn't an option that was on our radar at the time.

Hope this clears things up a bit,
# David Smith
-------------------------------------------------------------------------

My first reaction - It does.

I had not noticed this change, in part because I have had the REvolution Computing extension installed on one of my primary development machines so I could assess company's product. I'll write about that in a separate post in a week or two, but first I removed all of the REvolution Computing packages, so I could see the updated greeting for myself. A screen-shot seems appropriate.


I agree with David, the new language is an improvement and I was thrilled to learn that these changes were made because of the feedback received from the Ubuntu community. While I know there are some in the community who are uncomfortable with anything that smells like ad-ware, I think it is very important to highlight the fact that input from the community did affect the final product. This is one of the things that makes Free Software so incredible, users can have an important role in shaping the product.

This discussion further highlights the importance of developing some way for independent vendors, such as REvolution Computing to distribute software via Ubuntu in a manner that is acceptable to everyone. Everyone includes the independent vendor, Canonical AND the broader Ubuntu community. Clearly, this is going to be a delicate balancing act. Vendors like REvolution Computing want their products to be seen and used but many end-users of open-source software have a very low tolerance for anything that even resembles the ad-ware found on Windows. I think an intelligent App Center is the best way to balance the interests of both parties and I will watch to see how things develop during the Lucid Lynx cycle. Unfortunately, an intelligent App Center is easier said than done and will require substantial effort from Canonical to make it a reality (hopefully it can be a reality on Ubuntu AND Kubuntu).

I hope this discussion adds to the development of this collaboration in a meaningful way.

REvolution Computing, Ubuntu 9.10 - A mishandled opportunity

On Linux, R is a command line program. Whether or not this is a good thing or not is a separate issue. When R is started, a "greeting" is displayed before the R command prompt. Because it's only a few lines, it certainly doesn't delay start up and the message has traditionally been innocuous. However, users of Ubuntu 9.10 will notice an addition to the traditional R "greeting".

R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.


This is REvolution R version 3.0.0:
the optimized distribution of R from REvolution Computing.
REvolution R enhancements Copyright (C) REvolution Computing, Inc.

Checking for REvolution MKL:
- REvolution R enhancements not installed.
For improved performance and other extensions: apt-get install revolution-r

For readers unfamiliar with the greeting message in R, I assure you that everything below the line starting with 'citation()' is new. This has sparked a rather interesting discussion on the Ubuntu Forums - Link To Discussion.

As you can see from what I have written on the forum, I was immediately concerned but did not want to over-react. However, after hearing the various points of view that I have had access to, I have concluded that Canonical handled this poorly but I want to be very clear regarding where I feel there is a problem.

The section of the greeting regarding REvolution Computing's product is ad-ware. Not only is it ad-ware, the message is misleading. An inexperienced user could mistakenly read this and believe that their R installation is incomplete. Furthermore there is nothing in this message warning a user that part of the REvolution Computing system is proprietary. Although REvolution Computing has contributed several interesting extensions to the R community under an OSS license, the mkl extension is proprietary and would be installed by default if a user follows the recommendation of the new "greeting".

To sum up - I don't like the new greeting. I don't want to recreate the Ubuntu Forums discussion. I recommend you read the full discussion there.

Before closing this post, I do want to emphasize that I think REvolution Computing and Canonical are well within their rights to work together to offer this product to Ubuntu users. In fact, it makes good sense for these companies to work together to offer a modern, comprehensive statistics/BI solution to end-users. And I think it is perfectly OK for Canonical to help new users find the REvolution Computing extensions, but I don't think this should be done via the greeting in R.

The Karmic Koala introduced a new Application Center to Ubuntu. This replaced the Add/Remve Applications of past Ubuntu releases and will eventually also replace Synaptic on default installations. This application is intended to show case open source AND proprietary applications to Ubuntu users and I think that is 100% terrific. I LIKE options, provided the options come with full and reasonable disclosure. I think Canonical should develop a system so users installing R, PSPP, Gretl, etc. are able to easily find other potential solutions to their needs. This list should include REvolution Computing's product. Users should be able to easily see how popular these various programs are with other Ubuntu users. The license should also be obvious and not confusing. I think Open-Source and Proprietary are adequate. I don't think new users need to know the difference between the GPL, LGPL, Apache license, etc.

As you can see in the forum discussion, I did look on Launchpad and I found a bug report that is relevant to this discussion. You can find that bug report here. I hope that Canonical listens to the concerns of the community regarding this. Given the heated debate in the OSS community regarding technologies that fit into gray areas like Mono, this seems like a fight that Canonical and the community should work to resolve, rather than allow it to fester and turn into something like the Mono debate.

--andy

Take this down

I need to take this down later today.

W20705731R1400D0511-UB2BYRY3

Yeah.

Tuesday, November 10, 2009

Habits - They are hard to change.

It's been a while since I wrote anything here. I got busy with work, riding, Karen, life, etc. It happens. But, I do want to write here too. I tend to spam my friends with long explanations of how I see the world and I think it would be better to put those things here rather than bombard my friends with my various mental musings. So, to get things kicked off to a good start I back-posted an email I sent to all of my friends and family about a recent accomplishment of mine - I have ridden OVER 1,000 miles this year on a bicycle.

Unintended side affect - I have nice looking legs.

:-)

Tuesday, November 3, 2009

"A journey of a thousand miles, begins with a single step."
-- ??????????????? (See Note #1)
Or in my case, a single push of the pedal. . . . .
--Andy

In the summer of 2008, the price of gasoline hit record highs. For a while, a single gallon of gasoline in upstate New York cost nearly $5.00. Commuting to work was suddenly taking a significant bite out of my take-home pay. In a moment of serendipity stubbornness I went out and bought a road bike. Without really doing much research on the matter, I decided that I would start commuting to and from work via bicycle. I figured I could pay for the thing by using the one-time George W. Bush kick-back and my gas savings. Thanks to the outrageous price at the pump, I easily broke even last fall. When it got too cold for me to ride, I discovered that I missed it greatly. I had grown rather fond of my commute along the Hudson River between Albany and Troy.
When the snow thawed and the sun peeked out for more than 3 hours a day (spring 2009), I pulled the bike out of the shed and dusted it off. I also bought a bike computer, so I could measure how far I had ridden. This was especially nice when Karen and I entered the 2009 Tour de Cure to raise money for diabetes. We rode 50 miles on that Spring Day, which was Karen's second long-ish ride on her bike. By this point-in-time, I was once again commuting to and from work 2-3 times a week on the back of my aluminum horsey. I went on a few long-ish rides, and I bought a mountain bike so I could bounce off of trees and rocks, but most of my miles were ridden somewhere between Albany and Troy. A month or so ago I realized I had ridden more miles than I realized.
Yesterday while riding to work, my odometer tripped over one-thousand miles. To me, that's a big number. I'm really careful OCD about the tire pressure on the road bike, so this number is scary accurate. Among the more experienced riders who will receive this, I'm aware that 1000 miles is only 10 centuries - something that some of you hit by May. But for me this was pretty incredible. On my way to work today I paused for a moment when the odometer first read 1000 to look around. I was still in Albany, down along the Hudson River. The city recently repaved this section of the trail, an improvement that is hard for me to fully explain in a way that keeps this e-mail G-rated. But I will tell you that my padded tights seem to work much better now. It was a nice brisk fall morning and there were many walkers/runners sharing the trail with those of us on bicycle. Although I was running a little behind schedule and needed to get to work, I couldn't help but enjoy my surroundings. I also paused to enjoy the screams of the children I had run over earlier, but that's a discussion for a separate e-mail.
What does this all mean? For starters, I can happily report that I spent more money on padded tights this year and less money on gasoline. Priorities??? Parking is free at work, and since I've had to spend a fair amount of money on cycling gear to be comfortable/safe, I doubt I've saved much money. On the other hand, I am in phenomenal shape although I rarely "work-out" like I used to. When I get home, I usually eat dinner, hang out with friends, take a nap, etc. Thanks to my commute, I know I'll get plenty of exercise. There are also the obvious environmental benefits. I am a walking/talking/pedaling carbon-sink.
But, the real benefits of all this riding go far beyond any financial or environmental advantages. Today, I am more familiar with the back-streets of Albany, where riding a bike is a little less suicidal than State Street or Madison Avenue. I can also honestly say that I feel less stressed. If a day at work sucks, I can take it out on the ride home. It gives me a chance to process things and generally work-out any I have regarding my day at work. And finally, time moves a little more slowly on a bicycle. I get to really notice the weather and the seasons in a way that just isn't possible in a car. Since my top-sustainable speed is somewhere around 20 miles per hour, I can honestly say that I have to move slower too. In a world that is often dead-set on making us all multi-task a little more or move a little faster to accomplish more in a single day, it's REALLY nice to just slow down.

--andy

---- Notes -----
I really like this quote. It fits into so many situations ranging from moments of careful self-reflection to opportunities of cliche comedic bliss. Being a rather scholarly individual, I checked on Google to make sure I cited it appropriately. Most appear to cite Confucius while others think this was said by Lao-tzu. They're both dead, so I don't think they'll mind me appropriating the quotation.