Clusterf*k

1/06/2009

This email follows up earlier communication regarding the major ICT systems failure on Friday night [actually it was the first news I got], where the University’s primary data storage system had multiple simultaneous disk failures.

Note that while no staff email has been lost, staff files on the P: or K: drives have only been recovered from backups made early Friday morning. Files that were created new, or modified during Friday have been lost.

Just in case, if you emailed me on Thursday/Friday you better send me the email again.

Filed in web No Comments

The value of professional associations

29/05/2009

I have followed with interest the discussion on what should be the role of the New Zealand Institute of Forestry (NZIF§). It seems that a frequent position espoused by members is that the NZIF has two options to provide value to its members: making registration a legal requirement and ensuring high professional standards. I would contend that the first one is an oxymoron: ‘let us create value by making membership compulsory, so then members derive value from membership’. This generates as much value to members as a protection racket does to its victims. The second approach relies on the existence of an authority with the capacity to evaluate high professional standards. But Who are our peers in our narrow fields of specialization? Who can judge us as being ‘good enough’ to sell services as a growth modeler, forest economist, tree breeder, etc? At the end of the day, the market is king (or queen), and we are judged every time that we complete a professional assignment. The same goes for other activities: I would hire an accountant or a lawyer based on recommendations and experience—which are often translated in the market place through availability and fees charged—rather than by membership of a professional association.

That leaves us with how do we really derive value from voluntary association? We interact with other members, we exchange information, we learn. Do we strictly need the NZIF for this learning? Probably not, although it facilitates the process. Maybe the right function for the NZIF is to create opportunities for professional development, conferences, coordinated submissions, and making clear the role of forestry to New Zealand society. I think that the NZIF provides value by making communication easier for its members while any artificial barriers will only be detrimental to the interest of people working in the forestry sector and to their customers.

P.S. This quote from Free to Choose§ by Milton and Rose Friedman makes the point very clearly:

Licensure is widely used to restrict entry, particularly for occupations like medicine that have many individual practitioners dealing with a large number of individual customers. As in medicine, the boards that administer the licensure provisions are composed primarily of members of the occupation licensed—whether they be dentists, lawyers, cosmetologists, airline pilots, plumbers, or morticians. There is no occupation so remote that an attempt has not been made to restrict its practice by licensure. According to the chairman of the federal Trade Commission: “At a recent session of one state legislature, occupational groups advanced bills to license themselves as auctioneers, well-diggers, home improvement contractors, pet groomers, electrologists, sex therapists, data processors, appraisers, and TV repairers. Hawaii licenses tattoo artists. New Hampshire licenses lightning-rod salesman.”

The justification offered is always the same: to protect the consumer. However, the reason is demonstrated by observing who lobbies at the state legislature for the imposition or strengthening of licensure. The lobbyists are invariably representatives of the occupation in question rather than of the customers. True enough, plumbers presumably know better than anyone else what their customers need to be protected against. However, it is hard to regard altruistic concern for their customers as the primary motive behind their determined efforts to get legal power to decide who may be a plumber.

Filed in forestry, new zealand No Comments

The pain of moving (computers)

13/05/2009

It was the time to retire ‘Mastropiero’§ (my old mac laptop). While software wise it was running well (using Leopard) the building quality of the first series of macbook pros was not stellar. The new laptop—’Abraxas’—is a macbook pro with 320 GB hard drive, 4 GB RAM and 2.66 GHz Core 2 Duo processor.

Despite of all the propaganda, migration assistant is a fairly useless beast (at least in my personal situation). The university buys the computer and sets up a user account that, incidentally, has always the same name for a given user. This means that I can not just migrate my old account because there is already an account with the same name (and a bunch of settings) in place. In addition, migration assistant is pretty much an all or nothing affair, and I wanted to start with a fairly clean installation.

At the end I connected both laptops to the network and moved my data across. I imported all my songs into iTunes and copied the photo library, which was automatically upgraded from version 6 to version 8.

In the transfer process I dropped a number of programs that I was not using much. My current list of programs is in ‘I Use This’§. There is still a small amount of duplication; for example, both Eaglefiler and Devonthink are on the list, although eventually I will only keep the former. Another case in hand is MS Office. I can’t really stand MS Word and PowerPoint, particularly in their mac incarnations. If Office 2004 was slow, the 2008 version is a turd. I am trying to get by using OpenOffice, which I still do not consider completely satisfactory. I also have Pages, which is not quite compatible with Word. I think that OpenOffice still does a better job; it is uglier but more functional.

From a teaching point of view Keynote (presentations) and TeXShop (lecture notes) do the heavy lifting. My calendar is managed in iCal, which is synchronized to Google Calendar and also to my resucitated Palm T3; the latter using Mark Space’s missing sync. I dutifully ignore Palm’s own software.

Statistics are managed through R, although I am still waiting for a mac version for asreml-R a commercial package for genetic analyses. All publication quality plots are done there as well.

The university IT guys setup dual booting for me (20 GB windows XP partition), but I haven’t yet managed to have time to boot into windows. They also installed the developer tools, which I hope to use to do some programming with Python and C++ or Fortran 95 (depending on time availability).

And that is! A simple setup with oodles of space and memory; at least it feels like that now. Let’s wait for a year and see how it feels.

Filed in mac No Comments

Drylands

5/04/2009

There is a substantial amount of land with low rainfall–say between 500 and 900 mm of rain per year. Usually, this land is allocated to low productivity uses, for example sheep farming. Could we use durable wood, drought tolerant eucalypts? We could then have diversification of land use, alternative products and even additional carbon sequestration.

When we say dry it looks like this:

Dry in Marlborough

Drylands in Marlborough.

I would say that it is certainly worth a try; more precisely, a proper try. There is a history of half-hearted attempts in the matter, so if we are going to have a go, better we do it well or not at all.

Filed in forestry, photos No Comments

Eight years is nothing

31/03/2009

The screenshot shows the ‘State of the Surname’ as of 2001. Eighty-five hits for Apiolaza, most of them referring to my papers and emails to groups.

Google Search 2001
Google search for 2001.

Repeating today the search, we get a dramatically different answer§.

Filed in photos, web 1 Comment

This time is Calvino

30/03/2009

This happens relatively frequently: I am talking with someone else that doesn’t know me well and, at some point of the conversation I have mentioned that I am a forester. Then we move into books and I mention someone like Borges or Calvino and they look at me with this puzzled face as in ‘I didn’t know that foresters could read’. I know, it happens to other professions as well; just for the record not all of us are semi-literate apes, working with a chainsaw.

I was sorting out my bookshelves at work when I found a copy of ‘The literature machine’, a collection of essays by Italo Calvino§. It had my name and signature, together with 2002, Melbourne, Australia. (Digression: besides my name and signature I always put the city where I bought a book). I had vague memories of walking around in Melbourne’s CBD and finding an underground bookshop. At the time I was not looking for anything in particular, just browsing titles.

Why did I buy the book and never read it? I do remember browsing it and getting distracted by something more urgent, albeit clearly unimportant, because I cannot remember what was it. Probably I was not ready either; it has happened to me before. From ‘Uncle Tom’s cabin’§ when I was nine, to ‘The Fountainhead’§ when I was a teenager, to ‘The literature machine’ seven years ago. Most likely there is an issue of maturity, of being ready to read a particular story, philosophy or approach to the world.

Many years ago I read some of Calvino’s books, like Cosmicomics§ (brilliantly funny) and ‘The cloven viscount’§ (very enjoyable reading). But I particularly struggle with two literary forms: essays and plays. I sometimes can get into the former, but the latter has proven–until today–insurmountable.

However, today is the time for Calvino and essays. There is something deeply stimulating in these essays, together with a quaintness created by forty years gone since they were written. The feeling of freshness, possibility and hope from 1968 reads strange in 2009. At the same time, there is a bit of breaking with the system, since the implosion of the international economy. Maybe it is an excellent time to resonate with Calvino, as in the old days.

Filed in books, influences, writing No Comments

Teaching stats and software

12/03/2009

Forestry deals with variability and variability is the province of statistics. The use of statistics permeates forestry: we use sampling for inventory purposes, we use all sort of complex linear and non-linear regression models to predict growth, linear mixed models are the bread and butter of the analysis of experiments, etc.

I think it is fair to expect foresters to be at least acquainted with basic statistical tools, and we have two courses covering ANOVA and regression. In addition, we are supposed to introduce/reinforce statistical concepts in several other courses. So far so good, until we reach the issue of software.

During the first year of study, it is common to use MS Excel. I am not a big fan of Excel, but I can tolerate its use: people do not require much training to (ab)use it and it has a role to introduce students to some of the ’serious/useful’ functions of a computer; that is, beyond gaming. However, one can hit Excel limits fairly quickly which–together with the lack of audit trail for the analyses and the need to repeat all the pointing and clicking every time we need an analysis–makes looking for more robust tools very important.

Our current robust tool is SAS (mostly BASE and STAT, with some sprinkles of GRAPH), which is introduced in second year during the ANOVA and regression courses. SAS is a fine product, however:

  • We spend a very long time explaining how to write simple SAS scripts. Students forget the syntax very quickly.
  • SAS’s graphical capabilities are fairly ordinary and not at all conducive to exploratory data analysis.
  • SAS is extremely expensive, and it is dubious that we could afford to add the point and click module.
  • SAS tends to define the subject; I mean, it adopts new techniques very slowly, so there is the tendency to do only what SAS can do. This is unimportant for undergrads, but it is relevant for postgrads.
  • Users tend to store data in SAS’s own format, which introduces another source of lock-in.

In my research work I use mostly ASReml§ (for specialized genetic analyses) and R§ (for general work), although I am moving towards using ASReml-R (an R library that interfaces ASReml) to have a consistent work environment. For teaching I use SAS to be consistent with second year material.

Considering the previously mentioned barriers for students I have started playing with R-commander§, a cross-platform GUI for R created by John Fox (the writer of some very nice statistics books§, by the way). As I see it:

  • Its use in command mode is not more difficult than SAS.
  • We can get R-commander to start working right away with simple(r) methods, while maintaining the possibility of moving to more complex methods later by typing commands or programming.
  • It is free, so our students can load it into their laptops and keep on using it when they are gone. This is particularly true with international students: many of them will never see SAS again in their home countries.
  • It allows an easy path to data exploration (pre-requisite for building decent models) and high quality graphs.
  • R is open source and easily extensible.

I think that R would be an excellent fit for teaching; nevertheless, there would be a few drawbacks, mostly when dealing with postgrads:

  • There are restrictions to the size of datasets (they have to fit in memory), although there are ways to deal with some of the restrictions. On the other hand, I have hit the limits of PROC GLM and PROC MIXED before and that is where ASReml shines.
  • Some people have an investment in SAS and may not like the idea of using a different software.

We will see how it goes because–as someone put it many years ago–there is always resistance to change:

It must be remembered that there is nothing more difficult to plan, more doubtful of success, nor more dangerous to manage, than the creation of a new system. For the initiator has the enmity of all who would profit by the preservation of the old institutions and merely lukewarm defenders in those who would gain by the new ones.—Niccolò Machiavelli, The Prince, Chapter 6.§

Filed in software, statistics 6 Comments

Skynet in Python

5/02/2009

After a long hiatus I have come back to doing some (extremely basic, I have to admit) Python coding. This xkcd§ comic is a timely reminder:

Well, that and minimization of the objective function.

Filed in miscellanea, programming No Comments

Influences: Cronopios and Famas

2/02/2009

Books have accompanied me for all my life, or at least for as long as I can remember. However, my reading habits have changed many times, from reading simple books, to reading very complex books, to reading anything, to reading if I squeeze a few minutes here and there, to… you get the idea. ‘Habits’ is a funny word, an oxymoron, to refer to constant change.

Today I was thinking of influential books. No ‘good’ books or books that have received many awards or that have guided generations or catalyzed social change. I mean only books that have been important for me at a given point in time. If I had read them before or after that time they may have passed unnoticed. But I read them then, at the right time… for me.

As an adult I have moved houses several times, and every time I have lost books. There are also books that have been with me all this time. One of them is ‘Cronopios and Famas’ a collection of very short stories by Julio Cortázar§, one of the big voices of Argentinian literature. My first encounter with ‘Historias de Cronopios y Famas’–the original Spanish title–was in my maternal grandparents’ apartment. I was living with them and I was looking for something to read. Anything. I opened a drawer and found some interesting books, including Cortazar’s. It was one of the first editions, which I think belonged to one of my uncles, the one in exile.

Why was this an important book? Language, raw language. I am completely at lost when trying to explain Cortázar to someone who has not read his books. As Borges said:

No one can retell the plot of a Cortázar story; each one consists of determined words in a determined order. If we try to summarize them, we realize that something precious has been lost—Jorge Luis Borges

In ‘Progreso y retroceso’ (progress and regress) the whole story fits in only two paragraphs. The story is about a crystal that lets flies through but that does not let them come back because ‘no one knows what stuff in the flexibility of the fibers of this crystal, which was too fibrous’ or something like that:

Inventaron un cristal que dejaba pasar las moscas. La mosca venía empujaba un poco con la cabeza y, pop, ya estaba del otro lado. Alegría enormísima de la mosca.

Todo lo arruinó un sabio húngaro al descubrir que la mosca podía entrar pero no salir, o viceversa a causa de no se sabe que macana en la flexibilidad de las fibras de este cristal, que era muy fibroso. En seguida inventaron el cazamoscas con un terrón de azúcar dentro, y muchas moscas morían desesperadas. Así acabó toda posible confraternidad con estos animales dignos de mejor suerte.

The story is straightforward, with simple, almost pedestrian words. But those words have been extremely carefully selected, crafted in a particular order. I imagine Cortázar spending countless hours, agonizing on a myriad small decisions until reaching a point of perfect simplicity.

There was a clear before and after reading this book in 1981: language was not the same ever again. I learned to find the fantastic side of the quotidian. I grew to appreciate risk when building sentences, when pushing meanings and readings. My whole way to look at the world was influenced by a small book of ridiculous short stories.

Filed in books, influences No Comments

Generating dynamic Google maps with Python

1/02/2009

As I have mentioned before, I have been putting together some dynamically generated maps for environmental information. A barebones version of my Python code to generate the KML file is:

#!/usr/bin/env python
# encoding: utf-8
 
import urllib, random
 
# Charting function
def lineChart(data, size = '250x100'):
    baseURL = 'http://chart.apis.google.com/chart?cht=lc&chs='
    baseData = '&chd=t:'
    newData = ','.join(data)
    baseData = baseData + newData
    URL = baseURL + size + baseData    
    return URL
 
# Reading test data: connecting to server and extracting lines
f = urllib.urlopen('http://gis.someserver.com/TestData.csv')
stations = f.readlines()
kmlBody = ('')
 
for s in stations:
    data = s.split(',')
    # Generate random data
    a = []
    for r in range(60):
        a.append(str(round(random.gauss(50,10), 1)))
 
    chart = lineChart(a)
 
    # data is csv as station name (0), long (1), lat (2), y (3)
    kml = (
        '<Placemark>\n'
        '<name>%s</name>\n'
        '<description>\n'
        '<![CDATA[\n'
        '<p>Value: %s</p>\n'
        '<p><img src="%s" width="250" height="100" /></p>\n'
        ']]>\n'
        '</description>\n'
        '<Point>\n'
        '<coordinates>%f,%f</coordinates>\n'
        '</Point>\n'
        '</Placemark>\n'
        ) %(data[0], data[3], chart, float(data[1]), float(data[2]))
 
    kmlBody = kmlBody + kml
 
# Bits and pieces of the KML file
contentType = ('Content-Type: application/vnd.google-earth.kml+xml\n')
 
kmlHeader = ('<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n'
             '<kml xmlns=\"http://earth.google.com/kml/2.1\">\n'
             '<Document>\n')
 
kmlFooter = ('</Document>\n'
             '</kml>\n')
 
 
print contentType
print kmlHeader
print kmlBody
print kmlFooter

Well, this is not exactly barebones, because we also wanted to generate dynamic graphs for each placemark, in the easiest possible way. My first idea was to use one of the multiple javascript libraries available in the net However, a quick search revealed that KML files do not support javascript in the description tag. That was the time when I remembered playing with Google Charts a while ago. The lineChart function above is simply a call to create a line chart using the charts API. Because this is a test, I used 60 randomly generated data points, which explains the presence of random as an imported library.

Originally, I did not want to use javascript at all, so inserted the code as a search in maps, generating a link like http://maps.google.co.nz/maps?q=http://gis.someserver.com/dynamicmap.py Just copy the address, send it to some one and, presto, they have access to my map. However, I wanted to embed it in a blog post§ and I was struggling to do it. The solution was to click on the ‘Link’ link in the generated map to copy the ‘Paste HTML to embed in website’ link. This gives an iframe block that can be copied in any page or blog post.

While helping a friend to create another map, we faced the problem that the data set was being updated every five minutes. What is the problem? The map was not being refreshed often enough. The I am not sure if the problem was a browser cache or Google Maps, but it could be solved by calling the KML file with a random extra argument (the script does not need take any arguments, so anything after the question mark is ignored). In my case I needed a frequent random argument, so I use the current time (using the date would work for once a day updates). This meant inserting the map using javascript (and using a Google Maps key). The code for a simple page–from the header onwards–would look like:

<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<title>A simple dynamic python generated map</title>
<script src="http://maps.google.com/maps?file=api&amp;v=2&amp;key=my_key"
  type="text/javascript"></script>
<script type="text/javascript">
    //<![CDATA[
 
    function load() {
      if (GBrowserIsCompatible()) {
        var map = new GMap2(document.getElementById("map"));
        map.setCenter(new GLatLng(-33.458943, -70.658569), 11);
        var pollution = new GGeoXml("http://gis.uncronopio.org/testmapscsv.py?"+
                        (new Date()).getTime());
        map.addOverlay(pollution);
      }
    }
    //]]>
</script>
</head>
<body onload="load()" onunload="GUnload()">
<div id="map" style="width:750px;height:600px"></div>
</body>

It was not too bad for mucking around on Friday in between doing house chores.

Filed in geocoded, programming, web No Comments