SonicLib: an experiment of inclusive, gender-even open source project

New

Jan 21, 2015

Story

Mauri Favaron

Italy

Joined Nov 9, 2009

The team formed in 2011, trying to solve a very specific scientific problem (I just mention it for completeness: it is about the free processing of ultrasonic anemometer data for various practical applications like air quality and ecology, knowing what you are really doing; if you are curious about the technicalities you may find plenty of them at URL http://www.boundarylayer.eu/projects_soniclib.html).

Here I'm not about to delve into geeky techies, however. Quite rather, I will deal of open-source projects.

Open-source software: a few mini-highlights

I guess to be not the most qualified person to present this point. But the point itself is more important than I am, so allow me an attempt.

\"Open source\" is a kind of software released along with all the computer instructions used to generate it (known as \"source code\").

The core idea behind open source is that if you can access the source code and are knowledgeable enough you can study yourself how the program has been written, what it does exactly, and how it may be improved (or corrected).

The importance of this idea stems from a crude fact: most computer programs are not open-source. Think Office, to name just one. This \"closed-source\" software works, of course, but your only way to understand what does it really do is by the manuals, book, classes, ..., made available by the program manufacturer or someone else. In many cases this means you \"have\" to pay for both having/using the program and learning about it.

In theory, the idea of open-source is very nice and enabling.

But as we'll see shortly on, it also contains some dangers.

Behind the scenes: the open-source mindset

\"If you have the source you can understand it.\"

Right. But to do so you have to be a programmer proficient in the language used to write the source code (many exist, like Ada, Fortran, Java, Go, and tens of others).

Not all people are, however.

And many software would demand, in my modest opinion, full awareness by all of us, or at least the majority. Think, for example, the program running on your car's ABS computer, or in a pacemaker, or in a digital power meter, or any one of the other numberless embedded devices surrounding us (and with the potential to become very harmful if they fail - or if an error in their governing programs suddenly manifests).

Open-source activists claim the biggest advantage of open source software gives you the \"freedom\" of studying/modifying it without any commercial part bans this fundamental right.

This is true, sure. But in my feeling, there is much more, which in most cases remains completely understated. As I see it, what the (often ideologized) most vocal advocates of open source mean by \"freedom\" is, really, autonomy.

You may understand this from the lingo of open-source. Most open-source software forums to date are populated by people who will help you with a lympid heart. But sometimes still addresses newbies with an \"RTFM\" (a horrible acronym of \"Read The ******* Manual\": that is, think twice and study a lot before attempting to post a silly question). There is a lot of perking and ranking, the Real Hackers at top. Religion wars abour my-computer-language-better-than-yours still abound.

In such a context, \"freedom\" to study/change/test a program written by someone else means being able to interact with the author as a peer. Something which first demands you \"prove yourself\" before.

Is this always a piece of cake? Not necessarily!

In part, because computer programs are for the most part very complex, so developing an understanding of them by reading source code and technical documentation is almost never an easy task.

But in another part, because source code is often written cryptically, the author himself attempting to prove his peers his prowess in using a specific computer language or piece of hardware. This attitude goes way beyond the attempt to write efficient and idiomatic (good styled) code, and the resulting inevitable obfuscation conspires making access to code possible-but-really-painful.

Most open-source visible activists and coders are men, and the very idea of \"open source software\" developed within a distinctly patriarchal frame. So, the \"I'm number one hacker in the World\" mindset behind some open source projects comes not as a surprise.

I wonder whether open source code would be easier to understand would more women participate in its development. But I'm still waiting an answer, and guess I'll wait long more - at least if we consider the low and declining number of women programmers...

When open source means not-really-open-but-I-would-like-you-think-so

Male chauvinism is not the only factor affecting adversely some open source projects. Shameless profit mentality is another.

To give an example I know well, a case exist of a famous instrument manufacturer (always name the sin, never the sinner) who hired (yes!) a programmer to develop an \"open source\" program to perform computings on ultrasonic anemometer data.

The resulting open-source project ended in a set of source codes (beautifully written, I have to admit) which the manufacturer then posted at their web site for anyone to download (provided giving plenty of personal data). This happened some years ago. From then on, no changes, no improvement were made to code. Evolution = 0. This made some people claim with involuntary humor \"the processing of ultrasonic anemometer data has reached its final maturity\".

What the manufacturer was really interested in, in reality, was to impose a standard processing way. A way in which the only instrument really supported is the one they sell, of course. From an ethical standpoint I find this position really questionable, and much beyond the spirit of open source, even in its original patriarchal form.

Staying practical

Here comes our \"experiment\".

Our problem was, we had quite an amount of ultrasonic anemometer data. A real lot, collected in various campaigns a few generations of students at the Physics department of Milan University made as part of their lab or theses.

Initially, any student had to process these data individually, adapting to their specific needs some old programs. Modification after modification we almost arrived at the point of losing track of what was done by the program.

Worse even, variants of these programs were used by various research institutions in Italy and abroad. As necessities were never exactly the same, a lot of divergence accumulated. Really, it was like assisting to the spread and quick evolution of a new virus quasi-species. Appalling (especially to me, who was supposed to know and maintain all these variants).

A shared prooblem we all (students, researchers, myself) suffered was that no one seemed to really know what the code did. Sure, the source was \"openly available\". But the patch-and-spaghetti-like-changes had made it incomprehensible. To make thing worse, if possible, an original sin added: the programmer who wrote the first embryo of the code was (and still is) a real genius in atmospheric physics, who did his best to credit as a genius in programming, lacking completely the patience and love for details necessary to make a computer program really working. So a flow of corrections, in addition to modifications, had to be made.

Now, imagine the scene. You are a researcher writing a paper, or a student assembling your thesis. The most important item in your toolbox is a program you have no real idea of what it does. Your scientific reputation depends in large part on the correctness and soundness of computings you have done. Your paper's referees will have little chance to perform some verifications (maybe just because the gigabytes of data you worked on are not transportable), so it may well be possible that any error you made will be discovered in future - maybe even mislabeled as an intentional fraud.

Quite an appalling perspective, isn't it?

And: suppose you use the processing programs as a teaching tool. I still remember when our professor of operational research stumbled when entering the lecture room so that the five sheets of his proof of the convergence of conjugate gradient messed together: heroically, he collected them (they were not numbered) and tried to explain us their contents - in the wrong order. Use a badly written program for a similar purpose, and the result will be likely to be the same (unless the students are all first class geniuses, something hopeful but not so likely).

To overcome all these, and other, troubles we decided to re-write the code from scratch in a collective effort. Before doing so we agreed a few point:

1) Code should be as short as possible and as easy to understand as possible. Less frills = less opportunities something goes awry.

2) No attempt will be made to write code which is \"optimal but hard to understand\". After all, computers are fast, and will become faster even in future.

3) Understanding and positive criticisms have to be possible not only to physicists and mathematicians, but also to biologists, chemists, agronomists and the many others who by habit process ultrasonic anemometer data.

4) The computer language in which the library is written is itself open-source and easy to access even by users who are not computer scientists or technicians.

5) Of no less importance, the code shall be designed with teaching in mind. Comment will not only have to explain the technicalities to an expert audience, but also tell the truth about the \"whys\" and the assumptions behind.

I have to say, quite no ideology, and much pragmaticism.

The computer language we chosen was R. The reasons:
- It is easily accessible.
- Ok, in the beginning it may seem a bit unlike Fortran (the beloved language of the pre-Jurassic part of us, me included), but is quite easy to learn - at least in the foundamentals
- It is useful not only for ultrasonic anemometry, but also for an immense number of other applications (just thinking to the academic and research work of my dept)
- Code written in it tends to be very concise, and visually conveys the physics - all the nitty-gritty algorithmic is hidden behind the language.

It was a huge success. And still is. From its inception, the SonicLib library has been used in labs, theses and scientific papers. More importantly, it evolved, with students giving a huge, visible contribution.

A mentality of our own?

Quite in the beginning we labeled the project as open-source and published it on a web site, for enhanced accessibility.

But our demography and intentions made it quite an anomalous open source project.

Demography, indeed, counted. If most open-source projects are almost exclusively male domains, in the SonicLib project the composition is more balanced, with a strong female representation.

This raised various specific concerns which a male-dominated enterprise would have made invisible. For example, rules of involvement. Or practices to govern multitasking and make it working. Or, a specific strong emphasis on connection and inclusiveness.

This was not something anticipated in the beginning. It evolved this way, thanks to the mind sets prevailing in the team. As mostly teachers, the kernel people focus more on making students and users understand, rather than prove their mental prowess.

Every one a year the SonicLib team meets - as in any serious open source project. The next meeting will be held at Bari on January 2014. The last meeting, as all other formal and less-than-formal encounters, occurred in a climate which is not the usual open source meeting atmosphere, often boring, or flaming. It is quite normal chocolate candies are shared among participants. Or flowers to stay on desks, in addition to computers. The whole atmosphere is friendly, of \"attentive listening\". Until now, no conflict situation evolved to the point to threaten the group harmony.

This is both nice and highly functional: it allows people to formulate ideas, even strange ones, with little concern for judgment (something quite rare in the environment of physicists and mathematicians).

Student involvement = growth

Physicists are a variety of large predators. As such, they pass a long training before they can catch preys on their own. Before of this, they are not fundamentally different from any other kittens: they have to build their own confidence in themselves (and assisted in this passage with a motherly attitude - sort of).

We realized that involving them in the evolution of an important open-source projects, their names being visible and remembered, is an important self-confidence builder. For many students, participating to a lab is the first time they \"do\" science instead of just reading of it, and for them knowing their contribution will be actually used and perused is a stimulus to do well.

In the process, developing a \"real\" cientific work make many of them enthusiast of physics - something not evident in a typical student career in its beginning, when learning is mostly passive.

An ongoing experiment

For me, involvement in the SonicLib project has been a welcome, unanticipated honor. But soon proved a place where I too can give a contribution (as a programmer I'm quite shy and soft-spoken, and find the average environment of open-source projects quite intimidating).

With time, I realized it is one of the few projects in current world which is not informed by a patriarchal style. It maybe might even be an example of \"feminist\" project. What's sure, its underlying mind set is different from the mainstream as day from night.

It is evolving - indeed quite fast. Since its beginning, it attempts to go beyond the traditional boundaries of \"eddy covariance\" and other difficult names, to address in specific and immediate terms the needs of real-world users in fields including air quality, wind power, civil engineering, farming, soil decontamination, ecology, and more. Thanks to a practical, pragmatical mentality shared by the team.

As it evolves, it is also gaining momentum rather fast. It is not impossible it will become famous, sooner or later.

Sure it suggests different, more humane ways to develop useful and solid open source code exist, beyond the mainstream.

Leadership

Education

Europe

Like this story?

Join World Pulse now to read more inspiring stories and connect with women speaking out across the globe!

Leave a supportive comment to encourage this author

Tell your own story

Explore more stories on topics you care about