LWN.net Logo

LWN.net Weekly Edition for March 28, 2013

StatusNet, Identi.ca, and transitioning to pump.io

By Nathan Willis
March 27, 2013

Evan Prodromou surprised a number of free software microbloggers in December 2012 when he announced that he would be closing down Status.Net, the "Twitter like" software service he launched in 2008, in favor of his new project, pump.io. But Status.Net's flagship site, Identi.ca has grown into a popular social-networking hub for the free and open source software community, and a number of Identi.ca users took the announcement to mean that Identi.ca would disappear, much to the community's detriment. Prodromou has reassured users Identi.ca will live on, though it will move from StatusNet (the software package, as distinguished from Status.Net, the company) over to pump.io. Since then, pump.io has rolled out to some test sites, but it is still in heavy development, and remains something of an unknown quantity to users.

Prodromou has some markedly different goals in mind for pump.io. The underlying protocol is different, but more importantly, StatusNet never quite reached its original goal of becoming a decentralized, multi-site platform—instead, the debut site Identi.ca was quickly branded as an open source "Twitter replacement." That misconception hampered StatusNet's adoption as a federated solution, putting the bulk of the emphasis on Identi.ca as the sole destination, with relatively few independent StatusNet sites. The pump.io rollout is progressing more slowly than StatusNet's, but that strategy is designed to avoid some of the problems encountered by StatusNet and Identi.ca.

The December announcement started off by saying that Status.Net would stop registering new hosted sites (e.g., foo.status.net) and was discontinuing its "premium" commercial services. The software itself would remain available, and site maintainers would be able to download the full contents of their databases. Evidently, the announcement concerned a number of Identi.ca users, though, because Prodromou posted a follow-up in January, reassuring users that the Identi.ca site would remain operational.

But there were changes afoot. The January post indicated that Identi.ca would be migrated over to run on pump.io (which necessarily would involve some changes in the feature set, given that it was not the same platform), and that all accounts which had been active in the past year would be moved, but that at some point no new registrations would be accepted.

Indeed Identi.ca stopped accepting new user registrations on March 26. The shutdown of new registrations was timed so that new users could be redirected to one of several free, public pump.io sites instead. Visiting http://pump.io/tryit.html redirects the browser to a randomly-selected pump.io site, currently chosen from a pool of ten. Users can set up an account on one of the public servers, but getting used to pump.io may be a learning experience, seeing as it presents a distinctly different experience than the Twitter-like StatusNet.

What is pump.io anyway?

At its core, StatusNet was designed as an implementation of the OStatus microblogging standard. An OStatus server produces an Atom feed of status-update messages, which are pushed to subscribers using PubSubHubbub. Replies to status updates are sent using the Salmon protocol, while the other features of Twitter-like microblogging, such as follower/following relationships and "favoriting" posts, are implemented as Activity Streams.

The system is straightforward enough, but with a little contemplation it becomes obvious that the 140-character limit inherited from Twitter is a completely artificial constraint. StatusNet did evolve to support longer messages, but ultimately there is no reason why the same software could not deliver pictures à la Pinterest or Instagram, too, or handle other types of Activity Stream.

And that is essentially what pump.io is; a general-purpose Activity Streams engine. It diverges from OStatus in a few other respects, of course, such as sending activity messages as JSON rather than as Atom, and by defining a simple REST inbox API instead of using PubSubHubbub and Salmon to push messages to other servers. Pump.io also uses a new database abstraction layer called Databank, which has drivers for a variety of NoSQL databases, but supports real relational databases, too. StatusNet, in contrast, was bound closely to MySQL. But, in the end, the important thing is the feature set; a pump.io instance can generate a microblogging feed, an image stream, or essentially any other type of feed. Activity Streams defines actions (which are called "verbs") that handle common social networking interaction; pump.io merely sends and receives them.

The code is available at Github; the wiki explains that the server currently understands a subset of Activity Streams verbs that describe common social networking actions: follow, stop-following, like, unlike, post, update, and so on. However, pump.io will process any properly-formatted Activity Streams message, which means that application authors can write interoperable software simply by sending compliant JSON objects. There is an example of this as well; a Facebook-like farming game called Open Farm Game. The game produces messages with its own set of verbs (for planting, watering, and harvesting crops); the pump.io test sites will consume and display these messages in the user's feed with no additional configuration.

The pump.io documentation outlines the other primitives understood by the server—such as the predefined objects (messages, images, users, collections, etc.) on which the verbs can act, and the API endpoints (such as the per-user inbox and outbox). Currently, the demo servers allow users to send status updates, post images, like or favorite posts, and reply to updates. Users on the demo servers can follow one another, although at the moment the UI to do so is decidedly unintuitive (one must visit the other user's page and click on the "Log in" link; only then does a "Follow" button become visible). But Prodromou said in an email that more is still to come.

For those users and developers who genuinely prefer StatusNet, the good news is that the software will indeed live on. There are currently two actively-developed forks, GNU social and Free & Social. Prodromou said there was a strong possibility the two would merge, although there will be a public announcement with all of the details when and if that happens.

Where to now?

Pump.io itself (and its web interface) are the focus of development, but they are not the whole story. Prodromou is keen to avoid the situation encountered at the StatusNet launch, where the vast majority of new users joined the first demo site (Identi.ca), and it became its own social network, which ended up consuming a significant portion of StatusNet's company resources. Directing new registrations to a randomly-selected pump.io service is one tactic to mitigate the risk; another is intentionally limiting what pump.io itself will do.

For instance, while StatusNet could be linked to Twitter or other services via server-side plugins, pump.io will rely on third-party applications for bridging to other services. Prodromou cited TwitterFeed and IFTTT as examples. "My hope is that hackers find pump.io fun to develop for," he said, "and that they can 'scratch an itch' with cool bridges and other apps." The narrow scope of pump.io also means that a pump.io service only serves up per-user content; that is to say, each user has an activity stream outbox and an inbox consisting of the activities the user follows, but there is no site-wide "public" stream—no tag feeds, no "popular notices."

That may frustrate Identi.ca users at the beginning, Prodromou says, but he reiterates that the goal is to make such second-tier services easy for others to develop and deploy, by focusing on the core pump.io API. For example, the pump.io sites forward all messages marked as "public" to the ofirehose.com site; any developer could subscribe to this "fire hose" feed and do something interesting with it. Ultimately, Prodromou said, he hopes to de-emphasize the importance of "sites" as entities, in favor of users. Users do not care much about SMTP servers, he said; they care about the emails sent and received, not about enumerating all of the accounts on the server.

That is true in the SMTP world (one might argue that the only people who care to enumerate the user accounts on a server probably have nefarious goals in mind), but it does present some practical problems in social networking. Finding other users and searching (both on message content and on metadata) have yet to be solved in pump.io. Prodromou said he is working on "find your friend" sites for popular services (like Facebook and Twitter) where users already have accounts, but that search will be trickier.

Identi.ca and other things in the future

Eventually, the plan is for Identi.ca to become just one more pump.io service among many; the decentralization will mean it is no harder to follow users on another pump.io server or to carry on a conversation across several servers than it is to interact with others on a monolithic site like Twitter. But getting to that future will place a heavier burden on the client applications, be they mobile, web-based, or desktop.

Prodromou has not set out a firm timeline for the process; he is working on the pump.io web application (which itself should be mobile-friendly HTML5) and simple apps for iOS and Android. In the medium term, the number of public pump.io sites is slated to ramp up from ten to 15 or 20. But at some point Prodromou will start directing new registrations to a free Platform-as-a-Service (PaaS) provider that offers pump.io as a one-click-install instead (AppFog and OpenShift were both mentioned, but only as hypothetical examples).

Where pump.io goes from there is hard to predict. Prodromou is focused on building a product developers will like; he deliberately chose the permissive Apache 2.0 license over the AGPL because the Node.js and JavaScript development communities prefer it, he said. Applications, aggregation, and PaaS delivery are in other people's hands, but that is evidently what he wants. As he explained it, running Status.Net took considerable resources (both human and server) to manage hosted instances and public services like Identi.ca, which slowed down development of the software itself. "I want to get out of the business of operating social networking sites and into the business of writing social networking software."

At some point in the next few months, Identi.ca will switch over from delivering OStatus with StatusNet to running pump.io. That will be a real watershed moment; as any social-networking theorist will tell you, the value of a particular site is measured by the community that uses it, not the software underneath. Identi.ca has grown into a valued social-networking hub for the free software community; hopefully that user community survives the changeover, even if it takes a while to find its bearings again on the new software platform.

Comments (1 posted)

Protecting communities

By Jonathan Corbet
March 27, 2013
The Wayland project, which seeks to design and implement next-generation display management for Linux and beyond, does not lack for challenges. The project is competing with a well-established system (the X Window System) that was written by many of the same developers. It is short of developers, and often seems to have a hard time communicating its reasons for existence and goals to a somewhat skeptical community. Canonical decided to create its own display manager for Ubuntu rather than work to help improve Wayland, and Android has yet another solution of its own. About the only thing the project lacked was a fork and internal fighting — until now. The story behind this episode merits a look at an example of the challenges involved in keeping a development community healthy.

Scott Moreau is an established contributor to both Wayland (the protocol definition and implementation) and Weston (the reference compositor implementation for Wayland). A quick search of the project's repositories shows that he contributed 84 changes to the project since the beginning of 2012 — about 2% of the total. Until recently, he was an active and often helpful presence on the project's mailing lists. So it might come as a surprise to learn that Scott was recently banned from the Wayland IRC channel and, subsequently, the project's mailing list. A simple reading of the story might suggest that the project kicked him out for creating his own fork of the code; when one looks closer, though, the story appears to be even simpler than that.

Last October, Wayland project leader Kristian Høgsberg suggested that it might be time to add a "next" branch to the Weston repository for new feature development. He listed a few patches that could go there, including "Scott's minimize etc work." Scott responded favorably at the time, but suggested that Wayland, too, could use a "next" branch. It does not appear that any such branch was created in the official repositories, though. So, for some months, the idea of a playground repository for new features remained unimplemented.

In mid-March 2013, Scott announced the creation of staging repositories for both Wayland and Weston, and started responding to patch postings with statements that they had been merged into "gh next". Two days later, he complained that "Kristian has expressed no interest in the gh next series or the benefits that it might provide" and that Kristian had not merged his latest patches. He also let it be known that he thought that Weston could be developed into a full desktop environment — a goal the Wayland developers, who are busy enough just getting the display manager implemented properly, do not share.

The series of messages continued with this lengthy posting comparing the "gh next" work with the Compiz window manager and its Beryl fork, claiming that, after the two projects merged back together, most of the interesting development had come from the Beryl side. Similarly, Scott intends "gh next" to be a place where developers can experiment with shiny new features, the best of which can eventually be merged back into the Wayland and Weston repositories. Scott's desire to "run ahead" is seen as a distraction by many Wayland developers who would rather focus on delivering a solid platform first, but that is not where the real discord lies.

There was, for example, a certain amount of disagreement with Scott's interpretation of the Compiz story. More importantly, he was asked to, if possible, avoid forking Wayland and making incompatible protocol changes that would be hard to integrate later. When Scott was shown how his changes could be made in a more cooperative manner, he responded "This sounds great but this is not the solution I have come up with." Meanwhile, the lengthy missives to the mailing list continued. And, evidently, he continued a pattern of behavior on the project's IRC channel that fell somewhere between "unpleasant" and "abusive." Things reached a point where other Wayland developers were quite vocal about their unwillingness to deal with Scott.

What developers in the project are saying now is that the fork had nothing to do with Scott's banishment from the Wayland project. Even his plans to make incompatible changes could have been overlooked, and his eventual results judged on their merits when the time came. But behavior that made it hard for everybody else to get their work done was not something that the project could accept.

There is no point in trying to second-guess the project's leadership here with regard to whether Scott is the sort of "poisonous person" that needs to be excluded from a development community. But there can be no doubt that such people can, indeed, have a detrimental effect on how a community works. When a community's communication channels turn unpleasant or abusive, most people who do not have a strong desire to be there will find somewhere else to be — and a different project to work on. Functioning communities are fragile things; they cannot take that kind of stress indefinitely.

Did this community truly need to expel one of its members as an act of self preservation? Expulsion is not an act without cost; Wayland has, in this case, lost an enthusiastic contributor. So such actions are not to be taken lightly; the good news is that our community cannot be accused of doing that. But, as long as our communities are made up of humans, we will have difficult interactions to deal with. So stories like those outlined above will be heard again in the future.

Comments (4 posted)

PyCon: Evangelizing Python

By Jake Edge
March 27, 2013

Python core developer Raymond Hettinger's PyCon 2013 keynote had elements of a revival meeting sermon, but it was also meant to spread the "religion" well beyond those inside the meeting tent. Hettinger specifically tasked attendees to use his "What makes Python awesome?" talk as a sales tool with management and other Python skeptics. While he may have used the word "awesome" a few too many times in the talk, Hettinger is clearly an excellent advocate of the language from a technical—not just cheerleading—perspective.

He started the talk by noting that he teaches "Python 140 characters at a time" on Twitter (@raymondh). He has been a core developer for twelve years, working on builtins, the standard library, and a few core language features. For the last year and a half, Hettinger has had a chance to "teach a lot of people Python". Teaching has given him a perspective on what is good and bad in Python.

Context for success

Python has a "context for success", he said, starting with its license. He and many others would never have heard of Python if it were not available under an open source license. It is also important for a "serious language" to have commercial distributions and the support that comes with those.

Python also has a "Zen", he said, which is also true of some other languages, like Ruby, but "C++ does not have Zen". Community is another area where Python excels. "C is a wonderful language", but it doesn't have a community, Hettinger said.

The PyPI repository for Python modules and packages is another important piece of the puzzle. Python also has a "killer app", in fact it has more than one. Zope, Django, and pandas are all killer apps, he said.

Windows support is another important attribute of Python. While many in the audience may be "Linux weenies" and look down on Windows users, most of the computers in the world are running Windows, so it is important for Python to run there too, he said. There are lots of Python books available, unlike some other languages. Hettinger is interested in Go, but there aren't many books on that language.

All of these attributes make up a context for success, and any language that has them is poised to succeed. But, he asked, why is he talking about the good points of Python at PyCon, where everyone there is likely to already know much of what he is saying? It is because attendees will often be in a position to recommend or defend Python. Hettinger's goal is for attendees to be able to articulate what is special about the language.

High-level qualities

The Python language itself has certain qualities that make it special, he said, starting with "ease of learning". He noted that David Beazley runs classes where students are able to write "amazing code" by the end of the second day. One of the exercises in those classes is to write a web log summarizing tool, which shows how quickly non-programmers can learn Python.

Python allows for a rapid development cycle as well. Hettinger used to work at a high-frequency trading company that could come up with a trading strategy in the morning and be using it by the afternoon because of Python. Though he was a good Java programmer, he could never get that kind of rapid turnaround using Java.

Readability and beauty in a language is important, he said, because it means that programmers will want to program in the language. Python programmers will write code on evenings and weekends, but "I never code C++ on the weekend" because it is "not fun, not beautiful". Python is both, he said.

The "batteries included" philosophy of Python, where the standard library is part of the language, is another important quality. Finally, one of Hettinger's favorite Python qualities is the protocols that it defines, such as the database and WSGI protocols. The database protocol means that you can swap out the underlying database system, switching to or from MySQL, Oracle, or PostgreSQL without changing the code to access the database. Once you know how to access one of them through Python, you know how to access them all.

As an example of the expressiveness and development speed of the language, Hettinger put up a slide with a short program. In a class he was teaching, someone asked how he would deduplicate a disk full of photos, and in five minutes he was able to come up with a fifteen-line program to do so. It is a real testament to the language that he could write that program live in class, but even more importantly, he can teach others to do the same. That one slide shows "a killer feature of the language: its productivity, and its beauty and brevity", he said.

But, there is a problem with that example. A similar slide could be created for Ruby or Perl, with roughly the same brevity. That would be evidence for the "all scripting languages are basically the same, just with different syntax" argument that he hears frequently from software executives. But all scripting languages are not the same, he said. That may have been true in 2000, but "we've grown since then"; there are lots of features that separate Python from the pack.

Winning language features

First up on Hettinger's list of "winning language features" is the required indentation of the language. It was an "audacious move" to make that choice for the language, but it contributes to the "clean, uncluttered" appearance of the code. He claimed that Python was the first to use indentation that way, though he later received a "Miranda warning" from an audience member as the Miranda language uses indentation and predates Python. People new to the language sometimes react negatively to the forced indentation, but it is a net positive. He showed some standard examples of where C programs can go wrong because the indentation doesn't actually match the control flow, which is impossible with Python. Python "never lies with its visual appearance", which is a winning feature, he said.

The iterator protocol is one of his favorite parts of the language. It is a "design pattern" that can be replicated in languages like Java and C++, but it is "effortless to use" in Python. The yield statement can create iterators everywhere. Because iterators are so deeply wired into the language, they can be used somewhat like Unix pipes. So the shell construct:

    cat filename | sort | uniq
can be expressed similarly in Python as:
    sorted(set(open(filename)))
This shows how iterators can be used as composable filters. In addition, Python has a level of expressiveness that is similar to SQL, so:
    sum(shares*price for symbol, shares, price in port)
will sum the number of shares times the price for all of the entries in port, which is much like the SQL equivalent:
    SELECT SUM(shares*price) FROM port;
Languages that don't have for loops that are as powerful as Python's cannot really compete, he said.

One of his favorite things to teach about Python are list comprehensions. The idea came from mathematical set building notation. They "profoundly improve the expressiveness of Python", Hettinger said. While list comprehensions might at first appear to violate the "don't put too much on one line" advice given to new programmers, it is actually a way to build up a higher-level view. The examples he gave can fairly easily be expressed as natural language sentences:

    [line.lower() for line in open(filename) if 'INFO' in line]
which creates a list of lower-cased lines that contain "INFO". The second seems directly derived from math notation:
    sum([x**3 for x in range(10000)])
which sums a list of the cubes of the first 10,000 integers (starting at zero). Since list comprehensions can generally be expressed as single sentences, it is reasonable to write them that way in Python.

The generators feature is a "masterpiece" that was stolen from the Icon language. Now that Python has generators, other languages are adding them as well. Generators allow Python functions to "freeze their execution" at a particular point and to resume execution later. Using generators makes both iterators and coroutines easier to implement in a "clean, readable, beautiful" form. Doing things that way is something that Python has "that others don't". His simple example showed some of the power of the feature:

    def pager(lines, pagelen=60):
        for lineno, line in enumerate(lines):
            yield line
            if lineno % pagelen == 0:
                yield FORMFEED

Generator expressions come from Hettinger's idea of combining generators and list comprehensions. Rather than requiring the creation of a list, generators can be used in expressions directly:

    sum(x**3 for x in range(10000))
From that idea, dictionary and set comprehensions are obvious extensions, he said. Generator expressions are one way to combat performance problems in Python code because they have a small memory footprint and are thus cache friendlier, he said.

But generators have a problem: they are a "bad date". Like a date that can only talk about themselves, generators can only talk, not listen. That led to the idea of two-way generators. Now generators can accept inputs in the form of send(), throw(), and close() methods. It is a feature that is unique to Python, he said, and is useful for implementing coroutines. It also helps "tame" some of the constructs in Twisted.

Decorators have an interesting history in Python. They don't really add new functionality that can't be done other ways, so the first few times they were proposed, they were turned down. But they kept being proposed, so Guido van Rossum (Python's benevolent dictator for life) used a tried and true strategy to make the problem go away: he said that if everyone could agree on a syntax for decorators, he would consider adding them. For the first time ever, the entire community came together and agreed on a syntax. It presented that agreement to Van Rossum, who agreed: "you shall have decorators, but not the syntax you asked for".

In retrospect, the resistance to decorators (from Van Rossum and other core developers) was wrong, Hettinger said, as they have turned out to be a "profound improvement to the language". He pointed to the lightweight web frameworks (naming itty, Flask, and CherryPy) as examples of how decorators can be used to create simple web applications. His one slide example of an itty-based web service uses decorators for routing. Each new service is usually a matter of adding three lines or so:

    @get('/freespace')
    def compute_free_disk_space(request):
        return subprocess.check_output('df')
The code above creates a page at /freespace that runs df and returns its output as a web page.

"Who's digging Python now?", he asked with a big grin, as he did in spots throughout the talk—to much applause. The features he had mentioned are reasons to pick Python over languages like Ruby, he said. While back in 2000, Python may have been the equivalent of other scripting languages, that has clearly changed.

There are even more features that make Python compelling, such as the with statement. Hettinger thinks that "context managers" using with may turn out to be as important to programming as was the invention of the subroutine. The with statement is a tool for making code "clean and beautiful" by setting up a temporary context where the entry and exit conditions can be ensured (e.g. files closed or locks unlocked) without sprinkling try/finally blocks all over. Other languages have a with, but they are not at all the same as Python's. The best uses for it have not yet been discovered, he said, and suggested that audience members "prove to the world that they are awesome", so that other languages get them.

The last winning feature that he mentioned was one that he initially didn't want to be added: abstract base classes. Van Rossum had done six months of programming in Java and "came back" with abstract base classes. Hettinger has come to embrace them. Abstract base classes help clarify what a sequence or a mapping actually is by defining the interfaces used by those types. They are also useful for mixing in different classes to better organize programs and modules.

There is something odd that comes with abstract base classes, though. Python uses "duck typing", which means that using isinstance() is frowned upon. In fact, novice Python programmers spend their first six months adding isinstance() calls, he said, and then spend the next six months taking them back out.

With abstract base classes, there is an addition to the usual "looks like a duck, walks like a duck, quacks like a duck" test because isinstance() can lie. That leads to code that uses: "well, it said it was a duck, and that's good enough for me", he said with a laugh. He thought this was "incredibly weird", but it turns out there are some good use cases for the feature. He showed an example of using the collections.Set abstract base class to create a complete list-based set just by implementing a few basic operations. All of the normal set operations (subset and superset tests, set equality, etc.) are simply inherited from the base class.

Hettinger wrapped up his keynote with a request: "Please take this presentation and go be me". He suggested that attendees present it to explain what Python has that other languages are missing, thus why Python should be chosen over a language like Ruby. He also had "one more thing" to note: the Python community has a lot of both "established superstars" as well as "rising young superstars". Other languages have "one or two stars", he said, but Python has many; just one more thing that Python has that other languages don't.

Comments (81 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: OpenSSH 6.2; New vulnerabilities in gnome-online-accounts, kernel, libxml2, privoxy, ...
  • Kernel: Breaking GlusterFS; Widening ext4's readdir() cookie; Multipath TCP.
  • Distributions: GNOME, Fedora, and login-screen logos; Ubuntu, Slackware, Arch, ...
  • Development: Asynchronous I/O in Python; GNOME 3.8; C and C++ speed in GCC; replacing Google Reader; ...
  • Announcements: Awards for Bassel Khartabil and the TAZ, LF EEU report, videos for PyCon and devconf.cz, ...
Next page: Security>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds