IM Integration With XMPP4r

Some of you might be aware that I worked on a project called TimmyOnTime. It is a product that allow you to track your time using your Instant Messaging application only. How we did it exactly? Where’s the code? Nice try! But I won’t tell you… however I will help you getting start with XMPP4r (XMPP For Ruby).

This will be a multipart article. You know, it wouldn’t be my style to talk about XMPP4R without talking about XMPP first. One thing at a time! So, this article is just about XMPP… no XMPP4r.

What is XMPP?

The eXtensible Messaging and Presence Protocol (aka as Jabber) is a protocol to exchange messages between 2 entities. Those messages are transmitted over the wire in the XML format.


Unlike MSN and AIM which are proprietary protocols/technologies, no one owns Jabber. You don’t log on THE Jabber Network like you log on MSN. When you use a Jabber client like Pidgin, Google Talk (Technically, GTalk is not a jabber client… but we won’t go into this) or Gajim, you log on SOME Jabber server. You could even setup your own Jabber server. It doesn’t matter because these servers can talk to each other (but are not forced to… private jabber servers are pretty common). Server-to-server communication is in fact one of the primary asset of the Jabber architecture.

Say I’m using Pidgin and that my Jabber Account is located on (it’s just one of the many public jabber servers), this will probably give me a JID (Jabber identifier) like


So suppose my JID is indeed The first part of the JID is “frank”, which identifies the NODE or, in other words, identifies a person. The second part of the JID is “” and it identifies the SERVER to call in order to send messages to “frank”. The last part of the JID is “/home” and it identifies the RESOURCE. Think of resources as “multiple identities”. Frank could in fact load 2 instances of Pidgin at the same time (one IM at home and the other at work) so it would result in two different JID, respectively “” and “”.

The XML messages : Streams and Stanzas

Like I said, Jabber messages are encoded in the XML format. A stream is a container to exchange information between a client and a server (e.g. between pidgin and It simply is a XML element called “stream” that starts with <stream> and ends with </stream>. Within the stream element, both the client and the server can exchange other kind of XML messages (called stanzas). The communication is terminated when one of the 2 sides sends the closing </stream> element to the other. For example, if i close Pidgin, Pidgin will send the </stream> to and the session will be over.

One important thing to understand is that the stream DOES NOT exist between USER A and USER B… it exists between YOUR CLIENT (e.g. Pidgin) and the SERVER where YOUR Jabber account is located (e.g.

Stanzas are special kind of XML messages that are sent within a stream element. The 3 most common stanzas are :

  1. Message
  2. Presence
  3. IQ

A message stanza is used when USER A ( sends a message to USER B (… you know, like “Frank… are you there?”. it may looks like :

<message to=’’>
<body>Frank… are you there?</body>

</stream> <!- – Well… the closing stream element is not there if the session is still active! Everything happens at real time after all – ->

Let’s take a second to think about what happens exactly when a message is sent between 2 entities from 2 different servers. USER A sends the message through his IM client. The message stanza will be sent to USER A server, which is will see that the message is intended to USER B from so it will send the message there. will finally send the message stanza to USER B (considering that USER B and have established a session together thru a stream element). Yay!

A presence stanza is sent when the status of a user changes (Idle, Offline, Available, Do not disturb, etc).

IQ (Info / Query) stanzas are more general purpose messages. It can be used when USER A wants to know something about USER B but that it cannot be achieved by sending a message or a presence stanza. For example, if USER A choose the “get info” option for USER B, an IQ stanza will be sent to USER B, and USER B will answer to USER A with another IQ stanza containing the information that USER A asked for.

That’s it for part 1… in the next part I think we’ll be ready to dive into XMPP4r. There is also one point that I didn’t address about XMPP… and this is the Authentication process (SASL or TLS). So, next part should talk about this as well.