07|13 Implementing HTTP Live Streaming

Apple's recently released HTTP Live Streaming provides a clear, concise method for streaming audio and video content to browsers and mobile devices without the need for proprietary formats or browser plugins (I'm looking at you, Adobe). Instead, Apple's spec relies upon already common and relatively open formats.

The reference for implementing this functionality is Apple's HTTP Live Streaming Overview. Please read this document before continuing. I'll wait.

Right, so the jist is that you need to serve an m3u8 playlist containing metadata and a pointer to a URL containing a valid MPEG2 transport stream. As my preferred language is Python and the general convention for creating web services in Python is to use WSGI (PEP 333) we'll start with a basic WSGI application.

from flup.server.fcgi import WSGIServer
from threading import Thread
from socket import socket
from select import select
from Queue import Queue
import re

class LiveHTTPServer(object):
	def __init__(self):
		self.urls = [
			('^/stream.m3u8$', self.playlist),
			('^/stream.ts$', self.stream),
		]
		self.urls = [(re.compile(pattern), func) for pattern, func in urls]
		self.queues = []

	def __call__(self, environ, start_response):
		for pattern, func in self.urls:
			match = pattern.match(environ['PATH_INFO'])
			if match:
				return func(start_response, match)
		start_response('404 Not Found', [('Content-type', 'text/plain')])
		return ['404 Not Found']

So, there's not too much magic here. When the server object is instantiated, it compiles a list of regular expressions and maps them to instance methods. The WSGI server will call __call__ which attempts to match each of the regexes on the path of the request, calling the associated method if matched or sending a 404 response if not. Note that Ian Bicking's WebOb library is a much simpler way to perform a lot of these tasks, but doesn't provide an easy way to send chunked responses, which are necessary for the MPEG2 transport stream. Just ignore all the extra imports for now, we'll get to them in a minute.

	def playlist(self, start_response, match):
		start_response('200 OK', [('Content-type', 'application/x-mpegURL')])
		return ['''#EXTM3U
#EXTINF:10,
http://video.example.org/stream.ts
#EXT-X-ENDLIST''']

This method implementes the /stream.m3u8 response as required my Apple's spec. The M3U standard says that the EXTINF attribute should have a value of -1 for ongoing streams or those of unknown length, but the iPhone rejected the playlist given anything other than a positive integer in this field.

	def stream(self, start_response, match):
		start_response('200 OK', [('Content-type', 'video/MP2T')])
		q = Queue()
		self.queues.append(q)
		while True:
			try:
				yield q.get()
			except:
				if q in self.queues:
					self.queues.remove(q)
				return

This is where the tricky part actually happens. We create a Queue that will be filled with the MPEG2 data from another thread and start blocking on it, passing the data as a chunked response as soon as it's available. If anything goes wrong (eg. client disconnect) then we remove this stream's queue from the list and return. If this server were to handle a large number of clients, we might want to set a max queue size to avoid filling up memory with data destined for a slow or unresponsive client. It might also be useful to perform some locking on the queues list, to avoid contention between threads. I'll leave that as an exercise for the reader.

def input_loop(app):
	sock = socket()
	sock.bind(('', 9999))
	sock.listen(1)
	while True:
		print 'Waiting for input stream'
		sd, addr = sock.accept()
		print 'Accepted input stream from', addr
		data = True
		while data:
			readable = select([sd], [], [], 0.1)[0]
			for s in readable:
				data = s.recv(1024)
				if not data:
					break
				for q in app.queues:
					q.put(data)
		print 'Lost input stream from', addr

This method serves as the feeder for all of the client queues. It listens for a single connection on port 9999 and puts any received data into all available client queues. If the feeder stream is lost, it will go back to waiting for a new connection.

if __name__ == '__main__':
	app = LiveHTTPServer()
	server = WSGIServer(app, bindAddress=('', 9998))

	t1 = Thread(target=input_loop, args=[app])
	t1.setDaemon(True)
	t1.start()

	server.run()

Finally we tie it all together by instantiating the WSGI application and server and starting a separate thread for the input loop. The flup.server.fcgi.WSGIServer class will act as a FastCGI server that can act as a backend for any number of web servers. If you'd rather not use FastCGI, you should be able to drop in any other WSGI server as long as it supports multiple concurrent requests, otherwise any client after the first will just block waiting for the transport stream.

For my application, I used gstreamer to connect to the input socket and provide an MPEG2 transport stream. This is trivial to do using gst-launch assuming you've got the proper plugins installed.

gst-launch alsasrc device=hw:0,4 ! ffenc_libmp3lame ! ffmux_mpegts ! tcpclientsink host=video.example.org port=9999

07|13 HTTP live streaming a radio scanner

As my obsession with everything radio related continues, I found myself wanting a way to listen to arbitrary radio frequencies from my iPhone. I looked at several solutions along these lines, but settled for using an off-the-shelf radio scanner as a frontend due to the low cost, relative ease of use, and wideband reception. My portable scanner, an ICOM IC-RX7 certainly has all of the features necessary to accomplish this task, but is limited in that it's small package makes it difficult to setup a discriminator tap. A discriminator tap is a radio modification that allows you to capture the baseband radio signal before any real processing is done by the audio frontend. This is useful if you want to decode FSK encoded signals for instance.

So, rather than putting holes in my pretty portable radio, I opted to buy a new one. I figured this would be a good time to fill in for a few of the features the RX7 is lacking, like trunk tracking. After a bit of research and deciding where my price points were, I ended up with the Uniden BCT8 scanner. It supports a few common analog voice trunking systems (EDACS, LTR, Motorola) and has a serial port for controlling and programming.

As my platform of choice is Linux, I started searching for software to interface with the radio over the serial port and came up short. However, I did manage to find Uniden's serial protocol reference for the BCT8 and proceeded to implement a few Python modules to serve my purposes. As a side note: Uniden apparently cannot design a sane protocol and their engineers should be shot on sight.

With the eventual goal of getting scanner audio and control on my iPhone, I began to tackle the task of taking audio input from ALSA and pushing it over the network in some format that the iPhone can comprehend. My initial solution was to use the quite convenient (if somewhat undocumented) pyAudio library to capture raw PCM data from the sound card and multicast it out over my network. This worked fairly well for listening using my laptop, but would require some serious development work to get working on the iPhone (A custom Audio Unit Generator using multicast sockets to recieve the stream and somehow transforming it into something Core Audio can comprehend). Even if I did get that working, there would be limitations... It wouldn't work on a network without multicast, audio would be transmitted uncompressed, and the possibility of high-latency or dropped packets was high.

As I was working on the multicast server, Apple announced a new HTTP Live Streaming feature in the iPhone 3.0 firmware. A quick look at the documentation make it look fairly simple, as long as I could get my PCM audio stream into AAC or MP3 in an MPEG2 Transport Stream. The encoding part had proven more difficult than I thought and wouldn't work with straight UNIX pipes. I was also unable to find a good library for muxing MPEG2 Transport Streams... I figured that in the worst case I could use a Python module I wrote a few months ago for demuxing MPEG2-TS, but that would require quite a bit of work and may be filled with bugs.

Eventually I found gstreamer, which solves a lot of the above problems. I built a gstreamer pipeline that took input from alsasrc, compressed it using ffenc_libmp3lame, muxed using ffmux_mpegts, and output using tcpclientsink. The tcpclientsink is then connected to a Python app that reads input from a source socket and outputs the data as a chunked HTTP response to any open connections. I used WSGI to implement a simple server that handles the protocol portion of Apple's HTTP live streaming spec. In order to handle multiple clients without too much hassle, the streaming server uses flup to implement a FastCGI server that's frontended by nginx.

After bringing everything up and connecting the iPhone, I get a nice audio stream on the iPhone. There are still a few drawbacks to this approach, the most obvious being that I can't control the scanner from the phone. I'm hoping that Apple provides a way to integrate a live streaming connection directly into an app without popping up the QuickTime view, but I may run out of luck there. Worst case, I can provide some status information about the scanner (currently tuned frequency, talkgroup, etc) as a video stream in the transport stream and a separate web app for performing control actions.

I'll go into more detail about the implementation of the various pieces here in another post, but in the meantime the scanner's stream is available here: http://neohippie.net/scanner/ and is accessible using an iPhone with 3.0 firmware or VLC. Theoretically it should also work with QuickTime X, but that hasn't been released yet so I haven't been able to test it.