Monday, July 19, 2010

It's been a real long time since an update to this blog, but I thought I'd bring it back with a summary of the progress on the client side of the audio system.

So, I've been working on this for a long time, and put in a bunch of research. Without doing a lot of lowlevel C hacking (which we don't have time for) Here's what each individual computer will be capable and incapable of. Remember that multiple computers can be connected to act together in any way we dream up, so long as the lower level components will allow it.

The audio component of the client system has three tiers.

Tier 1 - GStreamer & PyGst
This is the core of the audio playback technology. It provides the tools for decoding the audio, allowing us to play anything from mp3 to ogg to m4a to flacc without having to worry about what format it is. At this level we also have play/pause/seek support, seek supports resolution nanosecond level (hell yeah). Here we can also control volume, but only for both channels at once.

Tier 2 - Jack Audio Connection Kit (JACK) & PyJack
This provides our dynamic mixer and patchbay, two very important components. It also has a crazy low latency, with the proper kernel and memory it will have the lowest latency that the hardware physically allows (just a fun fact, really). The first important feature that JACK provides is the dynamic mixer. Without this, each channel on the sound card would be locked to a single audio stream. GStreamer could play a single stereo audio stream, and the sound card would block on any incoming data. Without JACK we can have no concurrent audio streams, and that's bad. The second feature that JACK provides is the patchbay. Say we have two GStreamer outputs, each of them in stereo. Say we have a sound card with two outputs. JACK allows us to patch GStreamer outputs 1&2 into hardware output 1, and GStreamer outputs 3&4 into hardware output 2. We can map any JACK-capable inputs to any sound cards compatible with Linux.

Tier 3 - Advanced Linux Sound Architecture & PyALSAAudio
The lowest level that we give a damn about. This is where JACK outputs to. Each of those hardware channels that Linux supports is abstracted by a kernel driver for ALSA. What this means for us is volume control for each and every supported hardware channel.

If you read this at more than a skim you realize that we are without volume control for individual channels at an audio stream level. This would have to be controlled at the GStreamer or JACK level, and so far as I can tell there's no way to do that that doesn't really, really suck e.g. I could deinterleave the channels in GStreamer, modify the volume on each and reinterleave them, but this is very very dirty, and more likely than not to cause damage to the stream integrity. Not. Cool. Every other option I've created has been just as awful (or worse). Still, progress is good!

Monday, June 28, 2010

(lack of) Subtlety in Lighting

Subtlety is perhaps the most forgotten element of lighting in the modern theater. Today with professional theater companies one style of lighting dominates the visual scene. It is agressive lighting designed to force a scene with strong, vibrant colors and hard directional angles. This style of lighting is undoubtably a hybridization of the entertainment lighting that evolv
ed with rock and roll crossed with the Jesus Christ Superstar to Wicked transition that took
over Broadway three and a half decades ago. With the advent of computerized control decks, moving lights, digital projectors, dichroic color mixing, and later LED fixtures, lighting designers embraced an over-the-top, intensity driven style designed to overpower the audience. This style of design can only be described as replacement lighting, where the natural emotions created by a production are forced out by a combination of sound and lighting.

Don't get me wrong, there is most certainly a place for elaborate, intense and overpowering lighting. The last scene of Next to Normal where the entire stage became engulfed in blinding light (created by over 300 250watt incandescent bulbs) was exactly what that produced needed at that moment, but not every scene, not every song needs to remind us of Oprah or CSI. What I am talking to is the tendency to homogenize the emotional response of an audience by locking down focus and creating a sensory overload. This is not theater.

Friday, June 25, 2010

Audio: Streams and Triggers

Same project, same problems, different sense. On the surface, audio distribution seems like a simple issue, or at least, simpler than the lights. The technology for distributing light cues over ethernet is clearly a bit arcane, and sadly, not terribly well documented. Sound data, however, is thrown across ethernet every day. Computers have access to not one but many well maintained, well documented, high quality audio streaming protocols. In fact, with enough work, many pre-existing data-transfer libraries can even be shoehorned into doing duty as an audio stream. With a central server and a (cheap, outdated, obsolete) computer in every room that needs sound cues, a streaming network could provide all the audio needed, directly over ethernet in tidy TCP packets. So far, everything looks pretty good.

Until we look into how those protocols work. Try opening an internet radio station, or sharing music with an Apple TV or Airport Express. In fact, try viewing a video on YouTube. There are palpable seconds between the moment the data is requested, and the time it takes to get to your ears. Of course, this is perfectly fine when your only goal is simply to hear that music, or watch that video. They will get to you, no problem, that's what those systems are meant to do, get the data from a central location, out to speakers. These protocols will all buffer and therefore they will all present a significant latency. Getting your Youtube video three seconds after you click is pretty good. Playing a sound cue three seconds late isn't.

So where to go from here? Well it's pretty clear that those streaming protocols aren't very helpful in this situation. But only because they were designed to solve a different problem. Those protocols solve the issue of how to get the data to the user, they have nothing to do with time and triggers and everything to do with pushing data around. Most of the time they are used when the device playing back the media either cannot or should not have its own copy of the media, e.g. an AirportExpress which has no memory of its own to store data on, or YouTube where copy-righted content can stay safely on the server. The issue here is getting the data to play on time, not getting the data to location. So what if the client machines, the low-end boxes distributed throughout the building, already had the audio files loaded onto them, waiting to be triggered at any moment. No waiting for data to buffer, no delay, near 0 latency. And not just triggers to play and stop, but for gain, speed and duration modification too. This way the bandwidth isn't being taken up by audio pouring through it, choking off the light cues and increasing delay. So far as I can tell this doesn't exist yet. Oh well, time to make it.

Edit: Austin just made an interesting point, this is pretty much what Apple did with their iTunes remote protocol. I have to say I am a little bit worried by how damn slow that is, and I'm hoping this will be better. Right now I'm still laying down some groundwork code/systems.

Console Worlds: Tracking and Preset

One of the largest changes in moving from an entry level board, such as a SmartFade, Express, or Expression, to a full control deck such as an Ion, Eos or Congo, is a dramatic shift in the operation and edict of cue construction. It is absolutely essential to understand the differences not only in initial programing, but in terms of on-the-fly editing and scene tweaking.

Earlier and entry level consoles operate on what is known as a preset system. A preset systems works as the name implies, each cue is preset by the board operator and recorded on the console as complete values of every channel at that moment. Even though only one light may change intensity in a given cue, all data is recorded to the memory banks of the board. This style of construction means a board operator can manually build cues on the fly by raising or lowering individual sliders, submasters, or groups, and the recording functions as a snapshot of the exact state of the console at the moment record was pressed. A common problem for board operators and lighting designers centers around the fact that both ourselves and directors are picky. In a preset style board, changing the level of one light through a series of cues requires quickstepping through each cue with the selected fixture captured at the desired level. This can be tedious to say the least, and often impractical to accomplish in the middle of a rehearsal or cue-to-cue.

Enter the tracking console. Tracking consoles utilize a very different style of programing designed to work intuitively with a director and lighting designer. Tracking consoles attempt to mimic how a designer thinks about a show, in terms of what changes. When you hit record on a tracking console, the board does not save a comprehensive snapshot of the current scene. Instead, it records only what has changed! Additionally, it will carry on this change (as it did with all those beforehand) into any new cue you make. Additionally, any changes you make in previous cues will transfer forwards to preexisting cues, assuming the level is maintained.
What does that all mean? Let's say in Cue 1 you set a cyc blue to 45%, which continues until cue 50 when it drops to 20%. Now, say you want to change the initial 45% to 60%. On a conventional preset board you would go to cue 1, capture the channels at 60%, and quickstep through cues 1-50. On a tracking console, the instruction of where to set that level is saved in one place and one place only, cue 1. Because the intensity of the light does not change until cue 50, it has no recorded value for cues 2-49. In order to change the intensity of the cyc from 45% to 60%, the only cue you need to modify is cue 1.
But this isn't always good, maybe you only want cues 1-20 to reflect the 65% change. In that case, tracking consoles use something called block cues. Block cues effectively break the tracking, allowing you to start fresh. Some consoles allow you to set break cues on specific channels (I.E. set a break cue for just the cyc).

Tracking consoles offer tremendous advantages over traditional preset consoles in terms of versatility and adaptability, yet often have a high learning curve. Because of the nature of programing, they often lack the kind of direct-user control Express owners are accustomed to, and are used less in live-creation settings.

Thursday, June 24, 2010

DMX over CAT5 vs DMX over CAT5 via IP

These days there are many means of distributing DMX instruction packets between devices appart from traditional 5 pin XLR cables. 5 pin has existed as an industry standard for many years, however with the changing demands of both the entertainment industry and complexity of technical theater, other solutions have emerged.
Nearly every new commercial or public construction project within the last twenty years comes prewired with three cables. Power, phone, and CAT5. The latter is of most interest. DMX instruction packet distribution over CAT5 offers several significant advantages, most notably cost and ease of use. Performing in an unorthodox space? Chances are CAT5 is already run through your venue!

Preexisting CAT5 cable can be utilized in two fassions.

Traditionally, early adopters of CAT5 simply used the cable as an inexpensive solution to long runs of 5 pin, doing a straight up electrical conversion. There is one quite glaring problem with the method. The R S-485 protocol that DMX runs on is not compatible with any TCP/IP hardware. In other words, DMX and standard ethernet based hardware are electrically incompatible, and connecting them will result in destruction of one or both pieces of hardware. The advantage of CAT5 to DMX cabling is no switching or routing hardware is needed, and the cable functions identically to a DMX run.

Since the late 1990's many companies have constructed their own TCP/IP based DMX conversion protocols. Since these protocols utilized the TCP/IP stack, the are 100% compatible with preexisting ethernet systems. One of the most widely accepted standards of TCP/IP DMX transmission is the ArtNet protocol. ArtNet can run through existing routing hardware (removing the costs of opto splitters) and distribute up to 256 individual DMX universes (256*512 =131,072 control channels) over one CAT5 cable! To make matters even better, ArtNet is a bidirectional connection, thus any ArtNet terminal can be reconfigured to work as a control module, allowing you to place your console literally anywhere with any ethernet jack.

Additionally, each lighting manufacturer maintains their own proprietary protocols for interface with their hardware. ETC, Strand, a, Pathway, MA, and Hog Ethernet protocols are entirely proprietary to the company. Often these protocols offers slight advantages over ArtNet, yet their lack of uniformity can make addition additional gear difficult.

Most hardware today (dimmer racks, intelligent fixtures, etc) come with or have options for ArtNet ethernet inputs, allowing these systems to be build inexpensively and efficiently. ArtNet nodes are available, which allow for conversion of ArtNet protocol to traditional 5pin DMX. When running an ethernet based system it is vital to stay on top of traffic over the network. It is highly recommended that you do not run 'internet' and lighting control protocol over the same wires, as it can increase network latency.

Now for the project. 30+ rooms over ethernet, cue stack for each room, one centralized time stream. Stay tuned!


Sleep No More

This winter, a British theater group changed american theater forever. If you were on the east coast you probably heard of the collaborative production of Sleep No More between the American Repertory Theater and Punchdrunk. The performance showcased the first modern attempt to introduce the American audience to immersive theater, and the show hit its mark. The power of Punchdrunk comes not from lofty Avant-garde theory designed to challenge us in vague abstractions, but from its remarkable ability to translate the underlying emotion or charge of a work into a visceral gut feeling.
The worlds created by Punchdrunk rely heavily on atmospheric elements to create that childhood sense of excitement, apprehensiveness, and wonder. The worlds resemble a dreamscape, where objects, actors and installations both are and are not at the same time. This recreation is supported by an elaborate and elegant technical backbone infrastructure. What makes many of Punchdrunk's technical challenges unique is their location. Sleep No More was performed in a converted 4 story schoolhouse in Brookline, MA lacking many of the amenities lighting and sound designers are used to, such as rigging, cabling, or large-load power distribution mechanisms. It is for these exact reasons that Punchdrunk represents the cutting edge of technical theater and a revolution within the performance industry.