Audio Nonsense

Processing as a proof of concept tool

I've given my self a few days to do some of my favourite sort of work: quickly throwing together a proof of concept of an idea. I had forgotten just how good Processing is at doing this. So this is a bit of a fan post...

In a day I had remembered enough about Java and how processing works to throw together a rough GUI that can zoom and pan, with draggable sources. The libraries in processing are great, especially oscP5, it was so quick to add an OSC implementation to the project.

As long as you're not trying to make a polished and shiny bit of software, and what you're doing isn't too massive, processing really is the best tool that I've come across for the job - let me know if you think otherwise, I'm always up for trying new solutions.

Snake on Q-SYS...

Yep. How else do you test a webserver plugin?

There is a reason for this. Honest. More to come...

Experimentation, innovation, and many plugins.

I had a call this summer that went something along the lines of: "I've heard you're good at putting stuff together and making it all talk to each other". So for the last few months I've been lucky enough to be working on a massive project that really let people experiment, it welcomed and encouraged innovation - which is a very rare thing at the moment!

My job was to work out how to get control messages from QLab, ProTools, BlackTrax, and a custom GUI into, and out, of an SD7 and then into the correct one of three surround processors. Actually there were more signals than that - everything was redundant and automatically switched from master to redundant.

Unfortunately the production went bust last week. So I've now fallen off the edge of what was a very busy cliff and back to looking for work...

However, it wasn't a total waste of time. I now have a load more knowledge of writing Q-SYS plugins. I've even written myself a .qplug compiler that allows me to write plugins in a modular fashion and also allows me to include my own libraries.

So over the next few weeks I plan to formalise my plugin collection, re-write them, and document them here. Plugins that are basically ready for real-world testing include:

UDP matrix
OSC I/O
OSC translator
A suite of plugins for time-based cue triggering
Object based surround processor
Perhaps an update to my Art-Net plugin

Stay tuned...

Ambisonics - not the format of the future

I know, radical, right? But let's talk for a moment about our ear-brain system - how do we perceive position? I'm not going to go all academic here and cite sources because you can easily write a thesis on how all of this works and nobody wants to wade through a thesis...

Start with the obvious: we have two ears. Our head causes an acoustic shadow for frequencies that have a wavelength smaller than the size of our head. This means that a high frequency sound coming from the right will be louder in the right ear than the left ear. Frequencies with a wavelength larger than our head will diffract around the head and reach the other ear with no real change in level (unless the sound is very close to the head - inverse square law etc etc).

However, my head is only about 1.7kHz wide (20cm ish) but I can discern the direction of sounds that have frequency content below 1.7kHz. So there's something else at play here.

Time. Our brain uses the arrival time of sounds at each ear to position sounds. You can prove this by playing a sound through two loudspeakers separated a distance apart, if you add delay to the left loudspeaker the "image" pans to the right. The level coming out of the loudspeakers stays the same and therefore the level at your ears stays the same. But the source still appears to move. Our ear-brain system uses time information as well as level information to decode location.

This is where ambisonics has issues. Ambisonics is level based panning (at low orders and with low density loudspeaker positions). There is no time information. In order to recreate time information for what I would call real-world loudspeaker setups (i.e anything that's not an array of loudspeakers forming a circle around the listener) you need to know your loudspeaker layout, the position of every source in your environment and then you calculate the arrival time of each source at each loudspeaker in real-time (not actually as complicated as it sounds - I've written a Q-SYS plugin that does it using a delay/level matrix).

Low order ambisonics only excites half of the ear-brain localisation system.

However, be warned - as soon as you start moving things around on a system that does calculate time arrivals you get the side-effect of pitch shifting! My preference is to use a hybrid - time and level based positioning for stationary objects, and only level based positioning for moving objects.

I know that's not the end of the story - our pinnae are also a vital part of discerning position. This is why those that can only hear with one ear can still localise sounds. But let's tackle one part of this process at a time!

Thinking about all of this makes you realise why it's so hard to generate decent immersive audio content for playback on headphones. There are so many ways to excite our ear-brain system, and all of them need to be activated in exactly the right way for our brain to be fooled (I think "exactly right" is different for each person too!).

Audio and Ma

No, not Ma as in Mother. Ma as in the Japanese term for negative space. In ancient Japanese culture what is not there is just as important as what is there. There was a really great program on the BBC about it. For example in the practice of Ikebana, the space between the flowers is viewed with as much importance as the flowers themselves. This idea crops up all over the place. One example would be that when learning music you are often taught that the rests are just as important as notes.

Ikebana Ohara School 13 by Don Urban. The space without floliage is just as important as the foleage.

You can utilise the concept of Ma when designing loudspeaker systems. The areas that aren't audience areas are just as important as the areas that you are trying to fill with sound.When 90% of your audience are hearing more reverberant field than direct sound it's important to make that reverberant field sound nice. Therefore what hits the walls needs to sound nice, what hits the ceiling needs to sound nice, and you never know which gap between seats the show director will stand in!

Just a thing to think about: Audio and Ma.

How do cardioid loudspeakers work?

Here's my attempt to explain how cardioid (passive and active) loudspeakers work. I hope it makes sense!

An omni microphone

I find that the best place to start when it comes to understanding cardioid loudspeakers is with the omni mic. Sounds daft, but bear with me. Imagine a membrane stretched across a bucket - like a drum. Changes in air pressure (sound) can only act on the front (top) of this membrane no matter what direction the sound is coming from. For the purpose of this example our bucket is tiny, that means that sound hits the front of the diaphragm with equal force no matter what direction it comes from. It's pickup pattern is omni-directional.

Figure of eight microphone

Let's cut the end off the bucket. Now sound can get to the back of the diaphragm as well as the front of the diaphragm. So sound coming from the front of the diaphragm hits the front of the diaphragm and vice versa for the back.

Positive pressure from the top pushes the diaphragm down, and positive pressure from the back pushes the diaphragm up. This explains why sounds behind a figure of eight microphone are of the opposite polarity to sounds in front of the microphone.

Now imagine sound coming towards the edge of this diaphragm. An equal amount of pressure reaches either side of the diaphragm and so cancels itself out resulting in the distinctive figure of eight pickup pattern: sound from the front doesn't cancel it's self out so can be heard where as sound from the side gets cancelled out.

Cardioid microphone

So we can take the principals used above to create a cardioid microphone, we just need the situation that we had at the side of the figure of eight example to occur behind the diaphragm instead of to the side. To do this we need the sound from behind to arrive at the front of the diaphragm at the same time as the sound reaches the back of the diaphragm. This means that we need to slow the sound down that's going to hit the back of the diaphragm. This can be done in all sorts of fancy ways but we can just imagine an omnidirectional capsule with a little port in the back. Sound that enters this port has to travel the same distance to get to the diaphragm as sound that doesn't travel down the port.

From a microphone to a loudspeaker

You apply this to a loudspeaker by just turning it on it's head. Rather than air pressure moving the diaphragm, you use use the diaphragm to create the air pressure.

As a loudspeaker pushes air to create an area of high pressure an area of low pressure of equal amplitude is created behind it. If you were to add this area of low pressure to the area of high pressure they would cancel each other out (this is why drivers sound odd when they're not in a box).

So if you create the port and the hole in the back of the cabinet so that the pressure coming from the back of the loudspeaker meets the pressure coming from the front of the loudspeaker at the correct time they will cancel each other out.

Obviously if you want to do this in the real world it gets a whole lot more complicated!

Examples of cardioid loudspeakers have been around for ages. Here's a patent for one in 1971!

Rather than building a cabinet and trying and do this with a single driver and amplifier channel it's much easier to fake the delay port with an extra loudspeaker. Unfortunately doing the passive version of this is not quite as easy as I have explained. As soon as you have to design a loudspeaker in a box it all starts to get very complicated. To design a passive cardioid loudspeaker not only do you have to do the electromechanical design work for a decent loudspeaker, you suddenly add a whole load more variables into your equation by needing to use the rear energy for fancy tricks.

A real world example

I did a quick experiment to demonstrate these principles in the real world. I used some small 4" loudspeakers. Delaying a rear facing loudspeaker to be in time with the forwards facing ones I flipped the polarity. Some measurements and a little EQ later (to switch off the rear facing tweeter!) I got the results shown below.

Figure 1 shows the response of three loudspeakers - two used for forwards pressure and the third for rear cancellation. The measurements show the SPL of the forwards addition (in green) and the rearwards cancellation (in orange). Note the LF boost caused by the coupling of three drivers. Also note the HF cancellation caused by interference from either the two HF tweeters (it's not the floor because the microphone was in contact with the floor). Figure two show the same principal but this time only using two loudspeakers.

Figure 1: Three loudspeakers. Orange = rearwards SPL. Green = forwards SPL.

Figure 2: Two loudspeakers. Orange = rearwards SPL. Green = forwards SPL.

Interoperability Standards: is there better solution?

This is going to be a long one. Sorry, there are no pretty pictures either! (scroll to the bottom for a tl;dr)

Looking into this topic was a big part of my MSc thesis, I thought it might be interesting to share some of the discussion here rather than watch all that work sit on my computer doing nothing! Let's start from the beginning:

Shows are getting larger and technically more demanding. Installations are becoming more intelligent, and health and safety regulations are getting stricter. This means that there is an ever increasing need for equipment to operate as a cohesive whole; not only within and across individual departments (Audio, Video, Lighting, Communications), but crossing over into other industries such as building automation, IT, and communications.

Achieving a uniﬁed method of control is problematic because most devices do not use the same control protocol (they do not speak the same language). The most popular solution to this is to use equipment originating from a single manufacturer because their products will be designed to work together. However, this leads to compromises in design (as the manufacturer may not make exactly the product you require) and often increased costs (due to the captive market). With a compromised design no parties involved in the system speciﬁcation can reach their full potential.

Every event or installation consists of a unique collection of equipment, thoughtfully designed to fulﬁl a brief. Integration problems create an undesirable barrier, often preventing the ideal solution from being realised.

This is not a new problem in either the wider world or the entertainment industry. The same solution to the problem can be found in many examples: Ethernet, IP, USB HID to name a few. All solutions use a standardised protocol.

Standardised Protocols

Within the entertainment industry there are many examples where protocol standardisation has been used to achieve interoperability, the most successful example can be found within the lighting industry.

From the onset of modern live entertainment lighting it was clear that diﬀerent manufacturers would build control consoles, dimmers, and lighting ﬁxtures, but yet they must all work together seamlessly. This meant that all manufacturers must agree on a method of control; a control standard was required.

Released in 1986, DMX512 was designed to replace the large number of proprietary control methods in existence and it has been very successful. It is “the most widely used entertainment lighting control standard in the world”[1]. You would struggle to ﬁnd a controllable light designed for the entertainment industry not featuring DMX.

Standardisation has drawbacks

However, the lighting industry is quickly outgrowing the capabilities of DMX5121. With 8-bit resolution, DMX was never designed for the complex lighting ﬁxtures in use today. This means that it is not uncommon to see devices with high resolution modes where two channels of DMX are used to specify attributes requiring ﬁner control (giving 16-bit resolution). It was also not designed to handle the sheer number of ﬁxtures that are used on modern events. It is clear that a replacement is required or DMX must evolve to ﬁt the new requirements of modern lighting ﬁxtures.

Unfortunately, this is causing the lighting industry some problems because all manufacturers must agree to use the same protocol. This means that the ﬁnal protocol must suit the current (and future) needs of all manufacturers.

These control requirements are hugely diverse. For example, the new protocol must still cope with a simple single channel dimmer, but it must also be able to cope with situations where life-safety is a concern such as pyrotechnics, and equipment for ﬂying scenery. It also must be updated to work on standard IT infrastructure that exists everywhere. It may also need to be secure so that a “hacker” cannot turn out the lights at the inaugural speech of the President. With the increased use of video content perhaps the new protocol needs to carry multiple channels of video as well?

For these reasons and more, there is still no oﬃcial, stable, and widely supported protocol for controlling lights in the entertainment industry other than DMX512-A.

Standardised protocols try to satisfy the needs of as many diﬀerent users as possible which usually renders them huge and diﬃcult to implement, thereby increasing development time and costs. A simple ﬁxture that only has dimmer control will not have the development budget to implement these huge industry-wide protocols. To keep the costs low they’re unlikely to even have the Ethernet capabilities required to implement the majority of these protocols.

Control standardisation in the audio industry

In the audio industry there is more call than ever before to ensure that products are remotely conﬁgurable and monitorable.

Unfortunately the audio industry has never been able to produce a control protocol that has been adopted industry wide. The AES have published many diﬀerent standards in an attempt to ﬁnd one that suits the needs of every user. Other industry bodies have also tried, some of the most ambitious are attempting to produce a single control protocol for the entire entertainment industry (AVB).

Advantages of using proprietary protocols

Rather than embracing control protocols published by industry bodies, manufacturers quite reasonably often choose control protocols based on their product requirements and development expertise, often ending up with unique proprietary protocols.

These protocols are designed to suit their product requirements exactly, they do not have to compromise to ﬁt to the rest of the industry. This is useful because it allows the developer to design the smallest, most eﬃcient control protocol required; minimising processing overheads and leading to an eﬃcient product, which is often more reliable. These protocols are also very quick to implement and develop, reducing costs.

Manufacturers do not have time to implement a protocol that they will only utilise a small fraction of. If a protocol oﬀers features that are not needed in their product it is eﬀectively wasted development time.

Importantly, allowing the developer to specify the control protocol encourages innovation. The developer is not locked to a set of rules deﬁned by an external body. No standardised protocol can leave enough room for future requirements because it is not know what these requirements may be. By the time a new version of a protocol can be written and approved, the once innovative idea is old technology.

Translating between protocols

It is unrealistic to expect an entire industry to be able to agree on the description of one single control protocol. This is especially apparent at the boundaries of multiple industries. The entertainment industry cannot expect the building automation industry to implement a protocol designed for and by the entertainment industry.

Eventually the concept of a single interoperable control protocol becomes untenable and protocols must be translated to achieve device integration. Currently this is often achieved using an entertainment industry designed controller with a GPIO port. The GPIO port is then connected to a building automation industry designed controller via its GPIO. An engineer must then dictate what each GPIO port represents. Not an elegant solution.

An alternative approach to industry wide integration is to translate between the pre-existing small, specialised control protocols that manufacturers already use.

Protocol translation is entirely achievable, it's what I studied and implemented for my Thesis. I made a working proof-of-concept control protocol translator. If you want to know how it works drop me an email and I'll happily send you some stuff across.

It should be noted that protocol translation can only reach its full potential if the application-speciﬁc proprietary control protocols used by manufacturers are released openly. Having to purchase a licence from a manufacturer in order to implement a protocol would make a universal translator prohibitively expensive.

That is not to say that control of a product cannot be monetised. The ability for a product to respond to control commands could be licensed, but the actual control protocol must be freely available.

Conclusion

Encouraging the use of small, open-source, application-speciﬁc control protocols would mean that it would be quicker to develop new, innovative products. The ﬁnal product would be more eﬃcient, reliable and less prone to bugs. Writing translations to move between protocols would be less arduous because each protocol is quick to implement.

Protocol translation would be a faster route to industry wide integration than designing one protocol to suit everybody.

References

[1] J. Huntington, Show Networks and Control Systems, 1st ed. Brooklyn: Zircon Designs Press, 2012.

TL;DR

Expecting the entire industry to use a single protocol is not practical or even desirable. Encouraging the publication of the many small, specialised, proprietary protocols that are already in existence would allow for a protocol translation device to be used. Translating between small protocols would be a faster route to industry wide integration than designing one protocol that attempts to suit every user.

Interference simulation from first principals

I've got some experiments with subs that I want to do but current simulation software that I have doesn't allow me to try them (more on this at a later date). This means that I need to build my own simulation tool. I thought it'd be interesting to document the process.

My First Simulation! (blue = quiet red = loud)

Maths is not my forte, and I have no idea how acoustics simulation works "under the hood". But I do know my fundamentals (I have my time working at d&b audiotechnik to thank for that). So I'm going to go from first principals and build a really simple interference visualisation tool. I have no idea if what I've done is correct but it feels about right.

Lets keep it simple and say that there can only be two sources, both sources are perfect point sources (they are infinitely small, have a flat frequency response from 0 to infinity, and they can go infinitely loud), and we are only going to work in 2D.

So what am I trying to do? Well I want to see how loud a specific frequency is at every point on a plane. That's too complicated, let's simplify. How about: I want to find out how loud a single frequency is at one point in space, given two sound sources.

That's probably simple enough. That's addition and subtraction of sound waves. That's electroacoustics 101 (if you haven't done this course you really should. It's free! Also read this book. It's not free!).

Time for some maths, yep we need some because we actually want a loudness number in the end!

Addition and Subtraction of Soundwaves

A sine wave:

y = sin(x)

But because computers like to work in radians and humanbeans like to talk about degrees we just need to add a little bit to this equation so that x can be in degrees:

y=sin((x*pi)/180)

Great there is one sine wave. But we want to add two together:

y=sin((x*pi)/180) + sin((x*pi)/180)

But our two sine waves are not going to arrive at our measurement point at the same time. There will be an offset in arrival time of the two sinewaves. We can see this as a phase difference between the two waves. So we actually need to add a phase difference to the formula above. I'm going to call the phase difference p:

y=sin((x*pi)/180) + sin(((x-p)*pi)/180)

Or if we take out our radian conversions to make it look nicer:

y=sin(x) + sin(x-p)

If you plot the above with p=90 for x=0:540 you get this image (the two signals are red and green, the sum is black):

I made that plot in R using this code:

   p <- 90
   x <- seq(0,540,1)
   w1 <- sin((x*pi)/180)
   w2 <- sin(((x-p)*pi)/180)
   out <- w1 + w2

   plot(x, out, type="l", ylim=c(-2,2))
   lines(x, w1, col="green")
   lines(x, w2, col="red")

But you could use libraCalc or excel too.

Calculating the phase difference

In the calculation above we just decided that p should be 90. But how do we work out what it actually is?

Well first of all we need to work out the difference in distance between our measurement point and each of the sources. The easiest way to do this is to put everything on a grid and give everything coordinates.

So lets put source one at (0, -0.86) and source two at (0,0.86). Why? Well those numbers give us a distance of 1.72 between the two sources. 1.72 is half of 3.44. The speed of sound is roughly around 344m/s (it was a good value to choose when working to two d.p!) so we will be able to get predictable patterns for 50Hz, 100Hz, and 200Hz etc. It keeps the maths easy!

For now, let's put the mic at (10,0). We've got a small isosceles triangle with the two sources at one end and the measurement point at the tip. I've set it up like that so that we know that we should have no phase difference between the two. It means that we can check our maths really easily.

Now we need to find the distance between points on a graph (thanks google! Or just remember Pythagoras from highschool...). So then we've got two distances (d1 and d2). Now we just need the difference between these (d2-d1), I'm going to call this difference d (for the positions described above d should be 0).

We've got the difference in distance now how do you turn that into difference in phase angle? With this cute little formula where d is the difference in distance between one source and the measurement point and the other source and the measurement point:

360d/λ = p

Let's say we're calculating for 100Hz, using the wave equation:

c/f=λ

344/100 = 3.44. So (360*0)/3.44 = 0! Great, we can now calculate the phase angle! Check it with the measurement position of (0,10) you should get a nice predictable number!

Calculating the level

This is all very well but we've just been talking about distance and we've not taken into account that sources get quieter as you move away from them. So we need to work out what the level should be at the measurement point. This means updating the formula for adding sound waves together. All we need to do is add a level multiplier to each sine wave:

y=L1sin((x*pi)/180) + L2sin(((x-p)*pi)/180)

Now we can plot with level differences:

But we need to calculate what the level should actually be. Because we're using point sources we can use the inverse square law where sourceInitialLevel is the level of the source at 1m.

Level drop in dB = 20*(log10(abs(distance)));
Level at point = sourceInitialLevel - level drop;

We've nearly got everything we need. Now we just need to find the actual summation level.

Finding the final level at the measurement point

This is the bit that I got stuck on for ages. I was trying to differentiate the sum formula to find the maxima and it was getting really quite messy. Then I took a step back and decided - it's only 360 points (I think you only actually need to do it on 180), why don't I just rattle through them and manually find the maximum point?! So that's what I did.

Now we come to a point that I need to think about a bit more before I can give a nice justification for why it's needed: I chose to convert my dBSPL into pressure when it came to adding together the two sinewaves. I calculated the level at the measurement point in dBSPL (using all the stuff above) but then converted that level in to raw Pascals. Once I had found the maximum level of the summed sine wave I converted the raw Pascals back in to dBSPL. I think that's the right way to get correct dBSPL results.

All done!

Putting it all together.

We have now got all of the tools needed to calculate the level at any point in space from two given omnidirectional sources. Now we need to display this data.

First build a grid of measurement points, then iterate through the grid calculating the level at each point using the maths from above each time.

Find the distance to each source from the measurement point.
Find the difference between the two distances.
Calculate the level of each source at the measurement point.
Sum the two sources.
Find the peak of the summed sine wave.

Now that you have a level for each measurement point it's just a case of assigning a colour to that level (I have just mapped the level range to 0-50% of Hue on a HSB colour picker, but you can also find the max and work down in 1, 3, or 6dB increments to show you exactly where the -3/6/9/12dB points are).

So you've got a grid of colours, all you have to do now is display them!

I did my version in processing because it's such a fast way to chuck a proof of concept together but you can probably use any language you like. If anyone would like to see my code, or fancy fixing my mistakes ask me and I'll make the BitBucket public (warning: the project has moved on quite a bit from here).

Below you can see two screenshots from my processing project. I created a grid with 0.25m spacing for the measurement points. You can see the sources as white cubes. Each measurement point is coloured depending on how loud it is: red is loud, then it goes through yellow, green and ends with blue for the quiet areas.

The mark in the middle of this image just shows the centre of the plane.

I wrote this post a good few days ago. The project can now do delay, polarity, and most importantly many sources. Here are some screenshots:

End-fire cardioid simulation with 3dB per colour band.

Development mode showing actual level values at measurement points.

However, I think I have made a mistake. Because here is a screenshot with identical settings in ArrayCalc (from d&b audiotechnik) and my simulator. Sources are the same, resolution is the same, physics should be the same! But the pattern is totally different. I have a lime.

My Simulator vs ArrayCalc - I think I've made a mistake somewhere!

All feedback gratefully received etc etc.....

A very british hotel - A very poor advert for the entertainment industry

A few weeks ago I was watching a Channel 4 documentary about an exclusive London hotel (the Mandarin Oriental). It's a very posh hotel and they were putting on a wedding which seemed to be a very high budget event.

In my opinion (as everything is on this blog) the event was let-down by the tech. Look at the care and attention that has gone into dressing the room and then look at the plastic speakers on tatty black stands that have been supplied. In one picture you can see that the power cable to the loudspeaker wasn't even taped to the stand!

Perhaps this is an education thing? Perhaps people aren't aware that you can have really high quality audio, colour matched to your event, hiding in the background with cables colour matched to your carpet or walls, or even hidden under the carpet. You've just got to pay for it and find someone willing to go that extra mile.

The event manager says that he won't allow an event to go wrong, but yet he is happy relying on loudspeakers that are worth about the same as a glass of wine to his clients. I don't understand.

I would love to hear from the rental company that did this job and hear their side of the story. Did they educate the clients in what is possible? Why didn't they make extra effort, knowing that there was a camera crew in the room?

Also, I would love to hear from the Hotel to know if they are aware that it is possible to provide lighting and video equipment that is up to the same standard as their world-class dining and service.

Perhaps I should start a company specialising in top quality, bespoke audio and lighting systems....

I-Simpa: open-source acoustic modelling

In my search for an open-source, lower cost, and hopefully more user-friendly version of EASE I discovered I-Simpa. On the face of it I-Simpa looks great. ~~Unfortunately they haven't released an installer since 2014.~~

UPDATE: The developer of I-Simpa commented below and pointed me towards a windows installer. It can be found here. I've not had the opportunity to delve into it yet. So the rest of this post is now obsolete...

If I-Simpa would run on Ubuntu and I could still justify running it, building from source wouldn't be an issue. Unfortunately because most audio software doesn't support Ubuntu I am now bound to Windows, and building from source on Windows is a totally different story for me! I've actually given up trying to get I-Simpa to build (I didn't want to have to download and learn how to use Microsoft Visual Studio and I couldn't get it to compile using the terminal).

~~It is a shame. It looks like a great bit of software, but unless it's easy to build then it won't gain any traction.~~

~~In the meantime I'll continue looking for a low-cost acoustic simulation tool.~~

As an aside - has everyone found the "Bash on Ubuntu on Windows"? It's epic. Findout how to enable it here.

Entertainment industry press - Yawn

Ok, enough about me and what I've been doing, it's time to ruffle some feathers. I'm sure that anyone reading this blog has read at least one of the industry magazines either online or in paper form (PSN, LS&I, AMI, Installation, etc). Let me summarise the content of the next issue:

Here is a review of a show, wasn't it epic?!
Look at this brand new product, it's going to revolutionise the industry.
Here is an interview with a person saying the same stuff as the person before them.
etc.....

What's the pattern?
They can't criticize. The industry press is sponsored by the industry, their income comes from advertising. That means that they can't give negative reviews for fear of loosing revenue. I don't know about you, but I'm bored! Not every product to be released is the next amazing thing. Some of them are, to put it mildly, rather disappointing. Wouldn't it be amazing to have a product review that was honest? Whilst it could damage a companies reputation, yes (and so it should if they keep releasing bad products), it could also give them great feedback and help drive innovation in the industry.

This is true for show reviews too: imagine a review of a show saying "this felt just like the last 4 shows I've been to, the only difference was a new person stood on stage" or, "it sounded awful, I couldn't understand a word the lead singer was saying". If I were working on either of those shows I would welcome the feedback, learn from it, and invite the reviewer to my next show to see if I got better.

Press isn't supposed to be nice and keep everyone happy but whilst it's funded by the very people it's reviewing I don't think anything is likely to change.

Immersive Audio

I've been experimenting with ways of demonstrating immersive soundscapes to people without having to have a large number of loudspeakers.

Introducing VR audio. Oh wow I wish everyone could agree on a nice way of representing a 3D sound field. Most people seem to be settling on ambisonics, but should it be 1st order or 2nd order? What order should the tracks be in (alphabetical order obviously - looking at you YouTube..)? What format should the content be in?

So here is what I tried. First I mixed an example in 3D and then uploaded it to YouTube only to discover that the soundfield doesn't move when you look around! Perhaps I did something wrong in the convoluted process of channel ordering and metadata. I'm not sure. Frankly the process is far too difficult to be of any use in the real world at the moment. You can listen to that failed experiment here if you should wish.

Then, using Bruce's convolutions and correction filter I did an example mixed in Ambisonics and uploaded it to soundcloud as binaural. You can find that example here. I have never been impressed with binaural audio. Even if it's a recording made on a dummy head with pinnae. I think perhaps it's linked to the art of foley. A realistic gun shot does not sound realistic when it's shown on screen. If you put yourself in an unrealistic situation everything must be exaggerated in order to be believed. I think it's the same with immersive audio - things that try to be too realistic loose their realism.

But that's a topic for a different day - "The path of academic research into immersive audio".

Art-Net Q-SYS plugin

UPDATE:
- This post continues to get quite a lot of traffic. The plugin still exists, I know the download link is broken, I've left the freelance market since I wrote this so keeping side projects up to date has had to take a back seat for a while. I hope that I can pick it up again in the future!

I've written a plugin for QSC Q-SYS that enables it to output Art-Net. You can download it ~~here~~.

Q-SYS is an amazingly powerful integration tool that I believe can be used for much more than the boardroom A/V or commercial audio processing (airports, shopping centres etc) that it's known for.

With a little programming knowledge it is pretty simple to write plugins for Q-SYS, and it's easy to get TCP or UDP communications up and running. The real power of Q-SYS lies in its tried and tested architecture - it is amazing at redundancy but yet so simple to use and setup.

So with a little imagination it's easy to make things like custom control interfaces. A small microcontroller with a network stack (I use Texas C series, but even arduino would do fine) and a custom plugin for Q-SYS and you're away. Using the same method it's easy to make adapters - perhaps a Q-SYS to DMX512-A adapter, or Q-SYS to CAN-BUS? How about a Q-SYS hosted object based surround processor (I've made one; perhaps I'll share it on here...).

The wonder of technology that's so open is that it's just a toolkit, it's up to you how you use it.

Who am I and what am I doing?

Who am I?

My name is Tom and my Linkedin page informs me that I “provide technical services to the entertainment industry on a freelance basis”. Which just about sums it up. My world ranges from technical literature, through to system design, integration, bits of research and design, and even some RFID race timing. You can find out more about me on my Linkedin should you wish.

What am I doing?

Well as I find myself with a bit of time on my hands I thought I would begin a blog to share my thoughts on various audio related topics. I want to share my reasonably non-technical take on some, often controversial, subjects and add my voice to the online debate. Hopefully I can learn some things along the way and if I’m lucky get the chance to impart some of my knowledge to someone else.

What will I cover?

I’ve got some plans for topics. Yes I do want to cover the cliché topics such as my thoughts on how to tune a system, hi-fi snake-oil, and no doubt I’ll mention sub arrays at some point. But I also want to address the boundary between audio industry, audio academia, and computer science. Occasionally I may even scratch psychology (unconscious bias and double blind tests). I won't make it too technical, I like to talk about things in the simplest way I can because that's how I understand them best. Warning - there will be lots of analogies!

I hope you find this blog interesting and I hope I have some fun writing it too. Please join in the conversation. Comment, share, and discuss.