The ADSR Envelope with Audiolib.js

Filed in development | html5 | javascript | music | ui | web 6 Comments

So, let’s talk envelopes. First what they are, and what they can do to the quality of a sound.

Demo here: http://www.benfarrell.com/labs/examples/envelopes-12-17/, but read on for how it all works!

An envelope is a real simple concept actually. Take any signal, like a sound. Maybe that sound plays at a constant volume. When you apply an “envelope” to that sound, you are changing the volume of that sound while it’s played. It might go up and down, back up again, whatever.

How it goes up and down, and the speed at which the volume is changed is up to the details of the envelope used.

You could create an envelope that takes 8 hours to complete. Maybe you want to go to sleep with some music, and then wake up in the morning with music. If you know it takes you 30 minutes to fall asleep, you’ll start the music playing at a loud volume. Over the next 30 minutes, you envelope your sound from loud to quiet, to off. In the morning, 30 minutes before you wake up, the envelope makes the music go from off, to quiet, to loud again. It wakes you up!

While this 8 hours illustrates how an envelope works, it’s a bit different when talking musical tones.

It’s not that different, however. We’re still talking about volume over time, but we’re talking milliseconds instead of hours or even minutes or seconds.

The character or personality of a musical tone can be changed greatly by altering the envelope. While we talk about this in terms of 1/1000th of a second, you don’t really notice the volume changing as you listen to the tone. Your ear doesn’t necessarily detect that the volume is going up and down.

Instead, the sound just has a different tonal quality! A piano for example wouldn’t sound as sharp if it didn’t go from no sound to loud that quickly. An accordion though, has a longer time as it goes from quiet to loud. And that quality – the “attack” (amongst other factors) create the personality of the tone.

Let’s talk specifics now. ADSR.

That’s Attack, Decay, Sustain, Release. The ADSR envelope is just one type of envelope, but it’s a popular one that’s been used in electronic music for decades.

  • Attack is the period of time after the initial release, it’s typically the loudest part of the sound
  • Decay is the phase while you’re going from the attack to the sustain – you’re “decaying”  the volume from this sharp initial phase to the normal volume phase
  • Sustain is the normal phase of the sound.  It is typically less volume than the attack, and go on for an indefinite period of time, or for a specific amount of time
  • Release is the draw down from the sustain period to no sound.  A fade out

the attack, decay, sustain, and release phases

I’ve talked about Audiolib.js in previous posts.  Audiolib is the Javascript library that enables you to make these dynamic sounds in Chrome and Firefox.

Audio programming isn’t easy though!  So while Audiolib helps out in awesome ways, it doesn’t have concepts of notes and music theory.  You have to tell it what frequencies to play, and if playing a chord, which individual frequencies make up the chord – which I explored and created my own helpers for.

The ADSR envelope is another example of something that Audiolib.js provides, however, it doesn’t provide any obvious usage for it.

There’s actually a VERY good reason for this.  While, we’re talking about envelopes on musical tones, the 8 hour sleepy-time envelope is another good usage example.  And that’s restricting envelopes to the volume of a sound.  There are tons more examples in audio synthesis that envelopes can be applied.  Like effects – you can apply a distortion effect to a sound (like a rock guitar).  But you can envelope the amount of distortion which is applied to the guitar.  This has nothing to do with volume, and everything to do with just how much of something is applied to something else.

So, I’d like to create a usage of our ADSR envelope that is limited to producing a musical tone – especially in the example of producing live sound by using a trigger (here it will be your computer/laptop keyboard).

Let’s start with our previous example where we extend the Audiolib.js Oscillator via a plugin.  We extended it to simply take a musical notation, like an “A” and set the correct frequency:

audioLib.generators('Note', function (sampleRate, notation, octave){
// extend Oscillator
for ( var prop in audioLib.generators.Oscillator.prototype) {
this[prop] = audioLib.generators.Oscillator.prototype[prop];
}
// do constructor routine for Note and Oscillator
var that = this;
 
// are we defining the octave separately? If so add it
if (octave) {
notation += octave;
}
that.frequency = Note.getFrequencyForNotation(notation);
that.waveTable = new Float32Array(1);
that.sampleRate = sampleRate;
that.waveShapes = that.waveShapes.slice(0);
}, {});

So let’s figure out how to work in an ADSR envelope. The usage for the Audiolib.js version is like so:

myEnvelope = audioLib.ADSREnvelope(sampleRate, attack, decay, sustain, release, sustainTime, releaseTime);

The parameters work like so:

  1. Sample Rate:  The sample rate of the audio – I won’t go into it here, as it’s a basic setting for audiolib
  2. attack – the amount of time (in milliseconds) that the attack phase takes to complete
  3. decay – the amount of time (in milliseconds) that the decay phase takes to complete
  4. sustain – the level of volume during the sustain phase (from 0 to 1)  - the default is 1
  5. release – the amount of time it takes for the release phase to complete (in milliseconds)
  6. sustainTime – the amount of time it takes for the sustain phase to complete (in milliseconds).  This param is pretty important though, because if you pass in null, the sustain period is indefinite.  Unless you call into the envelope with a trigger, it will continue being in the sustain phase forever
  7. releaseTime – the amount of time between the release phase and the envelope looping around to the attack phase again

The envelope has 6 states of being (0-indexed inside the code):

  1. Attack Phase
  2. Decay Phase
  3. Sustain Phase
  4. Release Phase
  5. Timed Sustain Phase
  6. Timed Release Phase

To kick off our envelope, you trigger it:

myEnvelope.triggerGate(true);

Now we can start using our envelope.  The usage is a little weird to me, as the Audiolib.js library treats it like a “generator” which seems a bit complicated for what it does.  I just want a stream of numbers, but OK I’ll bite.  I’ll use it with the byte arrays and whatnot, as if it’s an Oscillator.

var buffer = new Float32Array(1);
myEnvelope.append(buffer, 1);

So, I’m just pulling one value at a time from the envelope, and putting it into my “buffer”.  But my buffer only has one value in it at any time.  Like I said, I feel like I’m being forced into using it in more complicated of a way than I need!  Maybe there’s something I’m missing.

Now, every time, I get create an audio data point in my sound, I can multiply the data point by my envelope.  Thus my envelope is applied!

this[this.waveShape]() * buffer[0];

To do this, I overrode the “getMix” function in the Audiolib Oscillator.  But I needed to do other things too.

Since I trigger the envelope with triggerGate, it will cycle through the attack and release to the sustain phase as the envelope is used, automatically.

It gets complicated at the release phase though.  After your computer/laptop keyboard is released, we need to enter the release phase.  But we’re still grabbing sound, because the release phase still produces sound as it fades.  So our Oscillator needs to track that it’s in the release phase (it knows by keeping track of the envelope.state, which is 3 or 5 here for release or timed release).

Then finally when it gets back to state 0, or the cycle begins again, we mark this note as “released”, so our buffer knows that it doesn’t need to pull from it anymore.  We have to be very careful of note pulling from more notes than we need, cause all this music stuff is hard work, and too many notes slows down your CPU and breaks the audio processing.

The above is if the sustain phase goes on as long as you hold the key!  What if it’s a timed sustain – then we need the logic in there to release the key when the envelope is done, rather than when our user releases the key.

Here’s our final Note.js code.  And here’s a controller for keeping track of keys being pressed.

It all comes together in my (Chrome only) demo:

http://www.benfarrell.com/labs/examples/envelopes-12-17/

The demo starts out by not using an envelope at all.  You’ll hear some clicky-ness when you press and release a key.  That’s because you’re hearing the transition between no sound and the abrupt start in the phase of the waveform.  It’s EXACTLY one of the reasons why envelopes are useful – to ease these transitions in and out.

When you turn on the envelope, you can start adjusting parameters to see how the different properties of attack, decay, sustain, and release affect the overall personality of the tone.

Making an Audiolib.js Plugin

Filed in development | html5 | javascript | music | ui | web Leave a comment

I just started checking out Audiolib.js this weekend.  Audiolib is a Javascript project to help you synthesize sounds and tones in JS.

Just recently, Google Chrome added the Web Audio API, while Firefox added the Audio Data API.  Both let you get low level with sound – you can add bytes to an audio buffer and manufacture sound in realtime.

The library is bundled with “sink.js” which handles the inconsistencies in the audio API between Firefox and Chrome to create a common API.  Once this is abstracted away, we can make some audio.

I’ve done the nitty gritty of writing bytes to the buffer in Flash, and from my limited playing Audiolib does a great job with this, and I’m ecstatic I don’t have to rewrite my Flash stuff!

Audiolib provides “generators” for you to work with.  A “generator” is something that makes a sound, and defines a common API for anything that makes a noise to write that sound to the audio buffer.

In the generators namespace/package, we have an Oscillator, White/Pink/Brown Noise, and a Sampler.  White noise is cool and all, but I jumped right to the Oscillator to make a real tone.  The Oscillator takes a sample rate and a frequency.

You can define it thusly:

dev = audioLib.AudioDevice(audioCallback /* callback for the buffer fills */, 2 /* channelCount */);
osc = audioLib.Oscillator(dev.sampleRate /* sampleRate */, 440 /* frequency */);

So, first we defined the device to use – we told it what to use for the audio callback, and how many channels there are (we have 2 channels…left and right).

The audio callback is a pretty standard thing in the world of audio. When the sound card is starting to run out of audio to play, it signals Javascript and says “Hey I need more audio! Fill me up with some bytes”. With our “audioCallback” function, we say “No problem sound card! I got your back! When you run low, please call our audio callback method – this will fill your audio buffer up”

Here’s an example of the audio callback:

function audioCallback(buffer, channelCount){
  osc.append(buffer, channelCount);
}

All that happens here, is that the buffer and channel count gets passed into the method, and we tell out Oscillator what this is, and this Oscillator generates the appropriate bytes and send them to the buffer.

But what is that generator…the Oscillator? Well, it creates a sound at a certain frequency. We pass in the frequency as we instantiate this “osc” object. When the audioCallback fires, it pulls this frequency from the osc object and plays a specific tone!

This is awesome! All the hard work is done for us – but one of the first things I wanted to tackle here was to make it easier for musicians to understand.

See, casual musicians don’t know that a middle “A” on a piano oscillates at 440hz – they just know that they hit a middle “A” to play the tone. I want something that makes sense for casual musicians, so my first small task with audiolib is to override Oscillator to take a notation vs a frequency.

Audiolib provides a plugin spec, but it tripped me up a little to understand how a “Generator” worked. See, they have a js/generation folder which contains the Oscillator. It’s all quite readable and self-documented.

However, the “append” method was not found in the Oscillator class at all! I was looking EVERYWHERE for it!

Eventually I found it in the “wrapper-end.js” script that’s outside of all the folders.

Basically, you create a generator like Oscillator, and you write it to the audiolib.generators namespace. By virtue of being in this namespace, the “wrapper-end” script comes along to wrap things up. It takes all the things in the audiolib.generators namespace and adds the generator base functions. Basically it makes all the generators extend the generator base class after the fact.

Kinda confusing and sneaky if you ask me! Oh well.

Regardless, the plugins are well documented, so I copied the sample generator plugin:

audioLib.generators('SemiClock', function (sampleRate){
	this.sampleRate = sampleRate;
}, {
	prevSample: 0.0,
	generate: function(sample){
		this.phase = + !this.phase;
	},
	getMix: function(){
		return this.phase;
	},
	phase: 0
});

Cool, so we’re naming a generator in the generator namespace, and defining some required methods, like “generate” and “getMix”.

Well, all I want to do is make a copy of all the objects in Oscillator and add a function and initialization routine to accept a musical notation and instantiate an oscillator with the correct frequency.

So here’s my take:

audioLib.generators('Note', function (sampleRate, notation, octave){
    // extend Oscillator
    for ( var prop in audioLib.generators.Oscillator.prototype) {
        this[prop] = audioLib.generators.Oscillator.prototype[prop];
    }
 
    // do constructor routine for Note and Oscillator
	var	self = this;
    self.octave = isNaN(octave) ? 4 : octave;
    self.setNotation(notation);
	self.waveTable	= new Float32Array(1);
	self.sampleRate = sampleRate;
	self.waveShapes	= self.waveShapes.slice(0);
}, {
    /* incremental tones as sharp notation */
    sharpNotations: ["A", "A#", "B", "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#"],
 
    /* incremental tones as flat notation */
    flatNotations: ["A", "Bb", "B", "C", "Db", "D", "Eb", "E", "F", "Gb", "G", "Ab"],
 
    /**
     * notation setter
     * @param notation (octave is optional)
     */
    setNotation: function(nt) {
        this.notation = nt;
 
        // does notation include the octave?
        if ( !isNaN( parseInt(nt.charAt(nt.length -1)) )) {
            this.octave = parseInt(nt.charAt(nt.length -1));
            this.notation = nt.substr(0, nt.length-2);
        }
        this.frequency = this._getFrequencyForNotation(this.notation);
    },
 
    /**
     * turn a notation into a frequency
     * @param notation
     * @return frequency
     */
    _getFrequencyForNotation: function(nt) {
        var freq;
        var indx = this.sharpNotations.indexOf(nt);
 
        if (indx == -1) {
            indx = this.flatNotations.indexOf(nt);
        }
 
        if (indx != -1) {
            indx += (this.octave-4) * this.sharpNotations.length;
            freq = 440 * (Math.pow(2, indx/12));
        }
        return freq;
    }
});

Basically what I did here is extended the Oscillator class. I looped through all the properties of the Oscillator.prototype object and copying it onto the new class.

I then, went in to the Oscillator generator and copied the small initialization routine.

Then I injected my own methods! I replaced the frequency setting with a method to lookup the index of the musical notation in my preset arrays of notations. I also parsed out the octave (if it existed).

In the end, I have a “Note” class which is exactly the same as the Oscillator class – however, instead of passing in the frequency (ie 440hz), you pass in “A”, or “A4″ for the 4th octave.

To change the notation of the oscillator/note, simply call the setNotation again!

I’m definitely looking forward to exploring Audiolib.js more. It has a few concepts that I hadn’t gotten around to implementing in my own Flash framework yet, so I’m excited to see how it’s done.
 

Hello from NCDevCon! My 2011 Presentations

Filed in development | flash | flash/flex | flex | html5 | ios | music | ui | video | web Leave a comment

Hello from NCDevCon.  If you were able to attend my presentations, thanks!

I’ve gone ahead and written them up as blog posts for you if you’d like to revisit what I talked about.

 HTML5 vs Flash Video: Choose Wisely (slides)

Live Instrumentation in Flash (slides)

Live Instrumentation in Flash Part 6 – Some Demos

Filed in development | flash/flex | flex | music | web Leave a comment

Demos

Live Instrumentation in Flash Part 7 – Beyond Flash

Filed in development | flash | flash/flex | javascript | music | web Leave a comment

At the heart of all our audio generation lies one simple Flash feature – the sample data event.

Can we find things outside of Flash that will let us do similar things?

The answer is….of course!  People have been writing desktop audio and music synthesizer software for years.  Flash and AIR offers some decent functionality on the desktop, but writing something in C++ would blow Flash away of course.

The reason is that Flash has only been in the generative audio game for a few years, while people writing C++ have been at this for decades.  As such, they have access to ready made filters, libraries, etc.

Take, for example, Node Beat – written with OpenFrameworks

http://forum.openframeworks.cc/index.php?topic=6070.0

 

Objective-C for doing iPhone development has the nice benefit of access to core libraries that Apple OSX Cocoa developers have had access to for years.  The basic AudioUnit in Cocoa or Objective-C gives you the low level access you need to make things like Mobile Synth:

http://code.google.com/p/mobilesynth/

Or even a simple tone generator:

http://cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html

Basically though, you’ll find that anything that gives you low-level audio access will give you what you need.  Speed is the key here, number crunching needs to be done in an instant.  Flash can keep up OK, and gives you the bare minimum of what you’d need to generate sound.

What about Javascript?  Well, sort of!  The Web Audio API has been defined and implemented on the developer version of Chrome and there’s also a Audio Data API draft in Firefox.  The basics are the same as Flash, but many more goodies are offered to help you along.

 

So, in the end we need speed and some basic API hooks.  On the web, Flash seems the most reliable, but it sounds like Chrome and Firefox are nipping at Flash’s heels with some very cool stuff coming up!

And in the mobile world, just like the desktop world, it seems this sort of base level audio access is standard as well.

Live Instrumentation in Flash Part 5 – Voice Synthesis

Filed in development | flash/flex | flex | music | ui Leave a comment

So far, we’ve discussed how to create tones, notes, and chords.  We’ve also gone into several different algorithms to change how a digital instrument sounds.  Changing how our digital instrument sounds is a lot more than simply changing the algorithm that defines your sound wave!  You may have noticed that given the several algorithms we covered in part 2 of this series, the sounds didn’t change all that much.

What can we do to create some more variety?

Attack, Decay, Sustain, Release

In electronic synthesizers, there are the concepts of “attack”, “decay”, “sustain”, and “release” (commonly abbreviated ADSR).  The easiest way to explain this is to imagine yourself hitting a key on a piano.

The moment you hit the key, the string is hit by the hammer by the piano.  This initially creates a loud tone.  The initial tone is much louder than anything that happens moments after.  This initial loudness is called the “attack”.  As the initial loudness wears off, the sound draws down to the normal volume as you hold the piano key.  This drawdown is known as the “decay”, and for the rest of the duration that you’re holding the key is known as the sustain.

Once you release that piano key (on a real piano), there would be a small amount of sound coming out as the string is still vibrating.  This will eventually fade away – but this period is known as the “release” period.

Envelopes

Attack, Decay, Sustain, Release is a type of envelope.  At a basic level, an envelope is controls volume on a waveform as the cycles continue.  There are more types of envelopes, but the ADSR envelope is the most commonly known.

Some synthesizers break up the phases even more.  Consider the DAHDSR envelope.  This starts with a delay phase.  This is the phase before the attack – a phase where you don’t hear any audio, or much at all, in the initial lead up to the loudness that is the attack phase.

Then comes the “hold” phase.  This is meant to prolong the time between the attack and release.  It keeps the volume high!

And then the rest continues with the decay, sustain, and release.

Different types of synthesizers take different envelope approaches.  Some even let you create your own – however it seems that the simple ADSR is the most common.

Shortening the Sustain

One thing I’ve found is that a sustain that keeps going and going and going while you press a key, sounds kind of unnatural.  It might be appropriate for something like a pipe organ, but if you’re trying to replicate the percussiveness of a piano or a bass guitar pluck, you don’t want something that goes on forever.

Keeping it short and sweet is the perfect key to making something sound natural!  I’m still experimenting with envelopes to try to accomplish this effect without just cutting off the waveform.  Overall though, I’ve found that limiting a sustain to a few hundred milliseconds is the perfect solution to making a more natural sounding instrument – especially if you want something percussive like a bass pluck or piano key.

Harmonic Overtones and other Imperfections

Imperfections in the note quality are a good way to make a computer generated note sound more natural.  Perhaps the tone could be off a couple steps in the frequency, however there are other ways!

One such way is to give a note a harmonic overtone.  Remember back in part 4 when we went into note relations?  We talked about octaves, where an octave higher will produce the same key/tone, but is at a higher pitch.  An octave higher or lower is in perfect harmony with the root note.

With harmonic overtones, we layer several octaves on top of the note.  This gives a more natural sound.  Many instruments in the real world will have the first 2 overtones be in perfect harmony.  In this regard, starting at middle A of 440hz, we’d have overtones of 880hz, 1320hz, and 1760hz.  All these notes playing at the same time to produce something more natural!

Another factor in real world instruments is that these overtones (especially the third and fourth) may not be exactly the same key in an octave, not in absolute harmony.  These higher overtones might be a little sharp or a little flat.  These nuances contribute to the instruments unique sound.  In fact, if you’ve ever seen the flare on the end of the trumpet, it’s not to get more volume, but to correct the harmonic overtones to be closer to absolute harmony.

We can absolutely replicate this in our digital instruments!  We can mix harmonic overtones together when playing a single note, and even make variations to avoid absolute harmony and give our instruments some character.

Filters, Modulations, and more

We’ve only scratched the surface of creating new and exciting voices for our instruments.  We can pass our tone through a filter, do frequency modulation, and more.  I’m still exploring, so this is best left for another time!

Live Instrumentation in Flash Part 4 – A Little Music Theory

Filed in development | flash | flash/flex | flex | music Leave a comment

In the last post, I threw a number at you: 440.  This was the frequency I put in my code to make the resulting audio sample listenable.

Well, it was no accident.  440 cycles is actually a middle A.

A Brief History

It wasn’t always 440hz (hertz actually denotes cycles per second), but not for lack of trying!  Remember that we haven’t gone digital until the past 30 years or so, and industrial manufacturing is only 70-80 years old.  People have been making instruments and playing music on them for thousands of years.  Beethoven was playing the piano in the late 18th century.

When you hear Beethoven’s music recreated, you’re not hearing it exactly as he played it.  Yes, all the music is transcribed and is probably fairly true to the original.  However, tuning a piano in the 18th century wasn’t an exact science.  Beethoven’s middle A might not have exactly been 440hz.

While Beethoven was a German composer, Chopin was Polish.  Being in different countries, so far apart, Chopin probably had a different tuning than Beethoven.  If you were in the German audience listening to Beethoven, you might be accustomed to a middle A tuned around 338hz.  Whereas, if you traveled to Poland, and heard the 445hz middle A, you might think something was a bit off.  You weren’t used to the different tuning.

Recorded music and touring artists changed all this.  As records were being sold and artists toured, someone in Japan could listen to a record produced in Spain.  As folks started getting together from all over the world, a sort of standard tuning was made out of all of this.  When digital music and exact measurements of frequencies came along, we could finally settle on a specific standard that everyone could agree on.

 

The Math of Notes and Octaves

So far, we’ve only talked about middle A, or “A4″.  Since A4 is basically the center of it all, we can go from there.  First off, we can talk octaves.  Take our middle A, for example.  If you go down an octave, you land on another A….A3.  Basically, its the same tone but a lower frequency.

How, do we figure out the frequency?  Well this time its easy – no fancy trigonometry!  Just half the 440.  So A3 is 220.  A2 is 110.  A5, on the other hand, is double 440 (880hz).

Now, what’s between A4 and A5?  What’s between an octave?  Well, there are 12 notes between the two.  In music terminology going up 1 note is called a “half step”.  Going up 2 notes is called a “whole step”.

Starting at A4 (or any A):

  • A
  • A sharp/B flat
  • B
  • C
  • C sharp/D flat
  • D
  • D sharp/E flat
  • E
  • F
  • F sharp/G flat
  • G
  • G sharp/A flat

If we assume that we’re starting at A4, and going up 1/12 of an octave for each half step, there’s actually a mathematical formula to calculate the frequency!

That formula is: frequency = 440 * (Math.pow(2, indx/12));

You should ALSO note that the relation between half-steps.  If note X is a half step below note Y, note X is said to be Y “flat”.  Note Y is said to be X sharp.  This applies to any key (not just the black keys on the piano are sharps and flats, though it makes total sense to name them that way given their positions).

So in the Flashamaphone framework, I’ve gone ahead and started at middle A.  I’ve put each of the twelve notes into an array.  Whichever index we’re trying to access in represented by “indx” in the above formula. We subtract 12 from index for each octave below 4, and add twelve for each octave above 4.
What we have here, is a simple way to calculate frequency for a given note!  Pretty cool!

How Western music is different from others

Jjust like how Beethoven’s piano was tuned differently from Chopin’s, different cultures have different connotations for chords.  The most basic example is major vs minor.  In our western culture, we associate major keys with a happy sound, but minor keys are sad.  In other cultures, this doesn’t necessarily hold true.  In fact, Indian music has no concept of major vs minor….and so, no similar connotations.

What we’ll cover next with chord structures will be accurate for any culture.  However, what each chord construct invokes for a person in western civilization is completely different for other cultures.    We’ve already discussed that major vs minor doesn’t mean anything important for Indian music – but that doesn’t mean the construct doesn’t exist!  It’s all just different ways of describing the ways notes can make up a chord.

Chords Structures

Starting from the root, you can get all the notes in a “key signature” the same way from any root note.  A key signature groups notes together – and typically if you play only the notes in the key, your music won’t sound off-putting.

A major key is always comprised of the first, third, fifth, seventh, ninth, and eleventh half tones up from your root.

A minor key is used in western culture to make something sound sad or similar.  Likewise, its always comprised of the first, third, fourth, sixth, eighth, ninth, and eleventh half tones up from the root.

I take care of this in Flashamaphone by making the following arrays for you:

// major key
notesInKey.push( notesToIndex[0] );
notesInKey.push( notesToIndex[2] );
notesInKey.push( notesToIndex[4] );
notesInKey.push( notesToIndex[5] );
notesInKey.push( notesToIndex[7] );
notesInKey.push( notesToIndex[9] );
notesInKey.push( notesToIndex[11] )
// minor key
notesInKey.push( notesToIndex[0] );
notesInKey.push( notesToIndex[2] );
notesInKey.push( notesToIndex[3] );
notesInKey.push( notesToIndex[5] );
notesInKey.push( notesToIndex[7] );
notesInKey.push( notesToIndex[8] );
notesInKey.push( notesToIndex[10] );

The most common chord structure is a triad – this means that there are 3 notes in a chord.  The first note is a chord is the “root”.  A “C” chord will always have the “C” note as the root, in other words.

A typical triad would consist of the 1st, 3rd, and 5th notes in the key signature. There’s lots more chord types, which you can use Flashamaphone to generate for you, but that’s the basics!

Live Instrumentation in Flash Part 3 – Generating a Tone

Filed in flash | flash/flex | flex | music | ui | web Leave a comment

Basic Sine Wave Generation

OK!  Lets step into the wayback machine and go back to part 1 of this series.  We were talking about building audio samples with a byte array

 var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c < 8192; c++) {
bytes.readFloat(number);
bytes.readFloat(number);
}

I never DID tell you guys what goes in number, did I?

Well, that’s where we can start making something listenable (and not the white noise we made before).

Lets make some changes:

for(var c:int = 0; c < durationinseconds*44100; c++) {
var number:Number = Math.sin(c * 2*Math.PI/44100 * 440);
bytes.readFloat(number);
bytes.readFloat(number);
}

OK, so what’s going on here? We’ve introduced, first of all, “durationinseconds”. We multiply by 44100. Since there are 44,100 samples per second, all we need to do is multiply by the number of seconds we want to produce, and generate that many samples.

Next up is to generate a sine wave. Why a sine wave? Well, think about your high school algebra class. There are all sorts of mathematic equations, and you can graph any of them. Most basic is a simple line.

a line - it starts infinitely low, and ends infinitely high, never repeating

The problem with a line, is that it keeps going up and up and up and up. That’s really not what we want here. What we want is something that goes high, comes back down, goes up again, and can do that forever.

We COULD utilize a little programming, and make a line, but at a set high point, reset and make it start at the low point again.

a line that cycles from going up and up, but resetting at 0 again

This is actually a good way to produce a sound, but lets talk sine waves for now.

The sine function to produce wave cycles is fantastic. The line can go up and down all day as you continually increment, but not only that, it’s curvy! So it goes up and down in a very smooth way.

Goes up and down repeatedly (forever) and is curvy

The thing about sine waves is that since they repeat themselves, they can ALSO be measured like a circle – in degrees. If you go 360 degrees, you’re right back where you started from. The bummer about trigonometry though, is that everything is measured in radians.

So 360 actually is equal to 2 Pi. 180 is Pi, and 90 is Pi/2. So above, we took 2 Pi and divided by 44100. We’re basically making a baseline here and saying that one full revolution through a sine wave is equivalent to one second. We multiply this by our iterator (c), to make the individual samples for each data point.

This would actually produce a SUPER low tone (I doubt you’d be able to hear it). I’m going to go ahead and ALSO multiply by 440 to produce something listenable.

Frequency and Amplitude

So let’s talk about what we did when we multiplied by 440. The effect we had was to take that super low tone (the one that takes a whole second to go a complete cycle) and make it go 440 cycles in one second.

How often something cycles is known as frequency. And that’s the thing about frequency in sound. If the frequency is too low, that is it doesn’t cycle fast enough, our brain doesn’t put together that its a repeating pattern, and it can’t latch on to the fact that its an audible tone. If too high, our ears can’t actually discern the signals either.

440 cycles is pretty much smack in the middle of what we are comfortable hearing. There’s a LITTLE more to this, and I’ll tell you more in part 4.

Anyway, all those cycles going so fast come together sounding like one tone. The more cycles per second, the higher the tone.

When you’ve heard about frequency, you’ve also heard about amplitude. In terms of audio, amplitude is just volume. It’s pretty easy. If you want a volume that’s 1/4 as loud, just multiply that “number” variable by .25 (OK I’m lying, volume is actually logarithmic, but I don’t feel like getting into that whole thing right now).

Here’s a nice little demo to allow you to play around with frequency and amplitude on a sin wave.

Different Types of Cycles

Like we discussed, sine waves are curvy… When we generate the tone, its sounds nice (if not a little whiny). There are other ways to go. I eluded to this before (with the line that we keep resetting in a cycle).
You can make things sound edgy too! Like you could make a square wave. I’m going to start giving examples now, and copy over some of what I have in the Flashamaphone project.

First of all though, lets simplify things and pop a bunch of math into a phase variable, like so:

var phase = c * 2*Math.PI/44100 * 440;

There! Now we don’t have to write all that stuff out each time. So let’s revisit how to do a sine wave:

// loop
number = Math.sin(phase);
// end loop

Goes up and down repeatedly (forever) and is curvy

Next, lets try a square wave

// loop
number = Math.floor(Math.sin(phase));
// end loop

It sounds very 8-bit and harsh.

At this point, I can start making up my own terminology – and I came up with a stepped wave. It’s half-way between harsh and smooth, between a sine wave and a square wave.

// loop
number = Math.floor(Math.sin(phase)*4)/8 - Math.floor(Math.cos(phase)*4)/8;
// end loop

I have more!

A Step wave?

// loop
number = Math.floor(Math.sin(phase)) - Math.floor(Math.cos(phase));
// end loop

Shark fin?

// loop
number = Math.cos(phase) - Math.floor(Math.sin(phase));
// end loop

Saw tooth?

// loop
number = phase - Math.floor(phase);
// end loop

a line that cycles from going up and up, but resetting at 0 again

Saw Sine?

// loop
number = Math.sin(phase) - Math.floor(Math.sin(phase));
// end loop

To listen to these examples, check out this demo!

That’s all I have in the Flashamaphone project.  We’ll see soon in part 5 of this series that there is much more to producing different types of sound than just the math to generate the cycle.  First though, jump on over to part 4, and we’ll talk music theory!

Live Instrumentation in Flash Part 2 – Enter Flashamaphone

Filed in development | flash | flash/flex | flex | music | web Leave a comment

As I said in part 1 of this Live Instrumentation series, working with byte arrays can be hard, and take some getting used to.  I still stand by the statement that you shouldn’t be afraid though!  One of the best ways to get cracking is to get your hands on an existing project and start changing stuff.  See what works and what doesn’t work.

I don’t know of too many projects that deal with audio generation.  Probably the best one out there is called Tonfall (http://code.google.com/p/tonfall/).  Tonfall brands itself as a Tiny AS3 Audio Framework.  It has lots of cool stuff to get you going playing with audio.  What I didn’t like about it for my purposes was the learning curve, especially if you don’t know much about audio already.  They have some cool demos – but when I was getting started in this stuff, the first thing that popped into my mind was a piano keyboard.  With Tonfall, I didn’t see a way to easily play a specific note like an A sharp.  Instead, I’d have to know the frequency of the note.

So that’s where I went when I started up my own project called Flashamaphone!  I wanted a way to easily make notes and chords and play them live.  I wanted a way to get started if you only new a tiny bit of Flex/AS3 and a tiny bit of music knowledge (only knowing your way around piano keys a bit).

Lets rehash a little bit of what we already covered as it relates to Flashamaphone.  First off, I’ve included the concept of a buffer in Flashamaphone.  Currently there are two types of audio buffers in the project

Recorded Buffer

This is the simplest of the buffers.  Its only does a few things.  You can add bytes to the buffer with “addToBuffer”, you can get the buffer bytes with “get buffer”, clear the buffer with “clearBuffer”, or activate or pull from a controller.  We’ll get into the controller in a bit….

So what you end up with here, is basically an audio file in memory that you can write to and read from.  Pretty simple!

Live Buffer

The live buffer is a little bit more complicated, but same concepts.  You can add bytes to the buffer, but that’s not really the point here.  The real point is to pull from a controller (wait one more second for that explanation!).  The real mechanical stuff going on here is a sound object with a sample data event.  Every time the audi0 buffer reaches out and needs more data, we’ll pull a certain number of samples from our controller.  If a controller doesn’t exist, we’ll pull it from a queue of audio bytes.

Pulling data when we need it (and only when we need it), means that we are creating a sort of live audio playback scenario.

Keyboard Controller

So far, there’s just one type of controller in the Flashamaphone project.  Its the keyboard controller.  The keyboard controller allows you to “press” a key and “release”.  With each press and release, you pass in the tone or note that you’re pressing.  In the end, all this controller does is keep track of what keys are currently being pressed, and what keys are in a “released” mode as the sound fades away after it’s been released.

Now, when we attach a controller to the buffer (whether live or recorded), the buffer can pull from the controller.  The act of pulling from the controller means that the buffer will ask for the bytes of a specific number of samples from the controller.  If the “A”, “D”, and “D#” keys are pressed on the controller, the controller will send back a mix of those bytes.  The buffer doesn’t care what keys are pressed, it just wants whatever the bytes are.

You’ve been Caught Up

Right, so you’ve been caught up on the “engine” behind Flashamaphone.  That’s all it really is behind the scenes.  A couple buffers and a controller.  It’s not very fun to just push bytes back and forth, but it needed to be done to get to the real cool stuff.  Next we’ll be making some actual recognizable tones!

First, let me hit you with a Flex code sample of our white noise using the keycontroller and live buffer using Flashamaphone:

 
<s:Application xmlns:fx="http://ns.adobe.com/mxml/2009"
			   xmlns:s="library://ns.adobe.com/flex/spark"
			   xmlns:mx="library://ns.adobe.com/flex/mx"
			   xmlns:buffer="org.flashamaphone.buffer.*"
			   xmlns:controller="org.flashamaphone.controller.*"
			   xmlns:tones="org.flashamaphone.tones.*"
			   xmlns:debug="org.flashamaphone.debug.*"
			   addedToStage="init(event)" viewSourceURL="srcview/index.html">
 
	<fx:Script>
		<![CDATA[
			import org.flashamaphone.debug.Debugger;
			import org.flashamaphone.tones.ITone;
			import org.flashamaphone.tones.Tone;
			import org.flashamaphone.voices.SimpleWaveform;
			import org.flashamaphone.voices.waveformFactories.SineWaveFactory;
			import org.flashamaphone.voices.waveformFactories.WhiteNoiseFactory;
 
			protected function start(event:MouseEvent):void {
				keyController.pressKey(tone);
			}
			protected function stop(event:MouseEvent):void {
				keyController.releaseKey(tone);
			}
			protected function init(event:Event):void {
				Debugger.graphicalOutput = debugViz;
				Debugger.debug(tone.voice.sustain(440, 2000));
			}
 
		]]>
	</fx:Script>
 
	<fx:Declarations>
		<!-- Place non-visual elements (e.g., services, value objects) here -->
		<buffer:LiveBuffer id="buffer" />
		<controller:KeyboardControllerBoard id="keyController" soundBuffer="{buffer}" />
		<tones:Tone 
			voice="{new SimpleWaveform( new WhiteNoiseFactory() )}" 
			debug="true"
			id = "tone" />
	</fx:Declarations>
 
	<s:Button mouseDown="start(event)" mouseUp="stop(event)" label="White Noise" />
 
	<s:Spacer height="100" />
	<s:SpriteVisualElement id="debugViz" width="100%" height="100" />
 
	<s:layout>
		<s:VerticalLayout />
	</s:layout>
</s:Application>

The code above simply creates both a live buffer and a keyboard controller. It assigns the live buffer to the keyboard controller. Now, all we need to figure out is how to send some actual notes to our keyboard controller, and not just white noise!

Live Instrumentation in Flash Part 1 – Basics of Dynamic Sound

Filed in development | flash | flash/flex | flex | music Leave a comment

Lets start with the basics.  We need to create sound from thin air and write that sound somewhere so we can hear it.  This could be live as you press a button, or into a file so you can play it later.

 

Don’t Be Afraid of the Byte Array!

In Flash, or most platforms in fact, there are a lot of different data types.  You have integers, floats, strings, etc.  I can make a sentence by putting strings together, and I can make an array by putting various types of data together.  A byte array is like an array, but it’s a long concatenation of binary data.

Unfortunately byteArrays are hard to debug! You can’t really just trace stuff out to your console like with a string, since it’s not in a readable format anymore.

How do you make binary data?  Well it’s easy with the Flash APIs.  You can read a string into a byte array – or more appropriate to us for sound: copy and paste bytes into a byte array or read a floating point number into a byte array.

When specifically talking about creating sound, we’ll mostly be dealing with reading floats into a byte array many times over in a loop.

For example, if you wanted to create a 4 second sound, there is some math to figure out.  Lets start with a loop:

var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c &lt; len; c++) {
bytes.writeFloat(number);
}

So what is “len”? How many times should we iterate and add floats into our byte array? Well typically, we’re dealing with 44,100 samples per second. That means for every second of audio we’re using 44,100 data points if you picture our sound like a line graph.

Our code snippet becomes:

var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c &lt; 4 * 44100; c++) {
bytes.writeFloat(number);
}

One more thing, though. The code above assumes mono, or one channel sound. If we want stereo sound, we need to do this loop twice over. Unless you’re trying to accomplish different effects on the left and right side, it’s usually fair to say that the float you read in each loop iteration will be the same number. This is why you’ll usually see the readFloat line repeated instead of the iteration count going up to 88,200:

var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c &lt; 4 * 44100; c++) {
bytes.writeFloat(number);
bytes.writeFloat(number);
}

What “number” are we writing? Well….thats where the magic happens, and is a bit more complicated. Don’t worry, we’ll get to that later.

Writing File Output
Having a byte array in memory is all well and good, but how can we use this. In terms of audio, we can dump it to a file, or play it live.

Dumping to a file is pretty easy if you find a good encoder. MP3 is compact, but processor intensive. It’s also kinda hard to figure out if you’re starting from scratch. Long story short, you’re writing chunks of data (frames) and giving each frame a header describing the contents of each data frame. You’re also compacting the data somehow. It’s a little beyond me, which is why if I ever need to do it, I’d grab a fast, ready made library like Shine (http://code.google.com/p/flash-kikko/). Shine is made in Alchemy – which means someone wrote it in C++ and compiled the library to a Flash SWC file. Doing it this way can result in much faster number crunching!

But, I like to be a little more basic – and write to a WAV file. WAV files are a format that contains raw audio data. For the most part we can take our generated audio data, and write it as is! Just dump all those bytes into a file!

Unfortunately, even writing audio to a WAV format takes some know how. You need to construct a file header which includes the number of channels, sampling rate, and a whole bunch of other things. Once all that stuff gets written to the file, you can just dump your actual audio data.

I’m not a WAV file format scholar, or do I really even understand whats going on in the header. This is why I’m using the WAV audio encoder from the Tonfall project by Andre Michelle (http://code.google.com/p/tonfall/).

Here’s an example of writing to a file on my desktop with AIR, but passing it through Tonfall’s WAV encoder first.

var we:WAVEncoder = new WAVEncoder(WAV16BitStereo44Khz.INSTANCE);
we.write32BitStereo44KHz(myaudiobytes.bytes, myaudiobytes.bytes.length/8);
var file:File = File.desktopDirectory.resolvePath("output/song.wav");
var stream:FileStream = new FileStream();
stream.open(file, FileMode.WRITE);
stream.writeBytes(we.bytes);
stream.close();

Did you notice the line where we’re passing myaudiobytes.bytes.length/8? Well this parameter is how many “samples” we’re passing. It leads to an important fun fact. A floating point number is 4 bytes. If we’re talking a mono file, a sample would be one floating point number or the 4 bytes.

Since we’re talking about a STEREO file, it’s 4 times 2, or 8 bytes. So to figure out how many stereo samples are in a byte array, we’d just take the length of the array and divide by 8.

What if we wanted to know how many SECONDS of audio are in our byte array? Well, first we’d find the number of samples, but then divide THAT by the sample rate (44,100 samples per second).

Live Playback with an Audio Buffer

I’m glad we talked about dumping stuff into a file first, because basically using an audio buffer is just like that, but you do it in smaller chunks.

To get the ball rolling, what you’d do is start up a blank sound in Flash. Usually you give it a source to play, but not here! Here, just a blank sound. Next, add an event listener to the sound “SAMPLE_DATA”. Whenever the audio buffer runs out of stuff to play, it will fire off the sample data event. Your code will pick up the event and run a function. Ideally, in this function, you’d be putting stuff into the audio buffer.

After adding the callback, play the sound to get things started:

sound = new Sound();
sound.addEventListener(SampleDataEvent.SAMPLE_DATA, onSampleData, false, 0, true);
sound.play();

What does our callback method look like? Well, for the most part, its what we discussed when popping stuff into a file. Just in smaller chunks.

function onSampleData(event):void {
var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c &lt; 8192; c++) {
bytes.writeFloat(number);
bytes.writeFloat(number);
}
event.data.writeBytes(bytes);
}

OK, so there’s not much difference here! We’re iterating over the same type of loop – writing stereo audio. We’re writing 8192 samples though. This IS a magic number. It’s the upper limit of what Flash allows you to write in the buffer. The lower limit is 2048. Now, why use one or the other (or in-between).

So, if you use the lower limit of 2048, audio will be sampled more frequently. It will tax your processor more often – and it might not be able to keep up properly. This means that when you listen to the audio being generated there may be some clicks or pops in your headphones as audio drops out and comes back in.

If you use the maximum limit of 8192, your processor is hit less often – and when it is hit, it can almost always handle the sample data generation. So thats good! Problem is that all this data, all 8192 samples represent a good quarter of a second or so of audio. This means that your audio could be lagging by a quarter of a second. I don’t mean that audio won’t play back smoothly. What I DO mean is that if you press a button, it could take you 1/4 of a second to hear the result of that button press. This is a little slow to make something like a virtual piano that feels reactive!

At the very end though, you’re writing to the event.data byte array. In fact, you don’t even have to generate your own sound. You could just be reading slowly from a byte array of a raw audio file you already read in. In fact, you could run some mathematical filter on it, and alter the values. Folks have done this before by loading an MP3 file into the sound object, using sound.extract to load it into a byte array, and THEN dropping every other sample as it gets sent to the buffer to do something like increasing the speed by 2x.

White Noise
I told you we’d get into what the number value is when we read the float into the byte array. That time is not yet – it’ll come in part 3. However, lets just put some random values in there!

function onSampleData(event):void {
var bytes:ByteArray = new ByteArray();
for (var c:int = 0; c &lt; 8192; c++) {
var number:Number = Math.random();
bytes.writeFloat(number);
bytes.writeFloat(number);
}
event.data.writeBytes(bytes);
}

By adding random values, we’ve just added random samples….at no real discernable frequency. This means there’s tons of noise, but no way to hear one strong tone from the noise. It results in white noise – like TV snow, or waves at the beach.

Let’s take a peek at what this looks like when we visualize it:

And lets listen to what it sounds like (Flash Example)

 

TOP