![]() |
are you ready for a mindfuck? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
....::::Menu::::.... ---------------------------...::About::... ...::Articles::... ...::Contact::... ...::Home & News::... ...::Links & Credits::... --------------------------- --------------------------- |
scopeby Elie De BrauwerGoal of this fileIn this article we are going to visualise the information stored in a wav audio file. In this article we'll use the scissor test and printing text in the window. History of this file
How this article is organized
The project consists of three files Wav.h which handles everything that
has to do with wav file access, Scope.h which draws the information it
can retrieve from the wav file by using the Wav class and sound.cpp which
opens the window, defines the callbacks and initialises the Scope class. Part 1: Using/Understanding the wav fileMaterial to useOne of my friends and also a classmate Bart Crets was kind enough to create a sample wav file which may be freely used and distributed. This file contains a sine with a frequency of 440 Hz. It is ten seconds long, it is sampled at 44100 Hz, using 16 bit per sample (which implies it is stereo). Hexdumps and references in this article are always made to this file. What we will try to createIf you are a bit interested in science and want to know why and how it works read the following section, if you don't care a bit and just want to go to the example skip the following section. The choice is yours. Scientific approachIn this article we will create a visualisation of sound samples. We start from a .WAV file, this file is created by using so called PCM (Pulse Code Modulation). This is needed because sound is a wave (analog) and a computer works with discrite (digital) data. So the sound is stored using ADC (analog to digital conversion), this is done be taking a sample at certain discrite intervals. In a 44100 Hz file like the one we are using here, a sample is taken 44100 times a second and this value is stored in the WAV file. Now there are various possible visualisations we could create. We could create a spectrum analyser. Which gives you an idea of what kind of signals the sounds exists but this technique requires that you split the sound into elementary waves from which you can easily know their frequency. But this requires too much mathematics for this (easy) article so here I have chosen to draw the oscillogram. This is the visual reconstruction of the acutal wave that existed before the PCM occured. So we start from the samples and try to create a wave from it, which can be considered as a DAC (digital to analog convertor), but we aren't reconstructing a signal we are merely visualising it. Below you can see two examples of an xmms screenshot, the first showing the spectrum analyser on a song (a sum of multiple frequency waves) and the oscillogram of another part of the same song.
Below you can see the spectogram and the oscillogram of the WAV file we are examining. Our WAV file is a 10 second long sine wave with a frequency of 440 Hz. You can see in the spectogram that it isn't a combined wave.
Other examples of a spectogram and an oscillogram can be seen at: http://www.clarku.edu/research/access/psychology/thompson/sound.html. If you don't believe me or don't trust me, go ask Google. The 'sound is a wave ?' approachBasicly we read the header, draw a horizontal line in the center of the screen, read a byte, take the byte as being a Y value and put it on the screen, we read the next byte, increment our X position draw the Y value, and so on. The .wav file formatA .wav files contains raw pieces of data that are taken a certain number of times a second. In the sample case 44100 samnples a seconds were taken. (For more information search Google for analog to digital conversion and for the Shannon and Nyquist theorem). For more information about the wav file format specification you might want to look at the following two sites:
Below you can find a small summary of which fields are in the header file. This data is used in the Wav class.
This is a sample header of an 16 bit stereo file, sampled at 44100 Hz. The file contains 10 seconds of data. helios@qntal:~/vuilbak$ hexdump -n 48 -C test.wav 00000000 52 49 46 46 c4 ea 1a 00 57 41 56 45 66 6d 74 20 |RIFF....WAVEfmt | 00000010 10 00 00 00 01 00 02 00 44 ac 00 00 10 b1 02 00 |........D.......| 00000020 04 00 10 00 64 61 74 61 a0 ea 1a 00 00 00 00 00 |....data........| 00000030 The Wav classBelow you can find the content of the file Wav.h, the class in this file (Wav class) has two public methods, you have string Wav::toString() which dumps the information about the wav file to the screen and you have int Wav::readSample(int,bool) which reads a sample from the file and returns the value. For those not interested in this don't look at the code below and use it as a blackbox. For the others, enjoy your read.
#ifndef __WAV_H
#define __WAV_H
#include <cstdlib>
#include <sstream>
using namespace std;
typedef unsigned char uchar;
typedef unsigned int uint;
class Wav{
private:
char * filename;
FILE * wavfile;
bool valid; // True if it is a wav
bool mono; // True == mono, False == stereo
int s_rate; // Sample rate 44100 kHz, 22050 kHz
int byte_sec; // number of bytes for a second
int byte_samp; // 1 = 8 bit mono, 2 = 8 bit stereo or 16 bit mono, 4 = 16 bit stereo
int bit_samp; // number of bits in a sample
int length; // length = amount of data after the header in bytes
double time; // length/byte_sec
Wav();// not available
public:
Wav(char *);
~Wav();
// String getInfo();
string toString();
int readSample(int,bool);
bool isValid();
int getSampleRate();
};
Above you can see the class prototype, all the usefull information is stored there, the default constructor is hidden and the only valid constructor is Wav(char *) which expects a filename.
///////////////////////////////////////////////////
// Wav::Wav
Wav::Wav(char *c){
valid = false;
filename = new char[strlen(c)+1];
strcpy(filename,c);
// open filepointer readonly
wavfile = fopen(filename,"r");
if(wavfile==NULL){
cout << "Could not open " << filename << endl;
}else{
// declare a uchar buff to store some values in
uchar *buff = new uchar[5];
buff[4]='\0';
// read the first 4 bytes
fread((void *)buff,1,4,wavfile);
// the first four bytes should be 'RIFF'
if(strcmp((char *)buff,"RIFF")==0){
// read byte 8,9,10 and 11
fseek(wavfile,4,SEEK_CUR);
fread((void *)buff,1,4,wavfile);
// this should read "WAVE"
if(strcmp((char *)buff,"WAVE")==0){
// read byte 12,13,14,15
fread((void *)buff,1,4,wavfile);
// this should read "FMT "
if(strcmp((char *)buff,"fmt ")==0){
fseek(wavfile,20,SEEK_CUR);
// final one read byte 36,37,38,39
fread((void *)buff,1,4,wavfile);
if(strcmp((char *)buff,"data")==0){
valid=true;
// Now we know it is a wav file, rewind the stream
rewind(wavfile);
// now is it mono or stereo ?
fseek(wavfile,22,SEEK_CUR);
fread((void *)buff,1,2,wavfile);
if(buff[0]==0x02){
mono=false;
}else{
mono=true;
}
// read the sample rate
fread((void *)&s_rate,1,4,wavfile);
fread((void *)&byte_sec,1,4,wavfile);
byte_samp=0;
fread((void *)&byte_samp,1,2,wavfile);
bit_samp=0;
fread((void *)&bit_samp,1,2,wavfile);
fseek(wavfile,4,SEEK_CUR);
fread((void *)&length,1,4,wavfile);
}
}
}
}
delete [] buff;
}
}
Above you can see how the header is being read and being stored in the private data members.
//////////////////////////
// Wav::toString()
string Wav::toString(){
// we put all the data in a stringstream using
// only strings would result in int->string
// conversion errors
ostringstream info;
if(wavfile!=NULL){
if(valid){
info << "Opened WAV file: " << filename << endl;
info << "This file is in ";
if(byte_samp==1){
info << "8 bit mono" << endl;
}else if(byte_samp==4){
info << "16 bit stereo" << endl;
}else if(byte_samp==2){
if(mono){
info << "16 bit mono" << endl;
}else{
info << "8 bit stereo" << endl;
}
}
info << "Sample rate " << s_rate << " samples each second" << endl;
info << "There are " << byte_sec << " bytes for a second data" << endl;
info << "There are " << byte_samp << " bytes for sample" << endl;
info << "A sample consists of " << bit_samp << " bits" << endl;
info << "There are " << length << " bytes of audio data in the file " << endl;
info << "The file is " << length/(byte_sec*1.0) << " seconds long" << endl;
info << "The file contains " << length/byte_samp << " samples" << endl;
}else{
info << filename << " is not a WAV file " << endl;
}
}else{
info << "Could not open file" << endl ;
}
return info.str();
}
string Wav::toString() returns a string which contains all the usefull information stored in the data members. Not a really vital part but handy when debugging. The result of test.wav is: Opened WAV file: test.wav This file is in 16 bit stereo Sample rate 44100 samples each second There are 176400 bytes for a second data There are 4 bytes for sample A sample consists of 16 bits There are 1764000 bytes of audio data in the file The file is 10 seconds long The file contains 441000 samples We use a stringstream from the STL here to store the data (the nice part is that stringstream does all the nasty conversions (double to string, integer to string, ...) for us and we don't have to care about it. Dont forget to include sstream.
//////////////////////////////////////////////////////
// Wav::readSample()
int Wav::readSample(int number,bool leftchannel=true){
/*
Reads sample number, returns it as an int, if
this.mono==false we look at the leftchannel bool
to determine which to return.
number is in the range [0,length/byte_samp[
returns 0xefffffff on failure
*/
if(valid && number>=0 && number<length/byte_samp){
// go to beginning of the file
rewind(wavfile);
// we start reading at sample_number * sample_size + header length
int offset = number * byte_samp + 44;
// unless this is a stereo file and the rightchannel is requested.
if(!mono && !leftchannel){
offset += byte_samp/2;
}
// read this many bytes;
int amount;
mono?amount=byte_samp:amount=byte_samp/2;
fseek(wavfile,offset,SEEK_CUR);
short sample = 0;
fread((void *)&sample,1,amount,wavfile);
return sample;
}else{
// return 0xefffffff if failed
return (int)0xefffffff;
}
}
Reads a sample return the integer value of the sample or returns 0xefffffff allowing us to use a nice while loop to step thru the file. The sample is a short because if we should read out integers the sign bit would be disregarded.
//////////////////////////////////////////////////
// Wav::~Wav
Wav::~Wav(){
delete []filename;
if(wavfile!=NULL){
fclose(wavfile);
}
}
///////////////////
// Wav::isValid()
bool Wav::isValid(){
return valid;
}
/////////////////////////
// Wav::getSampleRate()
int Wav::getSampleRate(){
return s_rate;
}
#endif /* Wav.h included */
#endif
Now all that is left is the destructor which closes the file, a logical isValid() function to know if a file has been opened and a getSampleRate() function which tells us the samplerate. Part 2: Wav meets OpenGLPart will be added soon Scope.hBelow you can see the code from the Scope.h file, this file contains the Scope class which handles the filling of the window with the data from the .wav file
#ifndef __SCOPE_H
#define __SCOPE_H
#include <sstream>
#include "Wav.h"
class Scope{
private:
int height; // dimensions of the window
int width;
int num; // number of sample to draw
double limit; // maximum amplitude
int start; // start reading at this point
string info;
Wav *wavfile;
string toString();
public:
Scope();
Scope(char *);
void draw();
void setDimensions(int,int);
void alterNum(int);
void alterLimit(int);
void alterStart(int);
};
The class definition, the scope contains some variables that define what data to display and how to display it, it also contains an Wav object (see above) where we will get our data from, the private toString() function is used to draw some text on the screen.
/////////////////////////
// Scope::Scope()
Scope::Scope(){
wavfile=NULL;
}
/////////////////////////
// Scope::Scope()
Scope::Scope(char *file){
wavfile=new Wav(file);
if(!wavfile->isValid()){
delete wavfile;
wavfile=NULL;
}else{
info=wavfile->toString();
height=0;
width=0;
num=1000;
limit=5000.0;
start=0;
}
}
The constructor creates a Wav object, sets the boundaries or sets the Wav file pointer to NULL if somehow the opening of the wav file failed.
////////////////////
// Scope::draw()
void Scope::draw(){
glEnable(GL_SCISSOR_TEST);
// left, the information
glViewport(0,0,300,height);
glScissor(0,0,300,height);
glClear(GL_COLOR_BUFFER_BIT);
int i=0;
float y=0.90;
glColor3f(1,1,0);
// the position on the screen
glRasterPos2f(-1,y);
while(info[i]!='\0'){
if(info[i]=='\n'){
y-=0.08;
glRasterPos2f(-1,y);
}else{
glutBitmapCharacter(GLUT_BITMAP_HELVETICA_12,info[i]);
}
i++;
}
// now show some information about what we see
string info2 = toString();
i=0;
y-=0.16;
glRasterPos2f(-1,y);
while(info2[i]!='\0'){
if(info2[i]=='\n'){
y-=0.08;
glRasterPos2f(-1,y);
}else{
glutBitmapCharacter(GLUT_BITMAP_HELVETICA_12,info2[i]);
}
i++;
}
// lower right
glViewport(300,0,width-300,height);
glScissor(300,0,width-300,height);
glClear(GL_COLOR_BUFFER_BIT);
// only draw it if we have a valid wavfile
if(wavfile!=NULL){
float sample=0;
int i=0;
glBegin(GL_LINE_STRIP);
sample=wavfile->readSample(start+i);
while(i<num && sample!=(int)0xefffffff){
// scale the sample
sample/=limit;
glColor3f(fabs(sample),1.0-fabs(sample),0.0);
glVertex2f(-1+i*2.0/(num*1.0),sample);
i++;
sample=wavfile->readSample(start+i);
}
glEnd();
}
}
The first thing you notice is that we are using the GL_SCISSOR_TEST to
split the screen in a tekst and a drawing part (the GL_SCISSOR_TEST was the
main subject of another article which can be found here).
So first we split the window, we get the info from the Wav object and use the
void glutBitmapCharacter(void *font, int character) function to put the letter
on the screen, we do this for each character but we have to manually set the place where
the letter will be shown, we do this using the void glRasterPos2f( GLfloat x, GLfloat y )
function. Since we have to do this manually we will also have to interpret the newlines.
The available fonts (these are bitmapped fonts so don't expect miracles) can be obtained
from the glutBitmapCharacter manpage.
//////////////////////////////////////
// Scope::setDimensions(int,int)
void Scope::setDimensions(int x,int y){
width=x;
height=y;
}
///////////////////////
// Scope::alterNum()
void Scope::alterNum(int delta){
if(num+delta>0){
num+=delta;
}
}
////////////////////////
// Scope::alterLimit()
void Scope::alterLimit(int delta){
if(limit+delta>0){
limit+=delta;
}
}
////////////////////////
// Scope::alterStart()
void Scope::alterStart(int delta){
if(start+delta>0){
start+=delta;
}
}
These functions do nothing more than changing some private data members, this way the user can determine what data is begin displayed.
////////////////////////
// Scope::toString()
string Scope::toString(void){
ostringstream info;
if(wavfile == NULL){
info << "No valid wav file opened" << endl;
}else{
info << "Drawing " << num << " samples" << endl;
info << "Amplitude goes from " << -limit << " to " << limit << endl;
info << "Starting at sample " << start << endl;
info << "Beginning at " << (start*1.0)/wavfile->getSampleRate() << " seconds" << endl;
info << "Ending at " << ((num+start)*1.0)/wavfile->getSampleRate() << " seconds" << endl;
}
return info.str();
}
#endif /* Scope.h included */
The toString() function uses a stringstream (from the C++ STL) to create a string with information about what the user can see on his screen. The string returned is drawn by using the draw() method (see above). The glueThe file sound.cpp creates the window and a scope object. It also defines some keybindings.
#include <iostream>
#include <GL/glut.h>
#include "Scope.h"
using namespace std;
typedef unsigned char uchar;
// callbacks
void disp(void);
void keyb(uchar key, int x, int y);
void reshape(int x, int y);
Scope *scope;
////////////////////////////////
// main
int main(int argc, char **argv){
glutInit(&argc, argv);
glutInitDisplayMode(GLUT_RGBA | GLUT_DOUBLE);
glutInitWindowSize(600,600);
glutInitWindowPosition(100,100);
glutCreateWindow("Sound");
// create a scope object
scope=new Scope("test.wav");
glClearColor(0.0,0.0,0.0,0.0);
glutDisplayFunc(disp);
glutKeyboardFunc(keyb);
glutReshapeFunc(reshape);
glutMainLoop();
return 0;
}
The parameter that is passed to the Scope class constructor is the filename of the Wav file to open.
////////////////////////////////////
// disp
void disp(void){
scope->draw();
glutSwapBuffers();
}
The scope class can draw itself we only swap the buffers here.
////////////////////////////////////
// keyb
void keyb(uchar key, int x, int y ){
if(key == 'q'){
exit(0);
}else if(key == 'n'){
scope->alterNum(1);
}else if(key == 'N'){
scope->alterNum(-1);
}else if(key == 'l'){
scope->alterLimit(10);
}else if(key == 'L'){
scope->alterLimit(-10);
}else if(key == 's'){
scope->alterStart(1);
}else if(key == 'S'){
scope->alterStart(-1);
}
glutPostRedisplay();
}
We define 7 keybindings:
//////////////////////////
// reshape
void reshape(int x,int y){
scope->setDimensions(x,y);
}
When the window is reshaped we modify the parameters so the maximum viewable area is being used. The resultAn archive containing the source files and the test.wav file can be downloaded by clicking here The result of our test.wav can be seen in the screenshot below:
And another view of our sine:
Until now we've only been using the (predictable) test.wav so I asked someone (Aergn) to provide me with another random wav file and below you can see the result of that file:
Below you can see the result after zooming into the the same wav file: ![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||