LeliaThomas.Com

Hanging Rock
In Memory of Summer Time Driftwood Beary Sad Beauty on Bourke Lavender Twins The Mystery of Branson Glen

I have just finished The Worst Class Ever (and hope I’ve passed).

Date: June 10, 2008

University continues to be about my getting that little piece of paper that pleases the questionably-intelligent masses. While I have an interesting class here and there, there are a lot that don’t teach me much that I don’t already know (or can’t easily find out) and more than a couple that remind me how poor mass education can be. Because degrees aren’t tailored for individual educational needs or desires, I ended up in a required course called Multimedia Technology. This has really been the biggest pole up my other end all this semester.

I knew that something was wrong when counselors started trying to tell me this was a requirement because it “covered the basics” I would “need to know to talk to sound and visual technicians in the industry.” Nothing that “covers basics” takes twelve weeks to talk about. I can break down the Internet to someone who doesn’t understand it at all in half a day. They probably won’t totally understand everything in the end, of course, and I’ll probably use some bad examples along the way, but they’ll have some basics under their belt. So, as one might be, I was a bit skeptical about this course.

Turns out, I had every right to be.

Multimedia Technology is a twelve-week course that has two streams in it, sound technology and visual technology. Both have lectures, there is a tutorial, and toward the middle to end of the course, there are biweekly labs. This course has been a mixture of mathematics, physics and engineering. So, yes, one can see how that’s going to help me so much with web design and artwork. You can also see how advanced mathematics, physics and engineering would be the basics needed to understand what another person is saying about equipment or a technology used.

I gave up on the lectures early on, because, as much as I like the lecturer, who is very nice, I found them to be useless. The lecturer is so smart when it comes to the material, that he doesn’t know how to simplify it enough for those of us who are randomly grasping to learn what we can. The lectures consisted of him reading, word for word, the lecture notes that are already available to us online. When he would ask “Do you have any questions,” you most certainly would, but the material was so complex that you didn’t know what you should ask or even how to ask. There have been few classes in my life that have struck me so dumb that I didn’t even know what questions I should ask.

The tutorials, which are much smaller, intimate classes done for each course in Australia, weren’t much better. My tutor is usually the one in the labs himself, doing research, and so asking a question of him resulted in massive answers that went entirely over my head. Asking around, it seemed to go over most other people’s heads, too.

Then there were the labs. They were a little more practical, and I could understand a very few things from them, but overall, I just followed instructions. That isn’t hard. That’s typical education. It didn’t teach me much, though.

And so it was that last night I found myself trying to go through twenty-four lectures that averaged at about 15 pages each and nine tutorial sheets that averaged around four pages each, in preparation for a 9am exam. A lot of the material for this course is either vague, unnecessarily descriptive to the point of making simple things complex or bad about assuming what is and isn’t general knowledge (i.e., using terms and acronyms that only sound and visual engineers would know of the top of their heads).

If anyone thinks I’m exaggerating, I can prove it quite easily with a question regarding JPEG (image) compression. Now, JPEG compression is actually pretty easy to understand. I’ve understood it for a long while, and some of my other classes have discussed it without my getting confused in the least. If this course was just about learning basics so you could communicate with industry technicians, JPEG compression would be really simple to explain and to understand. Here is my definition of JPEG compression:

JPEGs are a type of image that is “lossy,” meaning some information is “lost” to compress the data needed to produce the image; the lost data cannot be restored from a previous, lossless (or whole/raw data) version. How JPEGs are compressed is that certain pixels (small squares) of data are changed in favor of simpler data. I would probably then direct someone to a good visual representation of this type of compression, like the photos on this page, as a visual helps one learn JPEG compression much better than any explanation. That would be all you would need to know to talk to someone about it at a basic level.

Meanwhile, here is the answer to my tutorial’s fairly basic question of “explain the steps involved in JPEG compression.”

Transform the image from RGB into luminance/chrominance colour space (Y Cb Cr). The luminance component is greyscale, derived by combining R, G and B values. The reason for doing this is that if we separate RGB values into a brightness component and a colour component, we can take advantage of our lower sensitivity to colour detail and compress the colour information more heavily – see next step.

B. Downsample the chrominance (colour) information by averaging together groups of pixels. The luminance component is left at full resolution, while the chroma components are often reduced 2:1 horizontally and 2:1 vertically.

This step immediately reduces the data volume by one-half. In numerical terms it
is highly lossy, but for most images it has almost no impact on perceived quality, because of the eye’s poorer resolution for chroma info.

Essentially this means that we can get away with less detail in colour information provided the detail is there in the luminance.

[As a simple (imperfect) analogy consider a typical childrens colouring-book image painted with watercolours – we barely notice any bleeding of colour from one section to another because the detailed information (edges) is maintained in the distinctive lines and shading.]

C. Transform to the spatial frequency domain. Group the pixel values for each component (Y, Cr, Cb) into 8×8 blocks and transform the pixel information in each block from the spatial domain [ie x,y coordinates + brightness] into the frequency domain [a representation of the 2 dimensional waves that combine to make up the image] using a discrete cosine transform (DCT). The DCT gives a frequency map, with 8×8 components that describe the 2-d spatial frequency waves
(surface undulations) that combine to make up the image information we see in each block.

Note, this is Similar to the way a square wave can be made up by adding a series of sine waves of appropriate amplitude and frequency.

After the DCT we now have numbers representing the average intensity value in each block, followed by numbers that represent the successively higher frequency changes within the block. If the intensity in the block does not vary much the values of these subsequent numbers will be low or zero. Even if there is some variation in intensity values across the block (ie higher frequency components are present) the subsequent numbers are usually quite low. It turns out that if we ignore some of the higher frequency components, the image looks pretty much the same when re-constructed. The motivation for transforming to the frequency domain is that we can now throw away highfrequency information without affecting low-frequency information. This is achieved in the next step.

D. Quantize the frequency values (divide by a quantization coefficient). Quantization means limiting the possible values of a quantity to a discrete set of values – usually resulting in fewer possible values. This can therefore represent a data reduction step.

For example, consider the following series of numbers:

3 8 13 17 18 20 24 26 29 30

If we divide these numbers by a quantization coefficient of 10 using integer arithmetic (that means ignoring the fraction parts of a number) the results would be:

0 0 1 1 1 2 2 2 2 3

After quantization, the 10 different values are now represented as only 4 different values. Consequently, information has been lost. Even if we try to restore the data by multiplying by the quantization coefficient, we get:

0 0 10 10 10 20 20 20 20 30

Although the numbers now span the same range, there are now quantized into only four distinct values (information remains lost).

In JPEG compression, we quantize the frequency components in each block by dividing each of the 64 frequency values by a quantization coefficient, (rounding to integers). This is the fundamental information-losing step. The larger the
quantization coefficient, the more likely our quantized values will be rounded down to zero, which leads to the discarding of frequency components.

Even where we divide by the minimum possible quantization coefficient (1) information is still lost due to rounding errors. Because higher frequency components are usually of a small amplitude, the numbers representing them are always quantized less accurately than for the lower frequency components, since higher frequencies are represented by values initially closer to zero.

Most JPEG creating programs allow the specification of a quality factor. It is this factor that controls the degree of quantization (or rounding down) that occurs. Choosing a low quality factor results in a relatively large quantization
coefficient. This means that the frequency values are divided by a large number, and consequently more of them will be rounded down to below one, so more of the higher frequency components are lost.

[eg for a frequency value of 13, and quantization coefficient of 10: 13/10 = 1.3 which rounds to 1 in integer arithmetic. For a frequency value If 7 (same coefficient), 7/10 = 0.7 which rounds to 0].

E. Use Huffman encoding to further compress the quantized frequency data. Because there are many zeros in the quantized data blocks, there is scope for more efficient representation of this data using a loss-less technique such as
Huffman encoding. Because this step is lossless, it doesn’t further affect image quality.

F. Tack on appropriate header and output the result. All of the compression parameters are included in the header so that the decompressor can reverse the process. These parameters include the quantization tables and the Huffman coding tables.

The decompression algorithm reverses this process. The decompressor multiplied the reduced coefficients by the quantization table entries to produce approximate DCT coefficients. Since these are only approximate, the reconstructed pixel values are also approximate, however the errors won’t be highly visible. A high-quality decompressor will typically add some smoothing steps to reduce pixel-to-pixel discontinuities.

Sorry for the massive word vomit there, but that is the full quote from the tutorial solutions sheet, and that question was on the test. If you have little idea of what the above was talking about, join the club. Like I said, I have a pretty good understanding of JPEG compression, but when things get so wordy, you begin to have trouble understanding even that which you knew before.

As one might imagine, the exam was a lovely mess.

Leave a Comment

Comments ordered from oldest to newest.

Himani

June 10, 2008 at 3:14 pm

Wow, it just went on and on. Your explanation is what I would have used to explain JPEG on a “basic” level, as well. I’ve had so many classes like the one you described — utterly useless or nearly useless, extremely complicated for no reason, and/or a pain in the backside — in the University system that I’m quite jaded now. But even after graduating I had to go back because my degree was too specialized to get much work. Joy.

Lelia

June 10, 2008 at 3:50 pm

That was really how most of the explanations went in the lecture notes. There was some basic science and biology learning toward the beginning of the class (concave/convex lenses, waves, the anatomy of the eye), all of which I’d had in even elementary school science, but their explanations were so all over the place that I had trouble understanding it now. The diagrams they drew were pretty horrible, too, and it turned out that you needed to draw a number of them for the exam!

The exam was worth 60% of my entire grade, to make matters worse!

It doesn’t surprise me that you’ve ended up back in uni yourself, because of what you said. Unis love to recruit people for things like that, I guess, because they know you’ll be back. Considering no one seems to give a flying flip when it comes to my web design work whether I have a degree or not, it would appear a lot of my time in classrooms will have been a bit of a waste, really. Can’t say I’m surprised!

Josh

June 10, 2008 at 11:25 pm

That explanation is insane. You were supposed to be able to puke that back up as an answer?

Lelia

June 11, 2008 at 2:20 am

Yes, and I unfortunately couldn’t remember it this morning! I think I’d just retained all the junk I could at a certain point, and it wasn’t really enough at all!