Principal Component Analysis and the Chemical Composition of Coffee Odors

    One of the most essential features of an electronic nose is to find a way of correlating human responses with the responses of the sensors. This can be accomplished by using a neural network to analyze the pattern of the data, and compare the pattern to an unknown. Each of the sensor's responses could be plotted on an axis. However, when there are more than three sensors, visualization becomes impossible. Principal Component Analysis (PCA) takes the points plotted in higher dimensional space and reduces them to be displayed on a 2-D plane while keeping the relationships between the points constant. A graphical representation allows the user to find patterns, and perhaps correlate the responses to the characteristics of the smell, that is, sweet, offensive, green, nutty, etc.

    The odour causes a response in the sensors which is amplified by the circutry, and is sent to the computer via a data acquisition board which is then analyzed by LabVIEW, which uses a graphical programming language to produce a virtual instrument (VI) which can be interacted with like a real instrument. The HAL2001.vi will be able to acquire, display, and analyze the data and find the patterns by using the Eigenanalysis.vi as described below.

    The Eigenanalysis.vi performs a principal component analysis by performing the following steps.

    1. The data is taken and scaled by subtracting the mean and then dividing by the standard deviation (x-(Sx/N))/(s) for each sensor to produce data with a mean of zero and a standard deviation of one. This is done to prevent any one-sensor reading from dominating over the other sensors.
    2. All of the scaled data is put into an m x n matrix, A.
    3. ATA produces an n x n matrix.
    4. The eigenvalues (l) and eigenvectors (V) are calculated.
    5. The variance is (lk)/S(lk) The more variance, the more data a sensor contains, that is, more information is included in the axis.
    6. The eigenvectors corresponding to the two largest eigenvalues are taken.
    7. The dot product of each data run is taken with the two eigenvectors to produce a scalar product (X, Y).
    8. The two numbers are plotted on a graph and trends and relationships can be seen.

    Here is a plot of different coffee odours using sensor data taken from the MOSES II

    The C's represent the coffees and the U's represent the unknown coffees which were taken from the knowns at a later time.The distances are proportional to the likelihood of relationship. As you may see, the sensors don't produce responses at the same point for the same coffee. The most likely reasons for this fact are that the sensors don't recover enough to give the same reading. A longer clean air flush time may help. The sensors also aren't perfect. Over time and exposure to chemicals, the surfaces degrade and can't be expected to produce the same response.

    Chemistry

    Also I have been doing research on the chemical composition of coffee. Here are some informative websites.

    Jamaican Coffeeis an excellent site listing coffee facts and GC/MS data for the major constitutents in coffee and 3-D representations of the molecules. Be sure to get the Plugin!

    Coffee Brewing is an OK site. The main thing it lists is the chemicals in coffee.

    The following are from SIS Application Notes which contain research papers that address headspace and other techniques of organic volatile preparation for GC and GC/MS experiments.

    Comparison of Sensitivity of Headspace GC, Purge and Trap Thermal Desorption and Direct Thermal Extraction Techniques for Volatile Organics

    Direct Analysis Using the Short Path Thermal Desorption System

    Flavor/Fragrance Profiles of Instant and Ground Coffees by Short Path Thermal Desorption

    Direct Analysis of Spices and Coffee

    Click me for Home

    cmorong1@hotmail.com