Exploratory Data Analysis - HamNoSys Baby Steps
Dive deeper into our data.
In our previous post we told you a story about collecting, and preprocessing sign language datasets. Now it is time to delve deeper into the understanding of the datasets used, which is exploratory data analysis, in short EDA.
Out of all available datasets, we chose five with The Hamburg Notation System (HamNoSys) annotations. As our work still revolves around this notation system, we have already prepared some posts about it, you can find them here and here.
We ended up with a list of datasets, which in total consists of 9.5h and 10k glosses, as presented in Table 1. Our list contains the datasets as:
- PJM corpus - a collection of video data from 2 signers of Polish Sign Language (PJM) with resolution 1280x720px, taken frontal on plain background.
- basic lexicon - multilingual corpus of the four SLs BSL (British), DGS (German), GSL (Greek) and LSF (French) covering the topic of European travel with resolution 320x240px, taken mainly in front on plain background.
- GaLex - a corpus-based dictionary of DGS for technical terms from the field of landscaping and horticulture with resolution 160x90px.
- GLex - a corpus-based dictionary of DGS for technical terms from the field of health and nursing care with resolution 108x80px.
- GSL dataset - a large-scale RGB+D dataset divided into isolated gloses of GSL and saved as frames (separate images) with resolution 1440x1080px.
|Language||Time||Number of glosses|
Table 1. Basic time/instance statistics.
As you can see from the provided descriptions, the created collection consists of various glosses in different languages and come from different subject domains. However as HamNoSys describes the initial position of the movement use of their concatenation, to enhance proper sign recognition, should be possible.
Counting meaningful symbols
At the beginning of our studies we decided to use eight types of blocks to describe analyzed gestures. They are incorporated in our pipeline by use of eight classification heads - one per each block. The number of classes differ significantly between different heads, the same as the number of instances which can appear for each subclass of the chosen head. Fig. 1 shows basic numerical statistics for each defined block. Even for some classes of block describing extended finger directions we do not observe a single case (try to find class 10 for light green bar plot!).
Fig 1. Class size for each subgroup.
The most common classes for each block and different dataset are presented in Table 2. The most frequent symmetry operator, hand bending is 0, which means a lack of this type of symbol. For base form the most common is flat hand or with one pointed out finger (usually index finger). The statistics for the thumb position block is fairly evenly distributed (not counting class 4, when the thumb is placed under all fingers) - the frequency of different classes depends on the language. The most common Hand Location in the top/down frontal panel is center (class 2), which is not surprising at all.
|Symmetry operator||Base form||Thumb position||Bending||Ext finger direction||Palm orientation||Frontal plane l/r||Frontal plane t/b|
Table 2. Most frequent classes per analised dataset.
At the end we can also look at a correlation matrix of all of the variables in the used concatenated dataset. We can see some positive and negative correlations, which makes sense, because the presence of some HamNoSys symbols in the gloss description excludes the appearance of others. And the other way around, the others only appear together.
Fig 2. Correlation matrix.
In general by going through the exploratory data analysis, we have a much better understanding of the data, which will make it easier to choose the model, the attributes, and refine it overall. Stay tuned to read about the results of our trainings!