HearYou2.0

Seminar Project for Multimodal Corpus Linguistics seminar in SS18

Preacquisitions

python 3.5+
keras 2
tf 1.8

Supported Features

weighted class weights, attention wrapper,
train on different data type (scrripted/ improvised/ both)
train on different speech features (static/ dynamics - deltas and deltasdeltas)
train on different modalities (speech/ text/ motion/ all/ configure your own combinations in ./configs)
train on different ANN architectures (convs/ rnns/ configure your own models in ./models)
speech speaker independent configuration

How to run

python HearYou2.0.py -c configs/<model_to_run>.json

run all configurations stored in ./configs if -c flag is not given

Results

static speech feature

data type(all feat)	Scripted		Improvised		Both
	MFCC	all	MFCC	all	MFCC	all
text	xx%	xx%	56%	xx%	61%	xx%
speech	xx%	xx%	34%	xx%	51%	xx%
mocap	xx%	xx%	xx%	xx%	45%	xx%
text+speech	xx%	xx%	xx%	xx%	67%	xx%
text+speech+mocap	xx%	xx%	xx%	xx%	70%	xx%

dynamic speech feature (with 1st/2nd derivative)

improvised data

feature type	MFCC	34
speech	57%	51%
speech+mocap	73%	74%
text+speech	65%	50%
text+speech+mocap	76%	69%

scripted data

feature type	MFCC	34
speech	53%	51%
speech+mocap	41%	47%
text+speech	54%	51%
text+speech+mocap	44%	38%

complete data

feature type	MFCC	34
speech	50%	50%
speech+mocap	61%	52%
text+speech	50%	52%
text+speech+mocap	60%	50%

(text lstm without attention)

References

IEMOCAP data

https://sail.usc.edu/iemocap/iemocap_release.htm

Feature Extraction Library

https://github.com/tyiannak/pyAudioAnalysis/blob/master/pyAudioAnalysis/audioFeatureExtraction.py

Deltas & DeltasDeltas

https://github.com/jameslyons/python_speech_features/blob/master/python_speech_features/base.py

Conceptor

https://github.com/littleowen/Conceptor

Multimodality

https://github.com/Samarth-Tripathi/IEMOCAP-Emotion-Detection

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.idea		.idea
callbacks		callbacks
ckpts		ckpts
configs		configs
datasets		datasets
experiments		experiments
finaloutput		finaloutput
logs		logs
metrics		metrics
models		models
outputs		outputs
plots		plots
plotters		plotters
wrappers		wrappers
.gitignore		.gitignore
README.md		README.md
all_all.o445709		all_all.o445709
all_impro.o445708		all_impro.o445708
all_scripted.o445707		all_scripted.o445707
archi_save_path		archi_save_path
architectures.pages		architectures.pages
data_distribution_all_data.png		data_distribution_all_data.png
data_distribution_improvised_data.png		data_distribution_improvised_data.png
data_distribution_scripted_data.png		data_distribution_scripted_data.png
dynamic_mfcc_impro.o445695		dynamic_mfcc_impro.o445695
dynamic_mfcc_impro.po445695		dynamic_mfcc_impro.po445695
features.py		features.py
helper.py		helper.py
main.py		main.py
mfcc_all.o445705		mfcc_all.o445705
mfcc_impro.o445711		mfcc_impro.o445711
mfcc_impro.o445726		mfcc_impro.o445726
mfcc_scripted.o445706		mfcc_scripted.o445706
mocap_data_collect.py		mocap_data_collect.py
mocap_static.o445694		mocap_static.o445694
mocap_static.po445694		mocap_static.po445694
mocap_static_all.o445723		mocap_static_all.o445723
mocap_static_impro.o445722		mocap_static_impro.o445722
mocap_static_scripted.o445721		mocap_static_scripted.o445721
polikar_wavelets.pdf		polikar_wavelets.pdf
speech_delta_all.o445713		speech_delta_all.o445713
speech_delta_all_allfeature.o445717		speech_delta_all_allfeature.o445717
speech_delta_impro.o445712		speech_delta_impro.o445712
speech_delta_impro_allfeature.o445716		speech_delta_impro_allfeature.o445716
speech_delta_scripted.o445714		speech_delta_scripted.o445714
speech_delta_scripted_allfeature.o445715		speech_delta_scripted_allfeature.o445715
speech_static.o445693		speech_static.o445693
speech_static.po445693		speech_static.po445693
speech_static_all.o445718		speech_static_all.o445718
speech_static_impro.o445719		speech_static_impro.o445719
speech_static_scripted.o445720		speech_static_scripted.o445720
text_all.o445685		text_all.o445685
text_impro.o445686		text_impro.o445686
text_scripted.o445691		text_scripted.o445691
text_scripted.po445691		text_scripted.po445691
three_all_mfcc_dynamics.o445671		three_all_mfcc_dynamics.o445671
three_impro_mfcc_dynamics.o445672		three_impro_mfcc_dynamics.o445672
train.sge		train.sge

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HearYou2.0

Preacquisitions

Supported Features

How to run

Results

static speech feature

dynamic speech feature (with 1st/2nd derivative)

improvised data

scripted data

complete data

References

About

Releases

Packages

Languages

junbohuang/HearYou2.0

Folders and files

Latest commit

History

Repository files navigation

HearYou2.0

Preacquisitions

Supported Features

How to run

Results

static speech feature

dynamic speech feature (with 1st/2nd derivative)

improvised data

scripted data

complete data

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages