Speech and Computer 18th International Conference, SPECOM 2016, Budapest, Hungary, August 23-27, 2016, Proceedings / [electronic resource] :
edited by Andrey Ronzhin, Rodmonga Potapova, G�eza N�emeth.
- XVIII, 731 p. 197 illus. online resource.
- Lecture Notes in Computer Science, 9811 0302-9743 ; .
- Lecture Notes in Computer Science, 9811 .
Automatic Speech Recognition based on Neural Networks -- Machine Processing of Dialogue States; Speculations on Conversational Entropy -- Speech Recognition Challenges in the Car Navigation Industry -- A Comparison of Acoustic Features of Speech of Typically Developing Children and Children with Autism Spectrum Disorders -- A Deep Neural Networks (DNN) Based models for a Computer Aided Pronunciation Learning System -- A Linguistic Interpretation of the Atom Decomposition of Fundamental Frequency Contour for American English -- A Phonetic Segmentation Procedure Based on Hidden Markov Models -- A Preliminary Exploration of Group Social Engagement Level Recognition in Multiparty Casual Conversation -- An Agonist-Antagonist Pitch Production Model -- An Algorithm for Phase Manipulation in a Speech Signal -- An Exploratory Study on Sociolinguistic Variation of Russian Everyday Speech -- Adaptation of DNN Acoustic Models using KL-divergence Regularization and Multi-Task Training -- Advances in STC Russian Spontaneous Speech Recognition System -- Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance -- Assessment of the Relation Between Low-Frequency Features and Velum Opening by Using Real Articulatory Data -- Automatic Summarization of Highly Spontaneous Speech -- Backchanneling via Twitter Data for Conversational Dialogue Systems -- Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit -- Combining Atom Decomposition of the F0 Track and HMM-based Phonological Phrase Modelling for Robust Stress Detection in Speech -- Comparative analysis of classifiers for automatic language recognition in spontaneous speech -- Comparison of Retrieval Approaches and Blind Relevance Feedback Methods within the Czech Speech Information Retrieval -- Convolutional Neural Network in the Task of Speaker Change Detection -- Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer -- Designing High-Coverage Multi-Level Text Corpus for Non-Professional-Voice Conservation -- Designing Syllable Models for an HMM based Speech Recognition System -- Detecting Filled Pauses and Lengthenings in Russian Spontaneous Speech using SVM -- Detecting Laughter and Filler Events by Time Series Smoothing with Genetic Algorithms -- Detecting State of Aggression in Sentences using CNN -- DNN-based Acoustic Modeling for Russian Speech Recognition Using Kaldi -- DNN-Based Duration Modeling for Synthesizing Short Sentences -- Emotional Speech of 3-Years Old Children: Norm-Risk-Deprivation -- Ensemble Deep Neural Network based Waveform-Driven Stress Model for Speech Synthesis -- Evaluation of Response Times on a Touch Screen using Stereo Panned Speech Command Auditory Feedback -- Evaluation of the Speech Quality During Rehabilitation after Surgical Treatment of the Cancer of Oral Cavity and Oropharynx based on a Comparison of the Fourier Spectra -- Experiments with One-Class Classifier as a Predictor of Spectral Discontinuities in Unit Concatenation -- Exploring GMM-derived Features for Unsupervised Adaptation of Deep Neural Network Acoustic Models -- Feature Space VTS with Phase Term Modeling -- Finding Speaker Position Under Difficult Acoustic Conditions -- Fusing Various Audio Feature Sets for Detection of Parkinson's Disease from Sustained Voice and Speech Recordings -- HAVRUS Corpus: High-speed Recordings of Audio-Visual Russian Speech -- Human-Smartphone Interaction for Dangerous Situation Detection & Recommendation Generation while Driving -- Improving Automatic Speech Recognition Containing Additive Noise Using Deep Denoising Autoencoders of LSTM Networks -- Improving the Quality of Automatic Speech Recognition in Trucks -- Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation -- Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent & Text-Independent Operation Modalities -- Improvements to Prosodic Variation in Long Short-Term Memory based Intonation Models Using Random Forest -- In-document Adaptation for a Human Guided Automatic Transcription Service -- Interaction Quality as a Human-Human Task-Oriented Conversation Performance -- Investigation of Segmentation in i-Vector based Speaker Diarization of Telephone Speech -- Investigation of Speech Signal Parameters Reflecting the Truth of Transmitted Information -- Investigating Signal Correlation as Continuity Metric in a Syllable based Unit Selection Synthesis System -- Knowledge Transfer for Utterance Classification in Low-Resource Languages -- Language Identification using Time Delay Neural Network D-Vector on Short Utterances -- Lexical Stress in Punjabi and its Representation in PLS -- Low Inter-Annotator Agreement in Sentence Boundary Detection and Personality -- LSTM-based Language Models for Spontaneous Speech Recognition -- Measuring Prosodic Entrainment in Italian Collaborative Game-based Dialogues -- Microphone Array Directivity Improvement in Low-Frequency Domain for Speech Processing -- Modeling Imperative Utterances in Russian Spoken Dialogue: Verb-Central Quantitative Approach -- Multimodal Perception of Aggressive Behavior -- On Individual Polyinformativity of Speech and Voice Regarding Speaker's Auditive Attribution (Forensic Phonetic Aspect) -- Online Biometric Identification With Face Analysis in Web Applications -- Optimization of Zelinski post-filtering calculation -- Phonetic Aspects of High Level of Naturalness in Speech Synthesis -- Polybasic Attribution of Social Network Discourse -- Precise Estimation of Harmonic Parameter Trend and Modification of a Speech Signal -- Profiling a Set of Personality Traits of a Text's Author: a Corpus-Based Approach -- Prosody Analysis of Malay Language Storytelling Corpus -- Quality Assessment of two Fullband Audio Codecs Supporting Real-Time Communication -- Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments -- Scenarios of Multimodal Information Navigation Services for Users in Cyberphysical Environment -- Scores Calibration in Speaker Recognition Systems -- Selecting Keypoint Detector and Descriptor Combination for Augmented Reality Application -- Semi-automatic Speaker Verification System Based on Analysis of Formant, Durational and Pitch Characteristics -- Speaker-Dependent Bottleneck Features for Egyptian Arabic Speech Recognition -- Speech Acts Annotation of Everyday Conversations in the ORD corpus of Spoken Russian -- Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor -- Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System -- Speech Recognition combining MFCCs and Image Features -- Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech -- Statistical Analysis of Acoustical Parameters in the Voice of Children with Juvenile Dysphonia -- Stress, Arousal, and Stress Detector Trained on Acted Speech Database -- Study on the Improvement of Intelligibility for Elderly Speech using Formant Frequency Shift Method -- Text Classification in the Domain of Applied Linguistics as Part of a Pre-editing Module for Machine Translation Systems -- Tonal Specification of Perceptually Prominent Non-Nuclear Pitch Accents in Russian -- Toward Sign Language Motion Capture Dataset Building -- Trade-off Between Speed and Accuracy for Noise Variance Minimization (NVM) Pitch Estimation Algorithm -- Unsupervised Trained Functional Discourse Parser for E-Learning Materials Scaffolding.
This book constitutes the proceedings of the 18th International Conference on Speech and Computer, SPECOM 2016, held in Budapest, Hungary, in August 2016. The 85 papers presented in this volume were carefully reviewed and selected from 154 submissions.
9783319439587
10.1007/978-3-319-43958-7 doi
Computer science.
Database management.
Information storage and retrieval.
Artificial intelligence.
Image processing.
Pattern recognition.
Computer Science.
Artificial Intelligence (incl. Robotics).
Information Systems Applications (incl. Internet).
Pattern Recognition.
Information Storage and Retrieval.
Image Processing and Computer Vision.
Database Management.
Q334-342 TJ210.2-211.495
006.3
Automatic Speech Recognition based on Neural Networks -- Machine Processing of Dialogue States; Speculations on Conversational Entropy -- Speech Recognition Challenges in the Car Navigation Industry -- A Comparison of Acoustic Features of Speech of Typically Developing Children and Children with Autism Spectrum Disorders -- A Deep Neural Networks (DNN) Based models for a Computer Aided Pronunciation Learning System -- A Linguistic Interpretation of the Atom Decomposition of Fundamental Frequency Contour for American English -- A Phonetic Segmentation Procedure Based on Hidden Markov Models -- A Preliminary Exploration of Group Social Engagement Level Recognition in Multiparty Casual Conversation -- An Agonist-Antagonist Pitch Production Model -- An Algorithm for Phase Manipulation in a Speech Signal -- An Exploratory Study on Sociolinguistic Variation of Russian Everyday Speech -- Adaptation of DNN Acoustic Models using KL-divergence Regularization and Multi-Task Training -- Advances in STC Russian Spontaneous Speech Recognition System -- Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance -- Assessment of the Relation Between Low-Frequency Features and Velum Opening by Using Real Articulatory Data -- Automatic Summarization of Highly Spontaneous Speech -- Backchanneling via Twitter Data for Conversational Dialogue Systems -- Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit -- Combining Atom Decomposition of the F0 Track and HMM-based Phonological Phrase Modelling for Robust Stress Detection in Speech -- Comparative analysis of classifiers for automatic language recognition in spontaneous speech -- Comparison of Retrieval Approaches and Blind Relevance Feedback Methods within the Czech Speech Information Retrieval -- Convolutional Neural Network in the Task of Speaker Change Detection -- Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer -- Designing High-Coverage Multi-Level Text Corpus for Non-Professional-Voice Conservation -- Designing Syllable Models for an HMM based Speech Recognition System -- Detecting Filled Pauses and Lengthenings in Russian Spontaneous Speech using SVM -- Detecting Laughter and Filler Events by Time Series Smoothing with Genetic Algorithms -- Detecting State of Aggression in Sentences using CNN -- DNN-based Acoustic Modeling for Russian Speech Recognition Using Kaldi -- DNN-Based Duration Modeling for Synthesizing Short Sentences -- Emotional Speech of 3-Years Old Children: Norm-Risk-Deprivation -- Ensemble Deep Neural Network based Waveform-Driven Stress Model for Speech Synthesis -- Evaluation of Response Times on a Touch Screen using Stereo Panned Speech Command Auditory Feedback -- Evaluation of the Speech Quality During Rehabilitation after Surgical Treatment of the Cancer of Oral Cavity and Oropharynx based on a Comparison of the Fourier Spectra -- Experiments with One-Class Classifier as a Predictor of Spectral Discontinuities in Unit Concatenation -- Exploring GMM-derived Features for Unsupervised Adaptation of Deep Neural Network Acoustic Models -- Feature Space VTS with Phase Term Modeling -- Finding Speaker Position Under Difficult Acoustic Conditions -- Fusing Various Audio Feature Sets for Detection of Parkinson's Disease from Sustained Voice and Speech Recordings -- HAVRUS Corpus: High-speed Recordings of Audio-Visual Russian Speech -- Human-Smartphone Interaction for Dangerous Situation Detection & Recommendation Generation while Driving -- Improving Automatic Speech Recognition Containing Additive Noise Using Deep Denoising Autoencoders of LSTM Networks -- Improving the Quality of Automatic Speech Recognition in Trucks -- Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation -- Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent & Text-Independent Operation Modalities -- Improvements to Prosodic Variation in Long Short-Term Memory based Intonation Models Using Random Forest -- In-document Adaptation for a Human Guided Automatic Transcription Service -- Interaction Quality as a Human-Human Task-Oriented Conversation Performance -- Investigation of Segmentation in i-Vector based Speaker Diarization of Telephone Speech -- Investigation of Speech Signal Parameters Reflecting the Truth of Transmitted Information -- Investigating Signal Correlation as Continuity Metric in a Syllable based Unit Selection Synthesis System -- Knowledge Transfer for Utterance Classification in Low-Resource Languages -- Language Identification using Time Delay Neural Network D-Vector on Short Utterances -- Lexical Stress in Punjabi and its Representation in PLS -- Low Inter-Annotator Agreement in Sentence Boundary Detection and Personality -- LSTM-based Language Models for Spontaneous Speech Recognition -- Measuring Prosodic Entrainment in Italian Collaborative Game-based Dialogues -- Microphone Array Directivity Improvement in Low-Frequency Domain for Speech Processing -- Modeling Imperative Utterances in Russian Spoken Dialogue: Verb-Central Quantitative Approach -- Multimodal Perception of Aggressive Behavior -- On Individual Polyinformativity of Speech and Voice Regarding Speaker's Auditive Attribution (Forensic Phonetic Aspect) -- Online Biometric Identification With Face Analysis in Web Applications -- Optimization of Zelinski post-filtering calculation -- Phonetic Aspects of High Level of Naturalness in Speech Synthesis -- Polybasic Attribution of Social Network Discourse -- Precise Estimation of Harmonic Parameter Trend and Modification of a Speech Signal -- Profiling a Set of Personality Traits of a Text's Author: a Corpus-Based Approach -- Prosody Analysis of Malay Language Storytelling Corpus -- Quality Assessment of two Fullband Audio Codecs Supporting Real-Time Communication -- Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments -- Scenarios of Multimodal Information Navigation Services for Users in Cyberphysical Environment -- Scores Calibration in Speaker Recognition Systems -- Selecting Keypoint Detector and Descriptor Combination for Augmented Reality Application -- Semi-automatic Speaker Verification System Based on Analysis of Formant, Durational and Pitch Characteristics -- Speaker-Dependent Bottleneck Features for Egyptian Arabic Speech Recognition -- Speech Acts Annotation of Everyday Conversations in the ORD corpus of Spoken Russian -- Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor -- Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System -- Speech Recognition combining MFCCs and Image Features -- Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech -- Statistical Analysis of Acoustical Parameters in the Voice of Children with Juvenile Dysphonia -- Stress, Arousal, and Stress Detector Trained on Acted Speech Database -- Study on the Improvement of Intelligibility for Elderly Speech using Formant Frequency Shift Method -- Text Classification in the Domain of Applied Linguistics as Part of a Pre-editing Module for Machine Translation Systems -- Tonal Specification of Perceptually Prominent Non-Nuclear Pitch Accents in Russian -- Toward Sign Language Motion Capture Dataset Building -- Trade-off Between Speed and Accuracy for Noise Variance Minimization (NVM) Pitch Estimation Algorithm -- Unsupervised Trained Functional Discourse Parser for E-Learning Materials Scaffolding.
This book constitutes the proceedings of the 18th International Conference on Speech and Computer, SPECOM 2016, held in Budapest, Hungary, in August 2016. The 85 papers presented in this volume were carefully reviewed and selected from 154 submissions.
9783319439587
10.1007/978-3-319-43958-7 doi
Computer science.
Database management.
Information storage and retrieval.
Artificial intelligence.
Image processing.
Pattern recognition.
Computer Science.
Artificial Intelligence (incl. Robotics).
Information Systems Applications (incl. Internet).
Pattern Recognition.
Information Storage and Retrieval.
Image Processing and Computer Vision.
Database Management.
Q334-342 TJ210.2-211.495
006.3