2023
Speech Audio Synthesis from Tagged MRI and Non-negative Matrix Factorization via Plastic Transformer
Liu X, Xing F, Stone M, Zhuo J, Fels S, Prince J, El Fakhri G, Woo J. Speech Audio Synthesis from Tagged MRI and Non-negative Matrix Factorization via Plastic Transformer. Lecture Notes In Computer Science 2023, 14226: 435-445. PMID: 38651032, PMCID: PMC11034915, DOI: 10.1007/978-3-031-43990-2_41.Peer-Reviewed Original ResearchWeight mapAudio waveformEnd-to-end deep learning frameworkMatrix factorization-based approachesFactorization-based approachDeep learning frameworkNon-negative matrix factorizationEnd-to-endAdversarial trainingProcess of speech productionTwo-dimensional spectrogramConventional convolutionLearning frameworkMotion featuresTraining samplesAudio synthesisDimension expansionMatrix inputMatrix factorizationTagged MRISpeech productionTransformation modelExperimental resultsSpectrogramPlastic transformationSynthesizing audio from tongue motion during speech using tagged MRI via transformer
Liu X, Xing F, Prince J, Stone M, Fakhri G, Woo J. Synthesizing audio from tongue motion during speech using tagged MRI via transformer. Proceedings Of SPIE--the International Society For Optical Engineering 2023, 12464: 1246410-1246410-5. PMID: 38009135, PMCID: PMC10669779, DOI: 10.1117/12.2653345.Peer-Reviewed Original ResearchMotion fieldAudio waveformAdversarial training approachImprove synthesis qualityConvolutional decoderAudio dataSynthesis qualityTranslation networkData structureSpeech waveformTemporal modelTagged MRITongue motionTraining approachSpectrogramMuscle deformationSource of informationSpeechIntelligible speechFrameworkDecodingInformationPredictive informationEncodingNetworkQuantifying velopharyngeal motion variation in speech sound production using an audio-informed dynamic MRI atlas
Xing F, Jin R, Gilbert I, El Fakhri G, Perry J, Sutton B, Woo J. Quantifying velopharyngeal motion variation in speech sound production using an audio-informed dynamic MRI atlas. Proceedings Of SPIE--the International Society For Optical Engineering 2023, 12464: 124642m-124642m-6. PMID: 37621417, PMCID: PMC10448831, DOI: 10.1117/12.2654082.Peer-Reviewed Original ResearchMotion fieldReal-time speechHigh-dimensional datasetsAudio waveformAtlas spaceTemporal alignmentMotion variationsDatasetMagnetic resonance imagingMotion atlasMotion differencesSpeech variationImage acquisitionTaskSpeechMotion characteristicsDynamic magnetic resonance imagingPrincipal componentsImages
2022
Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator
Liu X, Xing F, Prince J, Zhuo J, Stone M, El Fakhri G, Woo J. Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator. Lecture Notes In Computer Science 2022, 13436: 376-386. PMID: 36820764, PMCID: PMC9942274, DOI: 10.1007/978-3-031-16446-0_36.Peer-Reviewed Original ResearchAudio waveformEnd-to-end deep learning frameworkAdversarial training approachDeep learning frameworkEnd-to-endTwo-dimensional spectrogramAdversarial networkIntermediate representationLearning frameworkResidual attentionDisentanglement strategyAudio synthesisDataset sizeImprove realismHeterogeneous representationsHeterogeneous translationAttentional strategiesTraining approachExperimental resultsMuscle deformationIntelligible speechMotor control theoriesTagged-MRIRelated-disordersSpeech acousticsTagged-MRI to audio synthesis with a pairwise heterogeneous deep translator
Liu X, Xing F, Stone M, Prince J, Kim J, Fakhri G, Woo J. Tagged-MRI to audio synthesis with a pairwise heterogeneous deep translator. The Journal Of The Acoustical Society Of America 2022, 151: a133-a133. DOI: 10.1121/10.0010891.Peer-Reviewed Original ResearchLatent space featuresEncoder-decoder structureCNN-based encoderSpace featuresDeep learning frameworkTagged MRI sequencesKullback-Leibler divergenceMel-spectrogramSpeech-related disordersLearning frameworkAudio synthesisAudio waveformSpeech productionKullback-LeiblerHeterogeneous representationsEvaluation strategiesIntelligible speechFrameworkTagged-MRISpeechDecodingAudioVisual movementEncodingUtterances
2021
4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech
Xing F, Jin R, Gilbert I, Perry J, Sutton B, Liu X, Fakhri G, Shosted R, Woo J. 4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech. The Journal Of The Acoustical Society Of America 2021, 150: 3500-3508. PMID: 34852570, PMCID: PMC8580575, DOI: 10.1121/10.0007064.Peer-Reviewed Original ResearchConceptsAudio waveformTemporal domain informationMulti-subject dataAtlas constructionMutual information measureMR image datasetsImage datasetsTarget domainDomain informationPost-processing methodImage sequencesTemporal alignmentSpatiotemporal alignmentMatching patternsInformation measuresImage dataSquare errorAligned volumesAlignment mapOverall score increaseMR technologyCross-correlationDeformable registrationSpeechImages