LinkedInSUNGJIN LEE

SUNGJIN LEE

Research Scientist at Yahoo

Location
New York, New York
Industry
Higher Education
Previous
  1. Carnegie Mellon University,
  2. PIOLINK,
  3. Corecess
Education
  1. Pohang University of Science and Technology
Websites
166connections

Join LinkedIn & access SUNGJIN's full profile

Join LinkedIn & access SUNGJIN's full profile. It's free!

As a LinkedIn member, you'll join 300 million other professionals who are sharing connections, ideas, and opportunities.

  • See who you know in common
  • Get introduced
  • Contact SUNGJIN directly
166connections
Pohang University of Science and Technology

Pohang University of Science and Technology

Computer Science

View full profile

Background

Experience

Research Scientist

Yahoo
– Present (3 months)New York, New York

Postdoctoral Associate

Carnegie Mellon University
(2 years 2 months)

My research interests lie in various areas of speech and language processing as well as machine learning. I am currently a postdoc research fellow in Language Technologies Institute at Carnegie Mellon University. I am primarily working on statistical dialog modeling including dialog state tracking and dialog strategy learning. I have developed dialog state tracking systems which achieved state-of-the-art performance on average in Dialog State Tracking Challenge 2013. I have also developed a rapid sparse Bayesian reinforcement learning algorithm for online dialog strategy optimization through interactions with real users. I am also interested in applying spoken language technologies to computer-assisted language learning settings.

Research Activities:
Dialog State Tracking Challenge (DSTC), Advisory Committee
The REAL Challenge, Scientific Committee
SIGDIAL, Program Committee
ACL, Program Committee
Interspeech, Program Committee
IJCNLP, Program Committee

Visiting Scholar

Carnegie Mellon University
(1 year)

Working on statistical dialog modeling

Senior team member

PIOLINK
(2 years 8 months)

PIOLINK is a leading application networking company in Korea. It manufactures Application Switches, Network Load Balancers, and Web Security Switches. I worked as a senior team member for the development and design of multi processors and multitasking systems. I implemented and managed firmware and software for embedded Linux systems mixing PowerPC and MIPS. I wrote numerous device drivers including communications processor drivers (BroadCom BCM1250, BCM1480), SSL chip drivers (Britestream BN1010; Cavium CN1000), network chip drivers (BroadCom BCM5690, BCM5464SR), and peripheral device drivers (PCI, HT, System Monitoring chip, RTC & NVRAM, Flash memory, UART).

Senior team member

Corecess
(3 years 6 months)

CORECESS manufactures telecommunications equipment for broadband access networks such as Optical Link Technologies (GEPON and WDM PON), Intelligent Mutilayer Switches, VDSL, and DSLAM. I worked on the development and design of quality of service (Corecess QoS architecture design and implementation), layer 2 network protocols (RSTP, LACP), and a Logging File System. I implemented and managed system software for embedded Linux and pSOS. I wrote numerous device drivers including programmable network processor drivers (Agere APP550), network chip drivers (Switchcore CXE 1000, CXE16; Galileo Galnet II, Galnet II+, Galnet III), and peripheral device drivers (PCI, IIC, System Monitoring chip, RTC & NVRAM, Flash memory, UART).

Volunteer Experience & Causes

Advisory Committee

Dialog State Tracking Challenge
– Present (2 years)Science and Technology

Scientific Committee

The REAL Challenge
– Present (2 years)Science and Technology

Program Committee

SIGDIAL
– Present (2 years)Science and Technology

Program Committee

ACL
– Present (1 year)Science and Technology

Program Committee

Interspeech

Program Committee

IJCNLP
– Present (3 years)Science and Technology

Program Committee

SLT

Program Committee

IWSDS

Opportunities SUNGJIN is looking for:

Causes SUNGJIN cares about:

  • Science and Technology

Organizations SUNGJIN supports:

Publications

Incremental Dialog Processing in a Task-Oriented Dialog

Interspeech 2014
2014

Incremental Dialog Processing (IDP) enables Spoken Dialog Systems to gradually process minimal units of user speech in order to give the user an early system response. In this paper, we present an application of IDP that shows its effectiveness in a task-oriented dialog system. We have implemented an IDP strategy and deployed it for one month on a real-user system. We compared the resulting dialogs with dialogs produced over the previous month without IDP. Results show that the incremental strategy significantly improved system performance by eliminating long and often off-task utterances that generally produce poor speech recognition results. User behavior is also affected; the user tends to shorten utterances after being interrupted by the system.

Authors:

Extrinsic Evaluation of Dialog State Tracking and Predictive Metrics for Dialog Policy Optimization(Link)

SIGDIAL
2014

During the recent Dialog State Tracking Challenge (DSTC), a fundamental question was raised: “Would better performance in dialog state tracking translate to better performance of the optimized policy by reinforcement learning?” Also, during the challenge system evaluation, another non-trivial question arose: “Which evaluation metric and schedule would best predict improvement in overall dialog performance?” This paper aims to answer these questions by applying an off-policy reinforcement learning method to the output of each challenge system. The results give a positive answer to the first question. Thus the effort to separately improve the performance of dialog state tracking as carried out in the DSTC may be justified. The answer to the second question also draws several insightful conclusions on the characteristics of different evaluation metrics and schedules.

Postech Immersive English Study (POMY): Dialog-based Language Learning Game

IEICE transactions on information and systems, 97(7), 2014
2014

This study examines the dialog-based language learning game (DB-LLG) realized in a 3D environment built with game contents. We designed the DB-LLG to communicate with users who can conduct interactive conversations with game characters in various immersive environments. From the pilot test, we found that several technologies were identified as essential in the construction of the DB-LLG such as dialog management, hint generation, and grammar error detection and feedback. We describe the technical details of our system POSTECH immersive English study (Pomy). We evaluated the performance of each technology using a simulator and by field tests with users.

Authors:

Structured Discriminative Model For Dialog State Tracking(Link)

SIGDIAL
2013

Many dialog state tracking algorithms have been limited to generative modeling due to the influence of the Partially Observable Markov Decision Process framework. Recent analyses, however, raised fundamental questions on the effectiveness of the generative formulation. In this paper, we present a structured discriminative model for dialog state tracking as an alternative. Unlike generative models, the proposed method affords the incorporation of features without having to consider dependencies between observations. It also provides a flexible mechanism for imposing relational constraints. To verify the effectiveness of the proposed method, we applied it to the Let’s Go domain (Raux et al., 2005). The results show that the proposed model is superior to the baseline and generative model-based systems in accuracy, discrimination, and robustness to mismatches between training and test datasets.

Recipe For Building Robust Spoken Dialog State Trackers: Dialog State Tracking Challenge System Description (nominated for best paper)(Link)

SIGDIAL
2013

For robust spoken conversational interaction, many dialog state tracking algorithms have been developed. Few studies, however, have reported the strengths and weaknesses of each method. The Dialog State Tracking Challenge (DSTC) is designed to address this issue by comparing various methods on the same domain. In this paper, we present a set of techniques that build a robust dialog state tracker with high performance: wide-coverage and well-calibrated data selection, feature-rich discriminative model design, generalization improvement techniques and unsupervised prior adaptation. The DSTC results show that the proposed method is superior to other systems on average on both the development and test datasets.

Authors:

Incremental Sparse Bayesian Method for Online Dialog Strategy Learning(Link)

The IEEE Journal of Selected Topics in Signal Processing, Vol. 6, No. 8, December 2012
December 2012

This paper proposes an incremental sparse Bayesian learning method to allow continuous dialog strategy learning from the interactions with real users. Since conventional reinforcement learning (RL) methods require a huge number of dialogs to reach convergence, it has been essential to use a simulated user in training dialog policies. The disadvantage of this approach is that the trained dialog policies always lag behind the optimal one for live users. In order to tackle this problem, a few studies applying online RL methods to dialog management have emerged and showed very promising results. However, these methods are limited to learning online the weight parameters of the basis functions in the model and so need batch learning on a fixed data set or some heuristics to find appropriate values for other meta parameters such as sparsity-controlling thresholds, basis function parameters, and noise parameters. The proposed method attempts to overcome this limitation to achieve fully incremental and fast dialog strategy learning by adopting a sparse Bayesian learning method for value function approximation. In order to verify the proposed method, three different experimental conditions have been used: artificial data, a simulated user, and real users. The experiment on the artificial data showed that the proposed method successfully learns all the parameters in an incremental manner. Also, the experiment on training and evaluating dialog policies with a simulated user clearly demonstrated that the proposed method is much faster than conventional RL methods. A live user study showed that the dialog strategy learned from real users performed as good as the best past systems, although it slightly underperformed the one trained on simulated dialogs due to the difficulty of user feedback elicitation.

Authors:

POMDP-Based Let's Go System for Spoken Dialog Challenge(Link)

IEEE SLT 2012, Miami, USA
2012

This paper describes a POMDP-based Let’s Go system which incorporates belief tracking and dialog policy optimization into the dialog manager of the reference system for the Spoken Dialog Challenge (SDC). Since all components except for the dialog manager were kept the same, component-wise comparison can be performed to investigate the effect of belief tracking and dialog policy optimization on the overall system performance. In addition, since unsupervised methods have been adopted to learn all required models to reduce human labor and development time, the effectiveness of the unsupervised approaches compared to conventional supervised approaches can be investigated. The result system participated in the 2011 SDC and showed comparable performance with the base system which has been enhanced from the reference system for the 2010 SDC. This shows the capability of the proposed method to rapidly produce an effective system with minimal human labor and experts’ knowledge.

Authors:

An Unsupervised Approach to User Simulation: toward Self-Improving Dialog Systems(Link)

SIGDIAL, 2012, Seoul, Korea
2012

This paper proposes an unsupervised approach to user simulation in order to automatically furnish updates and assessments of a deployed spoken dialog system. The proposed method adopts a dynamic Bayesian network to infer the unobservable true user action from which the parameters of other components are naturally derived. To verify the quality of the simulation, the proposed method was applied to the Let’s Go domain (Raux et al., 2005) and a set of measures was used to analyze the simulated data at several levels. The results showed a very close correspondence between the real and simulated data, implying that it is possible to create a realistic user simulator that does not necessitate human intervention.

Authors:

Exploiting Machine-Transcribed Dialog Corpus to Improve Multiple Dialog States Tracking Methods(Link)

SIGDIAL, 2012, Seoul, Korea
2012

This paper proposes the use of unsupervised approaches to improve components of partition-based belief tracking systems. The proposed method adopts a dynamic Bayesian network to learn the user action model directly from a machine-transcribed dialog corpus. It also addresses confidence score calibration to improve the observation model in a unsupervised manner using dialog-level grounding information. To verify the effectiveness of the proposed method, we applied it to the Let’s Go domain (Raux et al., 2005). Overall system performance for several comparative models were measured. The results show that the proposed method can learn an effective user action model without human intervention. In addition, the calibrated confidence score was verified by demonstrating the positive influence on the user action model learning process and on overall system performance.

Authors:

Grammatical error detection for corrective feedback provision in oral conversations(Link)

25th AAAI conference on artificial intelligence (AAAI-11)
2011

The demand for computer-assisted language learning systems that can provide corrective feedback on language learners’ speaking has increased. However, it is not a trivial task to detect grammatical errors in oral conversations because of the unavoidable errors of automatic speech recognition systems. To provide corrective feedback, a novel method to detect grammatical errors in speaking performance is proposed. The proposed method consists of two sub-models: the grammaticality-checking model and the error-type classification model. We automatically generate grammatical errors that learners are likely to commit and construct error patterns based on the articulated errors.
When a particular speech pattern is recognized, the grammaticality-checking model performs a binary classification based on the similarity between the error patterns and the recognition result using the confidence score. The error-type classification model chooses the error type based on the most similar error pattern and the error frequency extracted from a learner corpus. The grammaticality-checking method largely outperformed the two comparative models by 56.36% and 42.61% in F-score while keeping the false positive rate very low. The error type classification model exhibited very high performance with a 99.6% accuracy rate. Because high precision and a low false positive rate are important criteria for the language-tutoring setting, the proposed method will be helpful for intelligent computer-assisted language learning systems.

Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Kyusong Lee,
  • Gary Geunbae Lee

On the effectivness of robot-assisted language learning(Link)

ReCALL, 2011, Volume 23, Issue 01

This study introduces the educational assistant robots that we developed for foreign language learning and explores the effectiveness of robot-assisted language learning (RALL) which is in its early stages. To achieve this purpose, a course was designed in which students have meaningful interactions with intelligent robots in an immersive environment. A total of 24 elementary students, ranging in age from ten to twelve, were enrolled in English lessons. A pre-test/post-test design was used to investigate the cognitive effects of the RALL approach on the students’ oral skills. No significant difference in the listening skill was found, but the speaking skills improved with a large effect size at the significance level of 0.01. Descriptive statistics and the pre-test/post-test design were used to investigate the affective effects of RALL approach. The result showed that RALL promoted and improved students’ satisfaction, interest, confidence, and motivation at the significance level of 0.01.

Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Jonghoon Lee,
  • Kyusong Lee,
  • Gary Geunbae Lee,
  • SeongDae Sagong,
  • Moonsang Kim

Grammar error simulation for computer-assisted language learning(Link)

Knowledge-Based Systems, 2011, Volume 24, Issue 6

This paper presents an automated method to generate realistic grammatical errors that can perform crucial functions for advanced technologies in computer-assisted language learning (CALL), including generating corrective feedback in dialog-based CALL (DB-CALL) systems, simulating a language learner to optimize tutoring strategies, and generating context-dependent grammar quizzes as educational materials. The goal of this study is to make grammatical errors generated by automatic simulators more realistic. To generate realistic errors, expert knowledge of language learners’ error characteristics was imported into a statistical modeling system that uses Markov logic, which provides a theoretically sound way to encode knowledge into probabilistic first-order logic. We learned the weights of first-order formulas from a learner corpus. The improved quality of the proposed method was demonstrated through comparative experiments using automatic evaluations (precision and recall rate and Kullback–Leibler divergence between error distributions) and human assessments. The proposed method increased precision by 6% and recall by 8.33% averaged across all proficiency levels. It also exhibited a relative improvement of 37.5% in the average Kullback–Leibler divergence. Judgment by human evaluators showed that the proposed method increased the average scores in two different evaluation tasks by 7 and by 0.411. Finally, we present a measure of labor savings to help predict the time and cost associated with this method for those who plan to exploit grammatical error simulation for their applications. The results indicate that using the proposed method could reduce the grammatical error generation time by 59% in average.

Authors:
  • SUNGJIN LEE,
  • Jonghoon Lee,
  • Hyungjong Noh,
  • Kyusong Lee,
  • Gary Geunbae Lee

Foreign language tutoring in oral conversations using spoken dialog systems

IEICE transactions on information and systems
Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh; Jonghoon Lee; Kyusong Lee; Gary Geunbae Lee

Stacking model-based Korean prosodic phrasing using speaker variability reduction and linguistic feature engineering

ACM Trans. On Asian Language Information Processing (TALIP)
Authors:
  • SUNGJIN LEE,
  • Jinsik Lee,
  • Jonghoon Lee,
  • Byeongchang Kim,
  • Gary Geunbae Lee

Iteratively constrained selection of word alignment links from knowledge and statistics

Knowledge-Based Systems
Authors:
  • SUNGJIN LEE,
  • Jonghoon Lee,
  • Hyungjong Noh,
  • Gary Geunbae Lee

Seamless error correction interface for voice word processor

IEEE international conference on acoustics, speech, and signal processing (ICASSP 2012)
Authors:
  • SUNGJIN LEE,
  • Junhwi Choi,
  • Kyungduk Kim,
  • Seokhwan Kim,
  • Donghyeon Lee,
  • Injae Lee,
  • Gary Geunbae Lee

Ranking dialog acts using discourse coherence indicator for English tutoring dialog systems

international workshop on spoken dialog systems technology (IWSDS 2011)
Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Kyusong Lee,
  • Gary Geunbae Lee

Effects of language learning game on Korean elementary school students

SLaTE 2011
Authors:
  • SUNGJIN LEE,
  • Kyusong Lee,
  • Soo-Ok Kweon,
  • Hyungjong Noh,
  • Jonghoon Lee,
  • Jinsik Lee,
  • Hae-Ri Kim,
  • Gary Geunbae Lee

Affective effects of speech-enabled robots for language learning

IEEE workshop on spoken language technology (SLT 2010)
Authors:
  • SUNGJIN LEE,
  • Changgu Kim,
  • Jonghoon Lee,
  • Hyungjong Noh,
  • Kyusong Lee,
  • Gary Geunbae Lee

POSTECH approaches for dialog-based English conversation tutoring

APSIPA annual summit and conference 2010
Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Jonghoon Lee,
  • Kyusong Lee,
  • Gary Geunbae Lee

Cognitive effects of robot-assisted language learning on oral skills

Interspeech2010 workshop on second language studies: acquisition, learning, education and technology
Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Jonghoon Lee,
  • Kyusong Lee,
  • Gary Geunbae Lee

Realistic grammar error simulation using markov logic.

ACL 2009
Authors:

Correlation-based Query Relaxation for Example-based dialog modeling

IEEE Automatic Speech Recognition and Understanding Workshop (ASRU09)
Authors:
  • SUNGJIN LEE,
  • Cheongjae Lee,
  • Sangkeun Jung,
  • Kyungduk Kim,
  • Gary Geunbae Lee

Intention-based Corrective Feedback Generation using Context-aware Model

international conference on computer supported education (CSEDU 2010)
Authors:
  • SUNGJIN LEE,
  • Cheongjae Lee,
  • Jonghoon Lee,
  • Hyungjong Noh,
  • Gary Geunbae Lee

Importing human tutor’s conversation strategy into dialog systems for language learning using fuzzy logic

international workshop on spoken dialog systems (IWSDS2009)
Authors:
  • SUNGJIN LEE,
  • Cheongjae Lee,
  • Jonghoon Lee,
  • Hyungjong Noh,
  • Gary Geunbae Lee

Script-description pair extraction on English as second language podcast text documents

international conference on computer supported education (CSEDU 2010)
Authors:
  • SUNGJIN LEE,
  • Hyungjong Noh,
  • Minwoo Jeong,
  • Gary Geunbae Lee

Projects

English Speaking Assessment and Provision of Feedback for Korean (SK Telecom Corporation Project)

I participated in making the project proposal and drawing the system architecture. In particular, I am developing a component for grammaticality judgment and provision of feedback for oral output.

Natural Language-based Immersive English Tutoring System (Ministry of Education Science and Technology Project)

I was responsible for project management and participated in designing the system architecture consisting of various technologies such as Speech, Vision, and Haptic. I am developing dialog strategies in consideration of students’ proficiency level, emotion, and gameplay.

Mobile Platforms for Dialog-based Speech Interfaces (Ministry of Knowledge and Economy Project)

I was responsible for investigating dialog management for English tutoring. I am working on data collection, robust language understanding, dialog management, and corrective feedback generation. I am researching a ASR combination to generate feedback on both global and local errors.

Intelligent Robots for English Conversation Tutoring (Ministry of Knowledge and Economy Project)

I was responsible for speech and language processing of intelligent robots. I collaborated with English teachers to make educational material and developed communicative robots capable of providing recast feedback in response to students’ errors. I participated in a pilot project for elementary students.

Conversational Agent for English Conversation Tutoring (Microsoft Research Asia Funded Project)

I was responsible for project management. I designed and implemented a hybrid language understanding component for robust language understanding and corrective feedback generation

Dialog-based Computer Aided Language Learning for Spoken Dialog System in English (KT Corporation Project)

I was responsible for project management. I designed and implemented a spoken dialog system for English conversation practice in the immigration domain. I developed a method to provide recast feedback and suggest expressions in the case of timeout

Languages

  1. English

  2. Korean

Skills

  • Machine Learning
  • Natural Language...
  • Computer Science
  • Artificial Intelligence
  • Spoken Dialog System
  • Python
  • LaTeX
  • C
  • Speech Recognition
  • Research
  • Simulation
  • Human-computer...
  • C++
  • Java
  • Matlab
  • Android
  • See 1+  See less

Education

Pohang University of Science and Technology

Ph.D, Computer Science

View SUNGJIN's full profile to...

  • See who you know in common
  • Get introduced
  • Contact SUNGJIN directly

Not the SUNGJIN LEE you're looking for? View more

Insights


People Also Viewed

LinkedIn member directory:

  1. a
  2. b
  3. c
  4. d
  5. e
  6. f
  7. g
  8. h
  9. i
  10. j
  11. k
  12. l
  13. m
  14. n
  15. o
  16. p
  17. q
  18. r
  19. s
  20. t
  21. u
  22. v
  23. w
  24. x
  25. y
  26. z
  27. more