[JS] Web Speech API의 SpeechSynesisUtterance

728x90

SpeechSynesisUtterance

Web Speech API의 SpeechSynesisUtterance 인터페이스는 음성 요청을 나타냅니다.

말할 텍스트, 언어, 음성 및 기타 음성 속성을 지정하는 데 사용됩니다.

SpeechSynesisUtterance 개체에는 다음과 같은 속성이 있습니다.

SpeechSynesisUtterance 속성

text: 말할 텍스트입니다.
lang: 텍스트의 언어입니다.
voice: 사용할 음성입니다.
rate: 텍스트를 읽어야 하는 속도입니다.
pitch: 음성의 높낮이를 나타내는 특성입니다.
volume: 음성의 볼륨입니다.
paused: 말하기가 일시 중지되었는지 여부입니다.
onend: 말하기가 끝나면 시작되는 이벤트입니다.
onerror: 음성 합성 중 오류가 발생하면 시작되는 이벤트입니다.

SpeechSynesisUtterance 인터페이스 사용방법

1. SpeechSynesisUtterance 개체를 만들어야 합니다.

2. 필요에 따라 개체의 속성을 설정합니다

3. speechSynthesis개체의 speak() 메서드를 호출하여 텍스트 말하기를 시작할 수 있습니다.

SpeechSynesisUtterance 사용 예제

아래 예제는 Web Speech API를 사용해

텍스트 음성변환을 해주는 심플한 웹페이지를 만드는 것을 배울 수있습니다.

Project directory

일단 새로운 디렉터리 project-directory/ 와 index.html textToSpeech.js 을 만들어줍니다.

HTML page

index.html 을 아래와 같이 생성합니다

<html lang="en">
  <head>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta1/dist/css/bootstrap.min.css" rel="stylesheet" />
    <link rel="stylesheet" href="index.css" />
    <title>Text to Speech</title>
  </head>
  <body class="container mt-5 bg-dark">
    <h1 class="text-light">Text to Speech</h1>
    <p class="lead text-light mt-4">Select Voice</p>
    
    <!-- Select Menu for Voice -->
    <select id="voices" class="form-select bg-secondary text-light"></select>

    <!-- Range Slliders for Volume, Rate & Pitch -->
    <div class="d-flex mt-4 text-light">
      <div>
        <p class="lead">Volume</p>
        <input type="range" min="0" max="1" value="1" step="0.1" id="volume" />
        <span id="volume-label" class="ms-2">1</span>
      </div>
      <div class="mx-5">
        <p class="lead">Rate</p>
        <input type="range" min="0.1" max="10" value="1" id="rate" step="0.1" />
        <span id="rate-label" class="ms-2">1</span>
      </div>
      <div>
        <p class="lead">Pitch</p>
        <input type="range" min="0" max="2" value="1" step="0.1" id="pitch" />
        <span id="pitch-label" class="ms-2">1</span>
      </div>
    </div>

    <!-- Text Area  for the User to Type -->
    <textarea class="form-control bg-dark text-light mt-5" cols="30" rows="10" placeholder="Type here..."></textarea>

    <!-- Control Buttons -->
    <div class="mb-5">
      <button id="start" class="btn btn-success mt-5 me-3">Start</button>
      <button id="pause" class="btn btn-warning mt-5 me-3">Pause</button>
      <button id="resume" class="btn btn-info mt-5 me-3">Resume</button>
      <button id="cancel" class="btn btn-danger mt-5 me-3">Cancel</button>
    </div>
  </body>
  <script src="./textToSpeech.js"></script>
</html>

<select> : 비어있는 메뉴선택란은 자바스크립트를 통해서 사용가능한 음성목록들로 채울 것입니다.

<input type="range"> : 범위 슬라이더는 volume, pitch, rate를 조절할수있습니다.

<textarea> : 음성으로 변환할 텍스트를 입력할 박스입니다.

<button> : 음성을 컨트롤할 버튼입니다

The JavaScript file

인스턴스 생성

SpeechSynesisUtterance 클래스의 인스턴스를 생성합니다.

다양한 속성으로 해당 인스턴스를 구성할 수 있습니다.

let speech = new SpeechSynthesisUtterance();

속성 설정

SpeechSynthesisUtterance 인스턴스를 속성으로 구성해봅시다.

다음과 같은 6가지의 속성을 조정할 수 있습니다.

• Language

language속성은 발화의 언어를 가져오고 설정합니다.

만약 해당 속성을 설정하지 않았다면, <html lang="en"> html태그의 lang 의 값이 사용됩니다.

speech.lang = "en";

• Text

text속성은 발화가 이루어질때 합성될 텍스트를 가져오고 설정합니다.

예제에서는 시작버튼을 클릭했을때 click이벤트리스너를 통해 textarea 에서 텍스트값을 가져와서

이것을 속성으로 설정합니다.

document.querySelector("#talk").addEventListener("click", () => {
  speech.text = document.querySelector("textarea").value;
});

• Volume

volume속성은 발화의 볼륨을 가져오고 설정합니다.

최저 0 ~ 최고 1 사이의 볼륨을 float(최대1 로 소수점이 있는 숫자인 실수형)으로 나타냅니다

해당 속성이 설정되지않은 경우 기본값은 1입니다.

예제에서는 input 이벤트리스너를 통해 슬라이더 값이 변경되면 속성을 조정하게 해두었고,

HTML 태그에서 슬라이더의 최소값, 최대값 및 기본값을 이미 설정했습니다.

volume 값을 조정하는 Range Slider 옆에, <span>태그에 값을 현재 설정된 값을 표시합니다

document.querySelector("#rate").addEventListener("input", () => {
  // 입력된 rate 가져오기
  const rate = document.querySelector("#rate").value;

  //SpeechSynesisUtterance 인스턴스의 rate 속성 설정
  speech.rate = rate;

  // rate값 라벨 업데이트
  document.querySelector("#rate-label").innerHTML = rate;
});

• Rate

속성 rate는 발화 속도를 가져오고 설정합니다.

최저 0.1 ~ 최고 10 사이의 속도를 float(최대1 로 소수점이 있는 숫자인 실수형)으로 나타냅니다

해당 속성이 설정되지 않은 경우 기본값은 1입니다.

예제에서는 input 이벤트리스너를 통해 슬라이더 값이 변경되면 속성을 조정하게 해두었고,

HTML 태그에서 슬라이더의 최소값, 최대값 및 기본값을 이미 설정했습니다.

rate 값을 조정하는 Range Slider 옆에, <span>태그에 값을 현재 설정된 값을 표시합니다

document.querySelector("#volume").addEventListener("input", () => {
  // 입력된 volume 가져오기
  const volume = document.querySelector("#volume").value;

  //SpeechSynesisUtterance 인스턴스의 volume 속성 설정
  speech.volume = volume;
  
  // pitch값 라벨 업데이트
  document.querySelector("#volume-label").innerHTML = volume;
});

• Pitch

pitch 속성은 발화의 높낮이를 가져오고 설정합니다.

최저 0 ~ 최고 2 사이의 높낮이를 float(최대1 로 소수점이 있는 숫자인 실수형)으로 나타냅니다

해당 속성이 설정되지 않은 경우 기본값은 1입니다.

예제에서는 input 이벤트리스너를 통해 슬라이더 값이 변경되면 속성을 조정하게 해두었고,

HTML 태그에서 슬라이더의 최소값, 최대값 및 기본값을 이미 설정했습니다.

pitch 값을 조정하는 Range Slider 옆에, <span>태그에 값을 현재 설정된 값을 표시합니다

document.querySelector("#pitch").addEventListener("input", () => {
  // 입력된 pitch값 가져오기
  const pitch = document.querySelector("#pitch").value;

  //SpeechSynesisUtterance 인스턴스의 pitch 속성 설정
  speech.pitch = pitch;

  // pitch값 라벨 업데이트
  document.querySelector("#pitch-label").innerHTML = pitch;
});

• Voice

voice 속성은 발화가 이루어질때 사용될 음성을 가져오고 설정합니다.

voice는 SpeechSynthesisVoice 객체(시스템이 지원하는 음성) 중 하나로 설정되어야 합니다.

해당 속성이 설정되지 않은 경우 발화의 언어 속성에서 사용가능한 가장 적합한 기본 음성이 사용됩니다.

발화의 목소리를 설정하려면 window 객체에서 사용가능한 음성 리스트를 가져와야합니다.

window object가 로드될때, 그 목소리는 즉시 사용가능하지 않습니다. 이것은 비동기 작업입니다.

음성이 로드되면 이벤트는 강제 발생합니다.

음성이 로드될 때 실행되어야 하는 기능을 설정할 수 있습니다.

window.speechSynthesis.onvoiceschanged = () => {
  // On Voices Loaded
};

window.speechSynthesis.getVoices() 을 사용해서 음성 목록을 가져올 수있습니다.

이것은 사용가능한 SpeechSynthesisVoice 배열 객체를 반환합니다.

예제에서는

목록을 전역 배열에 저장하고 웹 페이지의 선택 메뉴를 사용 가능한 음성 목록으로 업데이트해 보겠습니다.

let voices = []; // 전역배열

window.speechSynthesis.onvoiceschanged = () => {
  // 음성의 리스트를 가져오기
  voices = window.speechSynthesis.getVoices();

  // 처음에는 배열의 첫 번째 음색을 설정합니다
  speech.voice = voices[0];

  // 음성 선택 목록을 설정합니다. (이 값은 나중에 사용자가 선택 메뉴를 사용하여 음색을 업데이트할 때 사용됩니다.)
  let voiceSelect = document.querySelector("#voices");
  voices.forEach((voice, i) => (voiceSelect.options[i] = new Option(voice.name, i)));
};

이제 음성 메뉴를 업데이트했으므로

여기에 onChange 이벤트 리스너를 추가하여 SpeechSynthesisUtterance 인스턴스의 음성을 업데이트해 보겠습니다.

사용자가 음성을 업데이트하면

인덱스 번호(각 옵션의 값으로 설정된) 와 음성의 전역 배열을 사용하여 음성을 업데이트합니다.

document.querySelector("#voices").addEventListener("change", () => {
  speech.voice = voices[document.querySelector("#voices").value];
});

Controls

SpeechSynthesis instance 에 컨트롤을 추가해봅시다

• Start

SpeechSynthesisUtterance 인스턴스를 window.speechSynthesis.speak() 메서드에 전달해야 합니다.

그러면 텍스트가 음성으로 변환되기 시작합니다.이 speak() 메서드를 호출하기 전 텍스트 속성을 설정해야합니다.

*참고 : 인스턴스가 이미 실행중인 동안 다른 텍스트 음성 변환을 시작하면 현재 실행중인 인스턴스 뒤에 대기하게 됩니다.

document.querySelector("#talk").addEventListener("click", () => {
  speech.text = document.querySelector("textarea").value;
  window.speechSynthesis.speak(speech);
});

• Pause

window.speechSynthesis.pause(). 를 사용하여

현재 실행 중인 SpeechSynthsisUtterance인스턴스를 일시 중지할 수 있습니다.

click이벤트리스너를 추가해 pasue버튼을 눌렀을때 SpeechSynthesisUtterance 인스턴스를 일시 중지 해보겠습니다.

document.querySelector("#pause").addEventListener("click", () => {
  window.speechSynthesis.pause();
});

• Resume

window.speechSynthesis.resume(). 을 사용하여

일시중지된 SpeechSynthesisUtterance 인스턴스를 다시시작할수있습니다.

click이벤트리스너를 추가해 재개버튼을 눌렀을때 SpeechSynthesisUtterance 인스턴스를 재개해보겠습니다.

document.querySelector("#resume").addEventListener("click", () => {
  window.speechSynthesis.resume();
});

• Cancel

window.speechSynthesis.cancel(). 을 사용하여

현재 실행 중인 인스턴스를 취소할 수 있습니다

click이벤트리스너를 추가해 재개버튼을 눌렀을때 SpeechSynthesisUtterance 인스턴스를 취소해보겠습니다.

document.querySelector("#resume").addEventListener("click", () => {
  window.speechSynthesis.resume();
});

전체 textToSpeech.js

// 새로운 SpeechSynthesisUtterance 객체 초기화
let speech = new SpeechSynthesisUtterance();

// 음성 언어 설정
speech.lang = "ko";

let voices = []; // 사용 가능한 음성 목록을 저장할 전역 배열

window.speechSynthesis.onvoiceschanged = () => {
  // 음성 목록 가져오기
  voices = window.speechSynthesis.getVoices();

  // 처음에 배열에서 첫 번째 음성을 설정합니다.
  speech.voice = voices[0];

  // 음성 선택 목록 설정. (사용자가 음성을 선택 메뉴를 통해 업데이트할 때 사용할 값으로 인덱스를 설정합니다.)
  let voiceSelect = document.querySelector("#voices");
  voices.forEach((voice, i) => (voiceSelect.options[i] = new Option(voice.name, i)));
};

document.querySelector("#rate").addEventListener("input", () => {
  // 입력에서 rate 값을 가져옵니다.
  const rate = document.querySelector("#rate").value;

  // SpeechSynthesisUtterance 인스턴스의 rate 속성 설정
  speech.rate = rate;

  // rate 레이블 업데이트
  document.querySelector("#rate-label").innerHTML = rate;
});

document.querySelector("#volume").addEventListener("input", () => {
  // 입력에서 볼륨 값을 가져옵니다.
  const volume = document.querySelector("#volume").value;

  // SpeechSynthesisUtterance 인스턴스의 볼륨 속성 설정
  speech.volume = volume;

  // 볼륨 레이블 업데이트
  document.querySelector("#volume-label").innerHTML = volume;
});

document.querySelector("#pitch").addEventListener("input", () => {
  // 입력에서 피치 값을 가져옵니다.
  const pitch = document.querySelector("#pitch").value;

  // SpeechSynthesisUtterance 인스턴스의 피치 속성 설정
  speech.pitch = pitch;

  // 피치 레이블 업데이트
  document.querySelector("#pitch-label").innerHTML = pitch;
});

document.querySelector("#voices").addEventListener("change", () => {
  // 음성 변경 시, select 메뉴의 값 사용 (전역 음성 배열에서 음성의 인덱스)
  speech.voice = voices[document.querySelector("#voices").value];
});

document.querySelector("#start").addEventListener("click", () => {
  // 텍스트 속성을 textarea의 값으로 설정
  speech.text = document.querySelector("textarea").value;

  // 음성 출력 시작
  window.speechSynthesis.speak(speech);
});

document.querySelector("#pause").addEventListener("click", () => {
  // speechSynthesis 인스턴스 일시 정지
  window.speechSynthesis.pause();
});

document.querySelector("#resume").addEventListener("click", () => {
  // 일시 정지된 speechSynthesis 인스턴스 재개
  window.speechSynthesis.resume();
});

document.querySelector("#cancel").addEventListener("click", () => {
  // speechSynthesis 인스턴스 취소
  window.speechSynthesis.cancel();
});

See the Pen SpeechSynesisUtterance by rebornbb (@bongcasso01) on CodePen.

원본 출처 : https://www.section.io/engineering-education/text-to-speech-in-javascript/

Text to Speech using Web Speech API in JavaScript

This tutorial will give readers a detailed guide on how they can implement text to speech using the Web Speech API in JavaScript. We will add listeners to control the instance when clicked.

www.section.io

호환성 확인 : https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API#Browser_compatibility.

Web Speech API - Web APIs | MDN

The Web Speech API enables you to incorporate voice data into web apps. The Web Speech API has two parts: SpeechSynthesis (Text-to-Speech), and SpeechRecognition (Asynchronous Speech Recognition.)

developer.mozilla.org

728x90

'STUDY > JavaScript' 카테고리의 다른 글

[JS] 자바스크립트 템플릿 리터럴 - 백틱(`)과 달러(${ }) 사용법 (0)	2023.08.30
[JS] npm 정의와 npm 설치방법 정리 (0)	2023.08.30
[JS] 마이크로 입력 받는 오디오를 녹음하는 방법 (0)	2023.08.11
[JS] async와 await 란? (0)	2023.08.11
[JS] 콜백함수 쉽게 배워보기 (0)	2023.08.09

SpeechSynesisUtterance

SpeechSynesisUtterance 속성

SpeechSynesisUtterance 인터페이스 사용방법

SpeechSynesisUtterance 사용 예제

Project directory

HTML page

The JavaScript file

인스턴스 생성

속성 설정

Controls

전체 textToSpeech.js

'STUDY > JavaScript' 카테고리의 다른 글

티스토리툴바