Japanese phonology

This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA). For an introductory guide on IPA symbols, see Help:IPA. For the distinction between , / / and ⟨ ⟩, see IPA § Brackets and transcription delimiters.

This article includes inline links to audio files. If you have trouble playing the files, see Wikipedia Media help.

The phonology of Japanese features a phonemic inventory of five vowels (/a, e, i, o, u/) and 15 or more consonants (depending on how certain sounds are analyzed). The phonotactics are relatively simple, allowing for few consonant clusters. Japanese phonology has been affected by the presence of several layers of vocabulary in the language: in addition to native Japanese vocabulary, Japanese has a large amount of Chinese-based vocabulary and loanwords from other languages.

Standard Japanese is a pitch-accent language, wherein the position or absence of a pitch drop may determine the meaning of a word: /haꜜsiɡa/ (箸が, 'chopsticks'), /hasiꜜɡa/ (橋が, 'bridge'), /hasiɡa/ (端が, 'edge').

Unless otherwise noted, the following describes the standard variety of Japanese based on the Tokyo dialect.

Lexical strata

Discussions of Japanese phonology often refer to different 'strata' or layers of vocabulary, as many statements about phonemes and phonotactics are only valid as generalizations over a subset of vocabulary items. For example, the consonant is generally absent in word-initial position in Yamato and Sino-Japanese words, but occurs freely in this position in Mimetic and Foreign words.

Yamato

Called wago (和語) or yamato kotoba (大和言葉) in Japanese, this category comprises inherited native vocabulary. Morphemes in this category show a number of restrictions on structure that may be violated by vocabulary in other layers.

Mimetic

Japanese possesses a variety of mimetic words that make use of sound symbolism to serve an expressive function. Like Yamato vocabulary, these words are also of native orgin, and can be considered to belong to the same overarching group. However, words of this type show some phonological peculiarities that cause some theorists to regard them as a separate layer of Japanese vocabulary.

Sino-Japanese

Called kango (漢語) in Japanese, words in this stratum originate from several waves of large-scale borrowing from Chinese that occurred from the 6th-14th centuries AD. They comprise 60% of dictionary entries and 20% of ordinary spoken Japanese, ranging from formal vocabulary to everyday words. Most Sino-Japanese vocabulary consists of combinations of more than one Sino-Japanese morpheme. Sino-Japanese morphemes display a limited phonological shape: each has a length of at most two moras, which Ito and Mester 2015 argue reflects a limitation in size to a single prosodic foot. These morphemes represent the Japanese phonetic adaptation of Middle Chinese monosyllabic morphemes, which in writing were each generally represented by a single Chinese character (hanzi). These characters were taken into Japanese as kanji; as Japanese writers also repurposed kanji to represent native vocabulary, there is a distinction between the Sino-Japanese readings of kanji, called On'yomi, and native readings, called Kun'yomi.

The moraic nasal /N/ is relatively common in Sino-Japanese, and contact with Middle Chinese is often described as being responsible for the presence of /N/ in Japanese (starting from approximately 800 AD in Early Middle Japanese), although /N/ also came to exist in native Japanese words as a result of sound changes.

Foreign

Called gairaigo (外来語) in Japanese, this is the newest layer of vocabulary, consisting of recent loanwords, many from English. In words of this stratum, a number of consonant-vowel sequences that did not previously exist in Japanese are tolerated. This has led to the introduction of new spelling conventions and complicates the phonemic analysis of these consonant sounds in Japanese: some consonants that were once allophones may now be analyzed as having attained phonemic status.

Consonants

	Bilabial	Alveolar	Alveolo- palatal	Palatal	Velar	Uvular	Glottal
Nasal	m	n	(ɲ)		(ŋ)	(ɴ)
Plosive	p b	t d			k ɡ
Affricate		(t͡s) (d͡z)	(t͡ɕ) (d͡ʑ)
Fricative	(ɸ)	s z	(ɕ) (ʑ)	(ç)			h
Liquid		r
Semivowel				j	w
Special moras	/N/ /Q/

Consonants inside parentheses can be analyzed as allophones of other phonemes, at least in native words. In loanwords, /ɸ, ɕ, ʑ, t͡s, d͡z, t͡ɕ, d͡ʑ/ sometimes occur phonemically.

Phonetic notes

Voiceless stops /p, t, k/ are slightly aspirated—less so than English stops, but more than those in Spanish.
Non-coronal voiced stops /b, ɡ/ between vowels may be weakened to fricatives, especially in fast or casual speech:

/b/ > bilabial fricative [β]	/abareru/ >	暴れる, abareru, 'to behave violently'
/ɡ/ > velar fricative [ɣ]	/haɡe/ >	はげ, hage, 'baldness'

However, /ɡ/ is further complicated by its variant realization as [ŋ].

/t, d, n/ are laminal denti-alveolar (that is, the blade of the tongue contacts the back of the upper teeth and the front part of the alveolar ridge) and /s, z/ are laminal alveolar.
/w/ is traditionally described as a velar [ɰ] or labialized velar approximant [w] or something between the two, or as the semivocalic equivalent of /u/ with little to no rounding, while a 2020 real-time MRI study found it is better described as a bilabial approximant [β̞].
/h/ is [ç] before /i/ and /j/ (listen), and [ɸ] before /u/ (listen), coarticulated with the labial compression of that vowel.
Realization of the liquid phoneme /r/ varies greatly depending on environment and dialect. The prototypical and most common pronunciation is an apical tap, either alveolar [ɾ] or postalveolar [ɾ̠]. Utterance-initially and after /N/, the tap is typically articulated in such a way that the tip of the tongue is at first momentarily in light contact with the alveolar ridge before being released rapidly by airflow. This sound is described variably as a tap, a "variant of [ɾ]", "a kind of weak plosive", and "an affricate with short friction, ". The apical alveolar or postalveolar lateral approximant [l] is a common variant in all conditions, particularly utterance-initially and before /i, j/. According to Akamatsu (1997), utterance-initially and intervocalically (that is, except after /N/), the lateral variant is better described as a tap [ɺ] rather than an approximant. The retroflex lateral approximant [ɭ] is also found before /i, j/. In Tokyo's Shitamachi dialect, the alveolar trill [r] is a variant marked with vulgarity. Other reported variants include the alveolar approximant [ɹ], the alveolar stop [d], the retroflex flap [ɽ], the lateral fricative [ɮ], and the retroflex stop [ɖ].
/N/ is a syllable-final moraic nasal with variable pronunciation: depending on what follows, it undergoes a variety of assimilatory processes. These assimilations occur beyond word boundaries. It is variously:
- bilabial [m] before /p, b, m/.
- velar [ŋ] before /k, ɡ/. This is palatalized when the following stop is, as in .
- lamino-alveolar [n] before ; never found utterance-finally.
- lamino-alveolopalatal [ɲ̟] before alveolo-palatals .
- apico-alveolar [n̺] before /r/.
- some sort of nasalized vowel before vowels, approximants /j, w/, and fricatives . Depending on context and speaker, the vowel's quality may closely match that of the preceding vowel or be more constricted in articulation. It is thus broadly transcribed with , an ad hoc semivocalic notation undefined for the exact place of articulation. It is also found utterance-finally. When utterance-final, the moraic nasal is traditionally described as uvular [ɴ], sometimes with qualification that the occlusion may not always be complete or that it is, or approaches, velar [ŋ] after front vowels. However, instrumental studies in the 2010s showed that there is considerable variability in the realization of utterance-final /N/ and that it often involves a lip closure or constriction. A 2023 real-time MRI study found that the tongue position of utterance-final /N/ largely corresponds to that of the preceding vowel, though with overlapping locations, leading the researcher to conclude that /N/ has no specified place of articulation rather than a clear allophonic rule. 5% of the samples of utterance-final /N/ were realized as nasalized vowels with no closure, where appreciable tongue raising was observed only when following /a/.
/Q/ is a syllable-final moraic obstruent consonant; it is unreleased and completely assimilated to the following consonant, producing a phonetically lengthened obstruent consonant.

Debated or marginal consonant phonemes

Voiced affricate vs. fricative

The distinction between and (voiced sibilant fricatives and affricates) is neutralized in most dialects, including Standard Japanese. A 2010 corpus study found that in neutralizing varieties, both variants were found in all positions, and that the time it takes to produce the consonant or consonant cluster (to which /N/, /Q/, and pauses contribute) was the most reliable predictor for affricate realization.

In non-merging dialects, the affricates can be analyzed as conditioned allophones of /d/ (pronounced before /u/, before /i/ or /j/, and elsewhere) whereas the fricatives , can be analyzed as allophones of /z/ (pronounced before /i/ or /j/, and elsewhere). In neutralizing dialects, the phoneme resulting from the merger is often assumed to be /z/, though some analyze it as /d͡z/, the voiced counterpart to .
Some dialects (e.g. Tosa) retain the distinctions between /zi/ and /di/ and between /zu/ and /du/, while others retain only /zu/ and /du/ but not /zi/ and /di/, or merge all four (e.g. north Tōhoku).
As a result of the neutralization, the historical spelling distinction between these sounds has been eliminated from the modern written standard except in cases where a mora is repeated once voiceless and once voiced, or where rendaku occurs in a compound word: つづく /tuduku/, いちづける /itidukeru/ from |iti+tukeru|.

Voiceless coronal affricate

In core vocabulary, [t͡s] can be analyzed as an allophone of /t/ before /u/:

/t/ > [t͡s]

/tuɡi/ >

次, tsugi, 'next'

In loanwords, however, [t͡s] can occur before other vowels: examples include ツァイトガイスト, tsaitogaisuto, 'zeitgeist'; エリツィン, Eritsin, 'Yeltsin'. There are also a small number of native forms with before a vowel other than /u/, such as otottsan, 'dad', although these are marginal (the standard form of this word is otōsan). Shibatani 1990 prefers not to abandon the analysis of as an allophone of /t/, but notes that Hattori 1955 concluded that and (transcribed phonemically with the symbol /c/) constituted a separate phoneme from /t/.

Palatalized consonants

Most consonants possess phonetically palatalized counterparts. Pairs of palatalized and non-palatalized consonants contrast before the back vowels /a o u/, but are in complementary distribution before the front vowels: only the palatalized version occurs before /i/, and only the non-palatalized version occurs before /e/. Palatalized consonants are normally analyzed as allophones conditioned by the presence of a following /i/ or /j/. When this analysis is adopted, the surface contrast between non-palatalized and palatalized consonants before back vowels is interpreted as a contrast between plain consonants and biphonemic /Cj/ sequences.

/mi/ >	/umi/ >	海, umi, 'sea'
/mj/ >	/mjaku/ >	脈, myaku, 'pulse'

Some phonologists have suggested that palatalized consonants could instead be analyzed as distinct consonants of their own (/Cʲ/). However, Nogita 2006 argues for the cluster analysis /Cj/, noting that in Japanese, syllables such as show a longer average duration than their non-palatalized counterparts (whereas comparable duration differences were not generally found between pairs of palatalized and unpalatalized consonants in Russian).

The phonemic analysis described above can be applied straightforwardly to the palatalized counterparts of /p b k g m n r/, as in the following examples:

/ɡ/ >	/ɡjoːza/ >	ぎょうざ, gyōza, 'fried dumpling'
/r/ >	/kiri/ >	霧, kiri, 'fog'

The palatalized counterpart of /n/ is often described as prepalatal [ɲ̟], although Martin 1959 and Labrune 2012 question the extent to which its realization in this context is truly palatal, and Recasens 2013 reports its place of articulation as dentoalveolar or alveolar:

/n/ >

/nihoN/ >

日本, Nihon, 'Japan'

The glides /j w/ cannot precede /j/. The alveolar-palatal sibilants can be analyzed as the palatalized allophones of /t s z/, but it is debated whether this phonemic interpretation remains accurate in light of contrasts found in loanword phonology. For example, would according to the traditional view be transcribed as /ti/ /tju/, but some analysts would instead transcribe them as /t͡si/ /t͡sju/ or /t͡ɕi/ /t͡ɕu/.

The palatalized counterpart of /h/ is normally described as (although some speakers do not distinguish from ):

/h/ >	/hito/ >	人, hito, 'person'
/hj/ > [ç]	/hjaku/ >	百, hyaku, 'hundred'

A few palatalized consonants turn up only in loanword vocabulary, namely .

Alveolo-palatal sibilants

For coronal obstruents, the palatalization goes further, resulting in alveolo-palatal sibilants (e.g. 田, ta, 'field' versus 茶, cha, 'tea'):

/s/ > [ɕ]	/sio/ >	塩, shio, 'salt'
/z/ > or [ʑ]	/zisiN/ >	地震, jishin, 'earthquake'
/t/ >	/tiziN/ > ~	知人, chijin, 'acquaintance'

The coronal obstruents /t d s z/ underwent coalescent palatalization when historically followed by /j/:

/sj/ > [ɕ]	/sjaboN/ >	シャボン, shabon, 'soap'
/zj/ >	/ɡozjuː/ > ~ /zjaɡaimo/ >	五十, gojū, 'fifty' じゃがいも, jagaimo, 'potato'
/tj/ >	/tja/ >	茶, cha, 'tea'

Therefore, alveolo-palatal can be analyzed as positional allophones of /t d s z/ before /i/, or as the surface realization of underlying /tj dj sj zj/ clusters before other vowels. For example, can be analyzed as /si/ and as /sja/. Likewise, can be analyzed as /ti/ and /t͡ɕa/ as /tja/. (These analyses correspond to the representation of these sounds in the Japanese spelling system.) Most dialects show a merger in the pronunciation of underlying /d/ and /z/ before /j/ or /i/, with the resulting merged phone varying between and . The contrast between /d/ and /z/ is also neutralized before /u/ in most dialects (see above).

Some linguists adopt an analysis where Japanese (but not other palatalized consonants) are their own phonemes. Arguments for this include the following:

Standard Japanese is widely recognized to now show a surface contrast between and unaffricated . The latter of these pairs occurs in vocabulary with a foreign origin. A more marginal contrast may exist between . (* and * usually do not occur even in loanwords, so that English cinema > シネマ, shinema; although they may be written スィ and ズィ respectively, they are rarely found even among the most innovative speakers and do not occur phonemically.)
The sequences are used and faithfully realized in loanwords, whereas /je/ is variably replaced with /ie/ and consonant + /je/ sequences such as , are generally absent.
Phonetically, display not only palatalization, but also a shift from alveolar to postalveolar articulation.
The aforementioned duration contrast observed between /Ca/ and /Cja/ syllables was not found between the pair and .

Alternatively, affrication but not palatalization may be analyzed as phonemic for both voiceless and voiced coronal obstruents. When this analysis is adopted, is analyzed as a palatalized allophone of an underlying affricate phoneme /t͡s/, just as is analyzed as a palatalized allophone of /(d)z/.

Voiceless bilabial fricative

In core vocabulary, [ɸ] can be analyzed as an allophone of /h/ before /u/:

/h/ > [ɸ]

/huta/ >

ふた, futa, 'lid'

In loanwords, can occur before other vowels or before /j/. Examples include ファイト, faito, 'fight'; フュージョン, fyūjon, 'fusion'. [ɸ] and [h] are distinguished before vowels except [ɯ] (e.g. English fork > フォーク, fōku versus hawk > ホーク, hōku). Even in loanwords, * is not distinguished from (e.g. English hood and food > フード, fūdo).

Some old borrowings show adaptation of foreign to Japanese before a vowel other than /u/, but in borrowings more recent than around 1890, [ɸ] has fairly consistently been used in this context. Another adaptation pattern once used by Japanese speakers was replacement of foreign with before any vowel other than /u/ (e.g. film > ) but this also is largely obsolete.

Moraic consonants N Q

The phonemic analysis of moraic consonants is disputed. (Phonetically, gemination can be transcribed with a length mark (e.g. ), but this notation obscures mora boundaries.)

One analysis, particularly popular among Japanese scholars, posits that geminate (that is, double) obstruent consonants begin with a special "mora phoneme" (モーラ音素, Mōra onso) /Q/, which corresponds to a unit of Japanese orthography, the sokuon (Hiragana: ⟨っ⟩; Katakana: ⟨ッ⟩). Likewise, the moraic nasal may be analyzed as a placeless nasal /N/, which likewise corresponds to a unit of Japanese orthography, the hatsuon (Hiragana: ⟨ん⟩; Katakana: ⟨ン⟩). These can be seen as 'placeless' consonant phonemes that have no underlying place of articulation (and also no manner of articulation, in the case of /Q/), instead manifesting as several phonetic realizations depending on context. According to this kind of analysis, geminate nasal consonants are phonemically /Nn/ and /Nm/, and other geminate consonants are phonemically /Q/ followed by an obstruent:

before [p]	/niQ.poN/ >	日本, nippon, 'Japan'
[s] before [s]	/kaQ.seN/ >	合戦, kassen, 'battle'
before	/saQ.ti/ >	察知, satchi, 'inference'

Less abstractly, the moraic nasal /N/ may be interpreted as a phoneme with an underlyingly uvular place of articulation, i.e. /ɴ/, based on the traditional description of its word-final realization. Similarly, it has been suggested that the underlying phonemic representation of /Q/ might be a glottal stop /ʔ/—despite the fact that phonetically, it is not always a stop, and is usually not glottal—based on the occurrence of in certain marginal forms that can be interpreted as containing /Q/ not followed by another obstruent: for example, can be found at the end of an exclamation, or before a sonorant in forms with emphatic gemination, and the use of the sokuon as a written representation of in these contexts suggests Japanese speakers identify as the default form of /Q/, or the form it takes when it is not possible for it to share its place and manner of articulation with a following obstruent.

A competing analysis dispenses entirely with /Q/ and /N/. The moraic obstruent can be interpreted as having the same phonemic value as the following consonant, as shown below:

before [p]	/nip.poN/ >	日本, nippon, 'Japan'
[s] before [s]	/kas.seN/ >	合戦, kassen, 'battle'
before	/sat.ti/ >	察知, satchi, 'inference'

Likewise, rather than being considered a distinct phoneme /N/, the moraic nasal may be considered an allophone of the coronal nasal /n/ when it occurs in syllable-final (coda) position (this requires treating syllable or mora boundaries as potentially distinctive, in order to explain the contrast between the moraic nasal and non-moraic /n/ before a vowel).

Alternatively, as there is no contrast in coda position between /m/ and /n/, the coda nasal can be interpreted as an 'archiphoneme' (a neutralization between otherwise contrastive phonemes). Likewise, the moraic obstruent can be interpreted as an archiphoneme representing the syllable-final neutralization of Japanese obstruent consonant phonemes.

Velar nasal onset

/ɡ/ may be realized as a velar nasal [ŋ] when it occurs within words—this includes not only between vowels but also between a consonant and a vowel. There is a fair amount of variation between speakers, however. Vance (1987) suggests that the variation follows social class, while Akamatsu (1997) suggests that the variation follows age and geographic location. The generalized situation is as follows.

Standard Japanese speakers can be categorized into 3 groups, referred to as A-, B-, and C-speakers, as defined below. If a B-speaker consistently realizes a given word with the allophone [ŋ], they will never employ [ɣ] as an allophone in that same word. A-speakers vary between [ŋ] and [ɡ], and C-speakers are generally consistent in using [ɡ]: for these groups, the velar fricative [ɣ] is another possible allophone in fast speech.

At the beginning of words

All present-day standard Japanese speakers generally use the stop [ɡ] at the beginning of words.

外遊, gaiyū, 'overseas trip'

/ɡaijuu/ > , but not *

In the middle of simple words (i.e. non-compounds)

	家具, kagu, 'furniture'
A-speakers, a majority, use either [ŋ] or [ɡ] in free variation.	/kaɡu/ > or
B-speakers, a minority, consistently use [ŋ].	/kaɡu/ > but not *
C-speakers, the majority in western Japan with a smaller minority in Kantō, consistently use [ɡ].	/kaɡu/ > but not *

In the middle of compound words (morpheme-initially)

B-speakers consistently use [ɡ] when /ɡ/ occurs morpheme-initially. Thus, for them the words 千五, sengo, 'one thousand and five' and 戦後, sengo, 'postwar' are a minimal pair, while for others they are homophonous.

To summarize:

	はげ, hage, 'baldness'
A-speakers	/haɡe/ > or or
B-speakers	/haɡe/ >
C-speakers	/haɡe/ > or

Some phonologists posit a distinct phoneme /ŋ/, citing pairs such as (大硝子, 'big sheet of glass') versus (大烏, 'big raven').

Vowels

Vowel phonemes of Japanese
	Front	Central	Back
Close	i		u
Mid	e		o
Open		a

/u/ is a close near-back vowel with the lips unrounded ([ɯ̟]) or compressed ([ɯ̟ᵝ]). When compressed, it is pronounced with the side portions of the lips in contact but with no salient protrusion. In conversational speech, compression may be weakened or completely dropped. It is centralized [ɨ] after /s, z, t/ and palatalized consonants (/Cj/), and possibly also after /n/.
/e, o/ are mid .
/a/ is central [ä].

Except for /u/, the short vowels are similar to their Spanish counterparts.

Long vowels and vowel sequences

All vowels display a length contrast: short vowels are phonemically distinct from long vowels:

	小母さん, obasan, 'aunt'		お婆さん, obaasan, 'grandmother'
	怪訝, kegen, 'dubious'		軽減, keigen, 'reduction'
	蛭, hiru, 'leech'		ヒール, hiiru, 'heel'
	都会, tokai, 'city'		倒壊, tōkai, 'destruction'
	区, ku, 'district'		空, kū, 'void'

Long vowels are pronounced with around 2.5 or 3 times the phonetic duration of short vowels, but are considered to be two moras long at the phonological level. In normal speech, a "double vowel", that is, a sequence of two identical short vowels (for example, across morpheme boundaries), is pronounced the same way as a long vowel. However, a distinction may be produced in slow or formal speech, where an audible hiatus (sometimes enunciated as a glottal stop) may occur between a sequence of two identical short vowels, but not in the middle of an intrinsically long vowel:

	砂糖屋, satō-ya, 'sugar shop'
or	里親, sato-oya, 'foster parent'

In addition, a double vowel may bear pitch accent on either the first or second element, whereas an intrinsically long vowel can be accented only on its first mora. The distinction between double vowels and long vowels may be phonologically analyzed in various ways. One analysis interprets long vowels as ending in a special segment /R/ that adds a mora to the preceding vowel sound (a chroneme). Another analysis interprets long vowels as sequences of the same vowel phoneme twice, with double vowels distinguished by the presence of a "zero consonant" or empty onset between the vowels.

Within words and phrases, Japanese allows long sequences of phonetic vowels without intervening consonants. Sequences of two vowels within a single word are extremely common, occurring at the end of many i-type adjectives, for example, and having three or more vowels in sequence within a word also occurs, as in あおい, aoi, 'blue/green'. In phrases, sequences with multiple o sounds are most common, due to the direct object particle を, wo (which comes after a word) being realized as o and the honorific prefix お〜, o, which can occur in sequence, and may follow a word itself terminating in an o sound; these may be dropped in rapid speech. A fairly common construction exhibiting these is 「〜をお送りします」, wo o-okuri-shimasu, '...humbly send...'. More extreme examples follow:

/hoː.oː.o.o.oː/	hōō o oō (鳳凰（ほうおう）を追（お）おう)	'let's chase the fenghuang'
/toː.oː.o.oː.oː/	tōō o ōō (東欧（とうおう）を覆（おお）おう)	'let's cover Eastern Europe'

Devoicing

In many dialects, the close vowels /i/ and /u/ become voiceless when placed between two voiceless consonants or, unless accented, between a voiceless consonant and a pausa.

/kutu/ >	靴, kutsu, 'shoe'
/atu/ >	圧, atsu, 'pressure'
/hikaN/ >	悲観, hikan, 'pessimism'

Generally, devoicing does not occur in a consecutive manner:

/kisitu/ >	気質, kishitsu, 'temperament'
/kusikumo/ >	奇しくも, kushikumo, 'strangely'

This devoicing is not restricted to only fast speech, though consecutive devoicing may occur in fast speech.

To a lesser extent, /o, a/ may be devoiced with the further requirement that there be two or more adjacent moras containing the same phoneme:

/kokoro/ >	心, kokoro, 'heart'
/haka/ >	墓, haka, 'grave'

The common sentence-ending copula です, desu and polite suffix ます, masu are typically pronounced and .

Japanese speakers are usually not even aware of the difference of the voiced and devoiced pair. On the other hand, gender roles play a part in prolonging the terminal vowel: it is regarded as effeminate to prolong, particularly the terminal /u/ as in あります, arimasu, 'there is'. Some nonstandard varieties of Japanese can be recognized by their hyper-devoicing, while in some Western dialects and some registers of formal speech, every vowel is voiced.^{[citation needed]} Recent research has argued that "vowel deletion" more accurately describes the phenomena.

However, Japanese contrasts devoiced vowel between two identical voiceless fricatives and voiceless fricative gemination. Vowel between two identical voiceless fricatives may have either a weak voiceless approximant release or a revoiced vowel depending on the rate of speech and individual speech habits.

	日進橋, Nisshinbashi	vs.	or	西新橋, Nishi-shinbashi
	決済, 'check out'	vs.	or	消す際, 'while erasing'

Nasalization

Japanese vowels are slightly nasalized when adjacent to nasals /m, n/. Before the moraic nasal /N/, vowels are heavily nasalized:

/kaNtoo/ >	関東, Kantō 'Kanto region'
/seesaN/ >	生産, seisan, 'production'

Glottal stop insertion

At the beginning and end of utterances, Japanese vowels may be preceded and followed by a glottal stop [ʔ], respectively. This is demonstrated below with the following words (as pronounced in isolation):

/eN/ > ~	円, en, 'yen'
/kisi/ >	岸, kishi, 'shore'
/u/ >	鵜, u, 'cormorant'

When an utterance-final word is uttered with emphasis, this glottal stop is plainly audible, and is often indicated in the writing system with a small letter っ, tsu, called a sokuon. This is also found in interjections like あっ, a and えっ, e.

Prosody

Moras

Japanese words have traditionally been analysed as composed of moras, a distinct concept from that of syllables. Each mora occupies one rhythmic unit, i.e. it is perceived to have the same time value. A mora may be "regular" consisting of just a vowel (V) or a consonant and a vowel (CV), or may be one of two "special" moras, /N/ and /Q/. A glide /j/ may precede the vowel in "regular" moras (CjV). Some analyses posit a third "special" mora, /R/, the second part of a long vowel (a chroneme). In the following table, the period represents a mora break, rather than the conventional syllable break.

Mora type	Example	Japanese	Moras per word
V	/o/	尾, o, 'tail'	1-mora word
jV	/jo/	世, yo, 'world'	1-mora word
CV	/ko/	子, ko, 'child'	1-mora word
CjV	/kjo/¹	巨, kyo, 'hugeness'	1-mora word
R	/R/ in /kjo.R/ or /kjo.o/	今日, kyō, 'today'	2-mora word
N	/N/ in /ko.N/	紺, kon, 'deep blue'	2-mora word
Q	/Q/ in /ko.Q.ko/ or /ko.k.ko/	国庫, kokko, 'national treasury'	3-mora word

^1 Traditionally, moras were divided into plain and palatal sets, the latter of which entail palatalization of the consonant element.

Thus, the disyllabic (日本, 'Japan') may be analyzed as /niQpoN/, dissected into four moras: /ni/, /Q/, /po/, and /N/.

In English, stressed syllables in a word are pronounced louder, longer, and with higher pitch, while unstressed syllables are relatively shorter in duration. Japanese is often considered a mora-timed language, as each mora tends to be of the same length, though not strictly: geminate consonants and moras with devoiced vowels may be shorter than other moras. Factors such as pitch have negligible influence on mora length.

Pitch accent

Standard Japanese has a distinctive pitch accent system: a word can have one of its moras bearing an accent or not. An accented mora is pronounced with a relatively high tone and is followed by a drop in pitch. The various Japanese dialects have different accent patterns, and some exhibit more complex tonic systems.

Feet

The bimoraic foot, a unit composed of two moras, plays an important role in linguistic analyses of Japanese prosody. The relevance of the bimoraic foot can be seen in the formation of hypocoristic names, clipped compounds, and shortened forms of longer words.

For example, the hypocoristic suffix -chan is attached to the end of a name to form an affectionate term of address. When this suffix is used, the name may be unchanged in form, or it may optionally be modified: modified forms always have an even number of moras before the suffix. It is common to use the first two moras of the base name, but there are also variations that are not produced by simple truncation:

Truncation to the first two moras:

/o.sa.mu/	osamu	>	/o.sa.tja.N/	osachan
/ta.ro.ː/	taroo	>	/ta.ro.tja.N/	tarochan
/jo.ː.su.ke/	yoosuke	>	/jo.ː.tja.N/	yoochan
/ta.i.zo.ː/	taizoo	>	/ta.i.tja.N/	taichan
/ki.N.su.ke/	kinsuke	>	/ki.N.tja.N/	kinchan

From first mora, with lengthening:

/ti/	chi	>	/ti.ː.tja.N/	chiichan
/ka.yo.ko/	kayoko	>	/ka.ː.tja.N/	kaachan

With formation of a moraic obstruent:

/a.tu.ko/	atsuko	>	/a.Q.tja.N/	atchan
/mi.ti.ko/	michiko	>	/mi.Q.tja.N/	mitchan
/bo.ː/	boo	>	/bo.Q.tja.N/	botchan

With formation of a moraic nasal:

/a.ni/	ani	>	/a.N.tja.N/	anchan
/me.gu.mi/	megumi	>	/me.N.tja.N/	menchan
/no.bu.ko/	nobuko	>	/no.N.tja.N/	nonchan

From two non-adjacent moras:

/a.ki.ko/	akiko	>	/a.ko.tja.N/	akochan
/mo.to.ko/	motoko	>	/mo.ko.tja.N/	mokochan

Poser 1990 argues that the various kinds of modifications are best explained in terms of a two-mora 'template' used in the formation of this type of hypocoristic: the bimoraic foot.

Monomoraic (one-mora) feet, also called "degenerate" feet, exist in other contexts. Labrune, citing Tanaka (2008:203), argues that feet may also be trimoraic, whereas Ito and Mester describe the foot as 'maximally bimoraic'.

Syllables

Although there is debate about the usefulness or relevance of syllables to the phonology of Japanese, it is possible to analyze Japanese words as being divided into syllables. When setting Japanese lyrics to (modern Western-style) music, a single note may correspond either to a mora or to a syllable.

Normally, each syllable contains at least one vowel and has a length of either one mora (called a light syllable) or two moras (called a heavy syllable); thus, the structure of a typical Japanese syllable can be represented as (C)(j)V(V/N/Q), where C represents an onset consonant, V represents a vowel, N represents a moraic nasal, Q represents a moraic obstruent, components in parentheses are optional, and components separated by a slash are mutually exclusive. However, other, more marginal syllable types (such as trimoraic syllables or vowelless syllables) may exist in restricted contexts.

The majority of syllables in spontaneous Japanese speech are 'light', that is, one mora long, with the form (C)(j)V.

Heavy syllables

'Heavy' syllables (two moras long) may potentially take any of the following forms:

(C)(j)VN (ending in a short vowel + /N/)
(C)(j)VQ (ending in a short vowel + /Q/)
(C)(j)VR (ending in a long vowel). May be analyzed either as a special case of (C)(j)VV with both V as the same vowel phoneme, or as ending in a vowel followed by a special chroneme segment (written as R or sometimes H).
(C)(j)V₁V₂, where V₁ is different from V₂. Sometimes notated as (C)(j)VJ.

Some descriptions of Japanese phonology refer to a VV sequence within a syllable as a 'diphthong'; others use the term 'quasi-diphthong' as a means of clarifying that these are analyzed as sequences of two vowel phonemes within one syllable, rather than as unitary phonemes. There is disagreement about which non-identical vowel sequences can occur within the same syllable. One criterion used to evaluate this question is the placement of pitch accent: it has been argued that, like syllables ending in long vowels, syllables ending in diphthongs cannot bear a pitch accent on their final mora. Kubozono 2015 argues that only /ai/, /oi/ and /ui/ can be diphthongs, whereas some prior literature has included other sequences such as /ae/, /ao/, /oe/, /au/, when they occur within a morpheme. Labrune 2012 argues against the syllable as a unit of Japanese phonology and thus concludes that no vowel sequences ought to be analyzed as diphthongs.

The distinction between a heterosyllabic vowel sequence and a long vowel or diphthong is not always predictable from the position of morpheme boundaries: that is, syllable breaks between vowels do not always correspond to morpheme boundaries (or vice versa).

For example, some speakers may pronounce the word 炎, honoo, 'flame' with a heterosyllabic /o.o/ sequence, even though this word is arguably monomorphemic in modern Japanese. This is an exceptional case: for the most part, heterosyllabic sequences of two identical short vowels are found only across a morpheme boundary. On the other hand, it is not so rare for a heterosyllabic sequence of two non-identical vowels to occur within a morpheme.

In addition, when VV sequences that may form a valid diphthong (such as /ai/) occur across morpheme boundaries, it seems to be possible for them to be pronounced in one syllable. For example, 歯医者, haisha, 'dentist' is morphologically a compound of 歯, ha, 'tooth' and 医者, isha, 'doctor' (itself composed of the morphemes 医, i, 'medical' and 者, i, 'person'); despite the morpheme boundary between /a/ and /i/ in this word, they seem to be pronounced in one syllable as a diphthong, making it a homophone with 敗者, haisha, 'defeated person'. Likewise, the morpheme /i/ used as a suffix to form the dictionary form (or affirmative nonpast-tense form) of an i-adjective is almost never pronounced as a separate syllable; instead, it combines with a preceding stem-final /i/ to form the long vowel , or with a preceding stem-final /a/, /o/ or /u/ to form a diphthong.

Superheavy syllables

Syllables of three or more moras, called "superheavy" syllables, are uncommon and exceptional (or 'marked'); the extent to which they occur in Japanese words is debated. Superheavy syllables never occur within a morpheme in Yamato or Sino-Japanese. Apparent superheavy syllables can be found in certain morphologically derived Yamato forms (including inflected verb forms where a suffix starting with /t/ is attached to a root ending in -VVC-, derived adjectives in っぽい, -ppoi, or derived demonyms in っこ, -kko) as well as in many loanwords.

Apparent superheavy syllables
Syllable type	Examples
Syllable type	Morphologically complex forms	Loanwords
(C)(j)VRN		English: green → Japanese: グリーン, romanized: gurīn
(C)(j)V₁V₂N		English: Spain → Japanese: スペイン, romanized: supein
(C)(j)VRQ	通った, tootta, 'pass-PAST' 東京っ子, tōkyōkko, 'Tokyoite'
(C)(j)V₁V₂Q	入って, haitte, 'enter-GERUNDIVE' 仙台っ子, sendaikko, 'Sendai-ite'
C)(j)VNQ	ロンドンっ子, rondonkko, 'Londoner', ドラえもんっぽい, doraemonppoi, 'like Doraemon'
C)(j)VRNQ	ウィーンっ子, uiinkko, 'Wiener', ウィーンって言った, uiintte itta, 'Vienna, (s)he said'

According to some accounts, certain forms listed in the above table may be avoided in favor of a different pronunciation with an ordinary heavy syllable (by reducing a long vowel to a short vowel or a geminate to a singleton consonant). Vance 1987 suggests there might be a strong tendency to reduce superheavy syllables to the length of two moras in speech at a normal conversational speed, saying that tooQta is often indistinguishable from toQta. Vance 2008 again affirms the existence of a tendency to shorten superheavy syllables in speech at a conversational tempo (specifically, to replace VRQ with VQ, VRN with VN, and VNQ with VN), but stipulates that the distinctions between 通った, tootta and 取った, totta; シーン, shiin and 芯, shin; and コンテ, konte, 'script' and 紺って, kontte, 'navy blue-QUOTATIVE' are clearly audible in careful pronunciation. Ito and Mester explicitly deny that there is a general tendency to shorten the long vowel of forms such as tootte in most styles of speech. Ohta 1991 accepts superheavy syllables ending in /RQ/ and /JQ/ but describes /NQ/ as hardly possible, stating that he and the majority of the informants he consulted judged examples such as /roNdoNQko/ to be questionably well-formed in comparision to /roNdoNko/.

It has also been argued that in some cases, what appears to be a superheavy syllable is really a sequence of a light syllable followed by a heavy syllable. Based on the location of the pitch accent, Kubozono 2015 argues that forms such as スペイン風邪, supeiꜜnkaze, 'Spanish influenza', リンカーン杯, rinkaaꜜnhai, 'Lincoln Cup', グリーン車, guriiꜜnsha, 'Green Car' (first-class car of a train) are syllabified as su.pe.in.ka.ze, rin.ka.an.hai, gu.ri.in.sha, and concludes that /VVN/ sequences are generally syllabified as /V.VN/. Ito and Mester 2018 state that compounds formed from words of this shape often exhibit variable accentuation, citing guriꜜinsha~guriiꜜnsha, Uターン率, yuutaaꜜnritsu~yuutaꜜanritsu, 'U-turn percentage', and マクリーン館, makuriiꜜnkan ~ makuriꜜinkan, 'McLean Building'.

Ito and Mester 2015 note that the pitch-based criterion for syllabifying VV sequences would suggest that Sendaiꜜkko is syllabified as Sen.da.ik.ko; likewise, Ohta 1991 reports a suggestion by Shin’ichi Tanaka (per personal communication) that the accentuation tookyooꜜkko implies the syllable division -kyo.oQ-, although Ohta favors the analysis with a superheavy syllable based on intuitition that this word contains a long vowel and not a sequence of two separate vowels. Ito and Mester ultimately question whether the placement of pitch accent on the second mora really rules out analyzing a three-mora sequence as a single superheavy syllable. A pitch accent is reported to fall on /N/ in the form rondonꜜkko. Ito and Mester find the syllabification ron.do.nk.ko implausible, and propose that pitch accent, rather than always falling on the first mora of a syllable, may fall on its penultimate mora (when there is more than one). Per Kubozono 2015, the superheavy syllable in toꜜotta bears accent on its first mora.

Evidence for the avoidance of superheavy syllables includes the adaptation of foreign long vowels or diphthongs to Japanese short vowels before /N/ in loanwords such as the following:

English: foundation → Japanese: ファンデーション, romanized: fandēshon

English: stainless → Japanese: ステンレス, romanized: sutenresu

English: corned beef → Japanese: コンビーフ, romanized: konbīfu

There are exceptions to this shortening: /ai/ seems to never be affected, and /au/, although often replaced with /a/ in this context, can be kept, as in the following words:

English: sound → Japanese: サウンド, romanized: saundo

English: mountain → Japanese: マウンテン, romanized: maunten

Vowelless syllables

Some analyses recognize vowelless syllables in restricted contexts.

Kawahara and Shaw 2018 argue that high vowel deletion may produce syllabic fricatives or affricates.
Per Vance 2008, /N/ is syllabic in the marginal circumstances where it occurs word-initially, such as ン十億, njūoku, 'several billion'.

Phonotactics

Within a mora

Phonotactically legal phoneme sequences, each counting as one mora
	/-a/	/-i/	/-u/	/-e/	/-o/	/-ja/	/-ju/	/-jo/
/∅-/	/a/	/i/	/u/	/e/	/o/	/ja/	/ju/	/jo/
/k-/	/ka/	/ki/	/ku/	/ke/	/ko/	/kja/	/kju/	/kjo/
/ɡ-/	/ɡa/	/ɡi/	/ɡu/	/ɡe/	/ɡo/	/ɡja/	/ɡju/	/ɡjo/
/s-/	/sa/	/si/	/su/	/se/	/so/	/sja/	/sju/	/sjo/
/z-/	/za/	/zi/	/zu/	/ze/	/zo/	/zja/	/zju/	/zjo/
/t-/	/ta/	/ti/	/tu/	/te/	/to/	/tja/	/tju/	/tjo/
/d-/	/da/	(/di/)	(/du/)	/de/	/do/	(/dja/)	(/dju/)	(/djo/)
/n-/	/na/	/ni/	/nu/	/ne/	/no/	/nja/	/nju/	/njo/
/h-/	/ha/	/hi/	/hu/	/he/	/ho/	/hja/	/hju/	/hjo/
/b-/	/ba/	/bi/	/bu/	/be/	/bo/	/bja/	/bju/	/bjo/
/p-/	/pa/	/pi/	/pu/	/pe/	/po/	/pja/	/pju/	/pjo/
/m-/	/ma/	/mi/	/mu/	/me/	/mo/	/mja/	/mju/	/mjo/
/r-/	/ra/	/ri/	/ru/	/re/	/ro/	/rja/	/rju/	/rjo/
/w-/	/wa/
Marginal combinations mostly found in Western loans









Special moras
/V-/	/N/
/V-C/	/Q/ (geminates the following consonant)
/V-/	/R/

Palatals

Japanese syllables may start with the palatal glide /j/ or with consonant + /j/ clusters. These onsets normally can be found only before the back vowels /a o u/.

Before /i/, /j/ never occurs. All consonants are phonetically palatalized before /i/, but do not contrast in this position with unpalatalized consonants: as a result, palatalization in this context can be analyzed as allophonic. In native Japanese vocabulary, coronal obstruent phones (i.e. ) do not occur before /i/, and in contexts where a morphological process such as verb inflection would place a coronal obstruent phoneme before /i/, the coronal is replaced with an alveolo-palatal sibilant, resulting in alternations such as 'wait' (negative) vs. 'wait' (polite) or 'lend' (negative) vs. 'lend' (polite). Thus, function in native vocabulary as the palatalized counterparts of coronal consonant phonemes. However, the analysis of alveolo-palatal sibilants as palatalized allophones of coronal consonants is complicated by loanwords. The sequences are distinguished from in recent loanwords (with generally preserved in words borrowed more recently than 1930) and to a lesser extent, some speakers may exhibit a contrast in loanwords between and .

Before /e/, was lost in the current standard language, but some dialects (such as Kyushu) and pre-modern versions of the language contain as well as exhibiting in place of modern standard . In standard Japanese, non-foreign words do not contain . There are no morphological alternations related to this gap. As discussed above, these sequences can occur in loanwords. The sequence has been consistently used in borrowed words at all time periods; セロ (sero) from cello seems to be a unique exception showing adaptation of to . Another rare exception, showing adaptation to (vowel raising), is チッキ (chikki) from English check (less common than チェック (chekku)). The sequences and tend to be used in words borrowed more recently than around 1950, whereas words borrowed before that point may show depalatalization to and respectively. Examples of depalatalized forms include ゼリー (zerī) from English jelly and セパード ( sepādo) from English shepherd (the latter borrowing dates to the 19th century).

Pre-U consonants

Several Japanese consonants developed special phonetic values before /u/. These variants, while initially allophonic, may however have attained phonemic status through later neutralizations or the introduction of novel contrasts in loanwords.

There is no distinction between and . In core vocabulary, the voiceless bilabial fricative occurs only before the vowel /u/. Thus, can be analyzed as an allophonic realization of /hu/.

Outside of loanwords, and do not occur, because /t d/ were affricated to before /u/.

In dialects that show neutralization of the contrast, the merged phone can occur before any vowel (not only before /u/); thus, for these dialects, the affrication of original /du/ can be analyzed as resulting in a phonemically distinct sequence /zu/ (resulting in a gap for the sequence /du/).

In core vocabulary, the voiceless coronal affricate occurs only before the vowel /u/; thus can be analyzed as an allophonic realization of /tu/. Verb inflection shows alternations between and , as in 'win' (negative) and 'win' (present tense). However, the interpretation of as /tu/ (with merely an allophone of /t/) is complicated by the occurrence of before vowels other than /u/ in loanwords.

In addition, in recent loanwords there is some use of unaffricated ; they can be represented in kana by トゥ and ドゥ, which received official recognition by a cabinet notice in 1991 as an alternative to the use of or to adapt foreign . Forms where and can be found include the following:

English: Today → /tudei/

French: toujours → /tuzjuuru/

French: douze → /duuzu/

Older loanwords from French display adaptation of as and of as :

French: Toulouse → /t͡suuruuzu/

French: Pompidou → /poNpidoo/

Vance 2008 argues that and remain 'foreignisms' in Japanese phonology; they are less frequent than , and this has been interpreted as evidence that a constraint against * remained active in Japanese phonology for longer than the constraint against *.

In both old and recent loanwords, the epenthetic vowel used after word-final or pre-consonantal /t/ or /d/ is normally /o/ rather than /u/ (there is also some use of and ). However, adapted forms show some fluctuation between and in this context, e.g. French estrade 'stage', in addition to being adapted as /esutoraddo/, has a variant adaptation /esuturaddu/.

Between moras

Special moras

If analyzed as phonemes, the moraic consonants /N/ and /Q/ show a number of phonotactic restrictions (although some constraints can be violated in certain contexts, or may apply only within certain layers of Japanese vocabulary).

N

In general, the moraic nasal /N/ can occur between a vowel and a consonant, between vowels (where it contrasts with non-moraic nasal onsets), or at the end of a word. It is usually not found word-initially, but word-initial /N/ may occur in colloquial speech as a result of dropping of a preceding mora. In this context, its pronunciation is invariably assimilated to the place of articulation of the following consonant:

/naN-bjaku-neN/ → /N-bjaku-neN/ ('several hundred years')

/soNna koto/ → /Nna koto/ ('such thing')

In Sino-Japanese vocabulary, /N/ can occur as the second and final mora of a Sino-Japanese morpheme. It may be followed by any other consonant or vowel. However, in some contexts Sino-Japanese morpheme-final /N/ may cause changes to the start of a closely connected following morpheme:

After /N/, morpheme-initial /h/ is regularly replaced with /p/ in Sino-Japanese words. However, this does not occur across word boundaries or across the juncture in the middle of a 'complex compound' where the second element of the compound is a prosodic word composed of more than one Sino-Japanese morpheme: for example, /h/ remains unchanged in 完全敗北, kan+zen#hai+boku, 'total defeat' and 新発明, shin#hatsu+mei, 'new invention'.
Some words where /N/ is followed by a morpheme that starts in modern Japanese with a vowel or semivowel developed a pronunciation with a geminate nasal (/Nn/ or /Nm/) as the result of historic sound changes (see renjō). Aside from these isolated exceptions, /N/ followed by a vowel is regularly pronounced without resyllabification in Sino-Japanese compounds.
A following /t k h s/ is sometimes changed to /d g b z/; this can be interpreted as a special case of the more general sound change of rendaku.

Q

The moraic obstruent /Q/ generally occurs only in word-medial position between a vowel and a consonant. However, word-initial geminates may occur in casual speech as the result of elision:

/mattaku/ (an expression of exasperation) →

/usseena/ ('shut up') →

In native Japanese vocabulary, /Q/ is found only before /p t k s/ (this includes , and , which can be viewed as allophones of /t/ and /s/); in other words, before voiceless obstruents other than /h/. The same generally applies to Sino-Japanese vocabulary. In these layers of the vocabulary, functions as the geminate counterpart of /h/ due to the historical development of Japanese /h/ from Old Japanese .

Tamaoka and Makioka 2004 found that in a Japanese newspaper corpus, /Q/ was followed over 98% of the time by one of /p t k s/; however, there were also at least some cases where it was followed by /h b d g z r/.

Geminate /h/ is found only in recent loanwords (e.g. ゴッホ, Gohho, '(van) Gogh', バッハ, Bahha, 'Bach'), and rarely in Sino-Japanese or mixed compounds (e.g. 十針, juhhari, 'ten stitches', 絶不調, zeffuchō, 'terrible slump').

Voiced obstruents (/b d g z/) do not occur as geminates in native Japanese words. This can be seen with suffixation that would otherwise feature voiced geminates. For example, Japanese has a suffix, |ri| that contains what Kawahara (2006) calls a "floating mora" that triggers gemination in certain cases (e.g. |tapu| +|ri| > 'a lot of'). When this would otherwise lead to a geminated voiced obstruent, a moraic nasal appears instead as a sort of "partial gemination" (e.g. |zabu| + |ri| > 'splashing').

However, voiced geminate obstruents have been used in words adapted from foreign languages since the 19th century. These loanwords can even come from languages, such as English, that do not feature gemination in the first place. For example, when an English word features a coda consonant preceded by a lax vowel, it can be borrowed into Japanese with a geminate; gemination may also appear as a result of borrowing via written materials, where a word spelled with doubled letters leads to a geminated pronunciation. Because these loanwords can feature voiced geminates, Japanese now exhibits a voice distinction with geminates where it formerly did not:

スラッガー, suraggā ('slugger') vs. surakkā ('slacker')

キッド, kiddo ('kid') vs. kitto ('kit')

Voiced geminate obstruents may also occur in truncated word forms (created by blending some moras from each word in a longer phrase) and in forms produced as the outcome of word games:

カットモデル, katto moderu, 'cut model' /kaQto moderu/ → kadderu /kaQderu/ (blend)

バット, batto, 'bat' /baQto/ → tobba /toQba/ (form produced in a reversing language game)

The most frequent geminated voiced obstruent is /Qd/, followed by /Qɡ/, /Qz/, /Qb/. In borrowed words, /d/ is the only voiced stop that is regularly adapted as a geminate when it occurs in word-final position after a lax/short vowel; gemination of /b/ and /ɡ/ in this context is sporadic.

Phonetically, voiced geminate obstruents in Japanese tend to have a 'semi-devoiced' pronunciation where phonetic voicing stops partway through the closure of the consonant. High vowels are not devoiced after phonemically voiced geminates.

In some cases, voiced geminate obstruents can optionally be replaced with the corresponding voiceless geminate phonemes:

バッド, baddo → バット, batto, 'bad'

ドッグ, doggu → ドック, dokku, 'dog'

ベッド, beddo → ベット, betto, 'bed'

Phonemic devoicing like this (which may be marked in spelling) has been argued to be conditioned by the presence of another voiced obstruent. Another example is doreddo ~ doretto 'dreadlocks'. Kawahara (2006) attributes this to a less reliable distinction between voiced and voiceless geminates compared to the same distinction in non-geminated consonants, noting that speakers may have difficulty distinguishing them due to the partial devoicing of voiced geminates and their resistance to the weakening process mentioned above, both of which can make them sound like voiceless geminates.

A small number of foreign proper names have katakana spellings that would imply a pronunciation with /Qr/, such as アッラー, arrā, 'Allah' and チェッリーニ, Cherrīni, 'Cellini'. The phonetic realization of /Qr/ in such forms varies between a lengthened sonorant sound and a sequence of a glottal stop followed by a sonorant.

Aside from loanwords, consonants that cannot normally occur after /Q/ may be geminated in certain emphatic variants of native words.

Reduplicative mimetics may be used in an intensified form where the second consonant of the first portion is geminated, and this can affect consonants that otherwise do not occur as geminates, such as /r/ and /j/ in the following forms:

barra-bara, 'in disorder'

borro-boro, 'worn out'

gurra-gura, 'shaky'

karra-kara, 'dry'

perra-pera, 'thin'

buyyo-buyo, 'flabby'

Adjectives may take an emphatic pronunciation where the second consonant is geminated and the following vowel is lengthened, as in the following forms:

naggaai < nagai, 'long'

karraai < karai, 'hot'

kowwaai < kowai, 'dreadful'

Similarly, Vance 2008 cites the following emphatic pronunciations as examples of /Qj/ and /Qm/:

/saQmui/ < 速い, hayai, 'fast'

/saQmui/ < 寒い, samui, 'cold'

Vowel sequences and long vowels

Sequences of vowels with no intervening consonant occur often in Japanese:

The sequences /ai oi ui ie ae oe ue io ao uo/ can be found within a morpheme in indigenous or Sino-Japanese words.
In Sino-Japanese morphemes, which show a fairly restricted structure, the only vowel sequences that can normally be found are /ai ui/ (as sequences of non-identical vowels) or (as long vowels). Sino-Japanese is historically derived from /ei/ and may variably be realized phonetically as (possibly due to spelling pronunciation) rather than as the long vowel .
Other vowel sequences can be found within a morpheme in foreign words.
In addition, any pair of vowels may occur in sequence across morpheme boundaries.

When the first of two vowels in a VV sequence is higher than the second, there is often not a clear distinction between a pronunciation with hiatus and a pronunciation where a glide with the same frontness as the first vowel is inserted before the second: i.e., the VV sequences /ia io ua ea oa/ may sound like /ija ijo uwa eja owa/. For example, English gear has been borrowed into Japanese as ギア, gia, 'gear', but an alternative form of this word is ギヤ, giya. Per Kawahara 2003, the sequences /eo eu/ are not pronounced like *, nor is /iu/ pronounced like * (although it may sometimes be replaced by ). In addition, Kawahara states that this glide formation process may be blocked by a syntactic boundary or by some (though not all) morpheme boundaries (Kawahara suggests that apparent cases of glide formation across morpheme boundaries are best interpreted as evidence that the boundary is no longer transparent).

Many long vowels historically developed from vowel sequences by coalescence, such as /au ou eu iu/ > . In addition, some vowel sequences in contemporary Japanese may optionally undergo coalescence to a long vowel in colloquial or casual speech (for some sequences, such as /oi/ and /ui/, coalescence is not possible in all contexts, but only in adjective forms):

/ai/ >	/itai/ >	痛い, itai, 'painful, ouch'
/oi/ >	/sugoi/ >	凄い, sugoi, 'great'

Distribution of consonant phonemes based on word position

In Yamato vocabulary, certain consonant phonemes, such as /p/, /h/, /r/ and voiced obstruents, tend to be found only in certain positions in a word. None of these restrictions applies to foreign vocabulary; some do not apply to mimetic or Sino-Japanese vocabulary; and certain generalizations have exceptions even within Yamato vocabulary; nevertheless, some linguists interpret them as still playing a role in Japanese phonology, based on the model of a 'stratified' lexicon where some active phonological constraints affect only certain layers of the vocabulary. The gaps in the distribution of these consonant phonemes can also be explained in terms of diachronic sound changes.

The voiced obstruents /b d g z/ occur without restriction at the start of Sino-Japanese and foreign morphemes, but usually do not occur at the start of Yamato words. However, word-initial /b d g z/ occur frequently in the mimetic stratum of native Japanese vocabulary, where they often function as sound-symbolic variants of their voiceless counterparts /p h t k s/. In addition, some non-mimetic Yamato words show a voiced initial obstruent; in some cases, voicing seems to have had an expressive function, adding a negative or pejorative shade to a root. Diachronically, Japanese voiced obstruents developed in native words from Old Japanese prenasalized consonants, which are thought to come from nasal + obstruent clusters derived from Proto-Japonic sequences of a nasal phoneme followed by an obstruent phoneme. Since these nasal + obstruent clusters did not occur word-initially, there was no regular source of word-initial voiced obstruents in Yamato vocabulary.

Yamato and mimetic words almost never start with /r/. In contrast, word-initial /r/ occurs without restriction in Sino-Japanese and foreign vocabulary.

In Yamato words, /p/ occurs only as a word-medial geminate (or equivalently, only after /Q/) as in 河童, kappa. In Sino-Japanese words, /p/ occurs only after /Q/ or /N/ (as in 切腹, seppuku, 北方, hoppō, 音符, onpu), alternating with /h/ in other positions. In contrast, mimetic words can contain singleton /p/, either word-initially or word-medially. Singleton /p/ also occurs freely in foreign words, such as パオズ, paozu, ペテン, peten, パーティー, pātī. The gap in the distribution of singleton results from the fact that original *p developed in Japanese to in word-initial position and to /w/ in intervocalic position, leaving geminate as the only context where occurred in Yamato words. (The fricative remained labial before all vowels up through Late Middle Japanese, but was eventually debuccalized to before any vowel other than /u/, resulting in the modern Japanese /h/ phoneme. The glide /w/ was eventually lost before any vowel other than /a/.) The few non-mimetic words where /p/ occurs initially include 風太郎, pūtarō, although as a personal name it is still pronounced Fūtarō.

The phoneme /h/ is rarely found in the middle of a Yamato morpheme (a small number of exceptions exist, such as ahureru, ahiru, yahari) or in the middle of a mimetic root (examples are mostly confined to mimetics that imitate 'gutteral' or 'laryngeal' sounds, such as goho-goho, 'coughing' and ahaha, 'laughing'). In addition, /h/ never occurs in the middle of a Sino-Japanese morpheme. This gap results from the aforementioned development of original *p to /w/, rather than /h/, in intervocalic position.

Morphophonology

As an agglutinative language, Japanese has generally very regular pronunciation, with much simpler morphophonology than a fusional language would. Nevertheless, there are a number of prominent sound change phenomena, primarily in morpheme combination and in conjugation of verbs and adjectives. Phonemic changes are generally reflected in the spelling, while those that are not either indicate informal or dialectal speech which further simplify pronunciation.

Sandhi

Various forms of sandhi exist; the Japanese term for sandhi generally is ren'on (連音).

Rendaku

In Japanese, sandhi is prominently exhibited in rendaku – consonant mutation of the initial consonant of a morpheme from unvoiced to voiced in some contexts when it occurs in the middle of a word. This phonetic difference is reflected in the spelling via the addition of dakuten, as in ka, ga (か／が). In cases where this combines with the yotsugana mergers, notably ji, dzi (じ／ぢ) and zu, dzu (ず／づ) in standard Japanese, the resulting spelling is morphophonemic rather than purely phonemic.

Gemination

The other common sandhi in Japanese is conversion of つ or く (tsu, ku), and ち or き (chi, ki), and rarely ふ or ひ (fu, hi) as a trailing consonant to a geminate consonant when not word-final – orthographically, the sokuon っ, as this occurs most often with つ. So that

一 (いつ itsu) + 緒 (しょ sho) = 一緒 (いっしょ issho)
学 (がく gaku) + 校 (こう kō) = 学校 (がっこう gakkō)

Some long vowels derive from an earlier combination of a vowel and fu ふ (see onbin). The f often causes gemination when it is joined with another word:

法 (hafu はふ > hō ほう) + 被 (hi ひ) = 法被 (happi はっぴ), instead of hōhi ほうひ
合 (kafu かふ > gō ごう) + 戦 (sen せん) = 合戦 (kassen), instead of gōsen
入 (nifu > nyū) + 声 (shō) = 入声 (nisshō), instead of nyūshō
十 (jifu > jū) + 戒 (kai) = 十戒 (jikkai) instead of jūkai

Most words exhibiting this change are Sino-Japanese words deriving from Middle Chinese morphemes ending in /t̚/, /k̚/ or /p̚/, which were borrowed on their own into Japanese with a prop vowel after them (e.g., 日 MC */nit̚/ > Japanese /niti/ ) but in compounds as assimilated to the following consonant (e.g. 日本 MC */nit̚.pu̯ən/ > Japanese /niQ.poN/ ).

Renjō

Sandhi also occurs much less often in renjō (連声), where, most commonly, a terminal /N/ or /Q/ on one morpheme results in /n/ (or /m/ when derived from historical m) or /t̚/ respectively being added to the start of a following morpheme beginning with a vowel or semivowel, as in ten + ō → tennō (天皇: てん + おう → てんのう). Examples:

First syllable ending with /N/

銀杏 (ginnan): ぎん (gin) + あん (an) → ぎんなん (ginnan)
観音 (kannon): くゎん (kwan) + おむ (om) → くゎんのむ (kwannom) → かんのん (kannon)
天皇 (tennō): てん (ten) + わう (wau) → てんなう (tennau) → てんのう (tennō)

First syllable ending with /N/ from original /m/

三位 (sanmi): さむ (sam) + ゐ (wi) → さむみ (sammi) → さんみ (sanmi)
陰陽 (onmyō): おむ (om) + やう (yau) → おむみゃう (ommyau) → おんみょう (onmyō)

First syllable ending with /Q/

雪隠 (setchin): せつ (setsu) + いん (in) → せっちん (setchin)
屈惑 (kuttaku): くつ (kutsu) + わく (waku) → くったく (kuttaku)

Onbin

Spelling changes
Archaic	Modern
あ＋う (a + u) あ＋ふ (a + fu)	おう (ō)
い＋う (i + u) い＋ふ (i + fu)	ゆう (yū)¹
う＋ふ (u + fu)	うう (ū)
え＋う (e + u) え＋ふ (e + fu)	よう (yō)
お＋ふ (o + fu)	おう (ō)
お＋ほ (o + ho) お＋を (o + wo)	おお (ō)
auxiliary verb む (mu)	ん (n)
medial or final は (ha)	わ (wa)
medial or final ひ (hi), へ (he), ほ (ho)	い (i), え (e), お (o) (via wi, we, wo, see below)
any ゐ (wi), ゑ (we), を (wo)	い (i), え (e), お (o)¹

1. usually not reflected in spelling

Another prominent feature is onbin (音便, euphonic sound change), particularly historical sound changes.

In cases where this has occurred within a morpheme, the morpheme itself is still distinct but with a different sound, as in hōki (箒 (ほうき), broom), which underwent two sound changes from earlier hahaki (ははき) → hauki (はうき) (onbin) → houki (ほうき) (historical vowel change) → hōki (ほうき) (long vowel, sound change not reflected in kana spelling).

However, certain forms are still recognizable as irregular morphology, particularly forms that occur in basic verb conjugation, as well as some compound words.

Verb conjugation

Polite adjective forms

The polite adjective forms (used before the polite copula gozaru (ござる, be) and verb zonjiru (存じる, think, know)) exhibit a one-step or two-step sound change. Firstly, these use the continuative form, -ku (-く), which exhibits onbin, dropping the k as -ku (-く) → -u (-う). Secondly, the vowel may combine with the preceding vowel, according to historical sound changes; if the resulting new sound is palatalized, meaning yu, yo (ゆ、よ), this combines with the preceding consonant, yielding a palatalized syllable.

This is most prominent in certain everyday terms that derive from an i-adjective ending in -ai changing to -ō (-ou), which is because these terms are abbreviations of polite phrases ending in gozaimasu, sometimes with a polite o- prefix. The terms are also used in their full form, with notable examples being:

arigatō (有難う、ありがとう, Thank you), from arigatai (有難い、ありがたい, (I am) grateful).
ohayō (お早う、おはよう, Good morning), from hayai (早い、はやい, (It is) early).
omedetō (お目出度う、おめでとう, Congratulations), from medetai (目出度い、めでたい, (It is) auspicious).

Other transforms of this type are found in polite speech, such as oishiku (美味しく) → oishū (美味しゅう) and ōkiku (大きく) → ōkyū (大きゅう).

-hito

The morpheme hito (人 (ひと), person) (with rendaku -bito (〜びと)) has changed to uto (うと) or udo (うど), respectively, in a number of compounds. This in turn often combined with a historical vowel change, resulting in a pronunciation rather different from that of the components, as in nakōdo (仲人 (なこうど), matchmaker) (see below). These include:

otōto (弟 (おとうと), younger brother), from otohito (弟人 (おとひと)) 'younger sibling' + 'person' → otouto (おとうと) → otōto.
imōto (妹 (いもうと), younger sister), from imohito (妹人 (いもひと)) 'sister' + 'person' → imouto (いもうと) → imōto.
shirōto (素人 (しろうと), novice), from shirohito (白人 (しろひと)) 'white' + 'person' → shirouto (しろうと) → shirōto.
kurōto (玄人 (くろうと), veteran), from kurohito (黒人 (くろひと)) 'black' + 'person' → kurouto (くろうと) → kurōto.
nakōdo (仲人 (なこうど), matchmaker), from nakabito (仲人 (なかびと)) → nakaudo (なかうど) → nakoudo (なこうど) → nakōdo.
karyūdo (狩人 (かりゅうど), hunter), from karibito (狩人 (かりびと)) → kariudo (かりうど) → karyuudo (かりゅうど) → karyūdo.
shūto (舅 (しゅうと), stepfather), from shihito (舅人 (しひと)) → shiuto (しうと) → shuuto (しゅうと) → shūto.
kurōdo (蔵人 (くろうど), warehouse keeper (archivist, sake/soy sauce/miso maker)), from kurabito (蔵人 (くらびと)) 'storehouse' + 'person' → kurando (くらんど) → kuraudo (くらうど) → kuroudo (くろうど) → kurōdo. kurauzu (くらうず) is also found, as a variant of kuraudo (くらうど).

Fusion

In some cases morphemes have effectively fused and will not be recognizable as being composed of two separate morphemes.^{[citation needed]}

Notes

References

^ Itō & Mester (1995), p. 817.
^ Nasu (2015), p. 253.
^ Ito & Mester (2015a), pp. 289–290.
^ Starr & Shih (2017), p. 11.
^ Nasu (2015), p. 255.
^ Labrune (2012), pp. 96–98.
^ Labrune (2012), p. 59.
^ Riney et al. (2007).
^ Maekawa (2020).
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j Okada (1999), p. 118.
^ ^a ^b ^c ^d ^e Labrune (2012), p. 92.
^ ^a ^b Vance (2008), p. 89.
^ ^a ^b ^c ^d Akamatsu (1997), p. 106.
^ Akamatsu (1997) employs a different symbol, [l̆], for the lateral tap.
^ Arai, Warner & Greenberg (2007), p. 48.
^ Vance (2008), pp. 101–102.
^ Labrune (2012), pp. 133–134.
^ Vance (2008), pp. 96, 99.
^ ^a ^b ^c Vance (2008), p. 96.
^ Vance (2008), pp. 97, 99.
^ ^a ^b Vance (2008), p. 97.
^ Saito (2005:94) and National Language Research Institute (1990:514), cited in Maekawa (2023:191–192).
^ Yamane & Gick (2010).
^ Hashi et al. (2014).
^ Nogita & Yamane (2015).
^ Mizoguchi (2019), p. 65.
^ Maekawa (2023), p. 209.
^ Maekawa (2023), pp. 209–210.
^ Maekawa (2010).
^ ^a ^b Jeroen van de Weijer; Kensuke Nanjo; Tetsuo Nishihara (2005). Voicing in Japanese. Walter de Gruyter. p. 150. ISBN 978-3-11-019768-6.
^ ^a ^b ^c ^d ^e Itō & Mester (1995), p. 825.
^ Labrune (2012), p. 68.
^ Shibatani (1990), p. 164-165.
^ Nogita (2006), p. 73.
^ Nogita (2006), pp. 78–79.
^ Nogita (2006), p. 83.
^ Martin (1959), p. 376.
^ Labrune (2012), p. 78.
^ Recasens (2013), p. 11.
^ ^a ^b ^c ^d ^e ^f Nogita (2006), p. 75.
^ Labrune (2012), p. 69.
^ Itō & Mester (1995), p. 827.
^ ^a ^b Itō & Mester (1995), p. 828.
^ Irwin (2011), p. 84.
^ Hall (2013).
^ Nogita (2006), p. 79.
^ Crawford (2009), p. 97.
^ Pintér (2015), p. 145.
^ Labrune (2012), p. 135.
^ Labrune (2012), pp. 132–133.
^ Shibatani (1990), p. 170.
^ Maekawa 2023, p. 2.
^ Vance 2008, pp. 110–112, 223–225.
^ ^a ^b Kubozono (2015a), p. 34.
^ Aoyama (2001), p. 17.
^ Mizoguchi 2019, p. 2.
^ Vance (1987), pp. 110–111.
^ Akamatsu (1997), p. 130.
^ Japanese academics represent as ご and as こ゚.
^ Shibatani (1990), p. 172.
^ ^a ^b Labrune (2012), p. 25.
^ Akamatsu (1997), p. 31.
^ ^a ^b ^c Vance (2008), pp. 54–56.
^ ^a ^b Okada (1999), p. 117.
^ Labrune (2012), p. 39.
^ ^a ^b Labrune (2012), p. 40.
^ ^a ^b Labrune (2012), p. 45.
^ Labrune (2012), p. 47.
^ Labrune (2012), p. 44.
^ Labrune (2012), p. 46.
^ ^a ^b Labrune (2012), pp. 34–35.
^ Tsuchida (2001), p. 225.
^ Tsuchida (2001), fn 3.
^ Seward (1992), p. 9.
^ Shaw & Kawahara (2018), pp. 101–102.
^ Moras are represented orthographically in katakana and hiragana – each mora, with the exception of CjV clusters, being one kana – and are referred to in Japanese as 'on' or 'onji'.
^ Verdonschot, Rinus G.; Kiyama, Sachiko; Tamaoka, Katsuo; Kinoshita, Sachiko; Heij, Wido La; Schiller, Niels O. (2011). "The functional unit of Japanese word naming: Evidence from masked priming". Journal of Experimental Psychology: Learning, Memory, and Cognition. 37 (6): 1458–1473. doi:10.1037/a0024491. hdl:1887/18409. PMID 21895391. S2CID 18278865.
^ Labrune (2012), p. 143.
^ Also notated /H/, following the conventional usage of h for lengthened vowels in romanization.
^ Labrune (2012), pp. 143–144.
^ Itō & Mester (1995:827). In such a classification scheme, the plain counterparts of moras with a palatal glide are onsetless moras.
^ Aoyama (2001), pp. 1–2.
^ Aoyama (2001), p. 11.
^ Aoyama (2001), pp. 7–8.
^ ^a ^b ^c Labrune (2012), p. 171.
^ Ito & Mester (2015b), p. 384.
^ Poser (1990), p. 82.
^ In content, all examples are taken from Poser (1990:82–89); however, the original phonemic transcriptions have been altered and mora boundaries and romanizations have been added.
^ Poser (1990), pp. 82, 84.
^ Poser (1990), pp. 84–85.
^ Poser (1990), p. 85.
^ Poser (1990), pp. 85, 89.
^ Poser (1990), p. 86.
^ Poser (1990), pp. 86, 88–89.
^ Ito & Mester (2015a), p. 290.
^ Starr & Shih (2017), pp. 6–8.
^ Otake (2015), p. 504.
^ Shinohara (2004), p. 295.
^ Vance (2017), p. 26.
^ Vance (2008), p. 135.
^ Kubozono (2015a), pp. 5–6.
^ Labrune (2012), pp. 54.
^ Labrune (2012), pp. 53–56.
^ ^a ^b ^c Vance (2008), p. 62.
^ Vance (2008), p. 63.
^ Vance (2008), pp. 174–175.
^ ^a ^b Ito & Mester (2016).
^ Ito & Mester (2015b), pp. 375–376.
^ Ito & Mester (2015b:375–376), Ito & Mester (2016)
^ ^a ^b Kubozono (2015a), p. 13.
^ ^a ^b Aoyama (2001), p. 9.
^ ^a ^b Ito & Mester (2015b), p. 376.
^ Ohta (1991), p. 168.
^ ^a ^b ^c ^d ^e ^f ^g Ito & Mester (2015b), p. 377.
^ Vance (1987), pp. 72–73.
^ Vance (2008), p. 132.
^ Ito & Mester (2015b:376), Ito & Mester (2016)
^ Ohta (1991), pp. 168, 177.
^ Kubozono (2015a:13–14), Kubozono (2015c:341–342)
^ Ito & Mester (2018), p. 214.
^ Ohta (1991), p. 177.
^ Ohta (1991:177), Ito & Mester (2015b:377)
^ Kubozono (2015c), p. 343.
^ Kubozono, Itô & Mester (2009), p. 956.
^ Kubozono (2015c), p. 337.
^ Kubozono (2015c), p. 338.
^ Kawahara & Shaw (2018), §5.
^ Vance (2008), p. 119.
^ Irwin (2011), pp. 75–76.
^ Crawford (2009), p. 15.
^ Broselow et al. (2012), p. 99.
^ ^a ^b Smith (1980), §3.1.4.2.5.
^ Crawford (2009), p. 69.
^ Crawford (2009), pp. 71–72.
^ ^a ^b Smith (1980), §5.6.
^ ^a ^b Crawford (2009), p. 72.
^ Vance (2008), p. 84.
^ Watanabe (2009), p. 163.
^ ^a ^b Shinohara (2004), p. 316.
^ Shinohara (2004), p. 299.
^ Vance (2008), pp. 84, 87.
^ Watanabe (2009), p. 151.
^ Pintér (2015), pp. 121–122.
^ Watanabe (2009), p. 174.
^ Kitaoka (2017), p. 6.
^ Ito & Mester (2015a), p. 291.
^ Ito & Mester (2015a), pp. 304–305.
^ Ito & Mester (2015a), pp. 295, 297.
^ Vance (2015), p. 421.
^ Kawahara (2015), p. 66.
^ Labrune (2012), p. 136.
^ Kubozono, Itô & Mester (2009), pp. 955, 972.
^ ^a ^b Tamaoka & Makioka (2004), pp. 540, 542.
^ Labrune (2012), pp. 70, 136.
^ Labrune (2012), p. 104.
^ Kawahara (2006), p. 550.
^ Labrune (2012:104–105) points out that the prefix |bu| has the same effect.
^ Crawford (2009), pp. 62–65.
^ Kawahara (2006:537–538), citing Katayama (1998)
^ Kawahara (2006), p. 538.
^ ^a ^b ^c ^d Kitaoka (2017), p. 5.
^ Ito, Kubozono & Mester (2017), p. 296.
^ ^a ^b Kawahara (2015), p. 54.
^ ^a ^b ^c Kawahara (2011), pp. 1–2.
^ ^a ^b Sano (2013), pp. 245–246.
^ Kawahara (2011:2) and Sano (2013:246), citing Nishimura 2003
^ Kawahara (2006), pp. 559, 561, 565.
^ Vance (2008), p. 113.
^ Kawahara 2015, p. 68.
^ ^a ^b Schourup & Tamori (1992), pp. 137–138.
^ Vance (2008), p. 110.
^ Labrune (2012), p. 53.
^ Ito & Mester (2015a), p. 292.
^ ^a ^b Kawahara (2003).
^ Vance (2008), p. 133.
^ Kubozono (2015b), pp. 225–228.
^ Kubozono (2015b), pp. 226–227.
^ Nasu (2015), p. 257.
^ Nasu (2015), pp. 260–261.
^ Nasu (2015), p. 284.
^ ^a ^b Nasu (2015), p. 276.
^ Nasu (2015), pp. 261, 266, 280.
^ Nasu (2015), p. 264.

Bibliography

Akamatsu, Tsutomu (1997), Japanese Phonetics: Theory and Practice, München: Lincom Europa, ISBN 978-3-89586-095-9
Aoyama, Katsura (2001), A Psycholinguistic Perspective on Finnish and Japanese Prosody: Perception, Production and Child Acquisition of Consonantal Quantity Distinctions, Springer Science & Business Media, ISBN 978-0-7923-7216-5
Arai, Takayuki; Warner, Natasha; Greenberg, Steven (2007), "Analysis of spontaneous Japanese in a multi-language telephone-speech corpus", Acoustical Science and Technology, 28 (1): 46–48, doi:10.1250/ast.28.46
Broselow, Ellen; Huffman, Marie; Hwang, Jiwon; Kao, Sophia; Lu, Yu-An (2012), "Emergent Rankings in Foreign Word Adaptations", in Arnett, Nathan; Bennett, Ryan (eds.), Proceedings of the 30th West Coast Conference on Formal Linguistics, pp. 98–108
Crawford, Clifford James (2009), Adaptation and Transmission in Japanese Loanword Phonology (PhD thesis)
Hall, Kathleen Currie (2013), "Documenting phonological change: A comparison of two Japanese phonemic splits" (PDF), in Luo, Shan (ed.), Proceedings of the 2013 Annual Conference of the Canadian Linguistic Association
Hashi, Michiko; Komada, Akina; Miura, Takao; Daimon, Shotaro; Takakura, Yuhki; Hayashi, Ryoko (2014), "Articulatory Variability in Word-Final Japanese Moraic-Nasals: An X-ray Microbeam Study", Journal of the Phonetic Society of Japan, 18 (2): 95–105, doi:10.24467/onseikenkyu.20.1_77
Irwin, Mark (2011), Loanwords in Japanese, John Benjamins, ISBN 978-90-2720592-6
Ito, Junko; Kubozono, Haruo; Mester, Armin (2017), "A prosodic account of consonant gemination in Japanese loanwords", in Kubozono, Haruo (ed.), The Phonetics and Phonology of Geminate Consonants, Oxford University Press, pp. 283–320
Itō, Junko; Mester, R. Armin (1995), "Japanese phonology", in Goldsmith, John A (ed.), The Handbook of Phonological Theory, Blackwell Handbooks in Linguistics, Blackwell Publishers, pp. 817–838
Ito, Junko; Mester, Armin (2015a), "Sino-Japanese phonology", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 253–288
Ito, Junko; Mester, Armin (2015b), "Word formation and phonological processes", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 363–395
Ito, Junko; Mester, R. Armin (2016), "Pitch accents as tonal complexes: Evidence from superheavies" (PDF), Keio University
Ito, Junko; Mester, R. Armin (2018), "Tonal alignment and preaccentuation", Journal of Japanese Linguistics, 34 (2): 195–222, doi:10.1515/jjl-2018-0014
Kawahara, Shigeto (2003), Phonological Society of Japan (ed.), "On a Certain Type of Hiatus Resolution in Japanese" (PDF), On'in Kenkyuu 音韻研究 [Phonological Studies], 6: 11–20
Kawahara, Shigeto (2006), "A faithfulness ranking projected from a perceptibility scale: The case of in Japanese", Language, 82 (3): 536–574, doi:10.1353/lan.2006.0146, S2CID 145093954
Kawahara, Shigeto (2011), Japanese loanword devoicing revisited: A rating study
Kawahara, Shigeto (2015), "The phonetics of sokuon, or geminate obstruents", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 43–77
Kawahara, Shigeto; Shaw, Jason (2018), "Persistence of prosody", Hana-bana (花々): A Festschrift for Junko Ito and Armin Mester, UC Santa Cruz: Festschrifts
Kitaoka, Daiho (2017), "Repair Strategies for failed feature specification in Japanese: Evidence from loanwords, a reversing word game, and blending.", Proceedings of the Annual Meetings on Phonology, 4, doi:10.3765/amp.v4i0.3978
Kubozono, Haruo; Itô, Junko; Mester, Armin (2009), The Linguistic Society of Korea (ed.), "Consonant Gemination in Japanese Loanword Phonology", Current Issues in Unity and Diversity of Languages. Collection of Papers Selected from the 18th International Congress of Linguists, Republic of Korea: Dongam Publishing Co.
Kubozono, Haruo (2015a), "Introduction to Japanese phonetics and phonology", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 1–40, doi:10.1515/9781614511984.1, ISBN 978-1-61451-252-3
Kubozono, Haruo (2015b), "Diphthongs and vowel coalescence", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 215–249, doi:10.1515/9781614511984.1, ISBN 978-1-61451-252-3
Kubozono, Haruo (2015c), "Loanword phonology", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 313–361, doi:10.1515/9781614511984.1, ISBN 978-1-61451-252-3
Labrune, Laurence (2012), The Phonology of Japanese, Oxford, England: Oxford University Press, ISBN 978-0-19-954583-4
Maekawa, Kikuo (2010), "Coarticulatory reinterpretation of allophonic variation: Corpus-based analysis of /z/ in spontaneous Japanese", Journal of Phonetics, 38 (3): 360–374, doi:10.1016/j.wocn.2010.03.001
Maekawa, Kikuo (2020), "Remarks on Japanese /w/", ICU Working Papers in Linguistics, 10: 45–52, doi:10.34577/00004625
Maekawa, Kikuo (2023), "Production of the utterance-final moraic nasal in Japanese: A real-time MRI study", Journal of the International Phonetic Association, 53 (1): 189–212, doi:10.1017/S0025100321000050
Martin, Samuel E. (1959), "Japanische Phonetik by Günther Wenck", Language, 35 (2): 370–382, doi:10.2307/410554, JSTOR 410554
Mizoguchi, Ai (2019), Articulation of the Japanese Moraic Nasal: Place of Articulation, Assimilation, and L2 Transfer (PhD thesis), City University of New York
Nasu, Akio (2015), "The phonological lexicon and mimetic phonology", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 253–288
National Language Research Institute (1990), Nihongo no boin, shiin, onsetsu: Chōon undō no jikken onseigaku-teki kenkyū 日本語の母音，子音，音節: 調音運動の実験音声学的研究 [Japanese vowels, consonants, syllables: Experimental phonetics research of articulatory movements] (in Japanese), Tokyo: National Language Research Institute, doi:10.15084/00001212
Nogita, Akitsugu (2006). "Arguments that Japanese s are complex onsets: durations of Japanese s and Russian s and blocking of Japanese vowel devoicing". Working Papers of the Linguistics Circle of the University of Victoria. 26 (1): 73–99.
Nogita, Akitsugu; Yamane, Noriko (2015), "Japanese moraic dorsalized nasal stop" (PDF), Phonological Studies, 18: 75–84, archived from the original (PDF) on 2019-08-19, retrieved 2020-04-09
Ohta, Satoshi (1991), "Syllable and Mora Geometry in Japanese", Tsukuba English Studies, 10: 157–181
Okada, Hideo (1999), "Japanese", in International Phonetic Association (ed.), Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, Cambridge University Press, pp. 117–119, ISBN 978-0-52163751-0
Otake, Takashi (2015), "Mora and mora-timing", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 493–523
Pintér, Gábor (2015), "The emergence of new consonant contrasts", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 121–165, doi:10.1515/9781614511984.167, ISBN 978-1-61451-252-3
Poser, William J. (1990), "Evidence for Foot Structure in Japanese", Language, 66 (1): 78–105
Recasens, Daniel (2013), "On the articulatory classification of (alveolo)palatal consonants" (PDF), Journal of the International Phonetic Association, 43 (1): 1–22, doi:10.1017/S0025100312000199, S2CID 145463946, archived from the original (PDF) on 2021-05-06, retrieved 2015-11-23
Riney, Timothy James; Takagi, Naoyuki; Ota, Kaori; Uchida, Yoko (2007), "The intermediate degree of VOT in Japanese initial voiceless stops", Journal of Phonetics, 35 (3): 439–443, doi:10.1016/j.wocn.2006.01.002
Saito, Yoshio (2005), Nihongo Onseigaku Nyūmon 日本語音声学入門 (in Japanese) (2nd ed.), Tokyo: Sanseido, ISBN 4-385-34588-0
Sano, Shin-ichiro (2013), "Patterns in Avoidance of Marked Segmental Configurations in Japanese Loanword Phonology" (PDF), Proceedings of GLOW in Asia IX: Main Session: 245–260
Schourup, Lawrence; Tamori, Ikuhiro (1992), "Japanese Palatalization in Relation to Theories of Restricted Underspecification", Gengo Kenkyu 言語研究, 101: 107–145
Seward, Jack (1992), Easy Japanese, McGraw-Hill Professional, ISBN 978-0-8442-8495-8
Shaw, Jason A.; Kawahara, Shigeto (2018), "The lingual articulation of devoiced /u/ in Tokyo Japanese" (PDF), Journal of Phonetics, 66: 100–119, doi:10.1016/j.wocn.2017.09.007
Shibatani, Masayoshi (1990), The Languages of Japan, Cambridge: Cambridge University Press, ISBN 978-0-521-36070-8
Shinohara, Shigeko (2004), "Emergence of universal grammar in foreign word adaptations", in Kager, René; Pater, Joe; Zonneveld, Wim (eds.), Constraints in Phonological Acquisition, Cambridge University Press, pp. 292–320
Smith, R. Edward (1980). Natural Phonology of Japanese (Thesis).
Starr, Rebecca Lurie; Shih, Stephanie S (2017), "The syllable as a prosodic unit in Japanese lexical strata: Evidence from text-setting", Glossa: A Journal of General Linguistics, 2 (1) 93: 1–34, doi:10.5334/gjgl.355
Tamaoka, Katsuo; Makioka, Shogo (2004), "Frequency of occurrence for units of phonemes, morae, and syllables appearing in a lexical corpus of a Japanese newspaper", Behavior Research Methods, Instruments, & Computers, 36 (3): 531–547
Tsuchida, Ayako (2001), "Japanese vowel devoicing", Journal of East Asian Linguistics, 10 (3): 225–245, doi:10.1023/A:1011221225072, S2CID 117861220
Vance, Timothy J. (1987), An Introduction to Japanese Phonology, Albany, NY: State University of New York Press, ISBN 978-0-88706-360-2
Vance, Timothy J. (2008), The Sounds of Japanese, Cambridge University Press, ISBN 978-0-5216-1754-3
Vance, Timothy J. (2015), "Rendaku", in Kubozono, Haruo (ed.), Handbook of Japanese Phonetics and Phonology, Berlin: De Gruyter, pp. 397–441
Vance, Timothy J. (2017), "The Japanese Syllable Debate: A Skeptical Look at Some Anti-Syllable Arguments", Proceedings of GLOW in Asia XI, MIT Working Papers in Linguistics 84, 1
Watanabe, Seiji (2009). Cultural and Educational Contributions to Recent Phonological Changes in Japanese (PhD thesis). The University of Arizona.
Yamane, Noriko; Gick, Bryan (2010), "Speaker-specific place of articulation: Idiosyncratic targets for Japanese coda nasal", Canadian Acoustics, 38 (3): 136–137

v t e Phonologies of the world's languages
Phonologies Orthographies Grammars Adjectives Determiners Nouns Prepositions Pronouns Verbs
A–E	Abkhaz Acehnese Adyghe Afrikaans American Sign Language Arabic Modern Standard Egyptian Hejazi Levantine Tunisian Avestan Belarusian Bengali Bulgarian Burmese Catalan Chinese Mandarin Cantonese Northern Wu Old Historical Chukchi Cornish Czech Danish Dutch Standard Orsmaal-Gussenhoven dialect English Australian General American New Zealand Received Pronunciation Regional North American White South African Standard Canadian Old Middle Esperanto Estonian
F–L	Faroese Finnish French Parisian Quebec Galician German Standard Bernese Greek Standard Modern Ancient Koine Greenlandic Gujarati Hawaiian Hebrew (Modern) Hindustani Hungarian Icelandic Ingrian Inuit Irish Italian Japanese Kiowa Konkani Korean Kurdish Kyrgyz Latgalian Latin Latvian Limburgish Maastrichtian Lithuanian Luxembourgish
M–S	Macedonian Malay Maldivian Māori Marathi Massachusett Medumba Navajo Nepali Norwegian Occitan Ojibwe Old Saxon Oromo Ottawa Pashto Persian Polish Portuguese Proto-Indo-European Ripuarian Colognian Kerkrade dialect Romanian Russian Sardinian Scots Scottish Gaelic Serbo-Croatian Slovak Slovene Somali Sotho Spanish Dialects and varieties Swedish
T–Z	Tagalog Tamil Taos Turkish Ubykh Ukrainian Uyghur Vietnamese Welsh West Frisian Yiddish Zuni