m and n are both nasals (nasal in 'manner of articulation', meaning air partly flows out through the nose due to a lowering of the velum/soft palate), but the m is
labial (involves the use of the lips) in its 'place of articulation', whilst the n is an
alveolar (meaning the tongue tip moves forward to touch the top tooth ridge and back of the teeth where they meet this ridge) in its place of articulation. Putting that into full linguistic parlance, one would say that m is a
bilabial nasal (~ sound) whereas n is an
alveolar nasal. Both incidentally are voiced (meaning, the larynx/vocal chords vibrate in producing each sound).
z meanwhile could be compared to s, not so much in terms of voiced vs voiceless (which they are), but to my mind, in the slight difference between them in tongue position/direction/pressure/sealing (which isn't quite conveyed by their description simply as voiced vs voiceless alveolar fricatives): with the s, the crest (my conscious use of what I believe is a non-technical term) of the tongue feels (judging from its sides) at the point of the alveolar ridge where the premolars and canines are, and the tongue's very tip is thus lax and sagging somewhat; with the z meanwhile, the tongue's apex is more or less the same but the tip has come up to nearly rest against almost the whole alveolar ridge (apart from the portion behind the two or so very front incisors - a gap of some sorts has to be left in the seal for
some air to escape, otherwise the z would start to sound like an s again (if the whole alveolar ridge were completely sealed off and no "buzz" allowed at the tongue tip)), and is also almost touching the backs/roots of the teeth (those two front incisors). The distinction between s and z is therefore like a flap or trapdoor positioning slightly higher into its second (z) position (but not to the point of closing completely) than it was in its first, somewhat more open (s) one; the z is a more "constricted" (constrained?) fricative i.e. one that flows through a smaller gap than s does.
So the z is the "greater" fricative, and the hiss produced by the air escaping will be higher in pitch than with the s, and the greater resulting internal pressure (indeed, "buzz") could be regarded (for practical purposes) as what leads to the voicing back down in the larynx.The net result is that voicless sounds will sound louder to the listener they are addressed to and reaching (more air is escaping from the speaker's mouth), but voiced sounds might sound louder to the speaker producing them, to his or her "internal ear".
It will help one to hear the change in "hiss pitch" if one tries not to voice the z when switching from "s to z airflow" - it should be possible to let a slight continuous flow out during the switch without activating the larynx (but such a sound in normal conversation would likely be inaudible, or make the speaker appear strange to the listener even if it were audible!).
I am not sure if all this is quite linguistically sound, but it's how I'd make sense of things with a view to your Spaniard's apparent problem with English z.
Same thing really with t versus d - I feel that the "t tongue" is behaving somewhat like the s's, whilst the "d tongue" is behaving somewhat like the z's i.e. that there is a slight but perceivable difference in forward pressure of the tongue, so that in the t it just "flicks" at the gumline, but in the d it "bunches" slightly against that and partly the backs of the teeth too, increasing pressure and "leading" to voicing. Which all isn't that surprising, when you consider that t like s is voiceless, whilst d like z is voiced.
Quick summary:
m: voiced
bilabial nasal
n: voiced
alveolar nasal
s:
voiceless alveolar fricative
z:
voiced alveloar fricative
t:
voiceless alveolar stop
d:
voiced alveolar stop
And a few silly phrases to practise all the above and get the tongue moving between the paired positions:
'my name' repeated quite a few times in a loop (my name my name my name etc etc). "maineimaineimaineim...".
'Sue's zoo sizzles!' (=is a red-hot attraction. Or maybe it's on fire and the animals are literally sizzling? LOL). "
su:
zu:
si(z)
zool
z" (z gets subtly repeated twice at the end, helping to underline the sound and its tongue action).
'Ted told Todd, don't dust!' (I guess they cohabit LOL). You can practice each word separately at first, then later as connected into a phrase; in reasonably swift/unpedantic connected speech, the t sound would be predominant in the first three words, and the d sound in the last two: "Te(d)tol(d)to(dd)doughn(t)dus(t)".
The above "technical" guff uses the terms in Kennedy's
Structure and Meaning in English (a nice crash course in English from phoneme all the way through to discourse level):
http://books.google.co.uk/books?id=wE5X ... r#PPA19,M1 . The picture of the speech organs on page 19, and the table of English consonant phonemes on page 22*, are both particularly useful, as is the description of the three parameters for describing sounds that follows under the table on page 22 and onto page 23. Lastly, the stuff on voiced/voiceless on pages 20 and 23 should probably also be read, and you'll notice that there are exercises that you might like to try, as well as frequency statistics and various facts and figures that could prove useful (e.g. the distribution/position of phonemes, and the various consonant clusters, on page 33 and pages 34-36 respectively).
You might also find this interesting:
http://forums.eslcafe.com/teacher/viewt ... 9740#39740
And have you heard of Swan & Smith's
Learner English?
http://books.google.co.uk/books?id=5jVg ... r#PPA90,M1
(Actually, this book is on that same Distance DELTA reading list that I directed you (I've now realized!) to over on the Hong Kong forum!
).
Lastly, this could be a very useful resource (esp. as it's completely free!):
http://www.cambridge.org/elt/peterroach/resources.htm
>
http://www.cambridge.org/elt/peterroach ... ossary.pdf
*Note that the "bold" voiceless consonants in this table have not reproduced that well in the Google Book Search scan, but are basically (in my actual copy of the book) clearly only the
left-hand one of any boxed pair of phonemes e.g. p is voiceless, b voiced. (And by this logic, any box with only a single
right-hand phoneme in it must be referring to a voiced phoneme).