From the English-learning perspective, vocabulary is best learned when done using mental stretching. I mean, vocabulary is learned based on currently known vocabulary (i.e., from "big" one can learn "huge" and then "giant" and so on). A technique that works well is "mind mapping" (similar to brainstorming maps).
At Caves Bookstore (specifically at the Taoyuan bookstore near the train station) I found a book that does exactly that. It maps out words in different units. The book, 翻譯大師教你記單字—進階篇, is written for those learning English, but with a little white out and a good electronic dictionary (and language exchange partners), it becomes a good book to expand my second language of choice work bank. (Click on the image (or here) to see enlarged pics of the book and its interior.)
Last week I learned some words about actors, film dubbing/narrating, and movie genres. Next week I'm on to astrology and descriptive adjectives.
If only I spent more time studying instead of planning to study...
Friday, August 1, 2008
Corpus Linguistics and Chinese
Word frequency has been on the tips of toungues and typed hundreds of times by those in lingistics, especially the applied linguists. Unfortunately, they're usually talking about the English language regarding second language acquisition.
So what about Chinese? Billions speak the language (and more than 23 million use the traditional, full form (thankfully!))...there must be a corpus in use to calculate the high-frequency words so learners know which words are important. And book publishers would know about these lists so their books would teach those top words so learners don't waste their time learning archaic words noone says anymore, right? Right?
W R O N G .
Integrated Chinese still uses words like {Na-lee} which is an old China-Chinese form of "Oh, you're too much! Stop embarrassing me!" Practical Audio-Visual Chinese teaches {ku1} which means "to cry" in Unit 24 (of 26 in Book 1).
Sinosplice links to a top 1000 list, which appears great with the first entries being truly common words. But look a little farther, like around the late 900s and early 1000s. Yes, that's right: 魚 {u3} appears at 971, 爹 {dai1} is at 965 whereas 爸 is at 991, 汽 {cheee1} at 1117, and the list goes on. I question this list, and wonder where Patrick, the creator, got his stats from.
Sinosplice links to another list, this one created by Jun Da and used by yellowbridge.com in their pay-for-its-convenience service. The left-hand menu bar mentions info and the site is a university (edu) site...but wait! 爸 is 1698? What kind of data are these sites using? I'm guessing very little spoken instances and mostly classical written texts. But before I really start criticizing this, I should read the introductory letter...but I can't just now since the link is opening a pdf that only shows the even pages...but you're more than welcome to read it in the meanwhile and maybe let me know what the odd pages say.
So what about Chinese? Billions speak the language (and more than 23 million use the traditional, full form (thankfully!))...there must be a corpus in use to calculate the high-frequency words so learners know which words are important. And book publishers would know about these lists so their books would teach those top words so learners don't waste their time learning archaic words noone says anymore, right? Right?
W R O N G .
Integrated Chinese still uses words like {Na-lee} which is an old China-Chinese form of "Oh, you're too much! Stop embarrassing me!" Practical Audio-Visual Chinese teaches {ku1} which means "to cry" in Unit 24 (of 26 in Book 1).
Sinosplice links to a top 1000 list, which appears great with the first entries being truly common words. But look a little farther, like around the late 900s and early 1000s. Yes, that's right: 魚 {u3} appears at 971, 爹 {dai1} is at 965 whereas 爸 is at 991, 汽 {cheee1} at 1117, and the list goes on. I question this list, and wonder where Patrick, the creator, got his stats from.
Sinosplice links to another list, this one created by Jun Da and used by yellowbridge.com in their pay-for-its-convenience service. The left-hand menu bar mentions info and the site is a university (edu) site...but wait! 爸 is 1698? What kind of data are these sites using? I'm guessing very little spoken instances and mostly classical written texts. But before I really start criticizing this, I should read the introductory letter...but I can't just now since the link is opening a pdf that only shows the even pages...but you're more than welcome to read it in the meanwhile and maybe let me know what the odd pages say.
Subscribe to:
Posts (Atom)