Chinese learners frequently ask about the frequency of individual characters (as this helps to order them in a reasonable sequence for learning). Numerous lists of common characters are available in various dictionaries (Oxford Dictionary, Wenlin or various online sources). They are often taken as the absolute, while they obviously depend on the corpus (the list in the Oxford Dictionary, for example, is skewed towards newspaper texts). The Chinese Internet corpus is a snapshot of the Chinese Web from 2005. The frequency list of characters coming from it might be more general (though still not ideal). The list of characters is available from here.
The first column is the rank, the second one is the frequency, which has been normalised per million characters. This means that if you read Internet texts, 的 will occur 38343 times per each million characters, 汽 — 205 times (rank 877), while (on average) you have to read about 100 million characters on the Internet to come across 腙 (in modern Chinese it is used for naming chemical compounds, e.g., 安巴腙 Ambazone).
The three corpora listed above are:
If you use these corpora in your studies, please refer to:
Sharoff, S. (2006) Creating general-purpose corpora using automated search engine queries. In Marco Baroni and Silvia Bernardini, editors, WaCk
y! Working papers on the Web as Corpus. Gedit, Bologna. PDF